Recherche avancée

Médias (91)

Autres articles (40)

  • Creating farms of unique websites

    13 avril 2011, par

    MediaSPIP platforms can be installed as a farm, with a single "core" hosted on a dedicated server and used by multiple websites.
    This allows (among other things) : implementation costs to be shared between several different projects / individuals rapid deployment of multiple unique sites creation of groups of like-minded sites, making it possible to browse media in a more controlled and selective environment than the major "open" (...)

  • Création définitive du canal

    12 mars 2010, par

    Lorsque votre demande est validée, vous pouvez alors procéder à la création proprement dite du canal. Chaque canal est un site à part entière placé sous votre responsabilité. Les administrateurs de la plateforme n’y ont aucun accès.
    A la validation, vous recevez un email vous invitant donc à créer votre canal.
    Pour ce faire il vous suffit de vous rendre à son adresse, dans notre exemple "http://votre_sous_domaine.mediaspip.net".
    A ce moment là un mot de passe vous est demandé, il vous suffit d’y (...)

  • Les tâches Cron régulières de la ferme

    1er décembre 2010, par

    La gestion de la ferme passe par l’exécution à intervalle régulier de plusieurs tâches répétitives dites Cron.
    Le super Cron (gestion_mutu_super_cron)
    Cette tâche, planifiée chaque minute, a pour simple effet d’appeler le Cron de l’ensemble des instances de la mutualisation régulièrement. Couplée avec un Cron système sur le site central de la mutualisation, cela permet de simplement générer des visites régulières sur les différents sites et éviter que les tâches des sites peu visités soient trop (...)

Sur d’autres sites (4465)

  • ffmpeg piped output producing incorrect metadata frame count

    8 décembre 2024, par Xorgon

    The short version : Using piped output from ffmpeg produces a file with incorrect metadata.

    


    ffmpeg -y -i .\test_mp4.mp4 -f avi -c:v libx264 - > output.avi to make an AVI file using the pipe output.

    


    ffprobe -v error -count_frames -show_entries stream=duration,nb_read_frames,r_frame_rate .\output.avi

    


    The output will show that the metadata does not match the actual frames contained in the video.

    


    Details below.

    



    


    Using Python, I am attempting to use ffmpeg to compress videos and put them in a PowerPoint. This works great, however, the video files themselves have incorrect frame counts which can cause issues when I read from those videos in other code.

    


    Edit for clarification : by "frame count" I mean the metadata frame count. The actual number of frames contained in the video is correct, but querying the metadata gives an incorrect frame count.

    


    Having eliminated the PowerPoint aspect of the code, I've narrowed this down to the following minimal reproducing example of saving an output from an ffmpeg pipe :

    


    from subprocess import Popen, PIPE

video_path = 'test_mp4.mp4'

ffmpeg_pipe = Popen(['ffmpeg',
                     '-y',  # Overwrite files
                     '-i', f'{video_path}',  # Input from file
                     '-f', 'avi',  # Output format
                     '-c:v', 'libx264',  # Codec
                     '-'],  # Output to pipe
                    stdout=PIPE)

new_path = "piped_video.avi"
vid_file = open(new_path, "wb")
vid_file.write(ffmpeg_pipe.stdout.read())
vid_file.close()


    


    I've tested several different videos. One small example video that I've tested can be found here.

    


    I've tried a few different codecs with avi format and tried libvpx with webm format. For the avi outputs, the frame count usually reads as 1073741824 (2^30). Weirdly, for the webm format, the frame count read as -276701161105643264.

    


    Edit : This issue can also be reproduced with just ffmpeg in command prompt using the following command :
ffmpeg -y -i .\test_mp4.mp4 -f avi -c:v libx264 - > output.avi

    


    This is a snippet I used to read the frame count, but one could also see the error by opening the video details in Windows Explorer and seeing the total time as something like 9942 hours, 3 minutes, and 14 seconds.

    


    import cv2

video_path = 'test_mp4.mp4'
new_path = "piped_video.webm"

cap = cv2.VideoCapture(video_path)
print(f"Original video frame count: = {int(cap.get(cv2.CAP_PROP_FRAME_COUNT)):d}")
cap.release()

cap = cv2.VideoCapture(new_path)
print(f"Piped video frame count: = {int(cap.get(cv2.CAP_PROP_FRAME_COUNT)):d}")
cap.release()


    


    The error can also be observed using ffprobe with the following command : ffprobe -v error -count_frames -show_entries stream=duration,nb_read_frames,r_frame_rate .\output.avi. Note that the frame rate and number of frames counted by ffprobe do not match with the duration from the metadata.

    


    For completeness, here is the ffmpeg output :

    


    ffmpeg version 2023-06-11-git-09621fd7d9-full_build-www.gyan.dev Copyright (c) 2000-2023 the FFmpeg developers
  built with gcc 12.2.0 (Rev10, Built by MSYS2 project)
  configuration: --enable-gpl --enable-version3 --enable-static --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-gmp --enable-bzlib --enable-lzma --enable-libsnappy --enable-zlib --enable-librist --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-libbluray --enable-libcaca --enable-sdl2 --enable-libaribb24 --enable-libaribcaption --enable-libdav1d --enable-libdavs2 --enable-libuavs3d --enable-libzvbi --enable-librav1e --enable-libsvtav1 --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs2 --enable-libxvid --enable-libaom --enable-libjxl --enable-libopenjpeg --enable-libvpx --enable-mediafoundation --enable-libass --enable-frei0r --enable-libfreetype --enable-libfribidi --enable-liblensfun --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf --enable-cuda-llvm --enable-cuvid --enable-ffnvcodec --enable-nvdec --enable-nvenc --enable-d3d11va --enable-dxva2 --enable-libvpl --enable-libshaderc --enable-vulkan --enable-libplacebo --enable-opencl --enable-libcdio --enable-libgme --enable-libmodplug --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libshine --enable-libtheora --enable-libtwolame --enable-libvo-amrwbenc --enable-libcodec2 --enable-libilbc --enable-libgsm --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enable-ladspa --enable-libbs2b --enable-libflite --enable-libmysofa --enable-librubberband --enable-libsoxr --enable-chromaprint
  libavutil      58. 13.100 / 58. 13.100
  libavcodec     60. 17.100 / 60. 17.100
  libavformat    60.  6.100 / 60.  6.100
  libavdevice    60.  2.100 / 60.  2.100
  libavfilter     9.  8.101 /  9.  8.101
  libswscale      7.  3.100 /  7.  3.100
  libswresample   4. 11.100 /  4. 11.100
  libpostproc    57.  2.100 / 57.  2.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'test_mp4.mp4':
  Metadata:
    major_brand     : mp42
    minor_version   : 0
    compatible_brands: isommp42
    creation_time   : 2022-08-10T12:54:09.000000Z
  Duration: 00:00:06.67, start: 0.000000, bitrate: 567 kb/s
  Stream #0:0[0x1](eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p(progressive), 384x264 [SAR 1:1 DAR 16:11], 563 kb/s, 30 fps, 30 tbr, 30k tbn (default)
    Metadata:
      creation_time   : 2022-08-10T12:54:09.000000Z
      handler_name    : Mainconcept MP4 Video Media Handler
      vendor_id       : [0][0][0][0]
      encoder         : AVC Coding
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> h264 (libx264))
Press [q] to stop, [?] for help
[libx264 @ 0000018c68c8b9c0] using SAR=1/1
[libx264 @ 0000018c68c8b9c0] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 0000018c68c8b9c0] profile High, level 2.1, 4:2:0, 8-bit
Output #0, avi, to 'pipe:':
  Metadata:
    major_brand     : mp42
    minor_version   : 0
    compatible_brands: isommp42
    ISFT            : Lavf60.6.100
  Stream #0:0(eng): Video: h264 (H264 / 0x34363248), yuv420p(progressive), 384x264 [SAR 1:1 DAR 16:11], q=2-31, 30 fps, 30 tbn (default)
    Metadata:
      creation_time   : 2022-08-10T12:54:09.000000Z
      handler_name    : Mainconcept MP4 Video Media Handler
      vendor_id       : [0][0][0][0]
      encoder         : Lavc60.17.100 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: N/A
[out#0/avi @ 0000018c687f47c0] video:82kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 3.631060%
frame=  200 fps=0.0 q=-1.0 Lsize=      85kB time=00:00:06.56 bitrate= 106.5kbits/s speed=76.2x    
[libx264 @ 0000018c68c8b9c0] frame I:1     Avg QP:16.12  size:  3659
[libx264 @ 0000018c68c8b9c0] frame P:80    Avg QP:21.31  size:   647
[libx264 @ 0000018c68c8b9c0] frame B:119   Avg QP:26.74  size:   243
[libx264 @ 0000018c68c8b9c0] consecutive B-frames:  3.0% 53.0%  0.0% 44.0%
[libx264 @ 0000018c68c8b9c0] mb I  I16..4: 17.6% 70.6% 11.8%
[libx264 @ 0000018c68c8b9c0] mb P  I16..4:  0.8%  1.7%  0.6%  P16..4: 17.6%  4.6%  3.3%  0.0%  0.0%    skip:71.4%
[libx264 @ 0000018c68c8b9c0] mb B  I16..4:  0.1%  0.3%  0.2%  B16..8: 11.7%  1.4%  0.4%  direct: 0.6%  skip:85.4%  L0:32.0% L1:59.7% BI: 8.3%
[libx264 @ 0000018c68c8b9c0] 8x8 transform intra:59.6% inter:62.4%
[libx264 @ 0000018c68c8b9c0] coded y,uvDC,uvAC intra: 48.5% 0.0% 0.0% inter: 3.5% 0.0% 0.0%
[libx264 @ 0000018c68c8b9c0] i16 v,h,dc,p: 19% 39% 25% 17%
[libx264 @ 0000018c68c8b9c0] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 21% 25% 30%  3%  3%  4%  4%  4%  5%
[libx264 @ 0000018c68c8b9c0] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 22% 20% 16%  6%  8%  8%  8%  5%  6%
[libx264 @ 0000018c68c8b9c0] i8c dc,h,v,p: 100%  0%  0%  0%
[libx264 @ 0000018c68c8b9c0] Weighted P-Frames: Y:0.0% UV:0.0%
[libx264 @ 0000018c68c8b9c0] ref P L0: 76.2%  7.9% 11.2%  4.7%
[libx264 @ 0000018c68c8b9c0] ref B L0: 85.6% 12.9%  1.5%
[libx264 @ 0000018c68c8b9c0] ref B L1: 97.7%  2.3%
[libx264 @ 0000018c68c8b9c0] kb/s:101.19


    


    So the question is : why does this happen, and how can one avoid it ?

    


  • How to resize dimensions of video through ffmpeg-python ?

    25 janvier, par kunambi

    I'm trying to resize a video file which a user has uploaded to Django, by using ffmpeg-python. The documentation isn't very easy to understand, so I've tried to cobble this together from various sources.

    


    This method is run in a celery container, in order to not slow the experience for the user. The problem I'm facing is that I can't seem to resize the video file. I've tried two different approaches :

    


    from django.db import models
from io import BytesIO
from myapp.models import MediaModel


def resize_video(mypk: str) -> None:
    instance = MediaModel.objects.get(pk=mypk)
    media_instance: models.FileField = instance.media
    media_output = "test.mp4"
    buffer = BytesIO()

    for chunk in media_instance.chunks():
        buffer.write(chunk)

    stream_video = ffmpeg.input("pipe:").video.filter("scale", 720, -1)  # resize to 720px width
    stream_audio = ffmpeg.input("pipe:").audio
    process = (
        ffmpeg.output(stream_video, stream_audio, media_output, acodec="aac")
        .overwrite_output()
        .run_async(pipe_stdin=True, quiet=True)
    )
    buffer.seek(0)
    process_out, process_err = process.communicate(input=buffer.getbuffer())
    # (pdb) process_out
    # b''

    # attempting to use `.concat` instead
    process2 = (
        ffmpeg.concat(stream_video, stream_audio, v=1, a=1)
        .output(media_output)
        .overwrite_output()
        .run_async(pipe_stdin=True, quiet=True)
    )
    buffer.seek(0)
    process2_out, process2_err = process2.communicate(input=buffer.getbuffer())
    # (pdb) process2_out
    # b''


    


    As we can see, no matter which approach chosen, the output is an empty binary. The process_err and process2_err both generate the following message :

    


    ffmpeg version N-111491-g31979127f8-20230717 Copyright (c) 2000-2023 the
FFmpeg developers
  built with gcc 13.1.0 (crosstool-NG 1.25.0.196_227d99d)
  configuration: --prefix=/ffbuild/prefix --pkg-config-flags=--static
--pkg-config=pkg-config --cross-prefix=x86_64-w64-mingw32- --arch=x86_64
--target-os=mingw32 --enable-gpl --enable-version3 --disable-debug
--disable-w32threads --enable-pthreads --enable-iconv --enable-libxml2
--enable-zlib --enable-libfreetype --enable-libfribidi --enable-gmp
--enable-lzma --enable-fontconfig --enable-libvorbis --enable-opencl
--disable-libpulse --enable-libvmaf --disable-libxcb --disable-xlib
--enable-amf --enable-libaom --enable-libaribb24 --enable-avisynth
--enable-chromaprint --enable-libdav1d --enable-libdavs2
--disable-libfdk-aac --enable-ffnvcodec --enable-cuda-llvm --enable-frei0r
--enable-libgme --enable-libkvazaar --enable-libass --enable-libbluray
--enable-libjxl --enable-libmp3lame --enable-libopus --enable-librist
--enable-libssh --enable-libtheora --enable-libvpx --enable-libwebp
--enable-lv2 --enable-libvpl --enable-openal --enable-libopencore-amrnb
--enable-libopencore-amrwb --enable-libopenh264 --enable-libopenjpeg
--enable-libopenmpt --enable-librav1e --enable-librubberband
--enable-schannel --enable-sdl2 --enable-libsoxr --enable-libsrt
--enable-libsvtav1 --enable-libtwolame --enable-libuavs3d --disable-libdrm
--disable-vaapi --enable-libvidstab --enable-vulkan --enable-libshaderc
--enable-libplacebo --enable-libx264 --enable-libx265 --enable-libxavs2
--enable-libxvid --enable-libzimg --enable-libzvbi
--extra-cflags=-DLIBTWOLAME_STATIC --extra-cxxflags=
--extra-ldflags=-pthread --extra-ldexeflags= --extra-libs=-lgomp
--extra-version=20230717
  libavutil      58. 14.100 / 58. 14.100
  libavcodec     60. 22.100 / 60. 22.100
  libavformat    60. 10.100 / 60. 10.100
  libavdevice    60.  2.101 / 60.  2.101
  libavfilter     9.  8.102 /  9.  8.102
  libswscale      7.  3.100 /  7.  3.100
  libswresample   4. 11.100 /  4. 11.100
  libpostproc    57.  2.100 / 57.  2.100
 "Input #0, mov,mp4,m4a,3gp,3g2,mj2, frompipe:':\r\n"
  Metadata:
    major_brand     : mp42
    minor_version   : 0
    compatible_brands: mp42mp41
    creation_time   : 2020-11-10T15:01:09.000000Z
  Duration: 00:00:04.16, start: 0.000000, bitrate: N/A
  Stream #0:0[0x1](eng): Video: h264 (Main) (avc1 / 0x31637661),
yuv420p(progressive), 1920x1080 [SAR 1:1 DAR 16:9], 2649 kb/s, 25 fps, 25
tbr, 25k tbn (default)
    Metadata:
      creation_time   : 2020-11-10T15:01:09.000000Z
      handler_name    : ?Mainconcept Video Media Handler
      vendor_id       : [0][0][0][0]
      encoder         : AVC Coding
  Stream #0:1[0x2](eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz,
stereo, fltp, 317 kb/s (default)
    Metadata:
      creation_time   : 2020-11-10T15:01:09.000000Z
      handler_name    : #Mainconcept MP4 Sound Media Handler
      vendor_id       : [0][0][0][0]
Stream mapping:
  Stream #0:0 (h264) -> scale:default (graph 0)
  scale:default (graph 0) -> Stream #0:0 (libx264)
  Stream #0:1 -> #0:1 (aac (native) -> aac (native))
[libx264 @ 00000243a23a1100] using SAR=1/1
[libx264 @ 00000243a23a1100] using cpu capabilities: MMX2 SSE2Fast SSSE3
SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 00000243a23a1100] profile High, level 3.0, 4:2:0, 8-bit
[libx264 @ 00000243a23a1100] 264 - core 164 - H.264/MPEG-4 AVC codec -
Copyleft 2003-2023 - http://www.videolan.org/x264.html - options: cabac=1
ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00
mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11
fast_pskip=1 chroma_qp_offset=-2 threads=6 lookahead_threads=1
sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0
constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1
weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40
intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0
qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
 "Output #0, mp4, toaa37f8d7685f4df9af85b1cdcd95997e.mp4':\r\n"
  Metadata:
    major_brand     : mp42
    minor_version   : 0
    compatible_brands: mp42mp41
    encoder         : Lavf60.10.100
  Stream #0:0: Video: h264 (avc1 / 0x31637661), yuv420p(tv, progressive),
800x450 [SAR 1:1 DAR 16:9], q=2-31, 25 fps, 12800 tbn
    Metadata:
      encoder         : Lavc60.22.100 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: N/A
  Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo,
fltp, 128 kb/s (default)
    Metadata:
      creation_time   : 2020-11-10T15:01:09.000000Z
      handler_name    : #Mainconcept MP4 Sound Media Handler
      vendor_id       : [0][0][0][0]
      encoder         : Lavc60.22.100 aac
frame=    0 fps=0.0 q=0.0 size=       0kB time=N/A bitrate=N/A
speed=N/A    \r'
frame=   21 fps=0.0 q=28.0 size=       0kB time=00:00:02.75 bitrate=  
0.1kbits/s speed=4.75x    \r'
[out#0/mp4 @ 00000243a230bd80] video:91kB audio:67kB subtitle:0kB other
streams:0kB global headers:0kB muxing overhead: 2.838559%
frame=  104 fps=101 q=-1.0 Lsize=     162kB time=00:00:04.13 bitrate=
320.6kbits/s speed=4.02x    
[libx264 @ 00000243a23a1100] frame I:1     Avg QP:18.56  size:  2456
[libx264 @ 00000243a23a1100] frame P:33    Avg QP:16.86  size:  1552
[libx264 @ 00000243a23a1100] frame B:70    Avg QP:17.55  size:   553
[libx264 @ 00000243a23a1100] consecutive B-frames:  4.8% 11.5% 14.4%
69.2%
[libx264 @ 00000243a23a1100] mb I  I16..4: 17.3% 82.1%  0.6%
[libx264 @ 00000243a23a1100] mb P  I16..4:  5.9% 15.2%  0.4%  P16..4: 18.3% 
0.9%  0.4%  0.0%  0.0%    skip:58.7%
[libx264 @ 00000243a23a1100] mb B  I16..4:  0.8%  0.3%  0.0%  B16..8: 15.4% 
1.0%  0.0%  direct: 3.6%  skip:78.9%  L0:34.2% L1:64.0% BI: 1.7%
[libx264 @ 00000243a23a1100] 8x8 transform intra:68.2% inter:82.3%
[libx264 @ 00000243a23a1100] coded y,uvDC,uvAC intra: 4.2% 18.4% 1.2% inter:
1.0% 6.9% 0.0%
[libx264 @ 00000243a23a1100] i16 v,h,dc,p: 53% 25%  8% 14%
[libx264 @ 00000243a23a1100] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 19%  6% 70%  1% 
1%  1%  1%  0%  0%
[libx264 @ 00000243a23a1100] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 46% 21% 15%  2% 
5%  4%  3%  3%  1%
[libx264 @ 00000243a23a1100] i8c dc,h,v,p: 71% 15% 13%  1%
[libx264 @ 00000243a23a1100] Weighted P-Frames: Y:30.3% UV:15.2%
[libx264 @ 00000243a23a1100] ref P L0: 46.7%  7.5% 34.6%  7.3%  3.9%
[libx264 @ 00000243a23a1100] ref B L0: 88.0% 10.5%  1.5%
[libx264 @ 00000243a23a1100] ref B L1: 98.1%  1.9%
[libx264 @ 00000243a23a1100] kb/s:177.73
[aac @ 00000243a23a2e00] Qavg: 1353.589


    


    I'm at a loss right now, would love any feedback/solution.

    


  • FFmpeg overlay positioning issue : Converting frontend center coordinates to FFmpeg top-left coordinates

    25 janvier, par tarun

    I'm building a web-based video editor where users can :

    


    Add multiple videos
Add images
Add text overlays with background color

    


    Frontend sends coordinates where each element's (x,y) represents its center position.
on click of the export button i want all data to be exported as one final video
on click i send the data to the backend like -

    


     const exportAllVideos = async () => {
    try {
      const formData = new FormData();
        
      
      const normalizedVideos = videos.map(video => ({
          ...video,
          startTime: parseFloat(video.startTime),
          endTime: parseFloat(video.endTime),
          duration: parseFloat(video.duration)
      })).sort((a, b) => a.startTime - b.startTime);

      
      for (const video of normalizedVideos) {
          const response = await fetch(video.src);
          const blobData = await response.blob();
          const file = new File([blobData], `${video.id}.mp4`, { type: "video/mp4" });
          formData.append("videos", file);
      }

      
      const normalizedImages = images.map(image => ({
          ...image,
          startTime: parseFloat(image.startTime),
          endTime: parseFloat(image.endTime),
          x: parseInt(image.x),
          y: parseInt(image.y),
          width: parseInt(image.width),
          height: parseInt(image.height),
          opacity: parseInt(image.opacity)
      }));

      
      for (const image of normalizedImages) {
          const response = await fetch(image.src);
          const blobData = await response.blob();
          const file = new File([blobData], `${image.id}.png`, { type: "image/png" });
          formData.append("images", file);
      }

      
      const normalizedTexts = texts.map(text => ({
          ...text,
          startTime: parseFloat(text.startTime),
          endTime: parseFloat(text.endTime),
          x: parseInt(text.x),
          y: parseInt(text.y),
          fontSize: parseInt(text.fontSize),
          opacity: parseInt(text.opacity)
      }));

      
      formData.append("metadata", JSON.stringify({
          videos: normalizedVideos,
          images: normalizedImages,
          texts: normalizedTexts
      }));

      const response = await fetch("my_flask_endpoint", {
          method: "POST",
          body: formData
      });

      if (!response.ok) {
        
          console.log('wtf', response);
          
      }

      const finalVideo = await response.blob();
      const url = URL.createObjectURL(finalVideo);
      const a = document.createElement("a");
      a.href = url;
      a.download = "final_video.mp4";
      a.click();
      URL.revokeObjectURL(url);

    } catch (e) {
      console.log(e, "err");
    }
  };


    


    the frontend data for each object that is text image and video we are storing it as an array of objects below is the Data strcutre for each object -

    


    // the frontend data for each
  const newVideo = {
      id: uuidv4(),
      src: URL.createObjectURL(videoData.videoBlob),
      originalDuration: videoData.duration,
      duration: videoData.duration,
      startTime: 0,
      playbackOffset: 0,
      endTime: videoData.endTime || videoData.duration,
      isPlaying: false,
      isDragging: false,
      speed: 1,
      volume: 100,
      x: window.innerHeight / 2,
      y: window.innerHeight / 2,
      width: videoData.width,
      height: videoData.height,
    };
    const newTextObject = {
      id: uuidv4(),
      description: text,
      opacity: 100,
      x: containerWidth.width / 2,
      y: containerWidth.height / 2,
      fontSize: 18,
      duration: 20,
      endTime: 20,
      startTime: 0,
      color: "#ffffff",
      backgroundColor: hasBG,
      padding: 8,
      fontWeight: "normal",
      width: 200,
      height: 40,
    };

    const newImage = {
      id: uuidv4(),
      src: URL.createObjectURL(imageData),
      x: containerWidth.width / 2,
      y: containerWidth.height / 2,
      width: 200,
      height: 200,
      borderRadius: 0,
      startTime: 0,
      endTime: 20,
      duration: 20,
      opacity: 100,
    };



    


    BACKEND CODE -

    


    import os
import shutil
import subprocess
from flask import Flask, request, send_file
import ffmpeg
import json
from werkzeug.utils import secure_filename
import uuid
from flask_cors import CORS


app = Flask(__name__)
CORS(app, resources={r"/*": {"origins": "*"}})



UPLOAD_FOLDER = 'temp_uploads'
if not os.path.exists(UPLOAD_FOLDER):
    os.makedirs(UPLOAD_FOLDER)


@app.route('/')
def home():
    return 'Hello World'


OUTPUT_WIDTH = 1920
OUTPUT_HEIGHT = 1080



@app.route('/process', methods=['POST'])
def process_video():
    work_dir = None
    try:
        work_dir = os.path.abspath(os.path.join(UPLOAD_FOLDER, str(uuid.uuid4())))
        os.makedirs(work_dir)
        print(f"Created working directory: {work_dir}")

        metadata = json.loads(request.form['metadata'])
        print("Received metadata:", json.dumps(metadata, indent=2))
        
        video_paths = []
        videos = request.files.getlist('videos')
        for idx, video in enumerate(videos):
            filename = f"video_{idx}.mp4"
            filepath = os.path.join(work_dir, filename)
            video.save(filepath)
            if os.path.exists(filepath) and os.path.getsize(filepath) > 0:
                video_paths.append(filepath)
                print(f"Saved video to: {filepath} Size: {os.path.getsize(filepath)}")
            else:
                raise Exception(f"Failed to save video {idx}")

        image_paths = []
        images = request.files.getlist('images')
        for idx, image in enumerate(images):
            filename = f"image_{idx}.png"
            filepath = os.path.join(work_dir, filename)
            image.save(filepath)
            if os.path.exists(filepath):
                image_paths.append(filepath)
                print(f"Saved image to: {filepath}")

        output_path = os.path.join(work_dir, 'output.mp4')

        filter_parts = []

        base_duration = metadata["videos"][0]["duration"] if metadata["videos"] else 10
        filter_parts.append(f'color=c=black:s={OUTPUT_WIDTH}x{OUTPUT_HEIGHT}:d={base_duration}[canvas];')

        for idx, (path, meta) in enumerate(zip(video_paths, metadata['videos'])):
            x_pos = int(meta.get("x", 0) - (meta.get("width", 0) / 2))
            y_pos = int(meta.get("y", 0) - (meta.get("height", 0) / 2))
            
            filter_parts.extend([
                f'[{idx}:v]setpts=PTS-STARTPTS,scale={meta.get("width", -1)}:{meta.get("height", -1)}[v{idx}];',
                f'[{idx}:a]asetpts=PTS-STARTPTS[a{idx}];'
            ])

            if idx == 0:
                filter_parts.append(
                    f'[canvas][v{idx}]overlay=x={x_pos}:y={y_pos}:eval=init[temp{idx}];'
                )
            else:
                filter_parts.append(
                    f'[temp{idx-1}][v{idx}]overlay=x={x_pos}:y={y_pos}:'
                    f'enable=\'between(t,{meta["startTime"]},{meta["endTime"]})\':eval=init'
                    f'[temp{idx}];'
                )

        last_video_temp = f'temp{len(video_paths)-1}'

        if video_paths:
            audio_mix_parts = []
            for idx in range(len(video_paths)):
                audio_mix_parts.append(f'[a{idx}]')
            filter_parts.append(f'{"".join(audio_mix_parts)}amix=inputs={len(video_paths)}[aout];')

        
        if image_paths:
            for idx, (img_path, img_meta) in enumerate(zip(image_paths, metadata['images'])):
                input_idx = len(video_paths) + idx
                
                
                x_pos = int(img_meta["x"] - (img_meta["width"] / 2))
                y_pos = int(img_meta["y"] - (img_meta["height"] / 2))
                
                filter_parts.extend([
                    f'[{input_idx}:v]scale={img_meta["width"]}:{img_meta["height"]}[img{idx}];',
                    f'[{last_video_temp}][img{idx}]overlay=x={x_pos}:y={y_pos}:'
                    f'enable=\'between(t,{img_meta["startTime"]},{img_meta["endTime"]})\':'
                    f'alpha={img_meta["opacity"]/100}[imgout{idx}];'
                ])
                last_video_temp = f'imgout{idx}'

        if metadata.get('texts'):
            for idx, text in enumerate(metadata['texts']):
                next_output = f'text{idx}' if idx < len(metadata['texts']) - 1 else 'vout'
                
                escaped_text = text["description"].replace("'", "\\'")
                
                x_pos = int(text["x"] - (text["width"] / 2))
                y_pos = int(text["y"] - (text["height"] / 2))
                
                text_filter = (
                    f'[{last_video_temp}]drawtext=text=\'{escaped_text}\':'
                    f'x={x_pos}:y={y_pos}:'
                    f'fontsize={text["fontSize"]}:'
                    f'fontcolor={text["color"]}'
                )
                
                if text.get('backgroundColor'):
                    text_filter += f':box=1:boxcolor={text["backgroundColor"]}:boxborderw=5'
                
                if text.get('fontWeight') == 'bold':
                    text_filter += ':font=Arial-Bold'
                
                text_filter += (
                    f':enable=\'between(t,{text["startTime"]},{text["endTime"]})\''
                    f'[{next_output}];'
                )
                
                filter_parts.append(text_filter)
                last_video_temp = next_output
        else:
            filter_parts.append(f'[{last_video_temp}]null[vout];')

        
        filter_complex = ''.join(filter_parts)

        
        cmd = [
            'ffmpeg',
            *sum([['-i', path] for path in video_paths], []),
            *sum([['-i', path] for path in image_paths], []),
            '-filter_complex', filter_complex,
            '-map', '[vout]'
        ]
        
        
        if video_paths:
            cmd.extend(['-map', '[aout]'])
        
        cmd.extend(['-y', output_path])

        print(f"Running ffmpeg command: {' '.join(cmd)}")
        result = subprocess.run(cmd, capture_output=True, text=True)
        
        if result.returncode != 0:
            print(f"FFmpeg error output: {result.stderr}")
            raise Exception(f"FFmpeg processing failed: {result.stderr}")

        return send_file(
            output_path,
            mimetype='video/mp4',
            as_attachment=True,
            download_name='final_video.mp4'
        )

    except Exception as e:
        print(f"Error in video processing: {str(e)}")
        return {'error': str(e)}, 500
    
    finally:
        if work_dir and os.path.exists(work_dir):
            try:
                print(f"Directory contents before cleanup: {os.listdir(work_dir)}")
                if not os.environ.get('FLASK_DEBUG'):
                    shutil.rmtree(work_dir)
                else:
                    print(f"Keeping directory for debugging: {work_dir}")
            except Exception as e:
                print(f"Cleanup error: {str(e)}")

                
if __name__ == '__main__':
    app.run(debug=True, port=8000)



    


    I'm also attaching what the final thing looks like on the frontend web vs in the downloaded video
and as u can see the downloaded video has all coords and positions messed up be it of the texts, images as well as videosdownloaded videos view
frontend web view

    


    can somebody please help me figure this out :)