
Recherche avancée
Médias (91)
-
Corona Radiata
26 septembre 2011, par
Mis à jour : Septembre 2011
Langue : English
Type : Audio
-
Lights in the Sky
26 septembre 2011, par
Mis à jour : Septembre 2011
Langue : English
Type : Audio
-
Head Down
26 septembre 2011, par
Mis à jour : Septembre 2011
Langue : English
Type : Audio
-
Echoplex
26 septembre 2011, par
Mis à jour : Septembre 2011
Langue : English
Type : Audio
-
Discipline
26 septembre 2011, par
Mis à jour : Septembre 2011
Langue : English
Type : Audio
-
Letting You
26 septembre 2011, par
Mis à jour : Septembre 2011
Langue : English
Type : Audio
Autres articles (40)
-
Creating farms of unique websites
13 avril 2011, parMediaSPIP platforms can be installed as a farm, with a single "core" hosted on a dedicated server and used by multiple websites.
This allows (among other things) : implementation costs to be shared between several different projects / individuals rapid deployment of multiple unique sites creation of groups of like-minded sites, making it possible to browse media in a more controlled and selective environment than the major "open" (...) -
Création définitive du canal
12 mars 2010, parLorsque votre demande est validée, vous pouvez alors procéder à la création proprement dite du canal. Chaque canal est un site à part entière placé sous votre responsabilité. Les administrateurs de la plateforme n’y ont aucun accès.
A la validation, vous recevez un email vous invitant donc à créer votre canal.
Pour ce faire il vous suffit de vous rendre à son adresse, dans notre exemple "http://votre_sous_domaine.mediaspip.net".
A ce moment là un mot de passe vous est demandé, il vous suffit d’y (...) -
Les tâches Cron régulières de la ferme
1er décembre 2010, parLa gestion de la ferme passe par l’exécution à intervalle régulier de plusieurs tâches répétitives dites Cron.
Le super Cron (gestion_mutu_super_cron)
Cette tâche, planifiée chaque minute, a pour simple effet d’appeler le Cron de l’ensemble des instances de la mutualisation régulièrement. Couplée avec un Cron système sur le site central de la mutualisation, cela permet de simplement générer des visites régulières sur les différents sites et éviter que les tâches des sites peu visités soient trop (...)
Sur d’autres sites (4465)
-
ffmpeg piped output producing incorrect metadata frame count
8 décembre 2024, par XorgonThe short version : Using piped output from ffmpeg produces a file with incorrect metadata.


ffmpeg -y -i .\test_mp4.mp4 -f avi -c:v libx264 - > output.avi
to make an AVI file using the pipe output.

ffprobe -v error -count_frames -show_entries stream=duration,nb_read_frames,r_frame_rate .\output.avi


The output will show that the metadata does not match the actual frames contained in the video.


Details below.



Using Python, I am attempting to use ffmpeg to compress videos and put them in a PowerPoint. This works great, however, the video files themselves have incorrect frame counts which can cause issues when I read from those videos in other code.


Edit for clarification : by "frame count" I mean the metadata frame count. The actual number of frames contained in the video is correct, but querying the metadata gives an incorrect frame count.


Having eliminated the PowerPoint aspect of the code, I've narrowed this down to the following minimal reproducing example of saving an output from an ffmpeg pipe :


from subprocess import Popen, PIPE

video_path = 'test_mp4.mp4'

ffmpeg_pipe = Popen(['ffmpeg',
 '-y', # Overwrite files
 '-i', f'{video_path}', # Input from file
 '-f', 'avi', # Output format
 '-c:v', 'libx264', # Codec
 '-'], # Output to pipe
 stdout=PIPE)

new_path = "piped_video.avi"
vid_file = open(new_path, "wb")
vid_file.write(ffmpeg_pipe.stdout.read())
vid_file.close()



I've tested several different videos. One small example video that I've tested can be found here.


I've tried a few different codecs with
avi
format and triedlibvpx
withwebm
format. For theavi
outputs, the frame count usually reads as1073741824
(2^30). Weirdly, for thewebm
format, the frame count read as-276701161105643264
.

Edit : This issue can also be reproduced with just ffmpeg in command prompt using the following command :

ffmpeg -y -i .\test_mp4.mp4 -f avi -c:v libx264 - > output.avi


This is a snippet I used to read the frame count, but one could also see the error by opening the video details in Windows Explorer and seeing the total time as something like 9942 hours, 3 minutes, and 14 seconds.


import cv2

video_path = 'test_mp4.mp4'
new_path = "piped_video.webm"

cap = cv2.VideoCapture(video_path)
print(f"Original video frame count: = {int(cap.get(cv2.CAP_PROP_FRAME_COUNT)):d}")
cap.release()

cap = cv2.VideoCapture(new_path)
print(f"Piped video frame count: = {int(cap.get(cv2.CAP_PROP_FRAME_COUNT)):d}")
cap.release()



The error can also be observed using
ffprobe
with the following command :ffprobe -v error -count_frames -show_entries stream=duration,nb_read_frames,r_frame_rate .\output.avi
. Note that the frame rate and number of frames counted by ffprobe do not match with the duration from the metadata.

For completeness, here is the ffmpeg output :


ffmpeg version 2023-06-11-git-09621fd7d9-full_build-www.gyan.dev Copyright (c) 2000-2023 the FFmpeg developers
 built with gcc 12.2.0 (Rev10, Built by MSYS2 project)
 configuration: --enable-gpl --enable-version3 --enable-static --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-gmp --enable-bzlib --enable-lzma --enable-libsnappy --enable-zlib --enable-librist --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-libbluray --enable-libcaca --enable-sdl2 --enable-libaribb24 --enable-libaribcaption --enable-libdav1d --enable-libdavs2 --enable-libuavs3d --enable-libzvbi --enable-librav1e --enable-libsvtav1 --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs2 --enable-libxvid --enable-libaom --enable-libjxl --enable-libopenjpeg --enable-libvpx --enable-mediafoundation --enable-libass --enable-frei0r --enable-libfreetype --enable-libfribidi --enable-liblensfun --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf --enable-cuda-llvm --enable-cuvid --enable-ffnvcodec --enable-nvdec --enable-nvenc --enable-d3d11va --enable-dxva2 --enable-libvpl --enable-libshaderc --enable-vulkan --enable-libplacebo --enable-opencl --enable-libcdio --enable-libgme --enable-libmodplug --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libshine --enable-libtheora --enable-libtwolame --enable-libvo-amrwbenc --enable-libcodec2 --enable-libilbc --enable-libgsm --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enable-ladspa --enable-libbs2b --enable-libflite --enable-libmysofa --enable-librubberband --enable-libsoxr --enable-chromaprint
 libavutil 58. 13.100 / 58. 13.100
 libavcodec 60. 17.100 / 60. 17.100
 libavformat 60. 6.100 / 60. 6.100
 libavdevice 60. 2.100 / 60. 2.100
 libavfilter 9. 8.101 / 9. 8.101
 libswscale 7. 3.100 / 7. 3.100
 libswresample 4. 11.100 / 4. 11.100
 libpostproc 57. 2.100 / 57. 2.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'test_mp4.mp4':
 Metadata:
 major_brand : mp42
 minor_version : 0
 compatible_brands: isommp42
 creation_time : 2022-08-10T12:54:09.000000Z
 Duration: 00:00:06.67, start: 0.000000, bitrate: 567 kb/s
 Stream #0:0[0x1](eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p(progressive), 384x264 [SAR 1:1 DAR 16:11], 563 kb/s, 30 fps, 30 tbr, 30k tbn (default)
 Metadata:
 creation_time : 2022-08-10T12:54:09.000000Z
 handler_name : Mainconcept MP4 Video Media Handler
 vendor_id : [0][0][0][0]
 encoder : AVC Coding
Stream mapping:
 Stream #0:0 -> #0:0 (h264 (native) -> h264 (libx264))
Press [q] to stop, [?] for help
[libx264 @ 0000018c68c8b9c0] using SAR=1/1
[libx264 @ 0000018c68c8b9c0] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 0000018c68c8b9c0] profile High, level 2.1, 4:2:0, 8-bit
Output #0, avi, to 'pipe:':
 Metadata:
 major_brand : mp42
 minor_version : 0
 compatible_brands: isommp42
 ISFT : Lavf60.6.100
 Stream #0:0(eng): Video: h264 (H264 / 0x34363248), yuv420p(progressive), 384x264 [SAR 1:1 DAR 16:11], q=2-31, 30 fps, 30 tbn (default)
 Metadata:
 creation_time : 2022-08-10T12:54:09.000000Z
 handler_name : Mainconcept MP4 Video Media Handler
 vendor_id : [0][0][0][0]
 encoder : Lavc60.17.100 libx264
 Side data:
 cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: N/A
[out#0/avi @ 0000018c687f47c0] video:82kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 3.631060%
frame= 200 fps=0.0 q=-1.0 Lsize= 85kB time=00:00:06.56 bitrate= 106.5kbits/s speed=76.2x 
[libx264 @ 0000018c68c8b9c0] frame I:1 Avg QP:16.12 size: 3659
[libx264 @ 0000018c68c8b9c0] frame P:80 Avg QP:21.31 size: 647
[libx264 @ 0000018c68c8b9c0] frame B:119 Avg QP:26.74 size: 243
[libx264 @ 0000018c68c8b9c0] consecutive B-frames: 3.0% 53.0% 0.0% 44.0%
[libx264 @ 0000018c68c8b9c0] mb I I16..4: 17.6% 70.6% 11.8%
[libx264 @ 0000018c68c8b9c0] mb P I16..4: 0.8% 1.7% 0.6% P16..4: 17.6% 4.6% 3.3% 0.0% 0.0% skip:71.4%
[libx264 @ 0000018c68c8b9c0] mb B I16..4: 0.1% 0.3% 0.2% B16..8: 11.7% 1.4% 0.4% direct: 0.6% skip:85.4% L0:32.0% L1:59.7% BI: 8.3%
[libx264 @ 0000018c68c8b9c0] 8x8 transform intra:59.6% inter:62.4%
[libx264 @ 0000018c68c8b9c0] coded y,uvDC,uvAC intra: 48.5% 0.0% 0.0% inter: 3.5% 0.0% 0.0%
[libx264 @ 0000018c68c8b9c0] i16 v,h,dc,p: 19% 39% 25% 17%
[libx264 @ 0000018c68c8b9c0] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 21% 25% 30% 3% 3% 4% 4% 4% 5%
[libx264 @ 0000018c68c8b9c0] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 22% 20% 16% 6% 8% 8% 8% 5% 6%
[libx264 @ 0000018c68c8b9c0] i8c dc,h,v,p: 100% 0% 0% 0%
[libx264 @ 0000018c68c8b9c0] Weighted P-Frames: Y:0.0% UV:0.0%
[libx264 @ 0000018c68c8b9c0] ref P L0: 76.2% 7.9% 11.2% 4.7%
[libx264 @ 0000018c68c8b9c0] ref B L0: 85.6% 12.9% 1.5%
[libx264 @ 0000018c68c8b9c0] ref B L1: 97.7% 2.3%
[libx264 @ 0000018c68c8b9c0] kb/s:101.19



So the question is : why does this happen, and how can one avoid it ?


-
How to resize dimensions of video through ffmpeg-python ?
25 janvier, par kunambiI'm trying to resize a video file which a user has uploaded to Django, by using
ffmpeg-python
. The documentation isn't very easy to understand, so I've tried to cobble this together from various sources.

This method is run in a celery container, in order to not slow the experience for the user. The problem I'm facing is that I can't seem to resize the video file. I've tried two different approaches :


from django.db import models
from io import BytesIO
from myapp.models import MediaModel


def resize_video(mypk: str) -> None:
 instance = MediaModel.objects.get(pk=mypk)
 media_instance: models.FileField = instance.media
 media_output = "test.mp4"
 buffer = BytesIO()

 for chunk in media_instance.chunks():
 buffer.write(chunk)

 stream_video = ffmpeg.input("pipe:").video.filter("scale", 720, -1) # resize to 720px width
 stream_audio = ffmpeg.input("pipe:").audio
 process = (
 ffmpeg.output(stream_video, stream_audio, media_output, acodec="aac")
 .overwrite_output()
 .run_async(pipe_stdin=True, quiet=True)
 )
 buffer.seek(0)
 process_out, process_err = process.communicate(input=buffer.getbuffer())
 # (pdb) process_out
 # b''

 # attempting to use `.concat` instead
 process2 = (
 ffmpeg.concat(stream_video, stream_audio, v=1, a=1)
 .output(media_output)
 .overwrite_output()
 .run_async(pipe_stdin=True, quiet=True)
 )
 buffer.seek(0)
 process2_out, process2_err = process2.communicate(input=buffer.getbuffer())
 # (pdb) process2_out
 # b''



As we can see, no matter which approach chosen, the output is an empty binary. The
process_err
andprocess2_err
both generate the following message :

ffmpeg version N-111491-g31979127f8-20230717 Copyright (c) 2000-2023 the
FFmpeg developers
 built with gcc 13.1.0 (crosstool-NG 1.25.0.196_227d99d)
 configuration: --prefix=/ffbuild/prefix --pkg-config-flags=--static
--pkg-config=pkg-config --cross-prefix=x86_64-w64-mingw32- --arch=x86_64
--target-os=mingw32 --enable-gpl --enable-version3 --disable-debug
--disable-w32threads --enable-pthreads --enable-iconv --enable-libxml2
--enable-zlib --enable-libfreetype --enable-libfribidi --enable-gmp
--enable-lzma --enable-fontconfig --enable-libvorbis --enable-opencl
--disable-libpulse --enable-libvmaf --disable-libxcb --disable-xlib
--enable-amf --enable-libaom --enable-libaribb24 --enable-avisynth
--enable-chromaprint --enable-libdav1d --enable-libdavs2
--disable-libfdk-aac --enable-ffnvcodec --enable-cuda-llvm --enable-frei0r
--enable-libgme --enable-libkvazaar --enable-libass --enable-libbluray
--enable-libjxl --enable-libmp3lame --enable-libopus --enable-librist
--enable-libssh --enable-libtheora --enable-libvpx --enable-libwebp
--enable-lv2 --enable-libvpl --enable-openal --enable-libopencore-amrnb
--enable-libopencore-amrwb --enable-libopenh264 --enable-libopenjpeg
--enable-libopenmpt --enable-librav1e --enable-librubberband
--enable-schannel --enable-sdl2 --enable-libsoxr --enable-libsrt
--enable-libsvtav1 --enable-libtwolame --enable-libuavs3d --disable-libdrm
--disable-vaapi --enable-libvidstab --enable-vulkan --enable-libshaderc
--enable-libplacebo --enable-libx264 --enable-libx265 --enable-libxavs2
--enable-libxvid --enable-libzimg --enable-libzvbi
--extra-cflags=-DLIBTWOLAME_STATIC --extra-cxxflags=
--extra-ldflags=-pthread --extra-ldexeflags= --extra-libs=-lgomp
--extra-version=20230717
 libavutil 58. 14.100 / 58. 14.100
 libavcodec 60. 22.100 / 60. 22.100
 libavformat 60. 10.100 / 60. 10.100
 libavdevice 60. 2.101 / 60. 2.101
 libavfilter 9. 8.102 / 9. 8.102
 libswscale 7. 3.100 / 7. 3.100
 libswresample 4. 11.100 / 4. 11.100
 libpostproc 57. 2.100 / 57. 2.100
 "Input #0, mov,mp4,m4a,3gp,3g2,mj2, frompipe:':\r\n"
 Metadata:
 major_brand : mp42
 minor_version : 0
 compatible_brands: mp42mp41
 creation_time : 2020-11-10T15:01:09.000000Z
 Duration: 00:00:04.16, start: 0.000000, bitrate: N/A
 Stream #0:0[0x1](eng): Video: h264 (Main) (avc1 / 0x31637661),
yuv420p(progressive), 1920x1080 [SAR 1:1 DAR 16:9], 2649 kb/s, 25 fps, 25
tbr, 25k tbn (default)
 Metadata:
 creation_time : 2020-11-10T15:01:09.000000Z
 handler_name : ?Mainconcept Video Media Handler
 vendor_id : [0][0][0][0]
 encoder : AVC Coding
 Stream #0:1[0x2](eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz,
stereo, fltp, 317 kb/s (default)
 Metadata:
 creation_time : 2020-11-10T15:01:09.000000Z
 handler_name : #Mainconcept MP4 Sound Media Handler
 vendor_id : [0][0][0][0]
Stream mapping:
 Stream #0:0 (h264) -> scale:default (graph 0)
 scale:default (graph 0) -> Stream #0:0 (libx264)
 Stream #0:1 -> #0:1 (aac (native) -> aac (native))
[libx264 @ 00000243a23a1100] using SAR=1/1
[libx264 @ 00000243a23a1100] using cpu capabilities: MMX2 SSE2Fast SSSE3
SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 00000243a23a1100] profile High, level 3.0, 4:2:0, 8-bit
[libx264 @ 00000243a23a1100] 264 - core 164 - H.264/MPEG-4 AVC codec -
Copyleft 2003-2023 - http://www.videolan.org/x264.html - options: cabac=1
ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00
mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11
fast_pskip=1 chroma_qp_offset=-2 threads=6 lookahead_threads=1
sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0
constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1
weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40
intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0
qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
 "Output #0, mp4, toaa37f8d7685f4df9af85b1cdcd95997e.mp4':\r\n"
 Metadata:
 major_brand : mp42
 minor_version : 0
 compatible_brands: mp42mp41
 encoder : Lavf60.10.100
 Stream #0:0: Video: h264 (avc1 / 0x31637661), yuv420p(tv, progressive),
800x450 [SAR 1:1 DAR 16:9], q=2-31, 25 fps, 12800 tbn
 Metadata:
 encoder : Lavc60.22.100 libx264
 Side data:
 cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: N/A
 Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo,
fltp, 128 kb/s (default)
 Metadata:
 creation_time : 2020-11-10T15:01:09.000000Z
 handler_name : #Mainconcept MP4 Sound Media Handler
 vendor_id : [0][0][0][0]
 encoder : Lavc60.22.100 aac
frame= 0 fps=0.0 q=0.0 size= 0kB time=N/A bitrate=N/A
speed=N/A \r'
frame= 21 fps=0.0 q=28.0 size= 0kB time=00:00:02.75 bitrate= 
0.1kbits/s speed=4.75x \r'
[out#0/mp4 @ 00000243a230bd80] video:91kB audio:67kB subtitle:0kB other
streams:0kB global headers:0kB muxing overhead: 2.838559%
frame= 104 fps=101 q=-1.0 Lsize= 162kB time=00:00:04.13 bitrate=
320.6kbits/s speed=4.02x 
[libx264 @ 00000243a23a1100] frame I:1 Avg QP:18.56 size: 2456
[libx264 @ 00000243a23a1100] frame P:33 Avg QP:16.86 size: 1552
[libx264 @ 00000243a23a1100] frame B:70 Avg QP:17.55 size: 553
[libx264 @ 00000243a23a1100] consecutive B-frames: 4.8% 11.5% 14.4%
69.2%
[libx264 @ 00000243a23a1100] mb I I16..4: 17.3% 82.1% 0.6%
[libx264 @ 00000243a23a1100] mb P I16..4: 5.9% 15.2% 0.4% P16..4: 18.3% 
0.9% 0.4% 0.0% 0.0% skip:58.7%
[libx264 @ 00000243a23a1100] mb B I16..4: 0.8% 0.3% 0.0% B16..8: 15.4% 
1.0% 0.0% direct: 3.6% skip:78.9% L0:34.2% L1:64.0% BI: 1.7%
[libx264 @ 00000243a23a1100] 8x8 transform intra:68.2% inter:82.3%
[libx264 @ 00000243a23a1100] coded y,uvDC,uvAC intra: 4.2% 18.4% 1.2% inter:
1.0% 6.9% 0.0%
[libx264 @ 00000243a23a1100] i16 v,h,dc,p: 53% 25% 8% 14%
[libx264 @ 00000243a23a1100] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 19% 6% 70% 1% 
1% 1% 1% 0% 0%
[libx264 @ 00000243a23a1100] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 46% 21% 15% 2% 
5% 4% 3% 3% 1%
[libx264 @ 00000243a23a1100] i8c dc,h,v,p: 71% 15% 13% 1%
[libx264 @ 00000243a23a1100] Weighted P-Frames: Y:30.3% UV:15.2%
[libx264 @ 00000243a23a1100] ref P L0: 46.7% 7.5% 34.6% 7.3% 3.9%
[libx264 @ 00000243a23a1100] ref B L0: 88.0% 10.5% 1.5%
[libx264 @ 00000243a23a1100] ref B L1: 98.1% 1.9%
[libx264 @ 00000243a23a1100] kb/s:177.73
[aac @ 00000243a23a2e00] Qavg: 1353.589



I'm at a loss right now, would love any feedback/solution.


-
FFmpeg overlay positioning issue : Converting frontend center coordinates to FFmpeg top-left coordinates
25 janvier, par tarunI'm building a web-based video editor where users can :


Add multiple videos
Add images
Add text overlays with background color


Frontend sends coordinates where each element's (x,y) represents its center position.
on click of the export button i want all data to be exported as one final video
on click i send the data to the backend like -


const exportAllVideos = async () => {
 try {
 const formData = new FormData();
 
 
 const normalizedVideos = videos.map(video => ({
 ...video,
 startTime: parseFloat(video.startTime),
 endTime: parseFloat(video.endTime),
 duration: parseFloat(video.duration)
 })).sort((a, b) => a.startTime - b.startTime);

 
 for (const video of normalizedVideos) {
 const response = await fetch(video.src);
 const blobData = await response.blob();
 const file = new File([blobData], `${video.id}.mp4`, { type: "video/mp4" });
 formData.append("videos", file);
 }

 
 const normalizedImages = images.map(image => ({
 ...image,
 startTime: parseFloat(image.startTime),
 endTime: parseFloat(image.endTime),
 x: parseInt(image.x),
 y: parseInt(image.y),
 width: parseInt(image.width),
 height: parseInt(image.height),
 opacity: parseInt(image.opacity)
 }));

 
 for (const image of normalizedImages) {
 const response = await fetch(image.src);
 const blobData = await response.blob();
 const file = new File([blobData], `${image.id}.png`, { type: "image/png" });
 formData.append("images", file);
 }

 
 const normalizedTexts = texts.map(text => ({
 ...text,
 startTime: parseFloat(text.startTime),
 endTime: parseFloat(text.endTime),
 x: parseInt(text.x),
 y: parseInt(text.y),
 fontSize: parseInt(text.fontSize),
 opacity: parseInt(text.opacity)
 }));

 
 formData.append("metadata", JSON.stringify({
 videos: normalizedVideos,
 images: normalizedImages,
 texts: normalizedTexts
 }));

 const response = await fetch("my_flask_endpoint", {
 method: "POST",
 body: formData
 });

 if (!response.ok) {
 
 console.log('wtf', response);
 
 }

 const finalVideo = await response.blob();
 const url = URL.createObjectURL(finalVideo);
 const a = document.createElement("a");
 a.href = url;
 a.download = "final_video.mp4";
 a.click();
 URL.revokeObjectURL(url);

 } catch (e) {
 console.log(e, "err");
 }
 };



the frontend data for each object that is text image and video we are storing it as an array of objects below is the Data strcutre for each object -


// the frontend data for each
 const newVideo = {
 id: uuidv4(),
 src: URL.createObjectURL(videoData.videoBlob),
 originalDuration: videoData.duration,
 duration: videoData.duration,
 startTime: 0,
 playbackOffset: 0,
 endTime: videoData.endTime || videoData.duration,
 isPlaying: false,
 isDragging: false,
 speed: 1,
 volume: 100,
 x: window.innerHeight / 2,
 y: window.innerHeight / 2,
 width: videoData.width,
 height: videoData.height,
 };
 const newTextObject = {
 id: uuidv4(),
 description: text,
 opacity: 100,
 x: containerWidth.width / 2,
 y: containerWidth.height / 2,
 fontSize: 18,
 duration: 20,
 endTime: 20,
 startTime: 0,
 color: "#ffffff",
 backgroundColor: hasBG,
 padding: 8,
 fontWeight: "normal",
 width: 200,
 height: 40,
 };

 const newImage = {
 id: uuidv4(),
 src: URL.createObjectURL(imageData),
 x: containerWidth.width / 2,
 y: containerWidth.height / 2,
 width: 200,
 height: 200,
 borderRadius: 0,
 startTime: 0,
 endTime: 20,
 duration: 20,
 opacity: 100,
 };




BACKEND CODE -


import os
import shutil
import subprocess
from flask import Flask, request, send_file
import ffmpeg
import json
from werkzeug.utils import secure_filename
import uuid
from flask_cors import CORS


app = Flask(__name__)
CORS(app, resources={r"/*": {"origins": "*"}})



UPLOAD_FOLDER = 'temp_uploads'
if not os.path.exists(UPLOAD_FOLDER):
 os.makedirs(UPLOAD_FOLDER)


@app.route('/')
def home():
 return 'Hello World'


OUTPUT_WIDTH = 1920
OUTPUT_HEIGHT = 1080



@app.route('/process', methods=['POST'])
def process_video():
 work_dir = None
 try:
 work_dir = os.path.abspath(os.path.join(UPLOAD_FOLDER, str(uuid.uuid4())))
 os.makedirs(work_dir)
 print(f"Created working directory: {work_dir}")

 metadata = json.loads(request.form['metadata'])
 print("Received metadata:", json.dumps(metadata, indent=2))
 
 video_paths = []
 videos = request.files.getlist('videos')
 for idx, video in enumerate(videos):
 filename = f"video_{idx}.mp4"
 filepath = os.path.join(work_dir, filename)
 video.save(filepath)
 if os.path.exists(filepath) and os.path.getsize(filepath) > 0:
 video_paths.append(filepath)
 print(f"Saved video to: {filepath} Size: {os.path.getsize(filepath)}")
 else:
 raise Exception(f"Failed to save video {idx}")

 image_paths = []
 images = request.files.getlist('images')
 for idx, image in enumerate(images):
 filename = f"image_{idx}.png"
 filepath = os.path.join(work_dir, filename)
 image.save(filepath)
 if os.path.exists(filepath):
 image_paths.append(filepath)
 print(f"Saved image to: {filepath}")

 output_path = os.path.join(work_dir, 'output.mp4')

 filter_parts = []

 base_duration = metadata["videos"][0]["duration"] if metadata["videos"] else 10
 filter_parts.append(f'color=c=black:s={OUTPUT_WIDTH}x{OUTPUT_HEIGHT}:d={base_duration}[canvas];')

 for idx, (path, meta) in enumerate(zip(video_paths, metadata['videos'])):
 x_pos = int(meta.get("x", 0) - (meta.get("width", 0) / 2))
 y_pos = int(meta.get("y", 0) - (meta.get("height", 0) / 2))
 
 filter_parts.extend([
 f'[{idx}:v]setpts=PTS-STARTPTS,scale={meta.get("width", -1)}:{meta.get("height", -1)}[v{idx}];',
 f'[{idx}:a]asetpts=PTS-STARTPTS[a{idx}];'
 ])

 if idx == 0:
 filter_parts.append(
 f'[canvas][v{idx}]overlay=x={x_pos}:y={y_pos}:eval=init[temp{idx}];'
 )
 else:
 filter_parts.append(
 f'[temp{idx-1}][v{idx}]overlay=x={x_pos}:y={y_pos}:'
 f'enable=\'between(t,{meta["startTime"]},{meta["endTime"]})\':eval=init'
 f'[temp{idx}];'
 )

 last_video_temp = f'temp{len(video_paths)-1}'

 if video_paths:
 audio_mix_parts = []
 for idx in range(len(video_paths)):
 audio_mix_parts.append(f'[a{idx}]')
 filter_parts.append(f'{"".join(audio_mix_parts)}amix=inputs={len(video_paths)}[aout];')

 
 if image_paths:
 for idx, (img_path, img_meta) in enumerate(zip(image_paths, metadata['images'])):
 input_idx = len(video_paths) + idx
 
 
 x_pos = int(img_meta["x"] - (img_meta["width"] / 2))
 y_pos = int(img_meta["y"] - (img_meta["height"] / 2))
 
 filter_parts.extend([
 f'[{input_idx}:v]scale={img_meta["width"]}:{img_meta["height"]}[img{idx}];',
 f'[{last_video_temp}][img{idx}]overlay=x={x_pos}:y={y_pos}:'
 f'enable=\'between(t,{img_meta["startTime"]},{img_meta["endTime"]})\':'
 f'alpha={img_meta["opacity"]/100}[imgout{idx}];'
 ])
 last_video_temp = f'imgout{idx}'

 if metadata.get('texts'):
 for idx, text in enumerate(metadata['texts']):
 next_output = f'text{idx}' if idx < len(metadata['texts']) - 1 else 'vout'
 
 escaped_text = text["description"].replace("'", "\\'")
 
 x_pos = int(text["x"] - (text["width"] / 2))
 y_pos = int(text["y"] - (text["height"] / 2))
 
 text_filter = (
 f'[{last_video_temp}]drawtext=text=\'{escaped_text}\':'
 f'x={x_pos}:y={y_pos}:'
 f'fontsize={text["fontSize"]}:'
 f'fontcolor={text["color"]}'
 )
 
 if text.get('backgroundColor'):
 text_filter += f':box=1:boxcolor={text["backgroundColor"]}:boxborderw=5'
 
 if text.get('fontWeight') == 'bold':
 text_filter += ':font=Arial-Bold'
 
 text_filter += (
 f':enable=\'between(t,{text["startTime"]},{text["endTime"]})\''
 f'[{next_output}];'
 )
 
 filter_parts.append(text_filter)
 last_video_temp = next_output
 else:
 filter_parts.append(f'[{last_video_temp}]null[vout];')

 
 filter_complex = ''.join(filter_parts)

 
 cmd = [
 'ffmpeg',
 *sum([['-i', path] for path in video_paths], []),
 *sum([['-i', path] for path in image_paths], []),
 '-filter_complex', filter_complex,
 '-map', '[vout]'
 ]
 
 
 if video_paths:
 cmd.extend(['-map', '[aout]'])
 
 cmd.extend(['-y', output_path])

 print(f"Running ffmpeg command: {' '.join(cmd)}")
 result = subprocess.run(cmd, capture_output=True, text=True)
 
 if result.returncode != 0:
 print(f"FFmpeg error output: {result.stderr}")
 raise Exception(f"FFmpeg processing failed: {result.stderr}")

 return send_file(
 output_path,
 mimetype='video/mp4',
 as_attachment=True,
 download_name='final_video.mp4'
 )

 except Exception as e:
 print(f"Error in video processing: {str(e)}")
 return {'error': str(e)}, 500
 
 finally:
 if work_dir and os.path.exists(work_dir):
 try:
 print(f"Directory contents before cleanup: {os.listdir(work_dir)}")
 if not os.environ.get('FLASK_DEBUG'):
 shutil.rmtree(work_dir)
 else:
 print(f"Keeping directory for debugging: {work_dir}")
 except Exception as e:
 print(f"Cleanup error: {str(e)}")

 
if __name__ == '__main__':
 app.run(debug=True, port=8000)




I'm also attaching what the final thing looks like on the frontend web vs in the downloaded video
and as u can see the downloaded video has all coords and positions messed up be it of the texts, images as well as videos




can somebody please help me figure this out :)