
Recherche avancée
Médias (91)
-
Chuck D with Fine Arts Militia - No Meaning No
15 septembre 2011, par
Mis à jour : Septembre 2011
Langue : English
Type : Audio
-
Paul Westerberg - Looking Up in Heaven
15 septembre 2011, par
Mis à jour : Septembre 2011
Langue : English
Type : Audio
-
Le Tigre - Fake French
15 septembre 2011, par
Mis à jour : Septembre 2011
Langue : English
Type : Audio
-
Thievery Corporation - DC 3000
15 septembre 2011, par
Mis à jour : Septembre 2011
Langue : English
Type : Audio
-
Dan the Automator - Relaxation Spa Treatment
15 septembre 2011, par
Mis à jour : Septembre 2011
Langue : English
Type : Audio
-
Gilberto Gil - Oslodum
15 septembre 2011, par
Mis à jour : Septembre 2011
Langue : English
Type : Audio
Autres articles (110)
-
MediaSPIP 0.1 Beta version
25 avril 2011, parMediaSPIP 0.1 beta is the first version of MediaSPIP proclaimed as "usable".
The zip file provided here only contains the sources of MediaSPIP in its standalone version.
To get a working installation, you must manually install all-software dependencies on the server.
If you want to use this archive for an installation in "farm mode", you will also need to proceed to other manual (...) -
Multilang : améliorer l’interface pour les blocs multilingues
18 février 2011, parMultilang est un plugin supplémentaire qui n’est pas activé par défaut lors de l’initialisation de MediaSPIP.
Après son activation, une préconfiguration est mise en place automatiquement par MediaSPIP init permettant à la nouvelle fonctionnalité d’être automatiquement opérationnelle. Il n’est donc pas obligatoire de passer par une étape de configuration pour cela. -
HTML5 audio and video support
13 avril 2011, parMediaSPIP uses HTML5 video and audio tags to play multimedia files, taking advantage of the latest W3C innovations supported by modern browsers.
The MediaSPIP player used has been created specifically for MediaSPIP and can be easily adapted to fit in with a specific theme.
For older browsers the Flowplayer flash fallback is used.
MediaSPIP allows for media playback on major mobile platforms with the above (...)
Sur d’autres sites (13979)
-
FFMpeg for IOS - Disabling Log Levels
12 novembre 2013, par VeeruI have compiled ffmpeg for ios. The configuration for compiling, i have borrowed from kxmovie's Rake file.
Everything works fine, but i would like to disable all the debug messages showed in console by the decoder.
How can i achieve this. I believe it should be the ffmpeg is compiled, but am not sure how to go about it. Any suggestions would be highly appreciated.
configure command :
./configure —disable-ffmpeg —disable-ffplay —disable-ffserver
—disable-ffprobe —disable-doc —disable-bzlib —target-os=darwin —enable-cross-compile —enable-gpl —enable-version3 —assert-level=2 —disable-mmx —arch=i386 —cpu=i386 —extra-ldflags='-arch i386' —extra-cflags='-arch i386' —disable-asm —cc=/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneSimulator.platform/Developer/usr/bin/gcc
—as='gas-preprocessor.pl /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneSimulator.platform/Developer/usr/bin/gcc'
—sysroot=/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneSimulator.platform/Developer/SDKs/iPhoneSimulator6.1.sdk
—extra-ldflags=-L/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneSimulator.platform/Developer/SDKs/iPhoneSimulator6.1.sdk/usr/lib/system -
Encoding/Decoding H264 using libav in C++ [closed]
20 mai, par gbock93I want to build an application to


- 

- capture frames in YUYV 4:2:2 format
- encode them to H264
- send over network
- decode the received data
- display the video stream












To do so I wrote 2 classes, H264Encoder and H264Decoder.


I post only the .cpp contents, the .h are trivial :


H264Encoder.cpp


#include 

#include <stdexcept>
#include <iostream>

H264Encoder::H264Encoder(unsigned int width_, unsigned int height_, unsigned int fps_):
 m_width(width_),
 m_height(height_),
 m_fps(fps_),
 m_frame_index(0),
 m_context(nullptr),
 m_frame(nullptr),
 m_packet(nullptr),
 m_sws_ctx(nullptr)
{
 // Find the video codec
 AVCodec* codec;
 codec = avcodec_find_encoder(AV_CODEC_ID_H264);
 if (!codec)
 throw std::runtime_error("[Encoder]: Error: Codec not found");

 // Allocate codec
 m_context = avcodec_alloc_context3(codec);
 if (!m_context)
 throw std::runtime_error("[Encoder]: Error: Could not allocate codec context");

 // Configure codec
 av_opt_set(m_context->priv_data, "preset", "ultrafast", 0);
 av_opt_set(m_context->priv_data, "tune", "zerolatency", 0);
 av_opt_set(m_context->priv_data, "crf", "35", 0); // Range: [0; 51], sane range: [18; 26], lower -> higher compression

 m_context->width = (int)width_;
 m_context->height = (int)height_;
 m_context->time_base = {1, (int)fps_};
 m_context->framerate = {(int)fps_, 1};
 m_context->codec_id = AV_CODEC_ID_H264;
 m_context->pix_fmt = AV_PIX_FMT_YUV420P; // H265|4 codec take as input only AV_PIX_FMT_YUV420P
 m_context->bit_rate = 400000;
 m_context->gop_size = 10;
 m_context->max_b_frames = 1;

 // Open codec
 if (avcodec_open2(m_context, codec, nullptr) < 0)
 throw std::runtime_error("[Encoder]: Error: Could not open codec");

 // Allocate frame and its buffer
 m_frame = av_frame_alloc();
 if (!m_frame) 
 throw std::runtime_error("[Encoder]: Error: Could not allocate frame");

 m_frame->format = m_context->pix_fmt;
 m_frame->width = m_context->width;
 m_frame->height = m_context->height;

 if (av_frame_get_buffer(m_frame, 0) < 0)
 throw std::runtime_error("[Encoder]: Error: Cannot allocate frame buffer");
 
 // Allocate packet
 m_packet = av_packet_alloc();
 if (!m_packet) 
 throw std::runtime_error("[Encoder]: Error: Could not allocate packet");

 // Convert from YUYV422 to YUV420P
 m_sws_ctx = sws_getContext(
 width_, height_, AV_PIX_FMT_YUYV422,
 width_, height_, AV_PIX_FMT_YUV420P,
 SWS_BILINEAR, nullptr, nullptr, nullptr
 );
 if (!m_sws_ctx) 
 throw std::runtime_error("[Encoder]: Error: Could not allocate sws context");

 //
 printf("[Encoder]: H264Encoder ready.\n");
}

H264Encoder::~H264Encoder()
{
 sws_freeContext(m_sws_ctx);
 av_packet_free(&m_packet);
 av_frame_free(&m_frame);
 avcodec_free_context(&m_context);

 printf("[Encoder]: H264Encoder destroyed.\n");
}

std::vector H264Encoder::encode(const cv::Mat& img_)
{
 /*
 - YUYV422 is a packed format. It has 3 components (av_pix_fmt_desc_get((AVPixelFormat)AV_PIX_FMT_YUYV422)->nb_components == 3) but
 data is stored in a single plane (av_pix_fmt_count_planes((AVPixelFormat)AV_PIX_FMT_YUYV422) == 1).
 - YUV420P is a planar format. It has 3 components (av_pix_fmt_desc_get((AVPixelFormat)AV_PIX_FMT_YUV420P)->nb_components == 3) and
 each component is stored in a separate plane (av_pix_fmt_count_planes((AVPixelFormat)AV_PIX_FMT_YUV420P) == 3) with its
 own stride.
 */
 std::cout << "[Encoder]" << std::endl;
 std::cout << "[Encoder]: Encoding img " << img_.cols << "x" << img_.rows << " | element size " << img_.elemSize() << std::endl;
 assert(img_.elemSize() == 2);

 uint8_t* input_data[1] = {(uint8_t*)img_.data};
 int input_linesize[1] = {2 * (int)m_width};
 
 if (av_frame_make_writable(m_frame) < 0)
 throw std::runtime_error("[Encoder]: Error: Cannot make frame data writable");

 // Convert from YUV422 image to YUV420 frame. Apply scaling if necessary
 sws_scale(
 m_sws_ctx,
 input_data, input_linesize, 0, m_height,
 m_frame->data, m_frame->linesize
 );
 m_frame->pts = m_frame_index;

 int n_planes = av_pix_fmt_count_planes((AVPixelFormat)m_frame->format);
 std::cout << "[Encoder]: Sending Frame " << m_frame_index << " with dimensions " << m_frame->width << "x" << m_frame->height << "x" << n_planes << std::endl;
 for (int i=0; iframerate.num) + 1;
 break;
 case AVERROR(EAGAIN):
 throw std::runtime_error("[Encoder]: avcodec_send_frame: EAGAIN");
 case AVERROR_EOF:
 throw std::runtime_error("[Encoder]: avcodec_send_frame: EOF");
 case AVERROR(EINVAL):
 throw std::runtime_error("[Encoder]: avcodec_send_frame: EINVAL");
 case AVERROR(ENOMEM):
 throw std::runtime_error("[Encoder]: avcodec_send_frame: ENOMEM");
 default:
 throw std::runtime_error("[Encoder]: avcodec_send_frame: UNKNOWN");
 }

 // Receive packet from codec
 std::vector result;
 while(ret >= 0)
 {
 ret = avcodec_receive_packet(m_context, m_packet);

 switch (ret)
 {
 case 0:
 std::cout << "[Encoder]: Received packet from codec of size " << m_packet->size << " bytes " << std::endl;
 result.insert(result.end(), m_packet->data, m_packet->data + m_packet->size);
 av_packet_unref(m_packet);
 break;

 case AVERROR(EAGAIN):
 std::cout << "[Encoder]: avcodec_receive_packet: EAGAIN" << std::endl;
 break;
 case AVERROR_EOF:
 std::cout << "[Encoder]: avcodec_receive_packet: EOF" << std::endl;
 break;
 case AVERROR(EINVAL):
 throw std::runtime_error("[Encoder]: avcodec_receive_packet: EINVAL");
 default:
 throw std::runtime_error("[Encoder]: avcodec_receive_packet: UNKNOWN");
 }
 }

 std::cout << "[Encoder]: Encoding complete" << std::endl;
 return result;
}
</iostream></stdexcept>


H264Decoder.cpp


#include 

#include <iostream>
#include <stdexcept>

H264Decoder::H264Decoder():
 m_context(nullptr),
 m_frame(nullptr),
 m_packet(nullptr)
{
 // Find the video codec
 AVCodec* codec;
 codec = avcodec_find_decoder(AV_CODEC_ID_H264);
 if (!codec)
 throw std::runtime_error("[Decoder]: Error: Codec not found");

 // Allocate codec
 m_context = avcodec_alloc_context3(codec);
 if (!m_context)
 throw std::runtime_error("[Decoder]: Error: Could not allocate codec context");

 // Open codec
 if (avcodec_open2(m_context, codec, nullptr) < 0)
 throw std::runtime_error("[Decoder]: Error: Could not open codec");

 // Allocate frame
 m_frame = av_frame_alloc();
 if (!m_frame)
 throw std::runtime_error("[Decoder]: Error: Could not allocate frame");

 // Allocate packet
 m_packet = av_packet_alloc();
 if (!m_packet) 
 throw std::runtime_error("[Decoder]: Error: Could not allocate packet");

 //
 printf("[Decoder]: H264Decoder ready.\n");
}

H264Decoder::~H264Decoder()
{
 av_packet_free(&m_packet);
 av_frame_free(&m_frame);
 avcodec_free_context(&m_context);

 printf("[Decoder]: H264Decoder destroyed.\n");
}

bool H264Decoder::decode(uint8_t* data_, size_t size_, cv::Mat& img_)
{
 std::cout << "[Decoder]" << std::endl;
 std::cout << "[Decoder]: decoding " << size_ << " bytes of data" << std::endl;

 // Fill packet
 m_packet->data = data_;
 m_packet->size = size_;

 if (size_ == 0)
 return false;

 // Send packet to codec
 int send_result = avcodec_send_packet(m_context, m_packet);

 switch (send_result)
 {
 case 0:
 std::cout << "[Decoder]: Sent packet to codec" << std::endl;
 break;
 case AVERROR(EAGAIN):
 throw std::runtime_error("[Decoder]: avcodec_send_packet: EAGAIN");
 case AVERROR_EOF:
 throw std::runtime_error("[Decoder]: avcodec_send_packet: EOF");
 case AVERROR(EINVAL):
 throw std::runtime_error("[Decoder]: avcodec_send_packet: EINVAL");
 case AVERROR(ENOMEM):
 throw std::runtime_error("[Decoder]: avcodec_send_packet: ENOMEM");
 default:
 throw std::runtime_error("[Decoder]: avcodec_send_packet: UNKNOWN");
 }

 // Receive frame from codec
 int n_planes;
 uint8_t* output_data[1];
 int output_line_size[1];

 int receive_result = avcodec_receive_frame(m_context, m_frame);

 switch (receive_result)
 {
 case 0:
 n_planes = av_pix_fmt_count_planes((AVPixelFormat)m_frame->format);
 std::cout << "[Decoder]: Received Frame with dimensions " << m_frame->width << "x" << m_frame->height << "x" << n_planes << std::endl;
 for (int i=0; i/
 std::cout << "[Decoder]: Decoding complete" << std::endl;
 return true;
}
</stdexcept></iostream>


To test the two classes I put together a main.cpp to grab a frame, encode/decode and display the decoded frame (no network transmission in place) :


main.cpp


while(...)
{
 // get frame from custom camera class. Format is YUYV 4:2:2
 camera.getFrame(camera_frame);
 // Construct a cv::Mat to represent the grabbed frame
 cv::Mat camera_frame_yuyv = cv::Mat(camera_frame.height, camera_frame.width, CV_8UC2, camera_frame.data.data());
 // Encode image
 std::vector encoded_data = encoder.encode(camera_frame_yuyv);
 if (!encoded_data.empty())
 {
 // Decode image
 cv::Mat decoded_frame;
 if (decoder.decode(encoded_data.data(), encoded_data.size(), decoded_frame))
 {
 // Display image
 cv::imshow("Camera", decoded_frame);
 cv::waitKey(1);
 }
 }
}



Compiling and executing the code I get random results between subsequent executions :


- 

- Sometimes the whole loop runs without problems and I see the decoded image.
- Sometimes the program crashes at the
sws_scale(...)
call in the decoder with"Assertion desc failed at src/libswscale/swscale_internal.h:757"
. - Sometimes the loop runs but I see a black image and the message
Slice parameters 0, 720 are invalid
is displayed when executing thesws_scale(...)
call in the decoder.








Why is the behaviour so random ? What am I doing wrong with the libav API ?


Some resources I found useful :


- 

- This article on encoding
- This article on decoding






-
FFmpeg overlay positioning issue : Converting frontend center coordinates to FFmpeg top-left coordinates
25 janvier, par tarunI'm building a web-based video editor where users can :


Add multiple videos
Add images
Add text overlays with background color


Frontend sends coordinates where each element's (x,y) represents its center position.
on click of the export button i want all data to be exported as one final video
on click i send the data to the backend like -


const exportAllVideos = async () => {
 try {
 const formData = new FormData();
 
 
 const normalizedVideos = videos.map(video => ({
 ...video,
 startTime: parseFloat(video.startTime),
 endTime: parseFloat(video.endTime),
 duration: parseFloat(video.duration)
 })).sort((a, b) => a.startTime - b.startTime);

 
 for (const video of normalizedVideos) {
 const response = await fetch(video.src);
 const blobData = await response.blob();
 const file = new File([blobData], `${video.id}.mp4`, { type: "video/mp4" });
 formData.append("videos", file);
 }

 
 const normalizedImages = images.map(image => ({
 ...image,
 startTime: parseFloat(image.startTime),
 endTime: parseFloat(image.endTime),
 x: parseInt(image.x),
 y: parseInt(image.y),
 width: parseInt(image.width),
 height: parseInt(image.height),
 opacity: parseInt(image.opacity)
 }));

 
 for (const image of normalizedImages) {
 const response = await fetch(image.src);
 const blobData = await response.blob();
 const file = new File([blobData], `${image.id}.png`, { type: "image/png" });
 formData.append("images", file);
 }

 
 const normalizedTexts = texts.map(text => ({
 ...text,
 startTime: parseFloat(text.startTime),
 endTime: parseFloat(text.endTime),
 x: parseInt(text.x),
 y: parseInt(text.y),
 fontSize: parseInt(text.fontSize),
 opacity: parseInt(text.opacity)
 }));

 
 formData.append("metadata", JSON.stringify({
 videos: normalizedVideos,
 images: normalizedImages,
 texts: normalizedTexts
 }));

 const response = await fetch("my_flask_endpoint", {
 method: "POST",
 body: formData
 });

 if (!response.ok) {
 
 console.log('wtf', response);
 
 }

 const finalVideo = await response.blob();
 const url = URL.createObjectURL(finalVideo);
 const a = document.createElement("a");
 a.href = url;
 a.download = "final_video.mp4";
 a.click();
 URL.revokeObjectURL(url);

 } catch (e) {
 console.log(e, "err");
 }
 };



the frontend data for each object that is text image and video we are storing it as an array of objects below is the Data strcutre for each object -


// the frontend data for each
 const newVideo = {
 id: uuidv4(),
 src: URL.createObjectURL(videoData.videoBlob),
 originalDuration: videoData.duration,
 duration: videoData.duration,
 startTime: 0,
 playbackOffset: 0,
 endTime: videoData.endTime || videoData.duration,
 isPlaying: false,
 isDragging: false,
 speed: 1,
 volume: 100,
 x: window.innerHeight / 2,
 y: window.innerHeight / 2,
 width: videoData.width,
 height: videoData.height,
 };
 const newTextObject = {
 id: uuidv4(),
 description: text,
 opacity: 100,
 x: containerWidth.width / 2,
 y: containerWidth.height / 2,
 fontSize: 18,
 duration: 20,
 endTime: 20,
 startTime: 0,
 color: "#ffffff",
 backgroundColor: hasBG,
 padding: 8,
 fontWeight: "normal",
 width: 200,
 height: 40,
 };

 const newImage = {
 id: uuidv4(),
 src: URL.createObjectURL(imageData),
 x: containerWidth.width / 2,
 y: containerWidth.height / 2,
 width: 200,
 height: 200,
 borderRadius: 0,
 startTime: 0,
 endTime: 20,
 duration: 20,
 opacity: 100,
 };




BACKEND CODE -


import os
import shutil
import subprocess
from flask import Flask, request, send_file
import ffmpeg
import json
from werkzeug.utils import secure_filename
import uuid
from flask_cors import CORS


app = Flask(__name__)
CORS(app, resources={r"/*": {"origins": "*"}})



UPLOAD_FOLDER = 'temp_uploads'
if not os.path.exists(UPLOAD_FOLDER):
 os.makedirs(UPLOAD_FOLDER)


@app.route('/')
def home():
 return 'Hello World'


OUTPUT_WIDTH = 1920
OUTPUT_HEIGHT = 1080



@app.route('/process', methods=['POST'])
def process_video():
 work_dir = None
 try:
 work_dir = os.path.abspath(os.path.join(UPLOAD_FOLDER, str(uuid.uuid4())))
 os.makedirs(work_dir)
 print(f"Created working directory: {work_dir}")

 metadata = json.loads(request.form['metadata'])
 print("Received metadata:", json.dumps(metadata, indent=2))
 
 video_paths = []
 videos = request.files.getlist('videos')
 for idx, video in enumerate(videos):
 filename = f"video_{idx}.mp4"
 filepath = os.path.join(work_dir, filename)
 video.save(filepath)
 if os.path.exists(filepath) and os.path.getsize(filepath) > 0:
 video_paths.append(filepath)
 print(f"Saved video to: {filepath} Size: {os.path.getsize(filepath)}")
 else:
 raise Exception(f"Failed to save video {idx}")

 image_paths = []
 images = request.files.getlist('images')
 for idx, image in enumerate(images):
 filename = f"image_{idx}.png"
 filepath = os.path.join(work_dir, filename)
 image.save(filepath)
 if os.path.exists(filepath):
 image_paths.append(filepath)
 print(f"Saved image to: {filepath}")

 output_path = os.path.join(work_dir, 'output.mp4')

 filter_parts = []

 base_duration = metadata["videos"][0]["duration"] if metadata["videos"] else 10
 filter_parts.append(f'color=c=black:s={OUTPUT_WIDTH}x{OUTPUT_HEIGHT}:d={base_duration}[canvas];')

 for idx, (path, meta) in enumerate(zip(video_paths, metadata['videos'])):
 x_pos = int(meta.get("x", 0) - (meta.get("width", 0) / 2))
 y_pos = int(meta.get("y", 0) - (meta.get("height", 0) / 2))
 
 filter_parts.extend([
 f'[{idx}:v]setpts=PTS-STARTPTS,scale={meta.get("width", -1)}:{meta.get("height", -1)}[v{idx}];',
 f'[{idx}:a]asetpts=PTS-STARTPTS[a{idx}];'
 ])

 if idx == 0:
 filter_parts.append(
 f'[canvas][v{idx}]overlay=x={x_pos}:y={y_pos}:eval=init[temp{idx}];'
 )
 else:
 filter_parts.append(
 f'[temp{idx-1}][v{idx}]overlay=x={x_pos}:y={y_pos}:'
 f'enable=\'between(t,{meta["startTime"]},{meta["endTime"]})\':eval=init'
 f'[temp{idx}];'
 )

 last_video_temp = f'temp{len(video_paths)-1}'

 if video_paths:
 audio_mix_parts = []
 for idx in range(len(video_paths)):
 audio_mix_parts.append(f'[a{idx}]')
 filter_parts.append(f'{"".join(audio_mix_parts)}amix=inputs={len(video_paths)}[aout];')

 
 if image_paths:
 for idx, (img_path, img_meta) in enumerate(zip(image_paths, metadata['images'])):
 input_idx = len(video_paths) + idx
 
 
 x_pos = int(img_meta["x"] - (img_meta["width"] / 2))
 y_pos = int(img_meta["y"] - (img_meta["height"] / 2))
 
 filter_parts.extend([
 f'[{input_idx}:v]scale={img_meta["width"]}:{img_meta["height"]}[img{idx}];',
 f'[{last_video_temp}][img{idx}]overlay=x={x_pos}:y={y_pos}:'
 f'enable=\'between(t,{img_meta["startTime"]},{img_meta["endTime"]})\':'
 f'alpha={img_meta["opacity"]/100}[imgout{idx}];'
 ])
 last_video_temp = f'imgout{idx}'

 if metadata.get('texts'):
 for idx, text in enumerate(metadata['texts']):
 next_output = f'text{idx}' if idx < len(metadata['texts']) - 1 else 'vout'
 
 escaped_text = text["description"].replace("'", "\\'")
 
 x_pos = int(text["x"] - (text["width"] / 2))
 y_pos = int(text["y"] - (text["height"] / 2))
 
 text_filter = (
 f'[{last_video_temp}]drawtext=text=\'{escaped_text}\':'
 f'x={x_pos}:y={y_pos}:'
 f'fontsize={text["fontSize"]}:'
 f'fontcolor={text["color"]}'
 )
 
 if text.get('backgroundColor'):
 text_filter += f':box=1:boxcolor={text["backgroundColor"]}:boxborderw=5'
 
 if text.get('fontWeight') == 'bold':
 text_filter += ':font=Arial-Bold'
 
 text_filter += (
 f':enable=\'between(t,{text["startTime"]},{text["endTime"]})\''
 f'[{next_output}];'
 )
 
 filter_parts.append(text_filter)
 last_video_temp = next_output
 else:
 filter_parts.append(f'[{last_video_temp}]null[vout];')

 
 filter_complex = ''.join(filter_parts)

 
 cmd = [
 'ffmpeg',
 *sum([['-i', path] for path in video_paths], []),
 *sum([['-i', path] for path in image_paths], []),
 '-filter_complex', filter_complex,
 '-map', '[vout]'
 ]
 
 
 if video_paths:
 cmd.extend(['-map', '[aout]'])
 
 cmd.extend(['-y', output_path])

 print(f"Running ffmpeg command: {' '.join(cmd)}")
 result = subprocess.run(cmd, capture_output=True, text=True)
 
 if result.returncode != 0:
 print(f"FFmpeg error output: {result.stderr}")
 raise Exception(f"FFmpeg processing failed: {result.stderr}")

 return send_file(
 output_path,
 mimetype='video/mp4',
 as_attachment=True,
 download_name='final_video.mp4'
 )

 except Exception as e:
 print(f"Error in video processing: {str(e)}")
 return {'error': str(e)}, 500
 
 finally:
 if work_dir and os.path.exists(work_dir):
 try:
 print(f"Directory contents before cleanup: {os.listdir(work_dir)}")
 if not os.environ.get('FLASK_DEBUG'):
 shutil.rmtree(work_dir)
 else:
 print(f"Keeping directory for debugging: {work_dir}")
 except Exception as e:
 print(f"Cleanup error: {str(e)}")

 
if __name__ == '__main__':
 app.run(debug=True, port=8000)




I'm also attaching what the final thing looks like on the frontend web vs in the downloaded video
and as u can see the downloaded video has all coords and positions messed up be it of the texts, images as well as videos




can somebody please help me figure this out :)