Newest 'ffmpeg' Questions - Stack Overflow

http://stackoverflow.com/questions/tagged/ffmpeg

Les articles publiés sur le site

  • FFMPEG streams from Docker container application are not playing on other computers

    25 janvier, par tipitoe

    I'm building a docker container app using dotnet core that captures and plays audio streams using ffmpeg.

    Since some of the URLs cannot be played directly in a web browser, audio needs to be processed by ffmpeg first. The ffmpeg output is a local url http://127.0.0.1:8080 that is sent back to a browser in an Ajax call.

    The audio plays perfectly in Visual Studio during development. I understand, since URL refers to localhost, there is no issue playing the audio. Unfortunately, with application being installed as docker container, the audio is being blocked. Initially, my guess was it was CORS that was blocking the stream. Also I considered the possibility that I have to use LAN IP address of the machine hosting docker container. This did not solve the problem either.

    Here is my code so far in Startup.cs

      services.AddCors(options => options.AddPolicy("AllowAll",
      builder => builder.AllowAnyOrigin().AllowAnyHeader().AllowAnyMethod()));
    
      app.UseCors("AllowAll");
      app.UseRouting();
    

    Here is an Ajax call

      $.ajax({
      type: 'POST',
      url: '/live?handler=PlayStream',
      beforeSend: function (xhr) {
          xhr.setRequestHeader('XSRF-TOKEN',
              $('input:hidden[name="__RequestVerificationToken"]').val());
      },
      dataType: 'json',
      data: { 'streamId': streamId },
      success: function (data) {
    
          var json = JSON.parse(data);
    
          source = new Audio(json.InputUri);
          source.play();
                  
    
    
      },
      failure: function (response) {
    
          alert(response.responseText);
      },
      error: function (response) {
    
          alert(response.responseText);
      }
    

    });

    I would appreciate any suggestions how to solve this problem. Thanks

  • FFmpeg overlay positioning issue : Converting frontend center coordinates to FFmpeg top-left coordinates

    25 janvier, par tarun

    I'm building a web-based video editor where users can:

    Add multiple videos Add images Add text overlays with background color

    Frontend sends coordinates where each element's (x,y) represents its center position. on click of the export button i want all data to be exported as one final video on click i send the data to the backend like -

     const exportAllVideos = async () => {
        try {
          const formData = new FormData();
            
          
          const normalizedVideos = videos.map(video => ({
              ...video,
              startTime: parseFloat(video.startTime),
              endTime: parseFloat(video.endTime),
              duration: parseFloat(video.duration)
          })).sort((a, b) => a.startTime - b.startTime);
    
          
          for (const video of normalizedVideos) {
              const response = await fetch(video.src);
              const blobData = await response.blob();
              const file = new File([blobData], `${video.id}.mp4`, { type: "video/mp4" });
              formData.append("videos", file);
          }
    
          
          const normalizedImages = images.map(image => ({
              ...image,
              startTime: parseFloat(image.startTime),
              endTime: parseFloat(image.endTime),
              x: parseInt(image.x),
              y: parseInt(image.y),
              width: parseInt(image.width),
              height: parseInt(image.height),
              opacity: parseInt(image.opacity)
          }));
    
          
          for (const image of normalizedImages) {
              const response = await fetch(image.src);
              const blobData = await response.blob();
              const file = new File([blobData], `${image.id}.png`, { type: "image/png" });
              formData.append("images", file);
          }
    
          
          const normalizedTexts = texts.map(text => ({
              ...text,
              startTime: parseFloat(text.startTime),
              endTime: parseFloat(text.endTime),
              x: parseInt(text.x),
              y: parseInt(text.y),
              fontSize: parseInt(text.fontSize),
              opacity: parseInt(text.opacity)
          }));
    
          
          formData.append("metadata", JSON.stringify({
              videos: normalizedVideos,
              images: normalizedImages,
              texts: normalizedTexts
          }));
    
          const response = await fetch("my_flask_endpoint", {
              method: "POST",
              body: formData
          });
    
          if (!response.ok) {
            
              console.log('wtf', response);
              
          }
    
          const finalVideo = await response.blob();
          const url = URL.createObjectURL(finalVideo);
          const a = document.createElement("a");
          a.href = url;
          a.download = "final_video.mp4";
          a.click();
          URL.revokeObjectURL(url);
    
        } catch (e) {
          console.log(e, "err");
        }
      };
    

    the frontend data for each object that is text image and video we are storing it as an array of objects below is the Data strcutre for each object -

    // the frontend data for each
      const newVideo = {
          id: uuidv4(),
          src: URL.createObjectURL(videoData.videoBlob),
          originalDuration: videoData.duration,
          duration: videoData.duration,
          startTime: 0,
          playbackOffset: 0,
          endTime: videoData.endTime || videoData.duration,
          isPlaying: false,
          isDragging: false,
          speed: 1,
          volume: 100,
          x: window.innerHeight / 2,
          y: window.innerHeight / 2,
          width: videoData.width,
          height: videoData.height,
        };
        const newTextObject = {
          id: uuidv4(),
          description: text,
          opacity: 100,
          x: containerWidth.width / 2,
          y: containerWidth.height / 2,
          fontSize: 18,
          duration: 20,
          endTime: 20,
          startTime: 0,
          color: "#ffffff",
          backgroundColor: hasBG,
          padding: 8,
          fontWeight: "normal",
          width: 200,
          height: 40,
        };
    
        const newImage = {
          id: uuidv4(),
          src: URL.createObjectURL(imageData),
          x: containerWidth.width / 2,
          y: containerWidth.height / 2,
          width: 200,
          height: 200,
          borderRadius: 0,
          startTime: 0,
          endTime: 20,
          duration: 20,
          opacity: 100,
        };
    
    

    BACKEND CODE -

    import os
    import shutil
    import subprocess
    from flask import Flask, request, send_file
    import ffmpeg
    import json
    from werkzeug.utils import secure_filename
    import uuid
    from flask_cors import CORS
    
    
    app = Flask(__name__)
    CORS(app, resources={r"/*": {"origins": "*"}})
    
    
    
    UPLOAD_FOLDER = 'temp_uploads'
    if not os.path.exists(UPLOAD_FOLDER):
        os.makedirs(UPLOAD_FOLDER)
    
    
    @app.route('/')
    def home():
        return 'Hello World'
    
    
    OUTPUT_WIDTH = 1920
    OUTPUT_HEIGHT = 1080
    
    
    
    @app.route('/process', methods=['POST'])
    def process_video():
        work_dir = None
        try:
            work_dir = os.path.abspath(os.path.join(UPLOAD_FOLDER, str(uuid.uuid4())))
            os.makedirs(work_dir)
            print(f"Created working directory: {work_dir}")
    
            metadata = json.loads(request.form['metadata'])
            print("Received metadata:", json.dumps(metadata, indent=2))
            
            video_paths = []
            videos = request.files.getlist('videos')
            for idx, video in enumerate(videos):
                filename = f"video_{idx}.mp4"
                filepath = os.path.join(work_dir, filename)
                video.save(filepath)
                if os.path.exists(filepath) and os.path.getsize(filepath) > 0:
                    video_paths.append(filepath)
                    print(f"Saved video to: {filepath} Size: {os.path.getsize(filepath)}")
                else:
                    raise Exception(f"Failed to save video {idx}")
    
            image_paths = []
            images = request.files.getlist('images')
            for idx, image in enumerate(images):
                filename = f"image_{idx}.png"
                filepath = os.path.join(work_dir, filename)
                image.save(filepath)
                if os.path.exists(filepath):
                    image_paths.append(filepath)
                    print(f"Saved image to: {filepath}")
    
            output_path = os.path.join(work_dir, 'output.mp4')
    
            filter_parts = []
    
            base_duration = metadata["videos"][0]["duration"] if metadata["videos"] else 10
            filter_parts.append(f'color=c=black:s={OUTPUT_WIDTH}x{OUTPUT_HEIGHT}:d={base_duration}[canvas];')
    
            for idx, (path, meta) in enumerate(zip(video_paths, metadata['videos'])):
                x_pos = int(meta.get("x", 0) - (meta.get("width", 0) / 2))
                y_pos = int(meta.get("y", 0) - (meta.get("height", 0) / 2))
                
                filter_parts.extend([
                    f'[{idx}:v]setpts=PTS-STARTPTS,scale={meta.get("width", -1)}:{meta.get("height", -1)}[v{idx}];',
                    f'[{idx}:a]asetpts=PTS-STARTPTS[a{idx}];'
                ])
    
                if idx == 0:
                    filter_parts.append(
                        f'[canvas][v{idx}]overlay=x={x_pos}:y={y_pos}:eval=init[temp{idx}];'
                    )
                else:
                    filter_parts.append(
                        f'[temp{idx-1}][v{idx}]overlay=x={x_pos}:y={y_pos}:'
                        f'enable=\'between(t,{meta["startTime"]},{meta["endTime"]})\':eval=init'
                        f'[temp{idx}];'
                    )
    
            last_video_temp = f'temp{len(video_paths)-1}'
    
            if video_paths:
                audio_mix_parts = []
                for idx in range(len(video_paths)):
                    audio_mix_parts.append(f'[a{idx}]')
                filter_parts.append(f'{"".join(audio_mix_parts)}amix=inputs={len(video_paths)}[aout];')
    
            
            if image_paths:
                for idx, (img_path, img_meta) in enumerate(zip(image_paths, metadata['images'])):
                    input_idx = len(video_paths) + idx
                    
                    
                    x_pos = int(img_meta["x"] - (img_meta["width"] / 2))
                    y_pos = int(img_meta["y"] - (img_meta["height"] / 2))
                    
                    filter_parts.extend([
                        f'[{input_idx}:v]scale={img_meta["width"]}:{img_meta["height"]}[img{idx}];',
                        f'[{last_video_temp}][img{idx}]overlay=x={x_pos}:y={y_pos}:'
                        f'enable=\'between(t,{img_meta["startTime"]},{img_meta["endTime"]})\':'
                        f'alpha={img_meta["opacity"]/100}[imgout{idx}];'
                    ])
                    last_video_temp = f'imgout{idx}'
    
            if metadata.get('texts'):
                for idx, text in enumerate(metadata['texts']):
                    next_output = f'text{idx}' if idx < len(metadata['texts']) - 1 else 'vout'
                    
                    escaped_text = text["description"].replace("'", "\\'")
                    
                    x_pos = int(text["x"] - (text["width"] / 2))
                    y_pos = int(text["y"] - (text["height"] / 2))
                    
                    text_filter = (
                        f'[{last_video_temp}]drawtext=text=\'{escaped_text}\':'
                        f'x={x_pos}:y={y_pos}:'
                        f'fontsize={text["fontSize"]}:'
                        f'fontcolor={text["color"]}'
                    )
                    
                    if text.get('backgroundColor'):
                        text_filter += f':box=1:boxcolor={text["backgroundColor"]}:boxborderw=5'
                    
                    if text.get('fontWeight') == 'bold':
                        text_filter += ':font=Arial-Bold'
                    
                    text_filter += (
                        f':enable=\'between(t,{text["startTime"]},{text["endTime"]})\''
                        f'[{next_output}];'
                    )
                    
                    filter_parts.append(text_filter)
                    last_video_temp = next_output
            else:
                filter_parts.append(f'[{last_video_temp}]null[vout];')
    
            
            filter_complex = ''.join(filter_parts)
    
            
            cmd = [
                'ffmpeg',
                *sum([['-i', path] for path in video_paths], []),
                *sum([['-i', path] for path in image_paths], []),
                '-filter_complex', filter_complex,
                '-map', '[vout]'
            ]
            
            
            if video_paths:
                cmd.extend(['-map', '[aout]'])
            
            cmd.extend(['-y', output_path])
    
            print(f"Running ffmpeg command: {' '.join(cmd)}")
            result = subprocess.run(cmd, capture_output=True, text=True)
            
            if result.returncode != 0:
                print(f"FFmpeg error output: {result.stderr}")
                raise Exception(f"FFmpeg processing failed: {result.stderr}")
    
            return send_file(
                output_path,
                mimetype='video/mp4',
                as_attachment=True,
                download_name='final_video.mp4'
            )
    
        except Exception as e:
            print(f"Error in video processing: {str(e)}")
            return {'error': str(e)}, 500
        
        finally:
            if work_dir and os.path.exists(work_dir):
                try:
                    print(f"Directory contents before cleanup: {os.listdir(work_dir)}")
                    if not os.environ.get('FLASK_DEBUG'):
                        shutil.rmtree(work_dir)
                    else:
                        print(f"Keeping directory for debugging: {work_dir}")
                except Exception as e:
                    print(f"Cleanup error: {str(e)}")
    
                    
    if __name__ == '__main__':
        app.run(debug=True, port=8000)
    
    

    I'm also attaching what the final thing looks like on the frontend web vs in the downloaded video and as u can see the downloaded video has all coords and positions messed up be it of the texts, images as well as videosdownloaded videos view frontend web view

    can somebody please help me figure this out :)

  • How to resize dimensions of video through ffmpeg-python ?

    25 janvier, par kunambi

    I'm trying to resize a video file which a user has uploaded to Django, by using ffmpeg-python. The documentation isn't very easy to understand, so I've tried to cobble this together from various sources.

    This method is run in a celery container, in order to not slow the experience for the user. The problem I'm facing is that I can't seem to resize the video file. I've tried two different approaches:

    from django.db import models
    from io import BytesIO
    from myapp.models import MediaModel
    
    
    def resize_video(mypk: str) -> None:
        instance = MediaModel.objects.get(pk=mypk)
        media_instance: models.FileField = instance.media
        media_output = "test.mp4"
        buffer = BytesIO()
    
        for chunk in media_instance.chunks():
            buffer.write(chunk)
    
        stream_video = ffmpeg.input("pipe:").video.filter("scale", 720, -1)  # resize to 720px width
        stream_audio = ffmpeg.input("pipe:").audio
        process = (
            ffmpeg.output(stream_video, stream_audio, media_output, acodec="aac")
            .overwrite_output()
            .run_async(pipe_stdin=True, quiet=True)
        )
        buffer.seek(0)
        process_out, process_err = process.communicate(input=buffer.getbuffer())
        # (pdb) process_out
        # b''
    
        # attempting to use `.concat` instead
        process2 = (
            ffmpeg.concat(stream_video, stream_audio, v=1, a=1)
            .output(media_output)
            .overwrite_output()
            .run_async(pipe_stdin=True, quiet=True)
        )
        buffer.seek(0)
        process2_out, process2_err = process2.communicate(input=buffer.getbuffer())
        # (pdb) process2_out
        # b''
    

    As we can see, no matter which approach chosen, the output is an empty binary. The process_err and process2_err both generate the following message:

    ffmpeg version N-111491-g31979127f8-20230717 Copyright (c) 2000-2023 the
    FFmpeg developers
      built with gcc 13.1.0 (crosstool-NG 1.25.0.196_227d99d)
      configuration: --prefix=/ffbuild/prefix --pkg-config-flags=--static
    --pkg-config=pkg-config --cross-prefix=x86_64-w64-mingw32- --arch=x86_64
    --target-os=mingw32 --enable-gpl --enable-version3 --disable-debug
    --disable-w32threads --enable-pthreads --enable-iconv --enable-libxml2
    --enable-zlib --enable-libfreetype --enable-libfribidi --enable-gmp
    --enable-lzma --enable-fontconfig --enable-libvorbis --enable-opencl
    --disable-libpulse --enable-libvmaf --disable-libxcb --disable-xlib
    --enable-amf --enable-libaom --enable-libaribb24 --enable-avisynth
    --enable-chromaprint --enable-libdav1d --enable-libdavs2
    --disable-libfdk-aac --enable-ffnvcodec --enable-cuda-llvm --enable-frei0r
    --enable-libgme --enable-libkvazaar --enable-libass --enable-libbluray
    --enable-libjxl --enable-libmp3lame --enable-libopus --enable-librist
    --enable-libssh --enable-libtheora --enable-libvpx --enable-libwebp
    --enable-lv2 --enable-libvpl --enable-openal --enable-libopencore-amrnb
    --enable-libopencore-amrwb --enable-libopenh264 --enable-libopenjpeg
    --enable-libopenmpt --enable-librav1e --enable-librubberband
    --enable-schannel --enable-sdl2 --enable-libsoxr --enable-libsrt
    --enable-libsvtav1 --enable-libtwolame --enable-libuavs3d --disable-libdrm
    --disable-vaapi --enable-libvidstab --enable-vulkan --enable-libshaderc
    --enable-libplacebo --enable-libx264 --enable-libx265 --enable-libxavs2
    --enable-libxvid --enable-libzimg --enable-libzvbi
    --extra-cflags=-DLIBTWOLAME_STATIC --extra-cxxflags=
    --extra-ldflags=-pthread --extra-ldexeflags= --extra-libs=-lgomp
    --extra-version=20230717
      libavutil      58. 14.100 / 58. 14.100
      libavcodec     60. 22.100 / 60. 22.100
      libavformat    60. 10.100 / 60. 10.100
      libavdevice    60.  2.101 / 60.  2.101
      libavfilter     9.  8.102 /  9.  8.102
      libswscale      7.  3.100 /  7.  3.100
      libswresample   4. 11.100 /  4. 11.100
      libpostproc    57.  2.100 / 57.  2.100
     "Input #0, mov,mp4,m4a,3gp,3g2,mj2, frompipe:':\r\n"
      Metadata:
        major_brand     : mp42
        minor_version   : 0
        compatible_brands: mp42mp41
        creation_time   : 2020-11-10T15:01:09.000000Z
      Duration: 00:00:04.16, start: 0.000000, bitrate: N/A
      Stream #0:0[0x1](eng): Video: h264 (Main) (avc1 / 0x31637661),
    yuv420p(progressive), 1920x1080 [SAR 1:1 DAR 16:9], 2649 kb/s, 25 fps, 25
    tbr, 25k tbn (default)
        Metadata:
          creation_time   : 2020-11-10T15:01:09.000000Z
          handler_name    : ?Mainconcept Video Media Handler
          vendor_id       : [0][0][0][0]
          encoder         : AVC Coding
      Stream #0:1[0x2](eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz,
    stereo, fltp, 317 kb/s (default)
        Metadata:
          creation_time   : 2020-11-10T15:01:09.000000Z
          handler_name    : #Mainconcept MP4 Sound Media Handler
          vendor_id       : [0][0][0][0]
    Stream mapping:
      Stream #0:0 (h264) -> scale:default (graph 0)
      scale:default (graph 0) -> Stream #0:0 (libx264)
      Stream #0:1 -> #0:1 (aac (native) -> aac (native))
    [libx264 @ 00000243a23a1100] using SAR=1/1
    [libx264 @ 00000243a23a1100] using cpu capabilities: MMX2 SSE2Fast SSSE3
    SSE4.2 AVX FMA3 BMI2 AVX2
    [libx264 @ 00000243a23a1100] profile High, level 3.0, 4:2:0, 8-bit
    [libx264 @ 00000243a23a1100] 264 - core 164 - H.264/MPEG-4 AVC codec -
    Copyleft 2003-2023 - http://www.videolan.org/x264.html - options: cabac=1
    ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00
    mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11
    fast_pskip=1 chroma_qp_offset=-2 threads=6 lookahead_threads=1
    sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0
    constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1
    weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40
    intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0
    qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
     "Output #0, mp4, toaa37f8d7685f4df9af85b1cdcd95997e.mp4':\r\n"
      Metadata:
        major_brand     : mp42
        minor_version   : 0
        compatible_brands: mp42mp41
        encoder         : Lavf60.10.100
      Stream #0:0: Video: h264 (avc1 / 0x31637661), yuv420p(tv, progressive),
    800x450 [SAR 1:1 DAR 16:9], q=2-31, 25 fps, 12800 tbn
        Metadata:
          encoder         : Lavc60.22.100 libx264
        Side data:
          cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: N/A
      Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo,
    fltp, 128 kb/s (default)
        Metadata:
          creation_time   : 2020-11-10T15:01:09.000000Z
          handler_name    : #Mainconcept MP4 Sound Media Handler
          vendor_id       : [0][0][0][0]
          encoder         : Lavc60.22.100 aac
    frame=    0 fps=0.0 q=0.0 size=       0kB time=N/A bitrate=N/A
    speed=N/A    \r'
    frame=   21 fps=0.0 q=28.0 size=       0kB time=00:00:02.75 bitrate=  
    0.1kbits/s speed=4.75x    \r'
    [out#0/mp4 @ 00000243a230bd80] video:91kB audio:67kB subtitle:0kB other
    streams:0kB global headers:0kB muxing overhead: 2.838559%
    frame=  104 fps=101 q=-1.0 Lsize=     162kB time=00:00:04.13 bitrate=
    320.6kbits/s speed=4.02x    
    [libx264 @ 00000243a23a1100] frame I:1     Avg QP:18.56  size:  2456
    [libx264 @ 00000243a23a1100] frame P:33    Avg QP:16.86  size:  1552
    [libx264 @ 00000243a23a1100] frame B:70    Avg QP:17.55  size:   553
    [libx264 @ 00000243a23a1100] consecutive B-frames:  4.8% 11.5% 14.4%
    69.2%
    [libx264 @ 00000243a23a1100] mb I  I16..4: 17.3% 82.1%  0.6%
    [libx264 @ 00000243a23a1100] mb P  I16..4:  5.9% 15.2%  0.4%  P16..4: 18.3% 
    0.9%  0.4%  0.0%  0.0%    skip:58.7%
    [libx264 @ 00000243a23a1100] mb B  I16..4:  0.8%  0.3%  0.0%  B16..8: 15.4% 
    1.0%  0.0%  direct: 3.6%  skip:78.9%  L0:34.2% L1:64.0% BI: 1.7%
    [libx264 @ 00000243a23a1100] 8x8 transform intra:68.2% inter:82.3%
    [libx264 @ 00000243a23a1100] coded y,uvDC,uvAC intra: 4.2% 18.4% 1.2% inter:
    1.0% 6.9% 0.0%
    [libx264 @ 00000243a23a1100] i16 v,h,dc,p: 53% 25%  8% 14%
    [libx264 @ 00000243a23a1100] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 19%  6% 70%  1% 
    1%  1%  1%  0%  0%
    [libx264 @ 00000243a23a1100] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 46% 21% 15%  2% 
    5%  4%  3%  3%  1%
    [libx264 @ 00000243a23a1100] i8c dc,h,v,p: 71% 15% 13%  1%
    [libx264 @ 00000243a23a1100] Weighted P-Frames: Y:30.3% UV:15.2%
    [libx264 @ 00000243a23a1100] ref P L0: 46.7%  7.5% 34.6%  7.3%  3.9%
    [libx264 @ 00000243a23a1100] ref B L0: 88.0% 10.5%  1.5%
    [libx264 @ 00000243a23a1100] ref B L1: 98.1%  1.9%
    [libx264 @ 00000243a23a1100] kb/s:177.73
    [aac @ 00000243a23a2e00] Qavg: 1353.589
    

    I'm at a loss right now, would love any feedback/solution.

  • FFMPEG - RADEON - VAAPI - Alfa channel overlay

    25 janvier, par Alexandre Leitão

    Ok I got to my limit so decided to ask for help. I'm running a python app in a docker container. It works fine with VAAPI and hardware acceleration(AMD Radeon GPU) till I use filter complex. This is my command

    ffmpeg -v error -y -hwaccel vaapi -vaapi_device /dev/dri/renderD128 -stream_loop -1 -f concat -i video_list.txt -stream_loop -1 -f concat -i overlay_list1.txt -stream_loop -1 -f concat -i overlay_list.txt -filter_complex "[0:v][1:v]overlay=x=0:y=0:shortest=1[out];[out][2:v]overlay=x=0:y=0:shortest=1[out1]" -map "[out1]" -map 1:a -c:v h264_vaapi -c:a aac -b:a 192k -t 1800 text.mp4
    

    video_list.txt contains 2 mp4 videos h264 (30sec each) overlay_list1.txt contains 2 mov video (alfa channel overlays 2 minutes each) overlay_list.txt contains 1 mov video (alfa channel overlay 25 minutes long)

    • the idea is loop video_list.txt to -t (1800 at the moment)
    • Loop overlay_list1.txt over it till -t
    • and loop overlay.txt above everything

    Audio only comes from overlay_list1.txt videos What I get is this output

    Impossible to convert between the formats supported by the filter 'Parsed_overlay_1' and the filter 'auto_scale_2'
    [fc#0 @ 0x5fadddb56540] Error reinitializing filters!
    [fc#0 @ 0x5fadddb56540] Task finished with error code: -38 (Function not implemented)
    [fc#0 @ 0x5fadddb56540] Terminating thread with return code -38 (Function not implemented)
    [vost#0:0/h264_vaapi @ 0x5fadddb3f400] Could not open encoder before EOF
    [vost#0:0/h264_vaapi @ 0x5fadddb3f400] Task finished with error code: -22 (Invalid argument)
    [vost#0:0/h264_vaapi @ 0x5fadddb3f400] Terminating thread with return code -22 (Invalid argument)
    [out#0/mp4 @ 0x5fadddc3f140] Nothing was written into output file, because at least one of its streams received no packets.
    

    I tried everything and couldn't fix... the last thing I read is I suppose to use hwupload and hwdownload on filter complex but I couldn't understand how to do it.

    Any help is welcome guys thank you and happy new year to y'all

  • Duplicated PTS value when using rtsp transport UDP (H264 FU-A)

    25 janvier, par Christoph

    I’m implementing a packet loss counter based on the PTS from the av_packet, and it works fine when using RTSP/TCP as the transport mode. However, when I switched to RTSP/UDP, two packets consistently share the same PTS. This puzzled me because I assumed that av_read_frame would parse the stream and provide "valid" packets.

    In both cases, the stream is FU-A H.264, and I expected FFmpeg to handle reassembly in both transport modes identically. My understanding was that if UDP packets were splitted, FFmpeg would reassemble them into a single av_packet, similar to how it handles reassembly for TCP packets split due to MTU and FU-A.

    I could adapt my packet loss calculation by simply ignoring packets with the same PTS as the previous one, but I want to understand what’s happening here.

    TCP

    packet pts: -9223372036854775808, dts: -9223372036854775808, size: 52672, key-frame: true, discard: false, corrupt: false
    packet pts: 3598, dts: 3598, size: 6034, key-frame: false, discard: false, corrupt: false
    packet pts: 7196, dts: 7196, size: 5730, key-frame: false, discard: false, corrupt: false
    packet pts: 10794, dts: 10794, size: 6153, key-frame: false, discard: false, corrupt: false
    packet pts: 14392, dts: 14392, size: 2269, key-frame: false, discard: false, corrupt: false
    packet pts: 17989, dts: 17989, size: 2656, key-frame: false, discard: false, corrupt: false
    packet pts: 21587, dts: 21587, size: 2659, key-frame: false, discard: false, corrupt: false
    

    UDP

    packet pts: -9223372036854775808, dts: -9223372036854775808, size: 1391, key-frame: true, discard: false, corrupt: false
    packet pts: 0, dts: 0, size: 109265, key-frame: true, discard: false, corrupt: false
    packet pts: 3598, dts: 3598, size: 878, key-frame: false, discard: false, corrupt: false
    packet pts: -> 3598, dts: 3598, size: 7728, key-frame: false, discard: false, corrupt: false
    packet pts: 7195, dts: 7195, size: 887, key-frame: false, discard: false, corrupt: false
    packet pts: -> 7195, dts: 7195, size: 7149, key-frame: false, discard: false, corrupt: false
    packet pts: 10793, dts: 10793, size: 795, key-frame: false, discard: false, corrupt: false
    packet pts: -> 10793, dts: 10793, size: 7777, key-frame: false, discard: false, corrupt: false
    packet pts: 14391, dts: 14391, size: 119, key-frame: false, discard: false, corrupt: false
    packet pts: -> 14391, dts: 14391, size: 2075, key-frame: false, discard: false, corrupt: false
    

    For reference here my code

    // PackageLossDetection detects possible packet loss based on PTS (Presentation Time Stamp) values.
    // It compares the PTS of the packet with the expected PTS, calculated using the stream's time base and average frame rate.
    // If the deviation between the expected and actual PTS exceeds a defined tolerance.
    //
    // Parameters:
    //   - pkt: incoming packet whose PTS is to be checked.
    //   - stream: the stream containing time base and average frame rate information.
    func (s *AvSource) PackageLossDetection(pkt *astiav.Packet, stream *astiav.Stream) {
    
        // When using UDP as RTSP Transport packages in tuple has same PTS
        // TODO: Maybe we should invest more time to find a better solution
        if s.lastPts == pkt.Pts() {
            return
        }
    
        if pkt.Pts() > 0 {
    
            const tolerance = 4 // Allowable deviation in PTS steps
            if stream.AvgFrameRate().Num() == 0 {
                s.log.Warn().Str("stream", s.stream.Name).Msg("PackageLossDetection, no frame rate information available")
                return
            }
    
            var ptsBetween = stream.TimeBase().Den() * stream.TimeBase().Num() / stream.AvgFrameRate().Num()
            if math.Abs(float64(pkt.Pts()-(s.lastPts+int64(ptsBetween)))) > tolerance {
                s.log.Warn().Str("stream", s.stream.Name).Msgf("PackageLossDetection, PTS steps: %d, expected: %d, got: %d", int(ptsBetween), s.lastPts+int64(ptsBetween), pkt.Pts())
                utils.SafeIncrementInt64(&s.metrics.LossCount)
            }
    
            s.lastPts = pkt.Pts()
        }
    }