Newest 'ffmpeg' Questions - Stack Overflow
Les articles publiés sur le site
-
AWS Lambda function execution for video
11 avril, par AbinayaI am trying to run my application as Lambda, I am not getting successful eexecution everytime. Its like inconsistent results over testing.
The application uses files from S3 bucket as input, once I have tried to execute Lambda with input being a simple 2 sec long mp4 video using FFMPEG in my application for processing. Lambda executed for 4 or 5 times (not everytime). I was able to see the FFMPEG commands and logs on AWS Log output screen. he cloudwatch logs on the last execution of this:
ffmpeg version 4.4.2-0ubuntu0.22.04.1 Copyright (c) 2000-2021 the FFmpeg developers built with gcc 11 (Ubuntu 11.2.0-19ubuntu1) configuration: --prefix=/usr --extra-version=0ubuntu0.22.04.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-pocketsphinx --enable-librsvg --enable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared libavutil 56. 70.100 / 56. 70.100 libavcodec 58.134.100 / 58.134.100 libavformat 58. 76.100 / 58. 76.100 libavdevice 58. 13.100 / 58. 13.100 libavfilter 7.110.100 / 7.110.100 libswscale 5. 9.100 / 5. 9.100 libswresample 3. 9.100 / 3. 9.100 libpostproc 55. 9.100 / 55. 9.100 END RequestId: a7e397a7-ef1b-44c5-9877-e60a55665fc8 REPORT RequestId: a7e397a7-ef1b-44c5-9877-e60a55665fc8 Duration: 255111.29 ms Billed Duration: 255000 ms Memory Size: 1024 MB Max Memory Used: 177 MB
In short, I have created a console application that uses a model (let's say onnx model). This model is utilized in such a way that when my application is started, input is downloaded from s3, once any objects are detected in the input video I am trying to convert that into rendered output video with the use of FFMPEG
Generated output was also uploaded to S3 bucket in .flv format. Sometimes the Output is 0KB .Since then I haven't encountered successful execution for my video case though nothing have been modified in my image that is using to execute the Lambda. What would be the reason for such inconsistent action throughout my testing?
I am expecting to have a successful output again, consistently. I have run the same application on ubuntu environment inside docker container, it generated output video and uploaded to S3, working as expected. The same docker image was pushed to ECR and I get no output. Is Lambda not compatible for FFMPEG usage ? I have been struggling with this issue for days, Any help would be appreciated.
-
4K Screen Recording on 1080p Monitors [closed]
10 avril, par Souhail BenlhachemiI have created a basic windows screen recording app (ffmpeg + GUI), but I noticed that the quality of the recording depends on the monitor used to record, the video recording quality when recorded using a full HD is different from he video recording quality when recorded using a 4k monitor (which is obvious).
There is not much difference between the two when playing the recorded video with a scale of 100%, but when I zoom to 150% or more, we clearly can see the difference between the two recorded videos (1920x1080 VS the 4k).
I did some research on how to do screen recording with a 4k quality on a full hd monitor, and here is what I found:
I played with the windows duplicate API (AcquireNextFrame function which gives you the next frame on the swap chain), I successfully managed to convert the buffer to a PNG image and save it locally to my machine, but as you expect the quality was the same as a normal screenshot! Because AcquireNextFrame return a frame after it is rasterized.
Then I came across what’s called “Graphics pipeline”, I spent some time to understand the basics, and finally I came to a conclusion that I need to intercept somehow the pre-rasterize data (the data that comes before the Rasterizer Stage - Geometry shaders, etc...) and then duplicate this data and do an off-screen render on a new 4k render target, but the windows API don’t allow that, there is no way to do that! The only option they have on docs is what’s called Stream Output Stage, but this is useful only if you want to render your own shaders, not the ones that my display is using. (I tried to use MinHook to intercept data but no luck).
After that, I tried a different approach, I managed to create a virtual display as extended monitor with 4k resolution, and record it using ffmpeg, but as you know what I’m seeing on my main display on my monitor is different from the virtual display (only an empty desktop), what I need to do is drag and drop app windows using my mouse to that screen manually, but this will put us in a problem when recording, we are not seeing what we are recording xD.
I found some YouTube videos that talk about DSR (Dynamic Super Resolution), I tried that on my nvidia control panel (manually with GUI) and it works. I managed to fake the system that I have a 4k monitor and the quality of the recording was crystal clear. But I didn’t find anyway to do that programmatically using NVAPI + there is no API for that on AMD.
Has anyone worked on a similar project? Or know a similar project that I can use as reference?
suggestions?
-
Using -itsoffset with AAC Tracks in MP4 : What Actually Happens ? [closed]
10 avril, par user27607034If the frame duration of an AAC file is around 21 milliseconds, does the itsoffset value in the following command actually get applied? And even if it does, will players like QuickTime or a smart TV actually respect it?
ffmpeg -itsoffset 0.009 -i input.mp4 \ -i input.mp4 \ -map 1:v:0 \ // hevc -map 0:a:0 \ // aac -c copy \ output.mp4
-
pydub.exceptions.CouldntDecodeError : Decoding failed. ffmpeg returned error code : 1
9 avril, par azail765This script will work on a 30 second wav file but not a 10 minutes phone call also in wav format. Any help would be appreciated
I've downloaded ffmpeg.
# Import necessary libraries from pydub import AudioSegment import speech_recognition as sr import os import pydub chunk_count = 0 directory = os.fsencode(r'C:\Users\zach.blair\Downloads\speechRecognition\New folder') # Text file to write the recognized audio fh = open("recognized.txt", "w+") for file in os.listdir(directory): filename = os.fsdecode(file) if filename.endswith(".wav"): chunk_count += 1 # Input audio file to be sliced audio = AudioSegment.from_file(filename,format="wav") ''' Step #1 - Slicing the audio file into smaller chunks. ''' # Length of the audiofile in milliseconds n = len(audio) # Variable to count the number of sliced chunks counter = 1 # Interval length at which to slice the audio file. interval = 20 * 1000 # Length of audio to overlap. overlap = 1 * 1000 # Initialize start and end seconds to 0 start = 0 end = 0 # Flag to keep track of end of file. # When audio reaches its end, flag is set to 1 and we break flag = 0 # Iterate from 0 to end of the file, # with increment = interval for i in range(0, 2 * n, interval): # During first iteration, # start is 0, end is the interval if i == 0: start = 0 end = interval # All other iterations, # start is the previous end - overlap # end becomes end + interval else: start = end - overlap end = start + interval # When end becomes greater than the file length, # end is set to the file length # flag is set to 1 to indicate break. if end >= n: end = n flag = 1 # Storing audio file from the defined start to end chunk = audio[start:end] # Filename / Path to store the sliced audio filename = str(chunk_count)+'chunk'+str(counter)+'.wav' # Store the sliced audio file to the defined path chunk.export(filename, format ="wav") # Print information about the current chunk print(str(chunk_count)+str(counter)+". Start = " +str(start)+" end = "+str(end)) # Increment counter for the next chunk counter = counter + 1 AUDIO_FILE = filename # Initialize the recognizer r = sr.Recognizer() # Traverse the audio file and listen to the audio with sr.AudioFile(AUDIO_FILE) as source: audio_listened = r.listen(source) # Try to recognize the listened audio # And catch expections. try: rec = r.recognize_google(audio_listened) # If recognized, write into the file. fh.write(rec+" ") # If google could not understand the audio except sr.UnknownValueError: print("Empty Value") # If the results cannot be requested from Google. # Probably an internet connection error. except sr.RequestError as e: print("Could not request results.") # Check for flag. # If flag is 1, end of the whole audio reached. # Close the file and break. fh.close()
I get this error on
audio = AudioSegment.from_file(filename,format="wav")
:Traceback (most recent call last): File "C:\Users\zach.blair\Downloads\speechRecognition\New folder\speechRecognition3.py", line 17, in
audio = AudioSegment.from_file(filename,format="wav") File "C:\Users\zach.blair\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pydub\audio_segment.py", line 704, in from_file p.returncode, p_err)) pydub.exceptions.CouldntDecodeError: Decoding failed. ffmpeg returned error code: 1 Output from ffmpeg/avlib:
ffmpeg version N-95027-g8c90bb8ebb Copyright (c) 2000-2019 the FFmpeg developers built with gcc 9.2.1 (GCC) 20190918 configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libdav1d --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-libmfx --enable-ffnvcodec --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth --enable-libopenmpt --enable-amf libavutil 56. 35.100 / 56. 35.100 libavcodec 58. 58.101 / 58. 58.101 libavformat 58. 33.100 / 58. 33.100 libavdevice 58. 9.100 / 58. 9.100 libavfilter 7. 58.102 / 7. 58.102 libswscale 5. 6.100 / 5. 6.100 libswresample 3. 6.100 / 3. 6.100 libpostproc 55. 6.100 / 55. 6.100 Guessed Channel Layout for Input Stream #0.0 : mono Input #0, wav, from '2a.wav.wav': Duration: 00:09:52.95, bitrate: 64 kb/s Stream #0:0: Audio: pcm_mulaw ([7][0][0][0] / 0x0007), 8000 Hz, mono, s16, 64 kb/s Stream mapping: Stream #0:0 -> #0:0 (pcm_mulaw (native) -> pcm_s8 (native)) Press [q] to stop, [?] for help [wav @ 0000024307974400] pcm_s8 codec not supported in WAVE format Could not write header for output file #0 (incorrect codec parameters ?): Function not implemented Error initializing output stream 0:0 -- Conversion failed!
-
How to dump ALL metadata from a media file, including cover image title ? [closed]
9 avril, par UnidealI have an MP3 song:
# ffprobe -hide_banner -i filename.mp3 Input #0, mp3, from 'filename.mp3': Metadata: composer : Music Author title : Song Name artist : Singer encoder : Lavf61.7.100 genre : Rock date : 2025 Duration: 00:03:14.04, start: 0.023021, bitrate: 208 kb/s Stream #0:0: Audio: mp3 (mp3float), 48000 Hz, stereo, fltp, 192 kb/s Metadata: encoder : Lavc61.19 Stream #0:1: Video: png, rgb24(pc, gbr/unknown/unknown), 600x600 [SAR 1:1 DAR 1:1], 90k tbr, 90k tbn (attached pic) Metadata: title : Cover comment : Cover (front)
The task is to save its metadata to a text file and restore from that file later. Both goals should be accomplished with ffmpeg.
The simpliest method is to run:
# ffmpeg -i filename.mp3 -f ffmetadata metadata.txt
After that,
metadata.txt
contains:;FFMETADATA1 composer=Music Author title=Song Name artist=Singer date=2025 genre=Rock encoder=Lavf61.7.100
I got global metadata only, but stream-specific info (cover image title and comment in my case) are missing.
Google suggested a more complex form of the command above to extract all metadata fields without any exclusions:
# ffmpeg -y -i filename.mp3 -c copy -map_metadata 0 -map_metadata:s:v 0:s:v -map_metadata:s:a 0:s:a -f ffmetadata metadata.txt
But the output is exactly the same:
;FFMETADATA1 composer=Music Author title=Song Name artist=Singer date=2025 genre=Rock encoder=Lavf61.7.100
Again, no info about the attached image.
Please explain what am I doing wrong.