Newest 'ffmpeg' Questions - Stack Overflow

http://stackoverflow.com/questions/tagged/ffmpeg

Les articles publiés sur le site

  • pydub.exceptions.CouldntDecodeError : Decoding failed. ffmpeg returned error code : 1

    9 avril, par azail765

    This script will work on a 30 second wav file but not a 10 minutes phone call also in wav format. Any help would be appreciated

    I've downloaded ffmpeg.

    # Import necessary libraries 
    from pydub import AudioSegment 
    import speech_recognition as sr 
    import os
    import pydub
    
    
    chunk_count = 0
    directory = os.fsencode(r'C:\Users\zach.blair\Downloads\speechRecognition\New folder')
    # Text file to write the recognized audio 
    fh = open("recognized.txt", "w+")
    for file in os.listdir(directory):
         filename = os.fsdecode(file)
         if filename.endswith(".wav"):
            chunk_count += 1
                 # Input audio file to be sliced 
            audio = AudioSegment.from_file(filename,format="wav") 
              
            ''' 
            Step #1 - Slicing the audio file into smaller chunks. 
            '''
            # Length of the audiofile in milliseconds 
            n = len(audio) 
              
            # Variable to count the number of sliced chunks 
            counter = 1
              
             
              
            # Interval length at which to slice the audio file. 
            interval = 20 * 1000
              
            # Length of audio to overlap.  
            overlap = 1 * 1000
              
            # Initialize start and end seconds to 0 
            start = 0
            end = 0
              
            # Flag to keep track of end of file. 
            # When audio reaches its end, flag is set to 1 and we break 
            flag = 0
              
            # Iterate from 0 to end of the file, 
            # with increment = interval 
            for i in range(0, 2 * n, interval): 
                  
                # During first iteration, 
                # start is 0, end is the interval 
                if i == 0: 
                    start = 0
                    end = interval 
              
                # All other iterations, 
                # start is the previous end - overlap 
                # end becomes end + interval 
                else: 
                    start = end - overlap 
                    end = start + interval  
              
                # When end becomes greater than the file length, 
                # end is set to the file length 
                # flag is set to 1 to indicate break. 
                if end >= n: 
                    end = n 
                    flag = 1
              
                # Storing audio file from the defined start to end 
                chunk = audio[start:end] 
              
                # Filename / Path to store the sliced audio 
                filename = str(chunk_count)+'chunk'+str(counter)+'.wav'
              
                # Store the sliced audio file to the defined path 
                chunk.export(filename, format ="wav") 
                # Print information about the current chunk 
                print(str(chunk_count)+str(counter)+". Start = "
                                    +str(start)+" end = "+str(end)) 
              
                # Increment counter for the next chunk 
                counter = counter + 1
                  
              
                AUDIO_FILE = filename 
                
                # Initialize the recognizer 
                r = sr.Recognizer() 
              
                # Traverse the audio file and listen to the audio 
                with sr.AudioFile(AUDIO_FILE) as source: 
                    audio_listened = r.listen(source) 
              
                # Try to recognize the listened audio 
                # And catch expections. 
                try:     
                    rec = r.recognize_google(audio_listened) 
                      
                    # If recognized, write into the file. 
                    fh.write(rec+" ") 
                  
                # If google could not understand the audio 
                except sr.UnknownValueError: 
                        print("Empty Value") 
              
                # If the results cannot be requested from Google. 
                # Probably an internet connection error. 
                except sr.RequestError as e: 
                    print("Could not request results.") 
              
                # Check for flag. 
                # If flag is 1, end of the whole audio reached. 
                # Close the file and break.                  
    fh.close()    
    

    I get this error on audio = AudioSegment.from_file(filename,format="wav"):

    Traceback (most recent call last):
      File "C:\Users\zach.blair\Downloads\speechRecognition\New folder\speechRecognition3.py", line 17, in 
        audio = AudioSegment.from_file(filename,format="wav")
      File "C:\Users\zach.blair\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pydub\audio_segment.py", line 704, in from_file
        p.returncode, p_err))
    pydub.exceptions.CouldntDecodeError: Decoding failed. ffmpeg returned error code: 1
    

    Output from ffmpeg/avlib:

      ffmpeg version N-95027-g8c90bb8ebb Copyright (c) 2000-2019 the FFmpeg developers
      built with gcc 9.2.1 (GCC) 20190918
      configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libdav1d --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-libmfx --enable-ffnvcodec --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth --enable-libopenmpt --enable-amf
      libavutil      56. 35.100 / 56. 35.100
      libavcodec     58. 58.101 / 58. 58.101
      libavformat    58. 33.100 / 58. 33.100
      libavdevice    58.  9.100 / 58.  9.100
      libavfilter     7. 58.102 /  7. 58.102
      libswscale      5.  6.100 /  5.  6.100
      libswresample   3.  6.100 /  3.  6.100
      libpostproc    55.  6.100 / 55.  6.100
    Guessed Channel Layout for Input Stream #0.0 : mono
    Input #0, wav, from '2a.wav.wav':
      Duration: 00:09:52.95, bitrate: 64 kb/s
        Stream #0:0: Audio: pcm_mulaw ([7][0][0][0] / 0x0007), 8000 Hz, mono, s16, 64 kb/s
    Stream mapping:
      Stream #0:0 -> #0:0 (pcm_mulaw (native) -> pcm_s8 (native))
    Press [q] to stop, [?] for help
    [wav @ 0000024307974400] pcm_s8 codec not supported in WAVE format
    Could not write header for output file #0 (incorrect codec parameters ?): Function not implemented
    Error initializing output stream 0:0 -- 
    Conversion failed!
    
  • How to dump ALL metadata from a media file, including cover image title ? [closed]

    9 avril, par Unideal

    I have an MP3 song:

    # ffprobe -hide_banner -i filename.mp3
    Input #0, mp3, from 'filename.mp3':
      Metadata:
        composer        : Music Author
        title           : Song Name
        artist          : Singer
        encoder         : Lavf61.7.100
        genre           : Rock
        date            : 2025
      Duration: 00:03:14.04, start: 0.023021, bitrate: 208 kb/s
      Stream #0:0: Audio: mp3 (mp3float), 48000 Hz, stereo, fltp, 192 kb/s
          Metadata:
            encoder         : Lavc61.19
      Stream #0:1: Video: png, rgb24(pc, gbr/unknown/unknown), 600x600 [SAR 1:1 DAR 1:1], 90k tbr, 90k tbn (attached pic)
          Metadata:
            title           : Cover
            comment         : Cover (front)
    

    The task is to save its metadata to a text file and restore from that file later. Both goals should be accomplished with ffmpeg.

    The simpliest method is to run:

    # ffmpeg -i filename.mp3 -f ffmetadata metadata.txt
    

    After that, metadata.txt contains:

    ;FFMETADATA1
    composer=Music Author
    title=Song Name
    artist=Singer
    date=2025
    genre=Rock
    encoder=Lavf61.7.100
    

    I got global metadata only, but stream-specific info (cover image title and comment in my case) are missing.

    Google suggested a more complex form of the command above to extract all metadata fields without any exclusions:

    # ffmpeg -y -i filename.mp3 -c copy -map_metadata 0 -map_metadata:s:v 0:s:v -map_metadata:s:a 0:s:a -f ffmetadata metadata.txt
    

    But the output is exactly the same:

    ;FFMETADATA1
    composer=Music Author
    title=Song Name
    artist=Singer
    date=2025
    genre=Rock
    encoder=Lavf61.7.100
    

    Again, no info about the attached image.

    Please explain what am I doing wrong.

  • some codecs (ie. libx264) cannot be reused after draining

    7 avril, par Neddie

    After draining an encoder (by sending it a null frame and then receiving packets until EOF) it enters draining mode after which avcodec_send_frame will fail, returning EOF. You're supposed to call avcodec_flush_buffers which among other things sets the internal draining flag to 0, allowing avcodec_send_frame to work again. Unfortunately it checks AV_CODEC_CAP_ENCODER_FLUSH first and if not set it returns without resetting the draining flag.

    So the only way to reuse the codec is to close and reopen it, which is wasteful, plus avcodec_close is deprecated. Am I missing something?

    int attribute_align_arg avcodec_send_frame(AVCodecContext *avctx, const AVFrame *frame)
    {
        AVCodecInternal *avci = avctx->internal;
        int ret;
    
        if (!avcodec_is_open(avctx) || !av_codec_is_encoder(avctx->codec))
            return AVERROR(EINVAL);
    
        if (avci->draining)
            return AVERROR_EOF;
    
        if (avci->buffer_frame->buf[0])
            return AVERROR(EAGAIN);
    
        if (!frame) {
            avci->draining = 1;
        } else {
            ret = encode_send_frame_internal(avctx, frame);
            if (ret < 0)
                return ret;
        }
    
        if (!avci->buffer_pkt->data && !avci->buffer_pkt->side_data) {
            ret = encode_receive_packet_internal(avctx, avci->buffer_pkt);
            if (ret < 0 && ret != AVERROR(EAGAIN) && ret != AVERROR_EOF)
                return ret;
        }
    
        avctx->frame_num++;
    
        return 0;
    }
    
    void avcodec_flush_buffers(AVCodecContext *avctx)
    {
        AVCodecInternal *avci = avctx->internal;
    
        if (av_codec_is_encoder(avctx->codec)) {
            int caps = avctx->codec->capabilities;
    
            if (!(caps & AV_CODEC_CAP_ENCODER_FLUSH)) {
                // Only encoders that explicitly declare support for it can be
                // flushed. Otherwise, this is a no-op.
                av_log(avctx, AV_LOG_WARNING, "Ignoring attempt to flush encoder "
                       "that doesn't support it\n");
                return;
            }
            ff_encode_flush_buffers(avctx);
        } else
            ff_decode_flush_buffers(avctx);
    
        avci->draining      = 0;
        avci->draining_done = 0;
        if (avci->buffer_frame)
            av_frame_unref(avci->buffer_frame);
        if (avci->buffer_pkt)
            av_packet_unref(avci->buffer_pkt);
    
        if (HAVE_THREADS && avctx->active_thread_type & FF_THREAD_FRAME &&
            !avci->is_frame_mt)
            ff_thread_flush(avctx);
        else if (ffcodec(avctx->codec)->flush)
            ffcodec(avctx->codec)->flush(avctx);
    }
    
  • How to automatically rotate video based on camera orientation while recording ?

    7 avril, par jestrabikr

    I am developing Mediasoup SFU and client web app, on the server I am recording client's stream by sending it to FFmpeg as plain RTP. FFmpeg creates HLS recording (.m3u8 and .ts files), because I need to be able to switch between WebRTC live stream and HLS recording before live stream ends.

    My problem is that, when I am testing the app and I rotate my phone 90 degrees, the recording's aspect ratio stays the same but the image is rotated (as shown on images 1.1, 1.2 and 1.3 below). I need for it to change aspect ratio dynamically according to camera orientation. How can I do that using FFmpeg?

    On the live stream it works perfectly fine (as shown on the images - 2.1 and 2.2 below), when the phone is rotated, the aspect ration is changed and video is shown correctly. I think it works on live stream because maybe WebRTC is signaling orientation changes somehow (but it does not project to the recording).

    These are my ffmpeg command arguments for recording (version 6.1.1-3ubuntu5):

    let commandArgs = [
          "-loglevel", "info",
          "-protocol_whitelist", "pipe,udp,rtp",
          "-fflags", "+genpts+discardcorrupt",
          "-reinit_filter", "1",
          "-strict", "-2",
          "-f", "sdp",
          "-i", "pipe:0",
          "-map", "0:v:0",
          "-c:v", "libx264",
          "-b:v", "1500k",
          "-profile:v", "high",
          "-level:v", "4.1",
          "-pix_fmt", "yuv420p",
          "-g", "30",
          "-map", "0:a:0",
          "-c:a", "aac",
          "-b:a", "128k",
          "-movflags", "+frag_keyframe+empty_moov",
          "-f", "hls",
          "-hls_time", "4",
          "-hls_list_size", "0",
          "-hls_flags", "split_by_time",
          `${filePath}.m3u8`
        ];
    
    • Image 1.1 - Portrait mode in recording:

      Image 1.1 - Portrait mode in recording

    • Image 1.2 - Landscape mode in recording (rotated 90deg to my left side - front camera is on my left side):

      Image 1.2 - Landscape mode in recording - left

    • Image 1.3 - Landscape mode in recording (rotated 90deg to my right side):

      Image 1.3 - Landscape mode in recording - right

    • Image 2.1 - Portrait mode in live stream (correct behavior):

      Portrait mode in live stream

    • Image 2.2 - Landscape mode in live stream (correct behavior):

      Landscape mode in live stream

  • ffmpeg : error while loading shared libraries : libnppig.so.12 : cannot open shared object file on CentOS [closed]

    7 avril, par Rahib Rasheed

    I'm trying to install ffmpeg on a CentOS server, but I'm running into a shared library error when I try to run it:

    ffmpeg: error while loading shared libraries: libnppig.so.12: cannot open shared object file: No such file or directory
    

    Running ldd shows some missing dependencies:

    ldd $(which ffmpeg) | grep libnpp
            libnppig.so.12 => not found
            libnppicc.so.12 => /lib64/libnppicc.so.12
            libnppidei.so.12 => /lib64/libnppidei.so.12
            libnppif.so.12 => not found
            libnppc.so.12 => /lib64/libnppc.so.12
    

    I’ve tried searching for the missing libraries but couldn’t find the correct packages for CentOS. What’s the best way to resolve this issue?