Newest 'ffmpeg' Questions - Stack Overflow

http://stackoverflow.com/questions/tagged/ffmpeg

Les articles publiés sur le site

1 | ... | 49 | 50 | 51 | 52 | 53 | 54 | 55 | 56 | 57 | ... | 9783

pydub.exceptions.CouldntDecodeError : Decoding failed. ffmpeg returned error code : 1

9 avril, par azail765

This script will work on a 30 second wav file but not a 10 minutes phone call also in wav format. Any help would be appreciated

I've downloaded ffmpeg.

# Import necessary libraries 
from pydub import AudioSegment 
import speech_recognition as sr 
import os
import pydub


chunk_count = 0
directory = os.fsencode(r'C:\Users\zach.blair\Downloads\speechRecognition\New folder')
# Text file to write the recognized audio 
fh = open("recognized.txt", "w+")
for file in os.listdir(directory):
     filename = os.fsdecode(file)
     if filename.endswith(".wav"):
        chunk_count += 1
             # Input audio file to be sliced 
        audio = AudioSegment.from_file(filename,format="wav") 
          
        ''' 
        Step #1 - Slicing the audio file into smaller chunks. 
        '''
        # Length of the audiofile in milliseconds 
        n = len(audio) 
          
        # Variable to count the number of sliced chunks 
        counter = 1
          
         
          
        # Interval length at which to slice the audio file. 
        interval = 20 * 1000
          
        # Length of audio to overlap.  
        overlap = 1 * 1000
          
        # Initialize start and end seconds to 0 
        start = 0
        end = 0
          
        # Flag to keep track of end of file. 
        # When audio reaches its end, flag is set to 1 and we break 
        flag = 0
          
        # Iterate from 0 to end of the file, 
        # with increment = interval 
        for i in range(0, 2 * n, interval): 
              
            # During first iteration, 
            # start is 0, end is the interval 
            if i == 0: 
                start = 0
                end = interval 
          
            # All other iterations, 
            # start is the previous end - overlap 
            # end becomes end + interval 
            else: 
                start = end - overlap 
                end = start + interval  
          
            # When end becomes greater than the file length, 
            # end is set to the file length 
            # flag is set to 1 to indicate break. 
            if end >= n: 
                end = n 
                flag = 1
          
            # Storing audio file from the defined start to end 
            chunk = audio[start:end] 
          
            # Filename / Path to store the sliced audio 
            filename = str(chunk_count)+'chunk'+str(counter)+'.wav'
          
            # Store the sliced audio file to the defined path 
            chunk.export(filename, format ="wav") 
            # Print information about the current chunk 
            print(str(chunk_count)+str(counter)+". Start = "
                                +str(start)+" end = "+str(end)) 
          
            # Increment counter for the next chunk 
            counter = counter + 1
              
          
            AUDIO_FILE = filename 
            
            # Initialize the recognizer 
            r = sr.Recognizer() 
          
            # Traverse the audio file and listen to the audio 
            with sr.AudioFile(AUDIO_FILE) as source: 
                audio_listened = r.listen(source) 
          
            # Try to recognize the listened audio 
            # And catch expections. 
            try:     
                rec = r.recognize_google(audio_listened) 
                  
                # If recognized, write into the file. 
                fh.write(rec+" ") 
              
            # If google could not understand the audio 
            except sr.UnknownValueError: 
                    print("Empty Value") 
          
            # If the results cannot be requested from Google. 
            # Probably an internet connection error. 
            except sr.RequestError as e: 
                print("Could not request results.") 
          
            # Check for flag. 
            # If flag is 1, end of the whole audio reached. 
            # Close the file and break.                  
fh.close()

I get this error on audio = AudioSegment.from_file(filename,format="wav"):

Traceback (most recent call last):
  File "C:\Users\zach.blair\Downloads\speechRecognition\New folder\speechRecognition3.py", line 17, in 
    audio = AudioSegment.from_file(filename,format="wav")
  File "C:\Users\zach.blair\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pydub\audio_segment.py", line 704, in from_file
    p.returncode, p_err))
pydub.exceptions.CouldntDecodeError: Decoding failed. ffmpeg returned error code: 1

Output from ffmpeg/avlib:

  ffmpeg version N-95027-g8c90bb8ebb Copyright (c) 2000-2019 the FFmpeg developers
  built with gcc 9.2.1 (GCC) 20190918
  configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libdav1d --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-libmfx --enable-ffnvcodec --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth --enable-libopenmpt --enable-amf
  libavutil      56. 35.100 / 56. 35.100
  libavcodec     58. 58.101 / 58. 58.101
  libavformat    58. 33.100 / 58. 33.100
  libavdevice    58.  9.100 / 58.  9.100
  libavfilter     7. 58.102 /  7. 58.102
  libswscale      5.  6.100 /  5.  6.100
  libswresample   3.  6.100 /  3.  6.100
  libpostproc    55.  6.100 / 55.  6.100
Guessed Channel Layout for Input Stream #0.0 : mono
Input #0, wav, from '2a.wav.wav':
  Duration: 00:09:52.95, bitrate: 64 kb/s
    Stream #0:0: Audio: pcm_mulaw ([7][0][0][0] / 0x0007), 8000 Hz, mono, s16, 64 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (pcm_mulaw (native) -> pcm_s8 (native))
Press [q] to stop, [?] for help
[wav @ 0000024307974400] pcm_s8 codec not supported in WAVE format
Could not write header for output file #0 (incorrect codec parameters ?): Function not implemented
Error initializing output stream 0:0 -- 
Conversion failed!

How to dump ALL metadata from a media file, including cover image title ? [closed]

9 avril, par Unideal

I have an MP3 song:

# ffprobe -hide_banner -i filename.mp3
Input #0, mp3, from 'filename.mp3':
  Metadata:
    composer        : Music Author
    title           : Song Name
    artist          : Singer
    encoder         : Lavf61.7.100
    genre           : Rock
    date            : 2025
  Duration: 00:03:14.04, start: 0.023021, bitrate: 208 kb/s
  Stream #0:0: Audio: mp3 (mp3float), 48000 Hz, stereo, fltp, 192 kb/s
      Metadata:
        encoder         : Lavc61.19
  Stream #0:1: Video: png, rgb24(pc, gbr/unknown/unknown), 600x600 [SAR 1:1 DAR 1:1], 90k tbr, 90k tbn (attached pic)
      Metadata:
        title           : Cover
        comment         : Cover (front)

The task is to save its metadata to a text file and restore from that file later. Both goals should be accomplished with ffmpeg.

The simpliest method is to run:

# ffmpeg -i filename.mp3 -f ffmetadata metadata.txt

After that, metadata.txt contains:

;FFMETADATA1
composer=Music Author
title=Song Name
artist=Singer
date=2025
genre=Rock
encoder=Lavf61.7.100

I got global metadata only, but stream-specific info (cover image title and comment in my case) are missing.

Google suggested a more complex form of the command above to extract all metadata fields without any exclusions:

# ffmpeg -y -i filename.mp3 -c copy -map_metadata 0 -map_metadata:s:v 0:s:v -map_metadata:s:a 0:s:a -f ffmetadata metadata.txt

But the output is exactly the same:

;FFMETADATA1
composer=Music Author
title=Song Name
artist=Singer
date=2025
genre=Rock
encoder=Lavf61.7.100

Again, no info about the attached image.

Please explain what am I doing wrong.

some codecs (ie. libx264) cannot be reused after draining

7 avril, par Neddie

After draining an encoder (by sending it a null frame and then receiving packets until EOF) it enters draining mode after which avcodec_send_frame will fail, returning EOF. You're supposed to call avcodec_flush_buffers which among other things sets the internal draining flag to 0, allowing avcodec_send_frame to work again. Unfortunately it checks AV_CODEC_CAP_ENCODER_FLUSH first and if not set it returns without resetting the draining flag.

So the only way to reuse the codec is to close and reopen it, which is wasteful, plus avcodec_close is deprecated. Am I missing something?

int attribute_align_arg avcodec_send_frame(AVCodecContext *avctx, const AVFrame *frame)
{
    AVCodecInternal *avci = avctx->internal;
    int ret;

    if (!avcodec_is_open(avctx) || !av_codec_is_encoder(avctx->codec))
        return AVERROR(EINVAL);

    if (avci->draining)
        return AVERROR_EOF;

    if (avci->buffer_frame->buf[0])
        return AVERROR(EAGAIN);

    if (!frame) {
        avci->draining = 1;
    } else {
        ret = encode_send_frame_internal(avctx, frame);
        if (ret < 0)
            return ret;
    }

    if (!avci->buffer_pkt->data && !avci->buffer_pkt->side_data) {
        ret = encode_receive_packet_internal(avctx, avci->buffer_pkt);
        if (ret < 0 && ret != AVERROR(EAGAIN) && ret != AVERROR_EOF)
            return ret;
    }

    avctx->frame_num++;

    return 0;
}

void avcodec_flush_buffers(AVCodecContext *avctx)
{
    AVCodecInternal *avci = avctx->internal;

    if (av_codec_is_encoder(avctx->codec)) {
        int caps = avctx->codec->capabilities;

        if (!(caps & AV_CODEC_CAP_ENCODER_FLUSH)) {
            // Only encoders that explicitly declare support for it can be
            // flushed. Otherwise, this is a no-op.
            av_log(avctx, AV_LOG_WARNING, "Ignoring attempt to flush encoder "
                   "that doesn't support it\n");
            return;
        }
        ff_encode_flush_buffers(avctx);
    } else
        ff_decode_flush_buffers(avctx);

    avci->draining      = 0;
    avci->draining_done = 0;
    if (avci->buffer_frame)
        av_frame_unref(avci->buffer_frame);
    if (avci->buffer_pkt)
        av_packet_unref(avci->buffer_pkt);

    if (HAVE_THREADS && avctx->active_thread_type & FF_THREAD_FRAME &&
        !avci->is_frame_mt)
        ff_thread_flush(avctx);
    else if (ffcodec(avctx->codec)->flush)
        ffcodec(avctx->codec)->flush(avctx);
}

How to automatically rotate video based on camera orientation while recording ?

7 avril, par jestrabikr
I am developing Mediasoup SFU and client web app, on the server I am recording client's stream by sending it to FFmpeg as plain RTP. FFmpeg creates HLS recording (.m3u8 and .ts files), because I need to be able to switch between WebRTC live stream and HLS recording before live stream ends.

My problem is that, when I am testing the app and I rotate my phone 90 degrees, the recording's aspect ratio stays the same but the image is rotated (as shown on images 1.1, 1.2 and 1.3 below). I need for it to change aspect ratio dynamically according to camera orientation. How can I do that using FFmpeg?

On the live stream it works perfectly fine (as shown on the images - 2.1 and 2.2 below), when the phone is rotated, the aspect ration is changed and video is shown correctly. I think it works on live stream because maybe WebRTC is signaling orientation changes somehow (but it does not project to the recording).

These are my ffmpeg command arguments for recording (version 6.1.1-3ubuntu5):
```
let commandArgs = [
      "-loglevel", "info",
      "-protocol_whitelist", "pipe,udp,rtp",
      "-fflags", "+genpts+discardcorrupt",
      "-reinit_filter", "1",
      "-strict", "-2",
      "-f", "sdp",
      "-i", "pipe:0",
      "-map", "0:v:0",
      "-c:v", "libx264",
      "-b:v", "1500k",
      "-profile:v", "high",
      "-level:v", "4.1",
      "-pix_fmt", "yuv420p",
      "-g", "30",
      "-map", "0:a:0",
      "-c:a", "aac",
      "-b:a", "128k",
      "-movflags", "+frag_keyframe+empty_moov",
      "-f", "hls",
      "-hls_time", "4",
      "-hls_list_size", "0",
      "-hls_flags", "split_by_time",
      `${filePath}.m3u8`
    ];
```
- Image 1.1 - Portrait mode in recording:
- Image 1.2 - Landscape mode in recording (rotated 90deg to my left side - front camera is on my left side):
- Image 1.3 - Landscape mode in recording (rotated 90deg to my right side):
- Image 2.1 - Portrait mode in live stream (correct behavior):
- Image 2.2 - Landscape mode in live stream (correct behavior):
ffmpeg : error while loading shared libraries : libnppig.so.12 : cannot open shared object file on CentOS [closed]

7 avril, par Rahib Rasheed
I'm trying to install ffmpeg on a CentOS server, but I'm running into a shared library error when I try to run it:
```
ffmpeg: error while loading shared libraries: libnppig.so.12: cannot open shared object file: No such file or directory
```
Running ldd shows some missing dependencies:
```
ldd $(which ffmpeg) | grep libnpp
        libnppig.so.12 => not found
        libnppicc.so.12 => /lib64/libnppicc.so.12
        libnppidei.so.12 => /lib64/libnppidei.so.12
        libnppif.so.12 => not found
        libnppc.so.12 => /lib64/libnppc.so.12
```
I’ve tried searching for the missing libraries but couldn’t find the correct packages for CentOS. What’s the best way to resolve this issue?