Recherche avancée

Médias (91)

Sur d’autres sites (352)

  • Error audio loading when runing Whisper Open AI model

    9 juin, par John mick

    The problem I'm trying to solve is that I can't run Whisper model for some audio, it says something related to audio decoding.

    


    payload.wav: Invalid data found when processing input.
raise RuntimeError(f"Failed to load audio: {e.stderr.decode()}") from e


    


    I tried using the micro-machines.wav and it works fine but when i used other audio it gives me an error.

    


    import whisper

model = whisper.load_model("base")
text=model.transcribe('micro-machines.wav',fp16=False)
print(text)
text=model.transcribe('payload.wav',fp16=False)
print(text)


    


    Error I'm getting for payload :

    


    d:\...\venv\lib\site-packages\whisper\transcribe.py:79: UserWarning: FP16 is not supported on CPU; using FP32 instead&#xA;  warnings.warn("FP16 is not supported on CPU; using FP32 instead")                                                                                        &#xA;Traceback (most recent call last):&#xA;  File "d:\...\venv\lib\site-packages\whisper\audio.py", line 42, in load_audio&#xA;    ffmpeg.input(file, threads=0)                                                                                    &#xA;  File "d:\...\venv\lib\site-packages\ffmpeg\_run.py", line 325, in run        &#xA;    raise Error(&#x27;ffmpeg&#x27;, out, err)                                                                                  &#xA;ffmpeg._run.Error: ffmpeg error (see stderr output for detail)                                                       &#xA;&#xA;The above exception was the direct cause of the following exception:&#xA;&#xA;Traceback (most recent call last):&#xA;  File "C:\....\Python\Python39\lib\runpy.py", line 197, in _run_module_as_main&#xA;    return _run_code(code, main_globals, None,&#xA;  File "C:\.....\Python\Python39\lib\runpy.py", line 87, in _run_code&#xA;    exec(code, run_globals)&#xA;  File "D:\...\venv\Scripts\whisper.exe\__main__.py", line 7, in <module>&#xA;  File "d:\...\venv\lib\site-packages\whisper\transcribe.py", line 314, in cli&#xA;    result = transcribe(model, audio_path, temperature=temperature, **args)&#xA;  File "d:\...\venv\lib\site-packages\whisper\transcribe.py", line 85, in transcribe&#xA;    mel = log_mel_spectrogram(audio)&#xA;  File "d:\...\venv\lib\site-packages\whisper\audio.py", line 111, in log_mel_spectrogram&#xA;    audio = load_audio(audio)&#xA;  File "d:\...\venv\lib\site-packages\whisper\audio.py", line 47, in load_audio&#xA;    raise RuntimeError(f"Failed to load audio: {e.stderr.decode()}") from e&#xA;RuntimeError: Failed to load audio: ffmpeg version 6.0-essentials_build-www.gyan.dev Copyright (c) 2000-2023 the FFmpeg developers&#xA;  built with gcc 12.2.0 (Rev10, Built by MSYS2 project)&#xA;  configuration: --enable-gpl --enable-version3 --enable-static --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enab&#xA;le-gmp --enable-lzma --enable-zlib --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-sdl2 --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxv&#xA;id --enable-libaom --enable-libopenjpeg --enable-libvpx --enable-libass --enable-libfreetype --enable-libfribidi --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf &#xA;--enable-cuda-llvm --enable-cuvid --enable-ffnvcodec --enable-nvdec --enable-nvenc --enable-d3d11va --enable-dxva2 --enable-libmfx --enable-libgme --enable-libopenmpt --enable-libo&#xA;pencore-amrwb --enable-libmp3lame --enable-libtheora --enable-libvo-amrwbenc --enable-libgsm --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enab&#xA;le-librubberband&#xA;  libavutil      58.  2.100 / 58.  2.100&#xA;  libavcodec     60.  3.100 / 60.  3.100&#xA;  libavformat    60.  3.100 / 60.  3.100&#xA;  libavdevice    60.  1.100 / 60.  1.100&#xA;  libavfilter     9.  3.100 /  9.  3.100&#xA;  libswscale      7.  1.100 /  7.  1.100&#xA;  libswresample   4. 10.100 /  4. 10.100&#xA;  libpostproc    57.  1.100 / 57.  1.100&#xA;payload.wav: Invalid data found when processing input&#xA;</module>

    &#xA;

    I tried searching for solutions and I found one which says It appears that the code failed to load the audio file for some reason and even failed to display that error because e.stderr did not contain a valid UTF-8 string

    &#xA;

  • When I use ffprobe to check a video stream,I get the error below [closed]

    21 mai, par dongrixinyu

    I came across with a problem when ffprobe and decoding video stream .

    &#xA;

    Here is the log :

    &#xA;

    ffprobe version 6.1.1 Copyright (c) 2007-2023 the FFmpeg developers&#xA;  built with gcc 9 (Ubuntu 9.4.0-1ubuntu1~20.04.2)&#xA;  configuration: --enable-gpl --enable-version3 --enable-shared --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-libsnappy --enable-zlib --enable-libsrt --enable-libssh --enable-libbluray --enable-libcaca --enable-sdl2 --enable-libdav1d --enable-libdavs2 --enable-libzvbi --enable-libwebp --enable-libx264 --enable-libx265 --enable-libaom --enable-libopenjpeg --enable-libvpx --enable-libass --enable-libfreetype --enable-libfribidi --enable-libvidstab --enable-libmfx --enable-opencl --enable-libgme --enable-libmodplug --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libshine --enable-libtheora --enable-libtwolame --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enable-ladspa --enable-libbs2b --enable-libmysofa --enable-librubberband --enable-libsoxr&#xA;  libavutil      58. 29.100 / 58. 29.100&#xA;  libavcodec     60. 31.102 / 60. 31.102&#xA;  libavformat    60. 16.100 / 60. 16.100&#xA;  libavdevice    60.  3.100 / 60.  3.100&#xA;  libavfilter     9. 12.100 /  9. 12.100&#xA;  libswscale      7.  5.100 /  7.  5.100&#xA;  libswresample   4. 12.100 /  4. 12.100&#xA;  libpostproc    57.  3.100 / 57.  3.100&#xA;[NULL @ 0x5595d3e72040] illegal reordering_of_pic_nums_idc 7&#xA;[h264 @ 0x5595d3e72040] illegal modification_of_pic_nums_idc 7&#xA;[h264 @ 0x5595d3e72040] decode_slice_header error&#xA;[h264 @ 0x5595d3e72040] no frame!&#xA;[h264 @ 0x5595d3e72040] decode_slice_header error&#xA;[h264 @ 0x5595d3e72040] no frame!&#xA;[h264 @ 0x5595d3e72040] reference overflow 66 > 15 or 0 > 15&#xA;    Last message repeated 1 times&#xA;[h264 @ 0x5595d3e72040] decode_slice_header error&#xA;[h264 @ 0x5595d3e72040] no frame!&#xA;[h264 @ 0x5595d3e72040] chroma_log2_weight_denom 27 is out of range&#xA;    Last message repeated 1 times&#xA;[h264 @ 0x5595d3e72040] Missing reference picture, default is 4&#xA;[h264 @ 0x5595d3e72040] concealing 8144 DC, 8144 AC, 8144 MV errors in P frame&#xA;[h264 @ 0x5595d3e72040] top block unavailable for requested intra mode&#xA;[h264 @ 0x5595d3e72040] error while decoding MB 4 0, bytestream 12113&#xA;[h264 @ 0x5595d3e72040] concealing 8160 DC, 8160 AC, 8160 MV errors in P frame&#xA;[h264 @ 0x5595d3e72040] illegal short term buffer state detected&#xA;[h264 @ 0x5595d3e72040] top block unavailable for requested intra mode -1&#xA;[h264 @ 0x5595d3e72040] error while decoding MB 1 0, bytestream 9617&#xA;[h264 @ 0x5595d3e72040] concealing 8160 DC, 8160 AC, 8160 MV errors in P frame&#xA;[h264 @ 0x5595d3e72040] illegal short term buffer state detected&#xA;[h264 @ 0x5595d3e72040] luma_log2_weight_denom 15 is out of range&#xA;    Last message repeated 1 times&#xA;[h264 @ 0x5595d3e72040] top block unavailable for requested intra mode&#xA;[h264 @ 0x5595d3e72040] error while decoding MB 4 0, bytestream 12323&#xA;[h264 @ 0x5595d3e72040] concealing 8160 DC, 8160 AC, 8160 MV errors in P frame&#xA;[h264 @ 0x5595d3e72040] illegal short term buffer state detected&#xA;[h264 @ 0x5595d3e72040] top block unavailable for requested intra mode -1&#xA;[h264 @ 0x5595d3e72040] error while decoding MB 27 0, bytestream 12229&#xA;[h264 @ 0x5595d3e72040] concealing 8160 DC, 8160 AC, 8160 MV errors in P frame&#xA;[h264 @ 0x5595d3e72040] illegal short term buffer state detected&#xA;[h264 @ 0x5595d3e72040] illegal reordering_of_pic_nums_idc 15 &#xA;[h264 @ 0x5595d3e72040] illegal modification_of_pic_nums_idc 15&#xA;[h264 @ 0x5595d3e72040] decode_slice_header error &#xA;[h264 @ 0x5595d3e72040] no frame!&#xA;[h264 @ 0x5595d3e72040] reference count 1 overflow&#xA;[h264 @ 0x5595d3e72040] reference count overflow&#xA;[h264 @ 0x5595d3e72040] decode_slice_header error &#xA;[h264 @ 0x5595d3e72040] no frame!&#xA;[h264 @ 0x5595d3e72040] luma_log2_weight_denom 31 is out of range&#xA;[h264 @ 0x5595d3e72040] illegal memory management control operation 21&#xA;[h264 @ 0x5595d3e72040] luma_log2_weight_denom 31 is out of range&#xA;[h264 @ 0x5595d3e72040] illegal memory management control operation 21&#xA;[h264 @ 0x5595d3e72040] deblocking filter parameters -7 0 out of range&#xA;[h264 @ 0x5595d3e72040] decode_slice_header error &#xA;[h264 @ 0x5595d3e72040] no frame!&#xA;[h264 @ 0x5595d3e72040] Reference 6 >= 3&#xA;[h264 @ 0x5595d3e72040] error while decoding MB 29 0, bytestream 8581&#xA;[h264 @ 0x5595d3e72040] concealing 8160 DC, 8160 AC, 8160 MV errors in P frame&#xA;[h264 @ 0x5595d3e72040] number of reference frames (0&#x2B;4) exceeds max (3; probably corrupt input), discarding one&#xA;[h264 @ 0x5595d3e72040] chroma_log2_weight_denom 15 is out of range&#xA;    Last message repeated 1 times&#xA;[h264 @ 0x5595d3e72040] deblocking_filter_idc 13 out of range&#xA;[h264 @ 0x5595d3e72040] decode_slice_header error &#xA;[h264 @ 0x5595d3e72040] no frame!&#xA;[h264 @ 0x5595d3e72040] deblocking_filter_idc 32 out of range&#xA;[h264 @ 0x5595d3e72040] decode_slice_header error&#xA;[h264 @ 0x5595d3e72040] no frame!&#xA;[h264 @ 0x5595d3e72040] illegal reordering_of_pic_nums_idc 31&#xA;[h264 @ 0x5595d3e72040] illegal modification_of_pic_nums_idc 31&#xA;[h264 @ 0x5595d3e72040] decode_slice_header error &#xA;[h264 @ 0x5595d3e72040] no frame!&#xA;[h264 @ 0x5595d3e72040] illegal reordering_of_pic_nums_idc 6&#xA;[h264 @ 0x5595d3e72040] illegal modification_of_pic_nums_idc 6&#xA;[h264 @ 0x5595d3e72040] decode_slice_header error &#xA;&#xA;

    &#xA;

    I executed both ffprobe xxxx.mp4 and avcodec receive frame functions to decode one frame. reproduced the same error log.

    &#xA;

    But when I open it using VLC or OBS it worked well. So,

    &#xA;

      &#xA;
    • does any error when I configure the options of ffmpeg ?
    • &#xA;

    • how to fix this problem ?
    • &#xA;

    &#xA;

    the mp4 file I upload a piece in mp4 link

    &#xA;

  • Amplification of recorded audio in flutter app using FFMPEG not working correctly

    20 mai, par Noman khanbhai

    In my app I need to record audio and send it to server, server then sends the file to a hardware using mqtt and then file gets played on the hardware. I am using flutter to build app and using record 5.0.5 package for audio recording and for amplification ffmpeg_kit_flutter 6.0.3 package to do the amplification.

    &#xA;

    The issue is it doesnt seems like there is much change in amplitude, I used different values for amplification factor but audio remains same.

    &#xA;

    Here is the code for amplification

    &#xA;

    Future<string>? amplifyAudio(&#xA;      String inputPath, String outputPath) async {&#xA;&#xA;    // Build FFmpeg command to amplify audio&#xA;    outputPath = await modifyOutputPath(inputPath)!;&#xA;    String audioFilter = &#x27;volume=${amplificationFactor}dB&#x27;; &#xA;    //-c:a aac&#xA;    String command = &#x27;-i $inputPath -af $audioFilter $outputPath&#x27;;&#xA;&#xA;    // Execute FFmpeg command&#xA;    await FFmpegKit.executeAsync(command).then((session) async {&#xA;      debugPrint("After executeAsync session ${session.toString()}");&#xA;      debugPrint(&#xA;          "After executeAsync returncode ${await session.getReturnCode()}");&#xA;      debugPrint("After executeAsync command ${session.getCommand()}");&#xA;      log("After executeAsync alllogs ${await session.getAllLogs()}");&#xA;      log("After executeAsync alllogstring ${await session.getAllLogsAsString()}");&#xA;      log("After executeAsync failStackTrace ${await session.getFailStackTrace()}");&#xA;    }).onError((error, stackTrace) {&#xA;      debugPrint("After executeAsync error ${error.toString()}");&#xA;    });&#xA;&#xA;    return outputPath;&#xA;  }&#xA;&#xA;</string>

    &#xA;

    This are the logs when above method gets executed.

    &#xA;

    FFMpeg command -> `-i /data/user/0/com.orgname.flutter.appname/app_flutter/1716209206469.aac -af volume=10.0dB /storage/emulated/0/Download/1716209213238_amplified.aac`&#xA;&#xA;> Logs&#xA;> After executeAsync alllogstring ffmpeg version n6.0 Copyright (c) 2000-2023 the FFmpeg developers&#xA;> built with Android (7155654, based on r399163b1) clang version 11.0.5 (https://android.googlesource.com/toolchain/llvm-project 87f1315dfbea7c137aa2e6d362dbb457e388158d)&#xA;> configuration: --cross-prefix=aarch64-linux-android- --sysroot=/Users/sue/Library/Android/sdk/ndk/22.1.7171670/toolchains/llvm/prebuilt/darwin-x86_64/sysroot --prefix=/Users/sue/Projects/arthenica/ffmpeg-kit/prebuilt/android-arm64/ffmpeg --pkg-config=/opt/homebrew/bin/pkg-config --enable-version3 --arch=aarch64 --cpu=armv8-a --target-os=android --enable-neon --enable-asm --enable-inline-asm --ar=aarch64-linux-android-ar --cc=aarch64-linux-android24-clang --cxx=aarch64-linux-android24-clang&#x2B;&#x2B; --ranlib=aarch64-linux-android-ranlib --strip=aarch64-linux-android-strip --nm=aarch64-linux-android-nm --extra-libs=&#x27;-L/Users/sue/Projects/arthenica/ffmpeg-kit/prebuilt/android-arm64/cpu-features/lib -lndk_compat&#x27; --disable-autodetect --enable-cross-compile --enable-pic --enable-jni --enable-optimizations --enable-swscale --disable-static --enable-shared --enable-pthreads --enable-v4l2-m2m --disable-outdev=fbdev --disable-indev=fbdev --enable-small --disable-xmm-clobber-test --disable-debug --enable-lto --disable-neon-clobber-test --disable-programs --disable-postproc --disable-doc --disable-htmlpages --disable-manpages --disable-podpages --disable-txtpages --disable-sndio --disable-schannel --disable-securetransport --disable-xlib --disable-cuda --disable-cuvid --disable-nvenc --disable-vaapi --disable-vdpau --disable-videotoolbox --disable-audiotoolbox --disable-appkit --disable-alsa --disable-cuda --disable-cuvid --disable-nvenc --disable-vaapi --disable-vdpau --enable-gmp --enable-gnutls --enable-iconv --disable-sdl2 --disable-openssl --enable-zlib --enable-mediacodec&#xA;> libavutil      58.  2.100 / 58.  2.100&#xA;> libavcodec     60.  3.100 / 60.  3.100&#xA;> libavformat    60.  3.100 / 60.  3.100&#xA;> libavdevice    60.  1.100 / 60.  1.100&#xA;> libavfilter     9.  3.100 /  9.  3.100&#xA;> libswscale      7.  1.100 /  7.  1.100&#xA;> libswresample   4. 10.100 /  4. 10.100&#xA;> Input #0, mov,mp4,m4a,3gp,3g2,mj2, from &#x27;/data/user/0/com.orgname.flutter.appname/app_flutter/1716209206469.aac&#x27;:&#xA;> Metadata:&#xA;> major_brand     : mp42&#xA;> minor_version   : 0&#xA;> compatible_brands: isommp42&#xA;> creation_time   : 2024-05-20T12:46:52.000000Z&#xA;> com.android.version: 12&#xA;> Duration: 00:00:04.76, start: 0.000000, bitrate: 131 kb/s&#xA;> Stream #0:0[0x1](eng): Audio: aac (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 128 kb/s (default)&#xA;> Metadata:&#xA;> creation_time   : 2024-05-20T12:46:52.000000Z&#xA;> handler_name    : SoundHandle&#xA;> vendor_id       : [0][0][0][0]&#xA;> Stream mapping:&#xA;> Stream #0:0 -> #0:0 (aac (native) -> aac (native))&#xA;> Press [q] to stop, [?] for help&#xA;

    &#xA;

    Note - I am also playing the audio after recording and before amplification in app, and also saving in download. to make sure audio file is correct.

    &#xA;

    Amplified file also gets saved but there is almost no difference.

    &#xA;

    I have also searched/googled/ and also done chatgpt to resolve issue. but nothing worked.

    &#xA;