Recherche avancée

Médias (91)

Autres articles (51)

  • La file d’attente de SPIPmotion

    28 novembre 2010, par

    Une file d’attente stockée dans la base de donnée
    Lors de son installation, SPIPmotion crée une nouvelle table dans la base de donnée intitulée spip_spipmotion_attentes.
    Cette nouvelle table est constituée des champs suivants : id_spipmotion_attente, l’identifiant numérique unique de la tâche à traiter ; id_document, l’identifiant numérique du document original à encoder ; id_objet l’identifiant unique de l’objet auquel le document encodé devra être attaché automatiquement ; objet, le type d’objet auquel (...)

  • MediaSPIP Init et Diogène : types de publications de MediaSPIP

    11 novembre 2010, par

    À l’installation d’un site MediaSPIP, le plugin MediaSPIP Init réalise certaines opérations dont la principale consiste à créer quatre rubriques principales dans le site et de créer cinq templates de formulaire pour Diogène.
    Ces quatre rubriques principales (aussi appelées secteurs) sont : Medias ; Sites ; Editos ; Actualités ;
    Pour chacune de ces rubriques est créé un template de formulaire spécifique éponyme. Pour la rubrique "Medias" un second template "catégorie" est créé permettant d’ajouter (...)

  • Changer son thème graphique

    22 février 2011, par

    Le thème graphique ne touche pas à la disposition à proprement dite des éléments dans la page. Il ne fait que modifier l’apparence des éléments.
    Le placement peut être modifié effectivement, mais cette modification n’est que visuelle et non pas au niveau de la représentation sémantique de la page.
    Modifier le thème graphique utilisé
    Pour modifier le thème graphique utilisé, il est nécessaire que le plugin zen-garden soit activé sur le site.
    Il suffit ensuite de se rendre dans l’espace de configuration du (...)

Sur d’autres sites (4640)

  • FFmpeg-command freezes app on NET MAUI Android [closed]

    28 septembre 2024, par Christian Röder

    Cheers,

    


    I'm developing a .NET MAUI app where I want to create videos from a list of images.

    


    This is the command : "-y -f concat -safe 0 -analyzeduration 100M -probesize 50M -i {Path.Combine(basePath, "input.txt")} -vf \"scale=720:1280,setsar=1:1\" -vcodec mpeg4 -pix_fmt yuv420p {outputPath}"

    


    The problem is, my app freezes when the command is executed. I've tried so much await/async/await/Task stuff, it changes nothing. The video ist created anyway...
When I'm using a simple command like "-encoders" everything is fine.

    


    This is the log when creating a video out of three images :

    


    [ffmpeg-kit] ffmpeg version n6.0
[ffmpeg-kit]  Copyright (c) 2000-2023 the FFmpeg developers
[ffmpeg-kit] 
[ffmpeg-kit]   built with Android (7155654, based on r399163b1) clang version 11.0.5 (https://android.googlesource.com/toolchain/llvm-project 87f1315dfbea7c137aa2e6d362dbb457e388158d)
[ffmpeg-kit]   configuration: --cross-prefix=aarch64-linux-android- --sysroot=/Users/sue/Library/Android/sdk/ndk/22.1.7171670/toolchains/llvm/prebuilt/darwin-x86_64/sysroot --prefix=/Users/sue/Projects/arthenica/ffmpeg-kit/prebuilt/android-arm64/ffmpeg --pkg-config=/opt/homebrew/bin/pkg-config --enable-version3 --arch=aarch64 --cpu=armv8-a --target-os=android --enable-neon --enable-asm --enable-inline-asm --ar=aarch64-linux-android-ar --cc=aarch64-linux-android24-clang --cxx=aarch64-linux-android24-clang++ --ranlib=aarch64-linux-android-ranlib --strip=aarch64-linux-android-strip --nm=aarch64-linux-android-nm --extra-libs='-L/Users/sue/Projects/arthenica/ffmpeg-kit/prebuilt/android-arm64/cpu-features/lib -lndk_compat' --disable-autodetect --enable-cross-compile --enable-pic --enable-jni --enable-optimizations --enable-swscale --disable-static --enable-shared --enable-pthreads --enable-v4l2-m2m --disable-outdev=fbdev --disable-indev=fbdev --enable-small --disable-xmm-clobber-test --disable-debug --enable-lto --disable-neon-clobber-test --disable-programs --disable-postproc --disable-doc --disable-htmlpages --disable-manpages --disable-podpages --disable-txtpages --disable-sndio --disable-schannel --disable-securetransport --disable-xlib --disable-cuda --disable-cuvid --disable-nvenc --disable-vaapi --disable-vdpau --disable-videotoolbox --disable-audiotoolbox --disable-appkit --disable-alsa --disable-cuda --disable-cuvid --disable-nvenc --disable-vaapi --disable-vdpau --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-gmp --enable-gnutls --enable-libmp3lame --enable-libass --enable-iconv --enable-libtheora --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libxml2 --enable-libopencore-amrnb --enable-libshine --enable-libspeex --enable-libdav1d --enable-libkvazaar --enable-libilbc --enable-libopus --enable-libsnappy --enable-libsoxr --enable-libtwolame --disable-sdl2 --enable-libvo-amrwbenc --enable-libzimg --disable-openssl --enable-zlib --enable-mediacodec
[ffmpeg-kit]   libavutil      58.  2.100 / 58.  2.100
[ffmpeg-kit]   libavcodec     60.  3.100 / 60.  3.100
[ffmpeg-kit]   libavformat    60.  3.100 / 60.  3.100
[ffmpeg-kit]   libavdevice    60.  1.100 / 60.  1.100
[ffmpeg-kit]   libavfilter     9.  3.100 /  9.  3.100
[ffmpeg-kit]   libswscale      7.  1.100 /  7.  1.100
[ffmpeg-kit]   libswresample   4. 10.100 /  4. 10.100
[ffmpeg-kit] Input #0, concat, from '/data/user/0/com.companyname.myapp/cache/input.txt':
[ffmpeg-kit]   Duration: 
[ffmpeg-kit] 00:00:01.50
[ffmpeg-kit] , start: 
[ffmpeg-kit] 0.000000
[ffmpeg-kit] , bitrate: 
[ffmpeg-kit] 1 kb/s
[ffmpeg-kit] 
[ffmpeg-kit]   Stream #0:0
[ffmpeg-kit] : Video: png, rgba(pc), 1080x1970
[ffmpeg-kit] , 
[ffmpeg-kit] 25 fps, 
[ffmpeg-kit] 25 tbr, 
[ffmpeg-kit] 25 tbn
[ffmpeg-kit] 
[ffmpeg-kit] Stream mapping:
[ffmpeg-kit]   Stream #0:0 -> #0:0
[ffmpeg-kit]  (png (native) -> mpeg4 (native))
[ffmpeg-kit] 
[ffmpeg-kit] Press [q] to stop, [?] for help
[ffmpeg-kit] Output #0, mp4, to '/data/user/0/com.companyname.myapp/cache/myapp.mp4':
[ffmpeg-kit]   Metadata:
[ffmpeg-kit]     encoder         : 
[ffmpeg-kit] Lavf60.3.100
[ffmpeg-kit] 
[ffmpeg-kit]   Stream #0:0
[ffmpeg-kit] : Video: mpeg4 (mp4v / 0x7634706D), yuv420p(tv, unknown/bt709/iec61966-2-1, progressive), 720x1280 [SAR 1:1 DAR 9:16], q=2-31, 200 kb/s
[ffmpeg-kit] , 
[ffmpeg-kit] 25 fps, 
[ffmpeg-kit] 12800 tbn
[ffmpeg-kit] 
[ffmpeg-kit]     Metadata:
[ffmpeg-kit]       encoder         : 
[ffmpeg-kit] Lavc60.3.100 mpeg4
[ffmpeg-kit] 
[ffmpeg-kit]     Side data:
[ffmpeg-kit]       
[ffmpeg-kit] cpb: 
[ffmpeg-kit] bitrate max/min/avg: 0/0/200000 buffer size: 0 
[ffmpeg-kit] vbv_delay: N/A
[ffmpeg-kit] 
[ffmpeg-kit] frame=    0 fps=0.0 q=4.1 size=       0kB time=00:00:00.00 bitrate=N/A speed=   0x    
[ffmpeg-kit] frame=   36 fps=0.0 q=2.7 Lsize=     268kB time=00:00:01.40 bitrate=1568.3kbits/s dup=33 drop=0 speed=7.18x    
[ffmpeg-kit] video:267kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 
[ffmpeg-kit] 0.382941%
[ffmpeg-kit] 


    


    I've putting the execute command in many different ways like Task.Run(() => ffmpegService.Execute(encoderCommand));

    


  • Twilio Real-Time Media Streaming to WebSocket Receives Only Noise Instead of Speech

    21 février, par dannym25

    I'm setting up a Twilio Voice call with real-time media streaming to a WebSocket server for speech-to-text processing using Google Cloud Speech-to-Text. The connection is established successfully, and I receive a continuous stream of audio data from Twilio. However, when I play back the received audio, all I hear is a rapid clicking/jackhammering noise instead of the actual speech spoken during the call.

    


    Setup :

    


      

    • Twilio sends inbound audio to my WebSocket server.
    • 


    • WebSocket receives and saves the raw mulaw-encoded audio data from Twilio.
    • 


    • The audio is processed via Google Speech-to-Text for transcription.
    • 


    • When I attempt to play back the audio, it sounds like machine-gun-like noise instead of spoken words.
    • 


    


    1. Confirmed WebSocket Receives Data

    


    • The WebSocket successfully logs incoming audio chunks from Twilio :

    


    🔊 Received 379 bytes of audio from Twilio
🔊 Received 379 bytes of audio from Twilio


    


    • This suggests Twilio is sending audio data, but it's not being interpreted correctly.

    


    2. Saving and Playing Raw Audio

    


    • I save the incoming raw mulaw (8000Hz) audio from Twilio to a file :

    


    fs.appendFileSync('twilio-audio.raw', message);


    


    • Then, I convert it to a .wav file using FFmpeg :

    


    ffmpeg -f mulaw -ar 8000 -ac 1 -i twilio-audio.raw twilio-audio.wav


    


    Problem : When I play the audio using ffplay, it contains no speech, only rapid clicking sounds.

    


    3. Ensured Correct Audio Encoding

    


    • Twilio sends mulaw 8000Hz mono format.
• Verified that my ffmpeg conversion is using the same settings.
• Tried different conversion methods :

    


    ffmpeg -f mulaw -ar 8000 -ac 1 -i twilio-audio.raw -c:a pcm_s16le twilio-audio-fixed.wav


    


    → Same issue.

    


    4. Checked Google Speech-to-Text Input Format

    


    • Google STT requires proper encoding configuration :

    


    const request = {
    config: {
        encoding: 'MULAW',
        sampleRateHertz: 8000,
        languageCode: 'en-US',
    },
    interimResults: false,
};


    


    • No errors from Google STT, but it never detects speech, likely because the input audio is just noise.

    


    5. Confirmed That Raw Audio is Not a WAV File

    


    • Since Twilio sends raw audio, I checked whether I needed to strip the header before processing.
• Tried manually extracting raw bytes, but the issue persists.

    


    Current Theory :

    


      

    • The WebSocket server might be handling Twilio’s raw audio incorrectly before saving it.
    • 


    • There might be an additional header in the Twilio stream that needs to be removed before playback.
    • 


    • Twilio’s <stream></stream> tag expects a WebSocket connection starting with wss:// instead of https://, and switching to wss:// partially fixed some previous connection issues.
    • &#xA;

    &#xA;

    Code Snippets :

    &#xA;

    Twilio Setup in TwiML Response

    &#xA;

    app.post(&#x27;/voice-response&#x27;, (req, res) => {&#xA;    console.log("&#128222; Incoming call from Twilio");&#xA;&#xA;    const twiml = new twilio.twiml.VoiceResponse();&#xA;    twiml.say("Hello! Welcome to the service. How can I help you?");&#xA;    &#xA;    // Prevent Twilio from hanging up too early&#xA;    twiml.pause({ length: 5 });&#xA;&#xA;    twiml.connect().stream({&#xA;        url: `wss://your-ngrok-url/ws`,&#xA;        track: "inbound_track"&#xA;    });&#xA;&#xA;    console.log("&#128736;️ Twilio Stream URL:", `wss://your-ngrok-url/ws`);&#xA;    &#xA;    res.type(&#x27;text/xml&#x27;).send(twiml.toString());&#xA;});&#xA;

    &#xA;

    WebSocket Server Handling Twilio Audio Stream

    &#xA;

    wss.on(&#x27;connection&#x27;, (ws) => {&#xA;    console.log("&#128279; WebSocket Connected! Waiting for audio input...");&#xA;&#xA;    ws.on(&#x27;message&#x27;, (message) => {&#xA;        console.log(`&#128266; Received ${message.length} bytes of audio from Twilio`);&#xA;&#xA;        // Save raw audio data for debugging&#xA;        fs.appendFileSync(&#x27;twilio-audio.raw&#x27;, message);&#xA;&#xA;        // Check if audio is non-empty but contains only noise&#xA;        if (message.length &lt; 100) {&#xA;            console.warn("⚠️ Warning: Audio data from Twilio is very small. Might be silent.");&#xA;        }&#xA;    });&#xA;&#xA;    ws.on(&#x27;close&#x27;, () => {&#xA;        console.log("❌ WebSocket Disconnected!");&#xA;        &#xA;        // Convert Twilio audio for debugging&#xA;        exec(`ffmpeg -f mulaw -ar 8000 -ac 1 -i twilio-audio.raw twilio-audio.wav`, (err) => {&#xA;            if (err) console.error("❌ FFmpeg Conversion Error:", err);&#xA;            else console.log("✅ Twilio Audio Saved as `twilio-audio.wav`");&#xA;        });&#xA;    });&#xA;&#xA;    ws.on(&#x27;error&#x27;, (error) => console.error("⚠️ WebSocket Error:", error));&#xA;});&#xA;

    &#xA;

    Questions :

    &#xA;

      &#xA;
    • Why is the audio from Twilio being received as a clicking noise instead of actual speech ?
    • &#xA;

    • Do I need to strip any additional metadata from the raw bytes before saving ?
    • &#xA;

    • Is there a known issue with Twilio’s mulaw format when streaming audio over WebSockets ?
    • &#xA;

    • How can I confirm that Google STT is receiving properly formatted audio ?
    • &#xA;

    &#xA;

    Additional Context :

    &#xA;

      &#xA;
    • Twilio <stream></stream> is connected and receiving data (confirmed by logs).
    • &#xA;

    • WebSocket successfully receives and saves audio, but it only plays noise.
    • &#xA;

    • Tried multiple ffmpeg conversions, Google STT configurations, and raw data inspection.
    • &#xA;

    • Still no recognizable speech in the audio output.
    • &#xA;

    &#xA;

    Any help is greatly appreciated ! 🙏

    &#xA;

  • copy .wav audio file settings to new .wav file

    18 novembre 2020, par Jonas

    currently I am working with a speech to text translation model that takes a .wav file and turns the audible speech within the audio into a text transcript. The model worked before on .wav audio recordings that were recorded directly. However now I am trying to do the same with audio that was at first present within a video.

    &#xA;

    The steps are as follows :

    &#xA;

      &#xA;
    • retrieve a video file from a stream url through ffmpeg
    • &#xA;

    • strip the .aac audio from the video
    • &#xA;

    • convert the .aac audio to .wav
    • &#xA;

    • save the .wav to s3 for later usage
    • &#xA;

    &#xA;

    The ffmpeg command I use is listed below for reference :

    &#xA;

      rm /tmp/jonas/*&#xA;  ffmpeg -i {stream_url} -c copy -bsf:a aac_adtstoasc /tmp/jonas/{filename}.aac&#xA;  ffmpeg -i /tmp/jonas/{filename}.aac /tmp/jonas/{filename}.wav&#xA;  aws s3 cp /tmp/jonas/{filename}.wav {s3_audio_save_location}&#xA;

    &#xA;

    The problem now is that my speech to text model does not work on this audio anymore. I use sox to convert the audio but sox does not seem to grab the audio. Also without sox the model does not work. This leads me to believe there is a difference in the .wav audio formatting and therefore I would like to know how I can either format the .wav with the same settings as a .wav that does work or find a way to compare the .wav audio formatting and set the new .wav to the correct format manually through ffmpeg

    &#xA;

    I tried with PyPy exiftool and found the metadata of the two files :

    &#xA;

    The metadata of the working .wav file is enter image description here

    &#xA;

    The metadata of the .wav file that does not work is enter image description here

    &#xA;

    So as can be seen the working .wav file has some different settings that I would like to mimic in the second .wav file presumably that would make my model work again :)

    &#xA;

    with kind regards,&#xA;Jonas

    &#xA;