Recherche avancée

Médias (29)

Mot : - Tags -/Musique

Autres articles (61)

  • Publier sur MédiaSpip

    13 juin 2013

    Puis-je poster des contenus à partir d’une tablette Ipad ?
    Oui, si votre Médiaspip installé est à la version 0.2 ou supérieure. Contacter au besoin l’administrateur de votre MédiaSpip pour le savoir

  • Supporting all media types

    13 avril 2011, par

    Unlike most software and media-sharing platforms, MediaSPIP aims to manage as many different media types as possible. The following are just a few examples from an ever-expanding list of supported formats : images : png, gif, jpg, bmp and more audio : MP3, Ogg, Wav and more video : AVI, MP4, OGV, mpg, mov, wmv and more text, code and other data : OpenOffice, Microsoft Office (Word, PowerPoint, Excel), web (html, CSS), LaTeX, Google Earth and (...)

  • Librairies et logiciels spécifiques aux médias

    10 décembre 2010, par

    Pour un fonctionnement correct et optimal, plusieurs choses sont à prendre en considération.
    Il est important, après avoir installé apache2, mysql et php5, d’installer d’autres logiciels nécessaires dont les installations sont décrites dans les liens afférants. Un ensemble de librairies multimedias (x264, libtheora, libvpx) utilisées pour l’encodage et le décodage des vidéos et sons afin de supporter le plus grand nombre de fichiers possibles. Cf. : ce tutoriel ; FFMpeg avec le maximum de décodeurs et (...)

Sur d’autres sites (12348)

  • Apply multiple text filters at once and burn into video for captioning without re-encoding, and fix error [ffmpeg-python wrapper]

    13 mai, par Baldi

    Is there any way to burn text into a video without re-encoding ? I ask this because the re-encoding process goes at around 0.1x speed on my device when writing to WEBM. Alternatively, if there is a faster way to render high quality video quickly while still re-encoding that would be great. I vaguely remember someone writing to a temporary file to solve this problem.

    


    Also small error in program, code attatched

    


    def processVideo(self):
    print("creating video")

    # File location management
    font_path = self.input_path / "CalSans-Regular.ttf"
    background_path = self.input_path / "new_video_background.webm"
    audio_path = self.sound_output_path
    video_ouput_path = self.parent_path / "new_result.webm"
    sound_input = ffmpeg.input(str(audio_path))
    video_input = ffmpeg.input(str(background_path))

    # Adding captions
    print(self.text_caption)
    previous_timepoint = 0
    for caption, timepoint in zip(self.text_caption, self.timepoints, strict=False): 
        # Text caption and timepooints are lists where the end of the words in text_caption correspond
        # to the timepoint with the same index in timepoint
        video_input = video_input.drawtext(
                                            text=caption, 
                                            fontfile = font_path, 
                                            x='w-text_w/2', 
                                            y='h-text_h/2', 
                                            escape_text=True, 
                                            fontsize= 32,
                                            bordercolor = "black",
                                            borderw = 4,
                                            enable=f'between(t,{previous_timepoint},{timepoint["timeSeconds"]})'
                                            )
        previous_timepoint = timepoint["timeSeconds"]
        
    # Combining sound and video and writing output
    command = ffmpeg.output(sound_input, video_input, str(video_ouput_path), codec='copy').overwrite_output().global_args('-shortest')
    print("args =", command)
    print(command.get_args())
    command.run()
    print("done!")


    


      File "c:\Desktop\Projects\video_project\main.py", line 239, in <module>&#xA;    post_list[0].processVideo()&#xA;    ~~~~~~~~~~~~~~~~~~~~~~~~~^^&#xA;  File "c:\Desktop\Projects\video_project\main.py", line 223, in processVideo&#xA;    command.run()&#xA;    ~~~~~~~~~~~^^&#xA;  File "C:\Desktop\Projects\video_project\.venv\Lib\site-packages\ffmpeg\_run.py", line 313, in run&#xA;    process = run_async(&#xA;        stream_spec,&#xA;    ...&lt;5 lines>...&#xA;        overwrite_output=overwrite_output,&#xA;    )&#xA;  File "C:\Desktop\Projects\video_project\.venv\Lib\site-packages\ffmpeg\_run.py", line 284, in run_async&#xA;    return subprocess.Popen(&#xA;           ~~~~~~~~~~~~~~~~^&#xA;        args, stdin=stdin_stream, stdout=stdout_stream, stderr=stderr_stream&#xA;        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^&#xA;    )&#xA;    ^&#xA;  File "C:\AppData\Local\Programs\Python\Python313\Lib\subprocess.py", line 1038, in __init__&#xA;    self._execute_child(args, executable, preexec_fn, close_fds,&#xA;    ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^&#xA;                        pass_fds, cwd, env,&#xA;                        ^^^^^^^^^^^^^^^^^^^&#xA;    ...&lt;5 lines>...&#xA;                        gid, gids, uid, umask,&#xA;                        ^^^^^^^^^^^^^^^^^^^^^^&#xA;                        start_new_session, process_group)&#xA;                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^&#xA;  File "C:\AppData\Local\Programs\Python\Python313\Lib\subprocess.py", line 1550, in _execute_child&#xA;    hp, ht, pid, tid = _winapi.CreateProcess(executable, args,&#xA;                       ~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^&#xA;                             # no special security&#xA;                             ^^^^^^^^^^^^^^^^^^^^^&#xA;    ...&lt;4 lines>...&#xA;                             cwd,&#xA;                             ^^^^&#xA;                             startupinfo)&#xA;                             ^^^^^^^^^^^^&#xA;FileNotFoundError: [WinError 206] The filename or extension is too long&#xA;</module>

    &#xA;

  • How to split audio file into equal-length segments with ffmpeg ?

    11 mars, par GPWR

    I want to split an audio file into several equal-length segments using FFmpeg. I want to specify the general segment duration (no overlap), and I want FFmpeg to render as many segments as it takes to go over the whole audio file (in other words, the number of segments to be rendered is unspecified).&#xA;Also, since I am not very experienced with FFmpeg (I only use it to make simple file conversions with few arguments), I would like a description of the code you should use to do this, rather than just a piece of code that I won't necessarily understand, if possible.&#xA;Thank you in advance.

    &#xA;

    P.S. Here's the context for why I'm trying to do this :&#xA;I would like to sample a song into single-bar loops automatically, instead of having to chop them manually using a DAW. All I want to do is align the first beat of the song to the beat grid in my DAW, and then export that audio file and use it to generate one-bar loops in FFmpeg.

    &#xA;

    In the future, I will try to do something like a batch command in which one can specify the tempo and key signature, and it will generate the loops using FFmpeg automatically (as long as the loop is aligned to the beat grid, as I've mentioned earlier). 😀

    &#xA;

  • Twilio Real-Time Media Streaming to WebSocket Receives Only Noise Instead of Speech

    21 février, par dannym25

    I'm setting up a Twilio Voice call with real-time media streaming to a WebSocket server for speech-to-text processing using Google Cloud Speech-to-Text. The connection is established successfully, and I receive a continuous stream of audio data from Twilio. However, when I play back the received audio, all I hear is a rapid clicking/jackhammering noise instead of the actual speech spoken during the call.

    &#xA;

    Setup :

    &#xA;

      &#xA;
    • Twilio sends inbound audio to my WebSocket server.
    • &#xA;

    • WebSocket receives and saves the raw mulaw-encoded audio data from Twilio.
    • &#xA;

    • The audio is processed via Google Speech-to-Text for transcription.
    • &#xA;

    • When I attempt to play back the audio, it sounds like machine-gun-like noise instead of spoken words.
    • &#xA;

    &#xA;

    1. Confirmed WebSocket Receives Data

    &#xA;

    • The WebSocket successfully logs incoming audio chunks from Twilio :

    &#xA;

    &#128266; Received 379 bytes of audio from Twilio&#xA;&#128266; Received 379 bytes of audio from Twilio&#xA;

    &#xA;

    • This suggests Twilio is sending audio data, but it's not being interpreted correctly.

    &#xA;

    2. Saving and Playing Raw Audio

    &#xA;

    • I save the incoming raw mulaw (8000Hz) audio from Twilio to a file :

    &#xA;

    fs.appendFileSync(&#x27;twilio-audio.raw&#x27;, message);&#xA;

    &#xA;

    • Then, I convert it to a .wav file using FFmpeg :

    &#xA;

    ffmpeg -f mulaw -ar 8000 -ac 1 -i twilio-audio.raw twilio-audio.wav&#xA;

    &#xA;

    Problem : When I play the audio using ffplay, it contains no speech, only rapid clicking sounds.

    &#xA;

    3. Ensured Correct Audio Encoding

    &#xA;

    • Twilio sends mulaw 8000Hz mono format.&#xA;• Verified that my ffmpeg conversion is using the same settings.&#xA;• Tried different conversion methods :

    &#xA;

    ffmpeg -f mulaw -ar 8000 -ac 1 -i twilio-audio.raw -c:a pcm_s16le twilio-audio-fixed.wav&#xA;

    &#xA;

    → Same issue.

    &#xA;

    4. Checked Google Speech-to-Text Input Format

    &#xA;

    • Google STT requires proper encoding configuration :

    &#xA;

    const request = {&#xA;    config: {&#xA;        encoding: &#x27;MULAW&#x27;,&#xA;        sampleRateHertz: 8000,&#xA;        languageCode: &#x27;en-US&#x27;,&#xA;    },&#xA;    interimResults: false,&#xA;};&#xA;

    &#xA;

    • No errors from Google STT, but it never detects speech, likely because the input audio is just noise.

    &#xA;

    5. Confirmed That Raw Audio is Not a WAV File

    &#xA;

    • Since Twilio sends raw audio, I checked whether I needed to strip the header before processing.&#xA;• Tried manually extracting raw bytes, but the issue persists.

    &#xA;

    Current Theory :

    &#xA;

      &#xA;
    • The WebSocket server might be handling Twilio’s raw audio incorrectly before saving it.
    • &#xA;

    • There might be an additional header in the Twilio stream that needs to be removed before playback.
    • &#xA;

    • Twilio’s <stream></stream> tag expects a WebSocket connection starting with wss:// instead of https://, and switching to wss:// partially fixed some previous connection issues.
    • &#xA;

    &#xA;

    Code Snippets :

    &#xA;

    Twilio Setup in TwiML Response

    &#xA;

    app.post(&#x27;/voice-response&#x27;, (req, res) => {&#xA;    console.log("&#128222; Incoming call from Twilio");&#xA;&#xA;    const twiml = new twilio.twiml.VoiceResponse();&#xA;    twiml.say("Hello! Welcome to the service. How can I help you?");&#xA;    &#xA;    // Prevent Twilio from hanging up too early&#xA;    twiml.pause({ length: 5 });&#xA;&#xA;    twiml.connect().stream({&#xA;        url: `wss://your-ngrok-url/ws`,&#xA;        track: "inbound_track"&#xA;    });&#xA;&#xA;    console.log("&#128736;️ Twilio Stream URL:", `wss://your-ngrok-url/ws`);&#xA;    &#xA;    res.type(&#x27;text/xml&#x27;).send(twiml.toString());&#xA;});&#xA;

    &#xA;

    WebSocket Server Handling Twilio Audio Stream

    &#xA;

    wss.on(&#x27;connection&#x27;, (ws) => {&#xA;    console.log("&#128279; WebSocket Connected! Waiting for audio input...");&#xA;&#xA;    ws.on(&#x27;message&#x27;, (message) => {&#xA;        console.log(`&#128266; Received ${message.length} bytes of audio from Twilio`);&#xA;&#xA;        // Save raw audio data for debugging&#xA;        fs.appendFileSync(&#x27;twilio-audio.raw&#x27;, message);&#xA;&#xA;        // Check if audio is non-empty but contains only noise&#xA;        if (message.length &lt; 100) {&#xA;            console.warn("⚠️ Warning: Audio data from Twilio is very small. Might be silent.");&#xA;        }&#xA;    });&#xA;&#xA;    ws.on(&#x27;close&#x27;, () => {&#xA;        console.log("❌ WebSocket Disconnected!");&#xA;        &#xA;        // Convert Twilio audio for debugging&#xA;        exec(`ffmpeg -f mulaw -ar 8000 -ac 1 -i twilio-audio.raw twilio-audio.wav`, (err) => {&#xA;            if (err) console.error("❌ FFmpeg Conversion Error:", err);&#xA;            else console.log("✅ Twilio Audio Saved as `twilio-audio.wav`");&#xA;        });&#xA;    });&#xA;&#xA;    ws.on(&#x27;error&#x27;, (error) => console.error("⚠️ WebSocket Error:", error));&#xA;});&#xA;

    &#xA;

    Questions :

    &#xA;

      &#xA;
    • Why is the audio from Twilio being received as a clicking noise instead of actual speech ?
    • &#xA;

    • Do I need to strip any additional metadata from the raw bytes before saving ?
    • &#xA;

    • Is there a known issue with Twilio’s mulaw format when streaming audio over WebSockets ?
    • &#xA;

    • How can I confirm that Google STT is receiving properly formatted audio ?
    • &#xA;

    &#xA;

    Additional Context :

    &#xA;

      &#xA;
    • Twilio <stream></stream> is connected and receiving data (confirmed by logs).
    • &#xA;

    • WebSocket successfully receives and saves audio, but it only plays noise.
    • &#xA;

    • Tried multiple ffmpeg conversions, Google STT configurations, and raw data inspection.
    • &#xA;

    • Still no recognizable speech in the audio output.
    • &#xA;

    &#xA;

    Any help is greatly appreciated ! 🙏

    &#xA;