
Recherche avancée
Médias (29)
-
#7 Ambience
16 octobre 2011, par
Mis à jour : Juin 2015
Langue : English
Type : Audio
-
#6 Teaser Music
16 octobre 2011, par
Mis à jour : Février 2013
Langue : English
Type : Audio
-
#5 End Title
16 octobre 2011, par
Mis à jour : Février 2013
Langue : English
Type : Audio
-
#3 The Safest Place
16 octobre 2011, par
Mis à jour : Février 2013
Langue : English
Type : Audio
-
#4 Emo Creates
15 octobre 2011, par
Mis à jour : Février 2013
Langue : English
Type : Audio
-
#2 Typewriter Dance
15 octobre 2011, par
Mis à jour : Février 2013
Langue : English
Type : Audio
Autres articles (61)
-
Publier sur MédiaSpip
13 juin 2013Puis-je poster des contenus à partir d’une tablette Ipad ?
Oui, si votre Médiaspip installé est à la version 0.2 ou supérieure. Contacter au besoin l’administrateur de votre MédiaSpip pour le savoir -
Supporting all media types
13 avril 2011, parUnlike most software and media-sharing platforms, MediaSPIP aims to manage as many different media types as possible. The following are just a few examples from an ever-expanding list of supported formats : images : png, gif, jpg, bmp and more audio : MP3, Ogg, Wav and more video : AVI, MP4, OGV, mpg, mov, wmv and more text, code and other data : OpenOffice, Microsoft Office (Word, PowerPoint, Excel), web (html, CSS), LaTeX, Google Earth and (...)
-
Librairies et logiciels spécifiques aux médias
10 décembre 2010, parPour un fonctionnement correct et optimal, plusieurs choses sont à prendre en considération.
Il est important, après avoir installé apache2, mysql et php5, d’installer d’autres logiciels nécessaires dont les installations sont décrites dans les liens afférants. Un ensemble de librairies multimedias (x264, libtheora, libvpx) utilisées pour l’encodage et le décodage des vidéos et sons afin de supporter le plus grand nombre de fichiers possibles. Cf. : ce tutoriel ; FFMpeg avec le maximum de décodeurs et (...)
Sur d’autres sites (12348)
-
Apply multiple text filters at once and burn into video for captioning without re-encoding, and fix error [ffmpeg-python wrapper]
13 mai, par BaldiIs there any way to burn text into a video without re-encoding ? I ask this because the re-encoding process goes at around 0.1x speed on my device when writing to WEBM. Alternatively, if there is a faster way to render high quality video quickly while still re-encoding that would be great. I vaguely remember someone writing to a temporary file to solve this problem.


Also small error in program, code attatched


def processVideo(self):
 print("creating video")

 # File location management
 font_path = self.input_path / "CalSans-Regular.ttf"
 background_path = self.input_path / "new_video_background.webm"
 audio_path = self.sound_output_path
 video_ouput_path = self.parent_path / "new_result.webm"
 sound_input = ffmpeg.input(str(audio_path))
 video_input = ffmpeg.input(str(background_path))

 # Adding captions
 print(self.text_caption)
 previous_timepoint = 0
 for caption, timepoint in zip(self.text_caption, self.timepoints, strict=False): 
 # Text caption and timepooints are lists where the end of the words in text_caption correspond
 # to the timepoint with the same index in timepoint
 video_input = video_input.drawtext(
 text=caption, 
 fontfile = font_path, 
 x='w-text_w/2', 
 y='h-text_h/2', 
 escape_text=True, 
 fontsize= 32,
 bordercolor = "black",
 borderw = 4,
 enable=f'between(t,{previous_timepoint},{timepoint["timeSeconds"]})'
 )
 previous_timepoint = timepoint["timeSeconds"]
 
 # Combining sound and video and writing output
 command = ffmpeg.output(sound_input, video_input, str(video_ouput_path), codec='copy').overwrite_output().global_args('-shortest')
 print("args =", command)
 print(command.get_args())
 command.run()
 print("done!")



File "c:\Desktop\Projects\video_project\main.py", line 239, in <module>
 post_list[0].processVideo()
 ~~~~~~~~~~~~~~~~~~~~~~~~~^^
 File "c:\Desktop\Projects\video_project\main.py", line 223, in processVideo
 command.run()
 ~~~~~~~~~~~^^
 File "C:\Desktop\Projects\video_project\.venv\Lib\site-packages\ffmpeg\_run.py", line 313, in run
 process = run_async(
 stream_spec,
 ...<5 lines>...
 overwrite_output=overwrite_output,
 )
 File "C:\Desktop\Projects\video_project\.venv\Lib\site-packages\ffmpeg\_run.py", line 284, in run_async
 return subprocess.Popen(
 ~~~~~~~~~~~~~~~~^
 args, stdin=stdin_stream, stdout=stdout_stream, stderr=stderr_stream
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 )
 ^
 File "C:\AppData\Local\Programs\Python\Python313\Lib\subprocess.py", line 1038, in __init__
 self._execute_child(args, executable, preexec_fn, close_fds,
 ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 pass_fds, cwd, env,
 ^^^^^^^^^^^^^^^^^^^
 ...<5 lines>...
 gid, gids, uid, umask,
 ^^^^^^^^^^^^^^^^^^^^^^
 start_new_session, process_group)
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "C:\AppData\Local\Programs\Python\Python313\Lib\subprocess.py", line 1550, in _execute_child
 hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
 ~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^
 # no special security
 ^^^^^^^^^^^^^^^^^^^^^
 ...<4 lines>...
 cwd,
 ^^^^
 startupinfo)
 ^^^^^^^^^^^^
FileNotFoundError: [WinError 206] The filename or extension is too long
</module>


-
How to split audio file into equal-length segments with ffmpeg ?
11 mars, par GPWRI want to split an audio file into several equal-length segments using FFmpeg. I want to specify the general segment duration (no overlap), and I want FFmpeg to render as many segments as it takes to go over the whole audio file (in other words, the number of segments to be rendered is unspecified).
Also, since I am not very experienced with FFmpeg (I only use it to make simple file conversions with few arguments), I would like a description of the code you should use to do this, rather than just a piece of code that I won't necessarily understand, if possible.
Thank you in advance.


P.S. Here's the context for why I'm trying to do this :
I would like to sample a song into single-bar loops automatically, instead of having to chop them manually using a DAW. All I want to do is align the first beat of the song to the beat grid in my DAW, and then export that audio file and use it to generate one-bar loops in FFmpeg.


In the future, I will try to do something like a batch command in which one can specify the tempo and key signature, and it will generate the loops using FFmpeg automatically (as long as the loop is aligned to the beat grid, as I've mentioned earlier). 😀


-
Twilio Real-Time Media Streaming to WebSocket Receives Only Noise Instead of Speech
21 février, par dannym25I'm setting up a Twilio Voice call with real-time media streaming to a WebSocket server for speech-to-text processing using Google Cloud Speech-to-Text. The connection is established successfully, and I receive a continuous stream of audio data from Twilio. However, when I play back the received audio, all I hear is a rapid clicking/jackhammering noise instead of the actual speech spoken during the call.


Setup :


- 

- Twilio
sends inbound audio to my WebSocket server. - WebSocket receives and saves the raw mulaw-encoded audio data from Twilio.
- The audio is processed via Google Speech-to-Text for transcription.
- When I attempt to play back the audio, it sounds like machine-gun-like noise instead of spoken words.










1. Confirmed WebSocket Receives Data


• The WebSocket successfully logs incoming audio chunks from Twilio :


🔊 Received 379 bytes of audio from Twilio
🔊 Received 379 bytes of audio from Twilio



• This suggests Twilio is sending audio data, but it's not being interpreted correctly.


2. Saving and Playing Raw Audio


• I save the incoming raw mulaw (8000Hz) audio from Twilio to a file :


fs.appendFileSync('twilio-audio.raw', message);



• Then, I convert it to a
.wav
file using FFmpeg :

ffmpeg -f mulaw -ar 8000 -ac 1 -i twilio-audio.raw twilio-audio.wav



• Problem : When I play the audio using
ffplay
, it contains no speech, only rapid clicking sounds.

3. Ensured Correct Audio Encoding


• Twilio sends mulaw 8000Hz mono format.
• Verified that my
ffmpeg
conversion is using the same settings.
• Tried different conversion methods :

ffmpeg -f mulaw -ar 8000 -ac 1 -i twilio-audio.raw -c:a pcm_s16le twilio-audio-fixed.wav



→ Same issue.


4. Checked Google Speech-to-Text Input Format


• Google STT requires proper encoding configuration :


const request = {
 config: {
 encoding: 'MULAW',
 sampleRateHertz: 8000,
 languageCode: 'en-US',
 },
 interimResults: false,
};



• No errors from Google STT, but it never detects speech, likely because the input audio is just noise.


5. Confirmed That Raw Audio is Not a WAV File


• Since Twilio sends raw audio, I checked whether I needed to strip the header before processing.
• Tried manually extracting raw bytes, but the issue persists.


Current Theory :


- 

- The WebSocket server might be handling Twilio’s raw audio incorrectly before saving it.
- There might be an additional header in the Twilio stream that needs to be removed before playback.
- Twilio’s
<stream></stream>
tag expects a WebSocket connection starting withwss://
instead ofhttps://
, and switching towss://
partially fixed some previous connection issues.








Code Snippets :


Twilio
Setup in TwiML Response 

app.post('/voice-response', (req, res) => {
 console.log("📞 Incoming call from Twilio");

 const twiml = new twilio.twiml.VoiceResponse();
 twiml.say("Hello! Welcome to the service. How can I help you?");
 
 // Prevent Twilio from hanging up too early
 twiml.pause({ length: 5 });

 twiml.connect().stream({
 url: `wss://your-ngrok-url/ws`,
 track: "inbound_track"
 });

 console.log("🛠️ Twilio Stream URL:", `wss://your-ngrok-url/ws`);
 
 res.type('text/xml').send(twiml.toString());
});



WebSocket Server Handling Twilio Audio Stream


wss.on('connection', (ws) => {
 console.log("🔗 WebSocket Connected! Waiting for audio input...");

 ws.on('message', (message) => {
 console.log(`🔊 Received ${message.length} bytes of audio from Twilio`);

 // Save raw audio data for debugging
 fs.appendFileSync('twilio-audio.raw', message);

 // Check if audio is non-empty but contains only noise
 if (message.length < 100) {
 console.warn("⚠️ Warning: Audio data from Twilio is very small. Might be silent.");
 }
 });

 ws.on('close', () => {
 console.log("❌ WebSocket Disconnected!");
 
 // Convert Twilio audio for debugging
 exec(`ffmpeg -f mulaw -ar 8000 -ac 1 -i twilio-audio.raw twilio-audio.wav`, (err) => {
 if (err) console.error("❌ FFmpeg Conversion Error:", err);
 else console.log("✅ Twilio Audio Saved as `twilio-audio.wav`");
 });
 });

 ws.on('error', (error) => console.error("⚠️ WebSocket Error:", error));
});



Questions :


- 

- Why is the audio from Twilio being received as a clicking noise instead of actual speech ?
- Do I need to strip any additional metadata from the raw bytes before saving ?
- Is there a known issue with Twilio’s
mulaw
format when streaming audio over WebSockets ? - How can I confirm that Google STT is receiving properly formatted audio ?










Additional Context :


- 

- Twilio
<stream></stream>
is connected and receiving data (confirmed by logs). - WebSocket successfully receives and saves audio, but it only plays noise.
- Tried multiple ffmpeg conversions, Google STT configurations, and raw data inspection.
- Still no recognizable speech in the audio output.










Any help is greatly appreciated ! 🙏


- Twilio