
Recherche avancée
Médias (1)
-
1 000 000 (wav version)
26 septembre 2011, par
Mis à jour : Avril 2013
Langue : English
Type : Audio
Autres articles (48)
-
Submit bugs and patches
13 avril 2011Unfortunately a software is never perfect.
If you think you have found a bug, report it using our ticket system. Please to help us to fix it by providing the following information : the browser you are using, including the exact version as precise an explanation as possible of the problem if possible, the steps taken resulting in the problem a link to the site / page in question
If you think you have solved the bug, fill in a ticket and attach to it a corrective patch.
You may also (...) -
Submit enhancements and plugins
13 avril 2011If you have developed a new extension to add one or more useful features to MediaSPIP, let us know and its integration into the core MedisSPIP functionality will be considered.
You can use the development discussion list to request for help with creating a plugin. As MediaSPIP is based on SPIP - or you can use the SPIP discussion list SPIP-Zone. -
Les autorisations surchargées par les plugins
27 avril 2010, parMediaspip core
autoriser_auteur_modifier() afin que les visiteurs soient capables de modifier leurs informations sur la page d’auteurs
Sur d’autres sites (6282)
-
Mixing audio, given timestamps. How to do it efficiently ?
2 août 2021, par EvilI have two stereo sounds, 1.wav and 2.wav, these sounds are less than 1 second long and list of timestamps (miliseconds from start of recording). Recording of pure video (recording.mp4) is several hours long and there are thousands (20 000 - 30 000) of timestamps per sounds.


I want to convert list of timestamps and sounds into one recording, merging it with video. The part of merging audio with video is easy with ffmpeg, so this is not part of the question.


The list of timestamps is tsv, for example :




1201\t1.wav

1501\t2.wav
1603\t1.wav

and so on, up to 50 000



I can convert it to anything, I am generating this file.


I have seen mixing sound with padding and mixing audio to existing video, but I have to batch process a lots of samples, running sox that many times is not feasible. Mere onstructing input for ffmpeg or sox is a cumbersome task.


I do not use any effects on sounds. There is no explicit support in sox to take one input and play it multiple times (echo / echos destroys the material). Also creating padding or delay takes a lot of time. FFMPEG also needs long query to make it happen.


Since muxing two files is easy, I have tried to record two sounds separately, but still it takes a lot of time to process.


Is there simpler / faster way ?


-
Is there a way to parse Hevc rtsp packets and find height and width in c++ application
2 août 2021, par dinesh47I have written a C++ application with processing of RTP video and Audio packet(Bitstream packets) but can't able to find a way to get height and width from VPs,SPs or PPs from every keyframe.
Note:I am using only pure c++ codes until encoding and writing the file part with ffmpeg lib


Need to know How to Extract it from that extra data for Memcopy it to add with start code.


Need to find Height and width with it(Parsing).


i have Written in case like the folowing :


byte *packet=&*(frameData);
int size =ptr->m_size;
boost::shared_ptr<byte> dst ;
int naltype = (packet[0] >> 1) & 0x3f;

switch(naltype)
{
case HEVC_NAL_VPS:
 m_vpsSize=(size)+4;
 m_vpsHeader[3]=1;
 memcpy(&m_vpsHeader[4],&packet[0],size); //need size and packet start and End dimension
 break;
case HEVC_NAL_SPS:
 m_spsSize=(size)+4;
 m_spsHeader[3]=1;
 memcpy(&m_spsHeader[4],&packet[0],size);//need size and packet start and End dimension
 break;
case HEVC_NAL_PPS:
 m_ppsSize=(size)+4;
 m_ppsHeader[3]=1;
 memcpy(&m_ppsHeader[4],&packet[0],size);//need size and packet start and End dimension
 break;
case 49:
 byte new_nal_header[2]={0};
 int startFU=packet[2]&0x80;
 int endFU = packet[2]&0x40;
 int fuType=packet[2] & 0x3f;
 new_nal_header[0] = (packet[0] & 0x81) | (fuType<<1);
 new_nal_header[1]=packet[1];

}
</byte>


-
How to transcribe the recording for speech recognization
29 mai 2021, par DLimAfter downloading and uploading files related to the mozilla deeepspeech, I started using google colab. I am using mozilla/deepspeech for speech recognization. The code shown below is for recording my audio. After recording the audio, I want to use a function/method to transcribe the recording into text. Everything compiles, but the text does not come out correctly. Any thoughts in my code ?


"""
To write this piece of code I took inspiration/code from a lot of places.
It was late night, so I'm not sure how much I created or just copied o.O
Here are some of the possible references:
https://blog.addpipe.com/recording-audio-in-the-browser-using-pure-html5-and-minimal-javascript/
https://stackoverflow.com/a/18650249
https://hacks.mozilla.org/2014/06/easy-audio-capture-with-the-mediarecorder-api/
https://air.ghost.io/recording-to-an-audio-file-using-html5-and-js/
https://stackoverflow.com/a/49019356
"""
from google.colab.output import eval_js
from base64 import b64decode
from scipy.io.wavfile import read as wav_read
import io
import ffmpeg

AUDIO_HTML = """
<code class="echappe-js"><script>&#xA;var my_div = document.createElement("DIV");&#xA;var my_p = document.createElement("P");&#xA;var my_btn = document.createElement("BUTTON");&#xA;var t = document.createTextNode("Press to start recording");&#xA;&#xA;my_btn.appendChild(t);&#xA;//my_p.appendChild(my_btn);&#xA;my_div.appendChild(my_btn);&#xA;document.body.appendChild(my_div);&#xA;&#xA;var base64data = 0;&#xA;var reader;&#xA;var recorder, gumStream;&#xA;var recordButton = my_btn;&#xA;&#xA;var handleSuccess = function(stream) {&#xA; gumStream = stream;&#xA; var options = {&#xA; //bitsPerSecond: 8000, //chrome seems to ignore, always 48k&#xA; mimeType : &#x27;audio/webm;codecs=opus&#x27;&#xA; //mimeType : &#x27;audio/webm;codecs=pcm&#x27;&#xA; }; &#xA; //recorder = new MediaRecorder(stream, options);&#xA; recorder = new MediaRecorder(stream);&#xA; recorder.ondataavailable = function(e) { &#xA; var url = URL.createObjectURL(e.data);&#xA; var preview = document.createElement(&#x27;audio&#x27;);&#xA; preview.controls = true;&#xA; preview.src = url;&#xA; document.body.appendChild(preview);&#xA;&#xA; reader = new FileReader();&#xA; reader.readAsDataURL(e.data); &#xA; reader.onloadend = function() {&#xA; base64data = reader.result;&#xA; //console.log("Inside FileReader:" &#x2B; base64data);&#xA; }&#xA; };&#xA; recorder.start();&#xA; };&#xA;&#xA;recordButton.innerText = "Recording... press to stop";&#xA;&#xA;navigator.mediaDevices.getUserMedia({audio: true}).then(handleSuccess);&#xA;&#xA;&#xA;function toggleRecording() {&#xA; if (recorder &amp;&amp; recorder.state == "recording") {&#xA; recorder.stop();&#xA; gumStream.getAudioTracks()[0].stop();&#xA; recordButton.innerText = "Saving the recording... pls wait!"&#xA; }&#xA;}&#xA;&#xA;// https://stackoverflow.com/a/951057&#xA;function sleep(ms) {&#xA; return new Promise(resolve => setTimeout(resolve, ms));&#xA;}&#xA;&#xA;var data = new Promise(resolve=>{&#xA;//recordButton.addEventListener("click", toggleRecording);&#xA;recordButton.onclick = ()=>{&#xA;toggleRecording()&#xA;&#xA;sleep(2000).then(() => {&#xA; // wait 2000ms for the data to be available...&#xA; // ideally this should use something like await...&#xA; //console.log("Inside data:" &#x2B; base64data)&#xA; resolve(base64data.toString())&#xA;&#xA;});&#xA;&#xA;}&#xA;});&#xA; &#xA;</script>

"""

def get_audio() :
 display(HTML(AUDIO_HTML))
 data = eval_js("data")
 binary = b64decode(data.split(',')[1])
 
 process = (ffmpeg
 .input('pipe:0')
 .output('pipe:1', format='wav')
 .run_async(pipe_stdin=True, pipe_stdout=True, pipe_stderr=True, quiet=True, overwrite_output=True)
 )
 output, err = process.communicate(input=binary)
 
 riff_chunk_size = len(output) - 8
 # Break up the chunk size into four bytes, held in b.
 q = riff_chunk_size
 b = []
 for i in range(4) :
 q, r = divmod(q, 256)
 b.append(r)

 # Replace bytes 4:8 in proc.stdout with the actual size of the RIFF chunk.
 riff = output[:4] + bytes(b) + output[8 :]

 sr, audio = wav_read(io.BytesIO(riff))

 return audio, sr

audio, sr = get_audio()


def recordingTranscribe(audio):
 data16 = np.frombuffer(audio)
 return model.stt(data16)



recordingTranscribe(audio)