
Recherche avancée
Médias (29)
-
#7 Ambience
16 octobre 2011, par
Mis à jour : Juin 2015
Langue : English
Type : Audio
-
#6 Teaser Music
16 octobre 2011, par
Mis à jour : Février 2013
Langue : English
Type : Audio
-
#5 End Title
16 octobre 2011, par
Mis à jour : Février 2013
Langue : English
Type : Audio
-
#3 The Safest Place
16 octobre 2011, par
Mis à jour : Février 2013
Langue : English
Type : Audio
-
#4 Emo Creates
15 octobre 2011, par
Mis à jour : Février 2013
Langue : English
Type : Audio
-
#2 Typewriter Dance
15 octobre 2011, par
Mis à jour : Février 2013
Langue : English
Type : Audio
Autres articles (56)
-
Personnaliser les catégories
21 juin 2013, parFormulaire de création d’une catégorie
Pour ceux qui connaissent bien SPIP, une catégorie peut être assimilée à une rubrique.
Dans le cas d’un document de type catégorie, les champs proposés par défaut sont : Texte
On peut modifier ce formulaire dans la partie :
Administration > Configuration des masques de formulaire.
Dans le cas d’un document de type média, les champs non affichés par défaut sont : Descriptif rapide
Par ailleurs, c’est dans cette partie configuration qu’on peut indiquer le (...) -
Publier sur MédiaSpip
13 juin 2013Puis-je poster des contenus à partir d’une tablette Ipad ?
Oui, si votre Médiaspip installé est à la version 0.2 ou supérieure. Contacter au besoin l’administrateur de votre MédiaSpip pour le savoir -
HTML5 audio and video support
13 avril 2011, parMediaSPIP uses HTML5 video and audio tags to play multimedia files, taking advantage of the latest W3C innovations supported by modern browsers.
The MediaSPIP player used has been created specifically for MediaSPIP and can be easily adapted to fit in with a specific theme.
For older browsers the Flowplayer flash fallback is used.
MediaSPIP allows for media playback on major mobile platforms with the above (...)
Sur d’autres sites (7202)
-
How to transcribe the recording for speech recognization
29 mai 2021, par DLimAfter downloading and uploading files related to the mozilla deeepspeech, I started using google colab. I am using mozilla/deepspeech for speech recognization. The code shown below is for recording my audio. After recording the audio, I want to use a function/method to transcribe the recording into text. Everything compiles, but the text does not come out correctly. Any thoughts in my code ?


"""
To write this piece of code I took inspiration/code from a lot of places.
It was late night, so I'm not sure how much I created or just copied o.O
Here are some of the possible references:
https://blog.addpipe.com/recording-audio-in-the-browser-using-pure-html5-and-minimal-javascript/
https://stackoverflow.com/a/18650249
https://hacks.mozilla.org/2014/06/easy-audio-capture-with-the-mediarecorder-api/
https://air.ghost.io/recording-to-an-audio-file-using-html5-and-js/
https://stackoverflow.com/a/49019356
"""
from google.colab.output import eval_js
from base64 import b64decode
from scipy.io.wavfile import read as wav_read
import io
import ffmpeg

AUDIO_HTML = """
<code class="echappe-js"><script>&#xA;var my_div = document.createElement("DIV");&#xA;var my_p = document.createElement("P");&#xA;var my_btn = document.createElement("BUTTON");&#xA;var t = document.createTextNode("Press to start recording");&#xA;&#xA;my_btn.appendChild(t);&#xA;//my_p.appendChild(my_btn);&#xA;my_div.appendChild(my_btn);&#xA;document.body.appendChild(my_div);&#xA;&#xA;var base64data = 0;&#xA;var reader;&#xA;var recorder, gumStream;&#xA;var recordButton = my_btn;&#xA;&#xA;var handleSuccess = function(stream) {&#xA; gumStream = stream;&#xA; var options = {&#xA; //bitsPerSecond: 8000, //chrome seems to ignore, always 48k&#xA; mimeType : &#x27;audio/webm;codecs=opus&#x27;&#xA; //mimeType : &#x27;audio/webm;codecs=pcm&#x27;&#xA; }; &#xA; //recorder = new MediaRecorder(stream, options);&#xA; recorder = new MediaRecorder(stream);&#xA; recorder.ondataavailable = function(e) { &#xA; var url = URL.createObjectURL(e.data);&#xA; var preview = document.createElement(&#x27;audio&#x27;);&#xA; preview.controls = true;&#xA; preview.src = url;&#xA; document.body.appendChild(preview);&#xA;&#xA; reader = new FileReader();&#xA; reader.readAsDataURL(e.data); &#xA; reader.onloadend = function() {&#xA; base64data = reader.result;&#xA; //console.log("Inside FileReader:" &#x2B; base64data);&#xA; }&#xA; };&#xA; recorder.start();&#xA; };&#xA;&#xA;recordButton.innerText = "Recording... press to stop";&#xA;&#xA;navigator.mediaDevices.getUserMedia({audio: true}).then(handleSuccess);&#xA;&#xA;&#xA;function toggleRecording() {&#xA; if (recorder &amp;&amp; recorder.state == "recording") {&#xA; recorder.stop();&#xA; gumStream.getAudioTracks()[0].stop();&#xA; recordButton.innerText = "Saving the recording... pls wait!"&#xA; }&#xA;}&#xA;&#xA;// https://stackoverflow.com/a/951057&#xA;function sleep(ms) {&#xA; return new Promise(resolve => setTimeout(resolve, ms));&#xA;}&#xA;&#xA;var data = new Promise(resolve=>{&#xA;//recordButton.addEventListener("click", toggleRecording);&#xA;recordButton.onclick = ()=>{&#xA;toggleRecording()&#xA;&#xA;sleep(2000).then(() => {&#xA; // wait 2000ms for the data to be available...&#xA; // ideally this should use something like await...&#xA; //console.log("Inside data:" &#x2B; base64data)&#xA; resolve(base64data.toString())&#xA;&#xA;});&#xA;&#xA;}&#xA;});&#xA; &#xA;</script>

"""

def get_audio() :
 display(HTML(AUDIO_HTML))
 data = eval_js("data")
 binary = b64decode(data.split(',')[1])
 
 process = (ffmpeg
 .input('pipe:0')
 .output('pipe:1', format='wav')
 .run_async(pipe_stdin=True, pipe_stdout=True, pipe_stderr=True, quiet=True, overwrite_output=True)
 )
 output, err = process.communicate(input=binary)
 
 riff_chunk_size = len(output) - 8
 # Break up the chunk size into four bytes, held in b.
 q = riff_chunk_size
 b = []
 for i in range(4) :
 q, r = divmod(q, 256)
 b.append(r)

 # Replace bytes 4:8 in proc.stdout with the actual size of the RIFF chunk.
 riff = output[:4] + bytes(b) + output[8 :]

 sr, audio = wav_read(io.BytesIO(riff))

 return audio, sr

audio, sr = get_audio()


def recordingTranscribe(audio):
 data16 = np.frombuffer(audio)
 return model.stt(data16)



recordingTranscribe(audio)



-
ffmpeg API : handle frame loss in hevc encoding
8 janvier 2024, par MarioEverything works fine until the introduction of frame->pts increment due to frame loss.


Below is the regular progression without frame->pts increments :




frame->pts=8 pkt->pts=512 pkt->dts=-512 pkt->flags=1

frame->pts=9 pkt->pts=2560 pkt->dts=0 pkt->flags=0

frame->pts=10 pkt->pts=1536 pkt->dts=512 pkt->flags=0

frame->pts=11 pkt->pts=1024 pkt->dts=1024 pkt->flags=0

frame->pts=12 pkt->pts=2048 pkt->dts=1536 pkt->flags=0

frame->pts=13 pkt->pts=4608 pkt->dts=2048 pkt->flags=0

frame->pts=14 pkt->pts=3584 pkt->dts=2560 pkt->flags=0

frame->pts=15 pkt->pts=3072 pkt->dts=3072 pkt->flags=0

frame->pts=16 pkt->pts=4096 pkt->dts=3584 pkt->flags=0

frame->pts=17 pkt->pts=6656 pkt->dts=4096 pkt->flags=0

frame->pts=18 pkt->pts=5632 pkt->dts=4608 pkt->flags=0



When I introduce the frame->pts increment it happens :




frame->pts=15 pkt->pts=512 pkt->dts=-512 pkt->flags=1

frame->pts=17 pkt->pts=4608 pkt->dts=2048 pkt->flags=0

frame->pts=19 pkt->pts=2560 pkt->dts=1536 pkt->flags=0

[mp4 @ 0x7eff842222c0] Application provided invalid, non monotonically increasing dts to muxer in stream 0 : 2048 >= 1536



So I wrote the following code as a "quick" solution (between av_packet_rescale_ts() and av_interleaved_write_frame()) :


av_packet_rescale_ts(pkt, c->time_base, st->time_base); 
 ...
 if (pkt->dts<=previous_dts) 
 { 
 if (pkt->pts<=previous_pts) 
 { 
 pkt->pts=previous_dts+1+pkt->pts-pkt->dts; 
 } 
 pkt->dts=previous_dts+1; 
 } 
 previous_dts=pkt->dts; 
 previous_pts=pkt->pts; 
 ...
 ret = av_interleaved_write_frame(fmt_ctx, pkt); 



Now I no longer have the error, but the values are :




frame->pts=15 pkt->pts=512 pkt->dts=-512 pkt->flags=1

changed frame->pts=15 pkt->pts=512 pkt->dts=1 pkt->flags=1

frame->pts=17 pkt->pts=4608 pkt->dts=2048 pkt->flags=0

frame->pts=19 pkt->pts=2560 pkt->dts=1536 pkt->flags=0

changed frame->pts=19 pkt->pts=3073 pkt->dts=2049 pkt->flags=0

frame->pts=21 pkt->pts=1536 pkt->dts=1536 pkt->flags=0

changed frame->pts=21 pkt->pts=2050 pkt->dts=2050 pkt->flags=0

frame->pts=23 pkt->pts=4096 pkt->dts=3584 pkt->flags=0

frame->pts=25 pkt->pts=8704 pkt->dts=6144 pkt->flags=0

frame->pts=27 pkt->pts=6656 pkt->dts=5632 pkt->flags=0

changed frame->pts=27 pkt->pts=7169 pkt->dts=6145 pkt->flags=0

frame->pts=29 pkt->pts=5632 pkt->dts=5632 pkt->flags=0

changed frame->pts=29 pkt->pts=6146 pkt->dts=6146 pkt->flags=0

frame->pts=31 pkt->pts=7680 pkt->dts=7168 pkt->flags=0

frame->pts=33 pkt->pts=12800 pkt->dts=10240 pkt->flags=0

frame->pts=35 pkt->pts=10752 pkt->dts=9728 pkt->flags=0

changed frame->pts=35 pkt->pts=11265 pkt->dts=10241 pkt->flags=0

frame->pts=37 pkt->pts=9728 pkt->dts=9728 pkt->flags=0

changed frame->pts=37 pkt->pts=10242 pkt->dts=10242 pkt->flags=0



What is the correct way to handle frame loss scenario ?
Is there a way to inform the encoder about frame loss ?


The encoder is "hevc_qsv" and the output format is mov (.mp4).


-
FFMPEG error when saving NDI stream to mp4
22 septembre 2020, par user1163234I am trying to record a NDI stream to a MP4 file(I want to stream the mp4 to rtmp endpoint after saving file). However I am getting this error when running this class. https://github.com/WalkerKnapp/devolay/blob/master/examples/src/main/java/com/walker/devolayexamples/recording/RecordingExample.java


Error :


Connecting to source: DESKTOP-GQNH46Q (Ari PC output)
[file @ 0x7fb4f2d48540] Setting default whitelist 'file,crypto'
x265 [info]: HEVC encoder version 0.0
x265 [info]: build info [Mac OS X][clang 8.1.0][64 bit] 8bit+10bit+12bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
x265 [info]: Main profile, Level-3.1 (Main tier)
x265 [info]: Thread pool created using 8 threads
x265 [info]: Slices : 1
x265 [info]: frame threads / pool features : 1 / wpp(12 rows)
x265 [info]: Coding QT: max CU size, min CU size : 64 / 8
x265 [info]: Residual QT: max TU size, max depth : 32 / 1 inter / 1 intra
x265 [info]: ME / range / subpel / merge : hex / 57 / 2 / 3
x265 [info]: Keyframe min / max / scenecut / bias: 1 / 250 / 40 / 5.00
x265 [info]: Lookahead / bframes / badapt : 20 / 4 / 2
x265 [info]: b-pyramid / weightp / weightb : 1 / 1 / 0
x265 [info]: References / ref-limit cu / depth : 3 / off / on
x265 [info]: AQ: mode / str / qg-size / cu-tree : 2 / 1.0 / 32 / 1
x265 [info]: Rate Control / qCompress : CRF-28.0 / 0.60
x265 [info]: tools: rd=3 psy-rd=2.00 early-skip rskip signhide tmvp b-intra
x265 [info]: tools: strong-intra-smoothing lslices=4 deblock sao
[SWR @ 0x7fb4f8893000] Using fltp internally between filters
[mp4 @ 0x7fb4f3820200] Application provided invalid, non monotonically increasing dts to muxer in stream 0: 135 >= 107
Failed to write video flush packet, skipping: Invalid argument
configurationVersion: 1
general_profile_space: 0
general_tier_flag: 0
general_profile_idc: 1
general_profile_compatibility_flags: 0x60000000
general_constraint_indicator_flags: 0x900000000000
general_level_idc: 93
min_spatial_segmentation_idc: 0
parallelismType: 0
chromaFormat: 1
bitDepthLumaMinus8: 0
bitDepthChromaMinus8: 0
avgFrameRate: 0
constantFrameRate: 0
numTemporalLayers: 1
temporalIdNested: 1
lengthSizeMinusOne: 3
numOfArrays: 4
array_completeness[0]: 0
NAL_unit_type[0]: 32
numNalus[0]: 1
nalUnitLength[0][0]: 24
array_completeness[1]: 0
NAL_unit_type[1]: 33
numNalus[1]: 1
nalUnitLength[1][0]: 41
array_completeness[2]: 0
NAL_unit_type[2]: 34
numNalus[2]: 1
nalUnitLength[2][0]: 7
array_completeness[3]: 0
NAL_unit_type[3]: 39
numNalus[3]: 1
nalUnitLength[3][0]: 2050
[AVIOContext @ 0x7fb4f2d48640] Statistics: 2 seeks, 4 writeouts
x265 [info]: frame I: 1, Avg QP:14.03 kb/s: 16.33 
x265 [info]: frame P: 36, Avg QP:21.67 kb/s: 0.04 
x265 [info]: frame B: 73, Avg QP:24.22 kb/s: 0.03 
x265 [info]: Weighted P-Frames: Y:0.0% UV:0.0%
x265 [info]: consecutive B-frames: 28.9% 13.2% 18.4% 2.6% 36.8% 

encoded 110 frames in 15.13s (7.27 fps), 0.18 kb/s, Avg QP:23.29
[aac @ 0x7fb4f6aa6200] Qavg: 65536.000