
Recherche avancée
Médias (91)
-
Chuck D with Fine Arts Militia - No Meaning No
15 septembre 2011, par
Mis à jour : Septembre 2011
Langue : English
Type : Audio
-
Paul Westerberg - Looking Up in Heaven
15 septembre 2011, par
Mis à jour : Septembre 2011
Langue : English
Type : Audio
-
Le Tigre - Fake French
15 septembre 2011, par
Mis à jour : Septembre 2011
Langue : English
Type : Audio
-
Thievery Corporation - DC 3000
15 septembre 2011, par
Mis à jour : Septembre 2011
Langue : English
Type : Audio
-
Dan the Automator - Relaxation Spa Treatment
15 septembre 2011, par
Mis à jour : Septembre 2011
Langue : English
Type : Audio
-
Gilberto Gil - Oslodum
15 septembre 2011, par
Mis à jour : Septembre 2011
Langue : English
Type : Audio
Autres articles (51)
-
Contribute to a better visual interface
13 avril 2011MediaSPIP is based on a system of themes and templates. Templates define the placement of information on the page, and can be adapted to a wide range of uses. Themes define the overall graphic appearance of the site.
Anyone can submit a new graphic theme or template and make it available to the MediaSPIP community. -
Librairies et binaires spécifiques au traitement vidéo et sonore
31 janvier 2010, parLes logiciels et librairies suivantes sont utilisées par SPIPmotion d’une manière ou d’une autre.
Binaires obligatoires FFMpeg : encodeur principal, permet de transcoder presque tous les types de fichiers vidéo et sonores dans les formats lisibles sur Internet. CF ce tutoriel pour son installation ; Oggz-tools : outils d’inspection de fichiers ogg ; Mediainfo : récupération d’informations depuis la plupart des formats vidéos et sonores ;
Binaires complémentaires et facultatifs flvtool2 : (...) -
Support audio et vidéo HTML5
10 avril 2011MediaSPIP utilise les balises HTML5 video et audio pour la lecture de documents multimedia en profitant des dernières innovations du W3C supportées par les navigateurs modernes.
Pour les navigateurs plus anciens, le lecteur flash Flowplayer est utilisé.
Le lecteur HTML5 utilisé a été spécifiquement créé pour MediaSPIP : il est complètement modifiable graphiquement pour correspondre à un thème choisi.
Ces technologies permettent de distribuer vidéo et son à la fois sur des ordinateurs conventionnels (...)
Sur d’autres sites (10229)
-
How to make a MPEG-DASH MPD which starts the playback in the middle of the first segment ?
18 septembre 2018, par ravin.wangHere are the reproduce steps :
-
Normalize an H.264 video stream
ffmpeg -i 2.h264 -c:v libx264 -intra -r 25 -vf scale=640x360,setdar=16:9 2@25fps@intra@640x360.h264
(*) After that, I got an H.264 stream where all pictures are H.264 IDR frames, and fps is 25, resolution is 640x360, aspect-ratio is 16:9.
-
Generate an MP4 file
MP4Box -add 2@25fps@intra@640x360.h264:timescale=1000 -fps 25 2@25fps@intra@640x360.mp4
-
Make dash MP4 fragmented content, including init mp4, .m4s files and one .mpd file
MP4Box -dash 5000 -frag 5000 -dash-scale 1000 -frag-rap -segment-name ’seg_second$Number$’ -segment-timeline -profile live 2@25fps@intra@640x360.mp4
- Copy and publish all these files to a folder under one HTTPD server
-
I want to play from 4s of the first segment, and don’t display any frames before 4s, so I changed the .MPD file to modify the fields "SegmentTemplate@presentationTimeOffset", "SegmentTimeline:S@d/t", like as :
<?xml version="1.0"?>
<mpd xmlns="urn:mpeg:dash:schema:mpd:2011" minbuffertime="PT1.500S" type="static" mediapresentationduration="PT0H0M26.000S" maxsegmentduration="PT0H0M5.000S" profiles="urn:mpeg:dash:profile:isoff-live:2011">
<period duration="PT0H0M26.000S">
<adaptationset segmentalignment="true" maxwidth="640" maxheight="360" maxframerate="25" par="16:9" lang="und">
<segmenttemplate presentationtimeoffset="4000" media="seg_second$Number$.m4s" timescale="1000" startnumber="1" initialization="seg_secondinit.mp4">
<segmenttimeline>
<s d="1000" t="4000"></s>
<s d="5000" r="4"></s>
</segmenttimeline>
</segmenttemplate>
<representation mimetype="video/mp4" codecs="avc3.64101E" width="640" height="360" framerate="25" sar="1:1" startwithsap="1" bandwidth="2261831">
</representation>
</adaptationset>
</period>
</mpd> -
Play the MPD url from VLC player, or Edge browser, it always starts the the first frame of the first segment, the frames between 0s 4s are also displayed unexpectedly.
What’s wrong with my steps ? Or any other options for it ?
-
-
MPD MPEG-DASH - Shows only one bitrate
13 août 2018, par Justin RecHelp. I wont show bitrate.
player.getBitrateInfoListFor("video") ;
Shows only one bitrate - 454948manifest.mpd generated by GPAC
<period duration="PT0H21M48.338S">
<adaptationset segmentalignment="true" group="1" maxwidth="270" maxheight="480" maxframerate="2070000/93437" par="270:480" lang="und">
<representation mimetype="video/mp4" codecs="avc3.640015" width="270" height="480" framerate="2070000/93437" sar="1:1" startwithsap="1" bandwidth="454948">
<segmenttemplate media="480_bbb/segment__track1_$Number$.m4s" timescale="2070000" startnumber="1" duration="8280000" initialization="480_bbb/segment__track1_init.mp4"></segmenttemplate>
</representation>
</adaptationset>
<adaptationset segmentalignment="true" group="1" maxwidth="202" maxheight="360" maxframerate="2070000/93437" par="202:360" lang="und">
<representation mimetype="video/mp4" codecs="avc3.64000D" width="202" height="360" framerate="2070000/93437" sar="1:1" startwithsap="1" bandwidth="281508">
<segmenttemplate media="360_bbb/segment__track1_$Number$.m4s" timescale="2070000" startnumber="1" duration="8280000" initialization="360_bbb/segment__track1_init.mp4"></segmenttemplate>
</representation>
</adaptationset>
<adaptationset segmentalignment="true" group="1" maxwidth="134" maxheight="240" maxframerate="2070000/93437" par="134:240" lang="und">
<representation mimetype="video/mp4" codecs="avc3.64000B" width="134" height="240" framerate="2070000/93437" sar="1:1" startwithsap="1" bandwidth="182832">
<segmenttemplate media="240_bbb/segment__track1_$Number$.m4s" timescale="2070000" startnumber="1" duration="8280000" initialization="240_bbb/segment__track1_init.mp4"></segmenttemplate>
</representation>
</adaptationset>
<adaptationset segmentalignment="true" group="1" maxwidth="80" maxheight="144" maxframerate="2070000/93437" par="80:144" lang="und">
<representation mimetype="video/mp4" codecs="avc3.640009" width="80" height="144" framerate="2070000/93437" sar="1:1" startwithsap="1" bandwidth="99667">
<segmenttemplate media="144_bbb/segment__track1_$Number$.m4s" timescale="2070000" startnumber="1" duration="8280000" initialization="144_bbb/segment__track1_init.mp4"></segmenttemplate>
</representation>
</adaptationset>
<adaptationset segmentalignment="true" lang="und">
<representation mimetype="audio/mp4" codecs="mp4a.40.2" startwithsap="1" bandwidth="66056">
<audiochannelconfiguration schemeiduri="urn:mpeg:dash:23003:3:audio_channel_configuration:2011" value="1"></audiochannelconfiguration>
<segmenttemplate media="audio_bbb/segment__track2_$Number$.m4s" timescale="48000" startnumber="1" duration="192000" initialization="audio_bbb/segment__track2_init.mp4"></segmenttemplate>
</representation>
</adaptationset>
</period>player.getBitrateInfoListFor("video") ;
Shows only one bitrate - 454948 -
How to use Google's Cloud Speech-to-Text REST API to transcribe a video
24 juillet 2018, par mrbI’d like to have the transcript of 2 people speaking in a video, but I get an empty response from the Cloud Speech-to-Text API
Approach :
I have a 56 minute video file containing a conversation between two people. I would like to have the transcript of that conversation, and I would like to use Google’s Cloud Speech-to-Text API to get that.
To save a little on my Google Cloud Storage I converted to video to audio first by using
mmpeg
.First I’d tried to figure out the audio codec by using the command below, and it looks like AAC.
ffmpeg -i video.mp4
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'videoplayback.mp4':
Metadata:
major_brand : mp42
minor_version : 0
compatible_brands: isommp42
creation_time : 2015-12-30T08:17:14.000000Z
Duration: 00:56:03.99, start: 0.000000, bitrate: 362 kb/s
Stream #0:0(und): Video: h264 (Constrained Baseline) (avc1 / 0x31637661), yuv420p, 490x360 [SAR 1:1 DAR 49:36], 264 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 96 kb/s (default)
Metadata:
creation_time : 2015-12-30T08:17:31.000000Z
handler_name : IsoMedia File Produced by Google, 5-11-2011So I took that from the video by using :
ffmpeg -i video.mp4 -vn -acodec copy myaudio.aac
Details so far :
ffmpeg -i myaudio.aac
Outputs :Input #0, aac, from 'myaudio.aac':
Duration: 00:56:47.49, bitrate: 97 kb/s
Stream #0:0: Audio: aac (LC), 44100 Hz, stereo, fltp, 97 kb/sAfter that I converted it to opus because I’m told that opus is better
ffmpeg -i myaudio.aac -acodec libopus -b:a 97k -vbr on -compression_level 10 myaudio.opus
Info so far :
opusinfo myaudio.opus
User comments section follows...
encoder=Lavc58.18.100 libopus
Opus stream 1:
Pre-skip: 312
Playback gain: 0 dB
Channels: 2
Original sample rate: 48000Hz
Packet duration: 20.0ms (max), 20.0ms (avg), 20.0ms (min)
Page duration: 1000.0ms (max), 1000.0ms (avg), 1000.0ms (min)
Total data length: 29956714 bytes (overhead: 0.872%)
Playback length: 56m:03.990s
Average bitrate: 71.24 kb/s, w/o overhead: 70.62 kb/sI this point I uploaded the
myaudio.opus
to the Google Cloud Storage.curl POST 1
I started the speech recognition by doing a POST withcurl
:curl --request POST --header "Content-Type: application/json" --url 'https://speech.googleapis.com/v1/speech:longrunningrecognize?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&key={MY_API_KEY}' --data '{"audio": {"uri": "gs://{MY_BUCKET}/myaudio.opus"},"config": {"encoding": "OGG_OPUS", "sampleRateHertz": 48000, "languageCode": "en-US"}}'
Response :
{"name": "123456789"}
123456789 was not the actual value.curl GET 1
Now I wanted to have the results :curl --request GET --url 'https://speech.googleapis.com/v1/operations/123456789?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&key={MY_API_KEY}'
This gave me the error :
Error : Unable to recognize speech, possible error in encoding or channel config. Please correct the config and retry the request.
So I updated the encoding configuration from
OGG_OPUS
toLINEAR16
.curl POST 2
Did the post again :curl --request POST --header "Content-Type: application/json" --url 'https://speech.googleapis.com/v1/speech:longrunningrecognize?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&key={MY_API_KEY}' --data '{"audio": {"uri": "gs://{MY_BUCKET}/myaudio.opus"},"config": {"encoding": "LINEAR16", "sampleRateHertz": 48000, "languageCode": "en-US"}}'
Response :
{"name": "987654321"}
curl GET 2
curl --request GET --url 'https://speech.googleapis.com/v1/operations/987654321?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&key={MY_API_KEY}'
Response :
{
"name": "987654321",
"metadata": {
"@type": "type.googleapis.com/google.cloud.speech.v1.LongRunningRecognizeMetadata",
"progressPercent": 100,
"startTime": "2018-06-08T11:01:24.596504Z",
"lastUpdateTime": "2018-06-08T11:01:51.825882Z"
},
"done": true
}The problem is that I don’t get the actual transcription. According the the documentation there should be a
response
key in the response containing the data.Since I’m kinda stuck here I’d like to know if I’m doing something completely wrong. I don’t have any technical or resource limitation so all suggestions are very welcome ! Also happy to change my approach.
Thanks in advance ! Cheers