
Recherche avancée
Médias (1)
-
Rennes Emotion Map 2010-11
19 octobre 2011, par
Mis à jour : Juillet 2013
Langue : français
Type : Texte
Autres articles (40)
-
HTML5 audio and video support
13 avril 2011, parMediaSPIP uses HTML5 video and audio tags to play multimedia files, taking advantage of the latest W3C innovations supported by modern browsers.
The MediaSPIP player used has been created specifically for MediaSPIP and can be easily adapted to fit in with a specific theme.
For older browsers the Flowplayer flash fallback is used.
MediaSPIP allows for media playback on major mobile platforms with the above (...) -
Support audio et vidéo HTML5
10 avril 2011MediaSPIP utilise les balises HTML5 video et audio pour la lecture de documents multimedia en profitant des dernières innovations du W3C supportées par les navigateurs modernes.
Pour les navigateurs plus anciens, le lecteur flash Flowplayer est utilisé.
Le lecteur HTML5 utilisé a été spécifiquement créé pour MediaSPIP : il est complètement modifiable graphiquement pour correspondre à un thème choisi.
Ces technologies permettent de distribuer vidéo et son à la fois sur des ordinateurs conventionnels (...) -
De l’upload à la vidéo finale [version standalone]
31 janvier 2010, parLe chemin d’un document audio ou vidéo dans SPIPMotion est divisé en trois étapes distinctes.
Upload et récupération d’informations de la vidéo source
Dans un premier temps, il est nécessaire de créer un article SPIP et de lui joindre le document vidéo "source".
Au moment où ce document est joint à l’article, deux actions supplémentaires au comportement normal sont exécutées : La récupération des informations techniques des flux audio et video du fichier ; La génération d’une vignette : extraction d’une (...)
Sur d’autres sites (7984)
-
ffmpeg : Combine/merge multiple mp4 videos not working, output only contains the first video
17 août 2021, par dtbakerHere is the command I am using to combine multiple videos :



ffmpeg -i 75_540_38HQ2.mp4 -i 76_70_20.mp4 -i 76_173_80.mp4 -i 81_186_35.mp4 -vcodec copy -acodec copy Mux1.mp4



The resulting
Mux1.mp4
does not contain all videos. Only the first video (75_540_38HQ2.mp4
). The file size of the source and resulting video is below (as you can see, resulting video is slightly larger than first vid) :



$ ls -lh
-rw-r—r— 1 dbaker dbaker 42M 2011-03-24 11:59 75_540_38HQ2.mp4
-rw-r—r— 1 dbaker dbaker 236M 2011-03-24 12:09 76_173_80.mp4
-rw-r—r— 1 dbaker dbaker 26M 2011-03-24 12:05 76_70_20.mp4
-rw-r—r— 1 dbaker dbaker 54M 2011-03-24 12:15 81_186_35.mp4
-rw-r—r— 1 dbaker dbaker 44M 2011-03-24 14:48 Mux1.mp4




Here is the output of the
ffmpeg
command. To me it looks ok, showing the multiple source inputs and the single output.



FFmpeg version SVN-r26402, Copyright (c) 2000-2011 the FFmpeg developers
 built on Mar 21 2011 18:05:32 with gcc 4.4.5
 configuration : —enable-gpl —enable-version3 —enable-nonfree —enable-postproc —enable-libfaac —enable-libmp3lame —enable-libopencore-amrnb —enable-libopencore-amrwb —enable-libtheora —enable-libvorbis —enable-libvpx —enable-libx264 —enable-libxvid —enable-x11grab
 libavutil 50.36. 0 / 50.36. 0
 libavcore 0.16. 1 / 0.16. 1
 libavcodec 52.108. 0 / 52.108. 0
 libavformat 52.93. 0 / 52.93. 0
 libavdevice 52. 2. 3 / 52. 2. 3
 libavfilter 1.74. 0 / 1.74. 0
 libswscale 0.12. 0 / 0.12. 0
 libpostproc 51. 2. 0 / 51. 2. 0
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '75_540_38HQ2.mp4' :
 Metadata :
 major_brand : isom
 minor_version : 512
 compatible_brands : isomiso2avc1mp41
 creation_time : 1970-01-01 00:00:00
 encoder : Lavf52.93.0
 Duration : 00:00:29.99, start : 0.000000, bitrate : 11517 kb/s
 Stream #0.0(eng) : Video : h264, yuv420p, 1280x960 [PAR 1:1 DAR 4:3], 11575 kb/s, 29.94 fps, 29.97 tbr, 30k tbn, 59.94 tbc
 Metadata :
 creation_time : 1970-01-01 00:00:00
 Stream #0.1(eng) : Audio : aac, 48000 Hz, stereo, s16, 127 kb/s
 Metadata :
 creation_time : 1970-01-01 00:00:00
Input #1, mov,mp4,m4a,3gp,3g2,mj2, from '76_70_20.mp4' :
 Metadata :
 major_brand : isom
 minor_version : 512
 compatible_brands : isomiso2avc1mp41
 creation_time : 1970-01-01 00:00:00
 encoder : Lavf52.93.0
 Duration : 00:00:19.98, start : 0.000000, bitrate : 10901 kb/s
 Stream #1.0(eng) : Video : h264, yuv420p, 1280x960 [PAR 1:1 DAR 4:3], 10804 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc
 Metadata :
 creation_time : 1970-01-01 00:00:00
 Stream #1.1(eng) : Audio : aac, 48000 Hz, stereo, s16, 128 kb/s
 Metadata :
 creation_time : 1970-01-01 00:00:00
Input #2, mov,mp4,m4a,3gp,3g2,mj2, from '76_173_80.mp4' :
 Metadata :
 major_brand : isom
 minor_version : 512
 compatible_brands : isomiso2avc1mp41
 creation_time : 1970-01-01 00:00:00
 encoder : Lavf52.93.0
 Duration : 00:03:09.99, start : 0.000000, bitrate : 10393 kb/s
 Stream #2.0(eng) : Video : h264, yuv420p, 1280x960 [PAR 1:1 DAR 4:3], 10321 kb/s, 29.96 fps, 29.97 tbr, 30k tbn, 59.94 tbc
 Metadata :
 creation_time : 1970-01-01 00:00:00
 Stream #2.1(eng) : Audio : aac, 48000 Hz, stereo, s16, 128 kb/s
 Metadata :
 creation_time : 1970-01-01 00:00:00

Seems stream 0 codec frame rate differs from container frame rate : 119.88 (120000/1001) -> 30000.00 (30000/1)
Input #3, mov,mp4,m4a,3gp,3g2,mj2, from '81_186_35.mp4' :
 Metadata :
 major_brand : isom
 minor_version : 512
 compatible_brands : isomiso2avc1mp41
 creation_time : 1970-01-01 00:00:00
 encoder : Lavf52.93.0
 Duration : 00:00:35.00, start : 0.000000, bitrate : 12700 kb/s
 Stream #3.0(eng) : Video : h264, yuv420p, 1280x720 [PAR 1:1 DAR 16:9], 12620 kb/s, 59.91 fps, 30k tbr, 60k tbn, 119.88 tbc
 Metadata :
 creation_time : 1970-01-01 00:00:00
 Stream #3.1(eng) : Audio : aac, 48000 Hz, stereo, s16, 128 kb/s
 Metadata :
 creation_time : 1970-01-01 00:00:00
Output #0, mp4, to 'Mux1.mp4' :
 Metadata :
 major_brand : isom
 minor_version : 512
 compatible_brands : isomiso2avc1mp41
 creation_time : 1970-01-01 00:00:00
 encoder : Lavf52.93.0
 Stream #0.0(eng) : Video : libx264, yuv420p, 1280x960 [PAR 1:1 DAR 4:3], q=2-31, 11575 kb/s, 30k tbn, 29.97 tbc
 Metadata :
 creation_time : 1970-01-01 00:00:00
 Stream #0.1(eng) : Audio : libfaac, 48000 Hz, stereo, 128 kb/s
 Metadata :
 creation_time : 1970-01-01 00:00:00
Stream mapping :
 Stream #0.0 -> #0.0
 Stream #2.1 -> #0.1
Press [q] to stop encoding
frame= 883 fps=632 q=-1.0 Lsize= 44730kB time=29.40 bitrate=12465.1kbits/s 
video:41678kB audio:2969kB global headers:0kB muxing overhead 0.184548%




Am I doing something blindingly stupid here ?



The source videos came from a video camera, and are small snippets taken with
ffmpeg -i bigfile.mp4 -ss 20 -t 10 -vcodec copy etc..



Thanks heaps !!
Dave





Edit : couldn't solve it so I just use avidemux GUI tool. It seemed to append the MP4's just fine.



Must be a problem with MP4's or just the ones that come off a gopro camera.


-
How to use Google's Cloud Speech-to-Text REST API to transcribe a video
24 juillet 2018, par mrbI’d like to have the transcript of 2 people speaking in a video, but I get an empty response from the Cloud Speech-to-Text API
Approach :
I have a 56 minute video file containing a conversation between two people. I would like to have the transcript of that conversation, and I would like to use Google’s Cloud Speech-to-Text API to get that.
To save a little on my Google Cloud Storage I converted to video to audio first by using
mmpeg
.First I’d tried to figure out the audio codec by using the command below, and it looks like AAC.
ffmpeg -i video.mp4
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'videoplayback.mp4':
Metadata:
major_brand : mp42
minor_version : 0
compatible_brands: isommp42
creation_time : 2015-12-30T08:17:14.000000Z
Duration: 00:56:03.99, start: 0.000000, bitrate: 362 kb/s
Stream #0:0(und): Video: h264 (Constrained Baseline) (avc1 / 0x31637661), yuv420p, 490x360 [SAR 1:1 DAR 49:36], 264 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 96 kb/s (default)
Metadata:
creation_time : 2015-12-30T08:17:31.000000Z
handler_name : IsoMedia File Produced by Google, 5-11-2011So I took that from the video by using :
ffmpeg -i video.mp4 -vn -acodec copy myaudio.aac
Details so far :
ffmpeg -i myaudio.aac
Outputs :Input #0, aac, from 'myaudio.aac':
Duration: 00:56:47.49, bitrate: 97 kb/s
Stream #0:0: Audio: aac (LC), 44100 Hz, stereo, fltp, 97 kb/sAfter that I converted it to opus because I’m told that opus is better
ffmpeg -i myaudio.aac -acodec libopus -b:a 97k -vbr on -compression_level 10 myaudio.opus
Info so far :
opusinfo myaudio.opus
User comments section follows...
encoder=Lavc58.18.100 libopus
Opus stream 1:
Pre-skip: 312
Playback gain: 0 dB
Channels: 2
Original sample rate: 48000Hz
Packet duration: 20.0ms (max), 20.0ms (avg), 20.0ms (min)
Page duration: 1000.0ms (max), 1000.0ms (avg), 1000.0ms (min)
Total data length: 29956714 bytes (overhead: 0.872%)
Playback length: 56m:03.990s
Average bitrate: 71.24 kb/s, w/o overhead: 70.62 kb/sI this point I uploaded the
myaudio.opus
to the Google Cloud Storage.curl POST 1
I started the speech recognition by doing a POST withcurl
:curl --request POST --header "Content-Type: application/json" --url 'https://speech.googleapis.com/v1/speech:longrunningrecognize?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&key={MY_API_KEY}' --data '{"audio": {"uri": "gs://{MY_BUCKET}/myaudio.opus"},"config": {"encoding": "OGG_OPUS", "sampleRateHertz": 48000, "languageCode": "en-US"}}'
Response :
{"name": "123456789"}
123456789 was not the actual value.curl GET 1
Now I wanted to have the results :curl --request GET --url 'https://speech.googleapis.com/v1/operations/123456789?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&key={MY_API_KEY}'
This gave me the error :
Error : Unable to recognize speech, possible error in encoding or channel config. Please correct the config and retry the request.
So I updated the encoding configuration from
OGG_OPUS
toLINEAR16
.curl POST 2
Did the post again :curl --request POST --header "Content-Type: application/json" --url 'https://speech.googleapis.com/v1/speech:longrunningrecognize?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&key={MY_API_KEY}' --data '{"audio": {"uri": "gs://{MY_BUCKET}/myaudio.opus"},"config": {"encoding": "LINEAR16", "sampleRateHertz": 48000, "languageCode": "en-US"}}'
Response :
{"name": "987654321"}
curl GET 2
curl --request GET --url 'https://speech.googleapis.com/v1/operations/987654321?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&key={MY_API_KEY}'
Response :
{
"name": "987654321",
"metadata": {
"@type": "type.googleapis.com/google.cloud.speech.v1.LongRunningRecognizeMetadata",
"progressPercent": 100,
"startTime": "2018-06-08T11:01:24.596504Z",
"lastUpdateTime": "2018-06-08T11:01:51.825882Z"
},
"done": true
}The problem is that I don’t get the actual transcription. According the the documentation there should be a
response
key in the response containing the data.Since I’m kinda stuck here I’d like to know if I’m doing something completely wrong. I don’t have any technical or resource limitation so all suggestions are very welcome ! Also happy to change my approach.
Thanks in advance ! Cheers
-
How to use Google's Cloud Speech-to-Text API to transcribe a video using the REST API
8 juin 2018, par mrbI’d like to have the transcript of 2 people speaking in a video, but I get an empty response from the Cloud Speech-to-Text API
Approach :
I have a 56 minute video file containing a conversation between two people. I would like to have the transcript of that conversation, and I would like to use Google’s Cloud Speech-to-Text API to get that.
To save a little on my Google Cloud Storage I converted to video to audio first by using
mmpeg
.First I’d tried to figure out the audio codec by using the command below, and it looks like AAC.
ffmpeg -i video.mp4
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'videoplayback.mp4':
Metadata:
major_brand : mp42
minor_version : 0
compatible_brands: isommp42
creation_time : 2015-12-30T08:17:14.000000Z
Duration: 00:56:03.99, start: 0.000000, bitrate: 362 kb/s
Stream #0:0(und): Video: h264 (Constrained Baseline) (avc1 / 0x31637661), yuv420p, 490x360 [SAR 1:1 DAR 49:36], 264 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 96 kb/s (default)
Metadata:
creation_time : 2015-12-30T08:17:31.000000Z
handler_name : IsoMedia File Produced by Google, 5-11-2011So I took that from the video by using :
ffmpeg -i video.mp4 -vn -acodec copy myaudio.aac
Details so far :
ffmpeg -i myaudio.aac
Outputs :Input #0, aac, from 'myaudio.aac':
Duration: 00:56:47.49, bitrate: 97 kb/s
Stream #0:0: Audio: aac (LC), 44100 Hz, stereo, fltp, 97 kb/sAfter that I converted it to opus because I’m told that opus is better
ffmpeg -i myaudio.aac -acodec libopus -b:a 97k -vbr on -compression_level 10 myaudio.opus
Info so far :
opusinfo myaudio.opus
User comments section follows...
encoder=Lavc58.18.100 libopus
Opus stream 1:
Pre-skip: 312
Playback gain: 0 dB
Channels: 2
Original sample rate: 48000Hz
Packet duration: 20.0ms (max), 20.0ms (avg), 20.0ms (min)
Page duration: 1000.0ms (max), 1000.0ms (avg), 1000.0ms (min)
Total data length: 29956714 bytes (overhead: 0.872%)
Playback length: 56m:03.990s
Average bitrate: 71.24 kb/s, w/o overhead: 70.62 kb/sI this point I uploaded the
myaudio.opus
to the Google Cloud Storage.curl POST 1
I started the speech recognition by doing a POST withcurl
:curl --request POST --header "Content-Type: application/json" --url 'https://speech.googleapis.com/v1/speech:longrunningrecognize?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&key={MY_API_KEY}' --data '{"audio": {"uri": "gs://{MY_BUCKET}/myaudio.opus"},"config": {"encoding": "OGG_OPUS", "sampleRateHertz": 48000, "languageCode": "en-US"}}'
Response :
{"name": "123456789"}
123456789 was not the actual value.curl GET 1
Now I wanted to have the results :curl --request GET --url 'https://speech.googleapis.com/v1/operations/123456789?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&key={MY_API_KEY}'
This gave me the error :
Error : Unable to recognize speech, possible error in encoding or channel config. Please correct the config and retry the request.
So I updated the encoding configuration from
OGG_OPUS
toLINEAR16
.curl POST 2
Did the post again :curl --request POST --header "Content-Type: application/json" --url 'https://speech.googleapis.com/v1/speech:longrunningrecognize?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&key={MY_API_KEY}' --data '{"audio": {"uri": "gs://{MY_BUCKET}/myaudio.opus"},"config": {"encoding": "LINEAR16", "sampleRateHertz": 48000, "languageCode": "en-US"}}'
Response :
{"name": "987654321"}
curl GET 2
curl --request GET --url 'https://speech.googleapis.com/v1/operations/987654321?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&key={MY_API_KEY}'
Response :
{
"name": "987654321",
"metadata": {
"@type": "type.googleapis.com/google.cloud.speech.v1.LongRunningRecognizeMetadata",
"progressPercent": 100,
"startTime": "2018-06-08T11:01:24.596504Z",
"lastUpdateTime": "2018-06-08T11:01:51.825882Z"
},
"done": true
}The problem is that I don’t get the actual transcription. According the the documentation there should be a
response
key in the response containing the data.Since I’m kinda stuck here I’d like to know if I’m doing something completely wrong. I don’t have any technical or resource limitation so all suggestions are very welcome ! Also happy to change my approach.
Thanks in advance ! Cheers