
Recherche avancée
Médias (29)
-
#7 Ambience
16 octobre 2011, par
Mis à jour : Juin 2015
Langue : English
Type : Audio
-
#6 Teaser Music
16 octobre 2011, par
Mis à jour : Février 2013
Langue : English
Type : Audio
-
#5 End Title
16 octobre 2011, par
Mis à jour : Février 2013
Langue : English
Type : Audio
-
#3 The Safest Place
16 octobre 2011, par
Mis à jour : Février 2013
Langue : English
Type : Audio
-
#4 Emo Creates
15 octobre 2011, par
Mis à jour : Février 2013
Langue : English
Type : Audio
-
#2 Typewriter Dance
15 octobre 2011, par
Mis à jour : Février 2013
Langue : English
Type : Audio
Autres articles (71)
-
Les autorisations surchargées par les plugins
27 avril 2010, parMediaspip core
autoriser_auteur_modifier() afin que les visiteurs soient capables de modifier leurs informations sur la page d’auteurs -
Configuration spécifique d’Apache
4 février 2011, parModules spécifiques
Pour la configuration d’Apache, il est conseillé d’activer certains modules non spécifiques à MediaSPIP, mais permettant d’améliorer les performances : mod_deflate et mod_headers pour compresser automatiquement via Apache les pages. Cf ce tutoriel ; mode_expires pour gérer correctement l’expiration des hits. Cf ce tutoriel ;
Il est également conseillé d’ajouter la prise en charge par apache du mime-type pour les fichiers WebM comme indiqué dans ce tutoriel.
Création d’un (...) -
Supporting all media types
13 avril 2011, parUnlike most software and media-sharing platforms, MediaSPIP aims to manage as many different media types as possible. The following are just a few examples from an ever-expanding list of supported formats : images : png, gif, jpg, bmp and more audio : MP3, Ogg, Wav and more video : AVI, MP4, OGV, mpg, mov, wmv and more text, code and other data : OpenOffice, Microsoft Office (Word, PowerPoint, Excel), web (html, CSS), LaTeX, Google Earth and (...)
Sur d’autres sites (10612)
-
Converting a call center recording to something useful
9 août 2018, par AbhayI have a call center recording (when played it sounds gibberish) for which the mediainfo shows info as
ion@aurora:~/Inbound$ mediainfo 48401-3405-48403--18042018170000.wav
General
Complete name : 48401-3405-48403--18042018170000.wav
Format : Wave
File size : 327 KiB
Duration : 4mn 11s
Overall bit rate : 10.7 Kbps
Audio
Format : G.723.1
Codec ID : A100
Duration : 4mn 11s
Bit rate : 10.7 Kbps
Channel(s) : 2 channels
Sampling rate : 8 000 Hz
Stream size : 327 KiB (100%)The ffmpeg info shows this as
ion@aurora:~/Inbound$ ffmpeg -i 48401-3405-48403--18042018170000.wav
ffmpeg version N-91330-ga990184 Copyright (c) 2000-2018 the FFmpeg developers
built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.9) 20160609
configuration: --prefix=/home/ion/ffmpeg_build --pkg-config-flags=--static --extra-cflags=-I/home/ion/ffmpeg_build/include --extra-ldflags=-L/home/ion/ffmpeg_build/lib --extra-libs='-lpthread -lm' --bindir=/home/ion/bin --enable-gpl --enable-libaom --enable-libass --enable-libfdk-aac --enable-libfreetype --enable-libmp3lame --enable-libopus --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265 --enable-nonfree
libavutil 56. 18.102 / 56. 18.102
libavcodec 58. 20.103 / 58. 20.103
libavformat 58. 17.100 / 58. 17.100
libavdevice 58. 4.101 / 58. 4.101
libavfilter 7. 25.100 / 7. 25.100
libswscale 5. 2.100 / 5. 2.100
libswresample 3. 2.100 / 3. 2.100
libpostproc 55. 2.100 / 55. 2.100
Input #0, wav, from '48401-3405-48403--18042018170000.wav':
Duration: 00:04:11.37, bitrate: 10 kb/s
Stream #0:0: Audio: g723_1 ([0][161][0][0] / 0xA100), 8000 Hz, mono, s16, 10 kb/s
At least one output file must be specifiedSo I converted this file to PCM using
ffmpeg -acodec g723_1 -i 48401-3405-48403--18042018170000.wav -acodec pcm_s16le -f wav outnew1.wav
But the audio still sound gibberish , I tried many variation and only Goldwave worked but that works on windows and with GUI not cli.
So how can I convert this file to something useful so that atleast I can listen to it , It feels like a challenge now.
Audio file : https://drive.google.com/open?id=1T54lKaI6IJmOqTPNOA_OkYRz89EQ5F2L
PS : Use VLC to play audio file
-
combine multiple mp4 videos and images
8 juin 2018, par wdsfdsso I have a folder of images, 1/20 named
*.png
and a folder of mp4’s named*.mp4
.I want to create a video in this order :
1.png
for 3 sec1.mp4
2.png
for 3 sec3.mp4
- etc
Is there a way I can display each png for 3 seconds and then show the respective mp4 using
ffmpeg
? I know I can convert each picture to a 3 second video invididually using this command and the framerate differences will be a problem (1/3 vs 60), but I’m not very experienced with command line video editing :ffmpeg -r 1/3 -i 1.png -vcodec mpeg4 1_intro.mp4
ffprobe output :
ffprobe version 4.0 Copyright (c) 2007-2018 the FFmpeg developers
built with Apple LLVM version 9.1.0 (clang-902.0.39.1)
configuration: --prefix=/usr/local/Cellar/ffmpeg/4.0 --enable-shared --enable-pthreads --enable-version3 --enable-hardcoded-tables --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-gpl --enable-libmp3lame --enable-libx264 --enable-libxvid --enable-opencl --enable-videotoolbox --disable-lzma
libavutil 56. 14.100 / 56. 14.100
libavcodec 58. 18.100 / 58. 18.100
libavformat 58. 12.100 / 58. 12.100
libavdevice 58. 3.100 / 58. 3.100
libavfilter 7. 16.100 / 7. 16.100
libavresample 4. 0. 0 / 4. 0. 0
libswscale 5. 1.100 / 5. 1.100
libswresample 3. 1.100 / 3. 1.100
libpostproc 55. 1.100 / 55. 1.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '1.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf57.71.100
Duration: 00:00:32.69, start: 0.000000, bitrate: 7039 kb/s
Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuvj420p(pc, bt709/unknown/unknown), 1920x1080 [SAR 1:1 DAR 16:9], 7004 kb/s, 60 fps, 60 tbr, 90k tbn, 120 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 43 kb/s (default)
Metadata:
handler_name : SoundHandleroutput of
out.mp4
ffprobe version 4.0 Copyright (c) 2007-2018 the FFmpeg developers
built with Apple LLVM version 9.1.0 (clang-902.0.39.1)
configuration: --prefix=/usr/local/Cellar/ffmpeg/4.0 --enable-shared --enable-pthreads --enable-version3 --enable-hardcoded-tables --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-gpl --enable-libmp3lame --enable-libx264 --enable-libxvid --enable-opencl --enable-videotoolbox --disable-lzma
libavutil 56. 14.100 / 56. 14.100
libavcodec 58. 18.100 / 58. 18.100
libavformat 58. 12.100 / 58. 12.100
libavdevice 58. 3.100 / 58. 3.100
libavfilter 7. 16.100 / 7. 16.100
libavresample 4. 0. 0 / 4. 0. 0
libswscale 5. 1.100 / 5. 1.100
libswresample 3. 1.100 / 3. 1.100
libpostproc 55. 1.100 / 55. 1.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'out.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf58.12.100
Duration: 00:10:19.13, start: 0.000000, bitrate: 5689 kb/s
Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 2560x1440 [SAR 1:1 DAR 16:9], 5565 kb/s, 53.91 fps, 60 tbr, 90k tbn, 120 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 114 kb/s (default)
Metadata:
handler_name : SoundHandler -
How to use Google's Cloud Speech-to-Text API to transcribe a video using the REST API
8 juin 2018, par mrbI’d like to have the transcript of 2 people speaking in a video, but I get an empty response from the Cloud Speech-to-Text API
Approach :
I have a 56 minute video file containing a conversation between two people. I would like to have the transcript of that conversation, and I would like to use Google’s Cloud Speech-to-Text API to get that.
To save a little on my Google Cloud Storage I converted to video to audio first by using
mmpeg
.First I’d tried to figure out the audio codec by using the command below, and it looks like AAC.
ffmpeg -i video.mp4
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'videoplayback.mp4':
Metadata:
major_brand : mp42
minor_version : 0
compatible_brands: isommp42
creation_time : 2015-12-30T08:17:14.000000Z
Duration: 00:56:03.99, start: 0.000000, bitrate: 362 kb/s
Stream #0:0(und): Video: h264 (Constrained Baseline) (avc1 / 0x31637661), yuv420p, 490x360 [SAR 1:1 DAR 49:36], 264 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 96 kb/s (default)
Metadata:
creation_time : 2015-12-30T08:17:31.000000Z
handler_name : IsoMedia File Produced by Google, 5-11-2011So I took that from the video by using :
ffmpeg -i video.mp4 -vn -acodec copy myaudio.aac
Details so far :
ffmpeg -i myaudio.aac
Outputs :Input #0, aac, from 'myaudio.aac':
Duration: 00:56:47.49, bitrate: 97 kb/s
Stream #0:0: Audio: aac (LC), 44100 Hz, stereo, fltp, 97 kb/sAfter that I converted it to opus because I’m told that opus is better
ffmpeg -i myaudio.aac -acodec libopus -b:a 97k -vbr on -compression_level 10 myaudio.opus
Info so far :
opusinfo myaudio.opus
User comments section follows...
encoder=Lavc58.18.100 libopus
Opus stream 1:
Pre-skip: 312
Playback gain: 0 dB
Channels: 2
Original sample rate: 48000Hz
Packet duration: 20.0ms (max), 20.0ms (avg), 20.0ms (min)
Page duration: 1000.0ms (max), 1000.0ms (avg), 1000.0ms (min)
Total data length: 29956714 bytes (overhead: 0.872%)
Playback length: 56m:03.990s
Average bitrate: 71.24 kb/s, w/o overhead: 70.62 kb/sI this point I uploaded the
myaudio.opus
to the Google Cloud Storage.curl POST 1
I started the speech recognition by doing a POST withcurl
:curl --request POST --header "Content-Type: application/json" --url 'https://speech.googleapis.com/v1/speech:longrunningrecognize?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&key={MY_API_KEY}' --data '{"audio": {"uri": "gs://{MY_BUCKET}/myaudio.opus"},"config": {"encoding": "OGG_OPUS", "sampleRateHertz": 48000, "languageCode": "en-US"}}'
Response :
{"name": "123456789"}
123456789 was not the actual value.curl GET 1
Now I wanted to have the results :curl --request GET --url 'https://speech.googleapis.com/v1/operations/123456789?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&key={MY_API_KEY}'
This gave me the error :
Error : Unable to recognize speech, possible error in encoding or channel config. Please correct the config and retry the request.
So I updated the encoding configuration from
OGG_OPUS
toLINEAR16
.curl POST 2
Did the post again :curl --request POST --header "Content-Type: application/json" --url 'https://speech.googleapis.com/v1/speech:longrunningrecognize?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&key={MY_API_KEY}' --data '{"audio": {"uri": "gs://{MY_BUCKET}/myaudio.opus"},"config": {"encoding": "LINEAR16", "sampleRateHertz": 48000, "languageCode": "en-US"}}'
Response :
{"name": "987654321"}
curl GET 2
curl --request GET --url 'https://speech.googleapis.com/v1/operations/987654321?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&key={MY_API_KEY}'
Response :
{
"name": "987654321",
"metadata": {
"@type": "type.googleapis.com/google.cloud.speech.v1.LongRunningRecognizeMetadata",
"progressPercent": 100,
"startTime": "2018-06-08T11:01:24.596504Z",
"lastUpdateTime": "2018-06-08T11:01:51.825882Z"
},
"done": true
}The problem is that I don’t get the actual transcription. According the the documentation there should be a
response
key in the response containing the data.Since I’m kinda stuck here I’d like to know if I’m doing something completely wrong. I don’t have any technical or resource limitation so all suggestions are very welcome ! Also happy to change my approach.
Thanks in advance ! Cheers