
Recherche avancée
Médias (39)
-
Stereo master soundtrack
17 octobre 2011, par
Mis à jour : Octobre 2011
Langue : English
Type : Audio
-
ED-ME-5 1-DVD
11 octobre 2011, par
Mis à jour : Octobre 2011
Langue : English
Type : Audio
-
1,000,000
27 septembre 2011, par
Mis à jour : Septembre 2011
Langue : English
Type : Audio
-
Demon Seed
26 septembre 2011, par
Mis à jour : Septembre 2011
Langue : English
Type : Audio
-
The Four of Us are Dying
26 septembre 2011, par
Mis à jour : Septembre 2011
Langue : English
Type : Audio
-
Corona Radiata
26 septembre 2011, par
Mis à jour : Septembre 2011
Langue : English
Type : Audio
Autres articles (42)
-
La sauvegarde automatique de canaux SPIP
1er avril 2010, parDans le cadre de la mise en place d’une plateforme ouverte, il est important pour les hébergeurs de pouvoir disposer de sauvegardes assez régulières pour parer à tout problème éventuel.
Pour réaliser cette tâche on se base sur deux plugins SPIP : Saveauto qui permet une sauvegarde régulière de la base de donnée sous la forme d’un dump mysql (utilisable dans phpmyadmin) mes_fichiers_2 qui permet de réaliser une archive au format zip des données importantes du site (les documents, les éléments (...) -
Script d’installation automatique de MediaSPIP
25 avril 2011, parAfin de palier aux difficultés d’installation dues principalement aux dépendances logicielles coté serveur, un script d’installation "tout en un" en bash a été créé afin de faciliter cette étape sur un serveur doté d’une distribution Linux compatible.
Vous devez bénéficier d’un accès SSH à votre serveur et d’un compte "root" afin de l’utiliser, ce qui permettra d’installer les dépendances. Contactez votre hébergeur si vous ne disposez pas de cela.
La documentation de l’utilisation du script d’installation (...) -
Automated installation script of MediaSPIP
25 avril 2011, parTo overcome the difficulties mainly due to the installation of server side software dependencies, an "all-in-one" installation script written in bash was created to facilitate this step on a server with a compatible Linux distribution.
You must have access to your server via SSH and a root account to use it, which will install the dependencies. Contact your provider if you do not have that.
The documentation of the use of this installation script is available here.
The code of this (...)
Sur d’autres sites (4556)
-
Converting a call center recording to something useful
9 août 2018, par AbhayI have a call center recording (when played it sounds gibberish) for which the mediainfo shows info as
ion@aurora:~/Inbound$ mediainfo 48401-3405-48403--18042018170000.wav
General
Complete name : 48401-3405-48403--18042018170000.wav
Format : Wave
File size : 327 KiB
Duration : 4mn 11s
Overall bit rate : 10.7 Kbps
Audio
Format : G.723.1
Codec ID : A100
Duration : 4mn 11s
Bit rate : 10.7 Kbps
Channel(s) : 2 channels
Sampling rate : 8 000 Hz
Stream size : 327 KiB (100%)The ffmpeg info shows this as
ion@aurora:~/Inbound$ ffmpeg -i 48401-3405-48403--18042018170000.wav
ffmpeg version N-91330-ga990184 Copyright (c) 2000-2018 the FFmpeg developers
built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.9) 20160609
configuration: --prefix=/home/ion/ffmpeg_build --pkg-config-flags=--static --extra-cflags=-I/home/ion/ffmpeg_build/include --extra-ldflags=-L/home/ion/ffmpeg_build/lib --extra-libs='-lpthread -lm' --bindir=/home/ion/bin --enable-gpl --enable-libaom --enable-libass --enable-libfdk-aac --enable-libfreetype --enable-libmp3lame --enable-libopus --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265 --enable-nonfree
libavutil 56. 18.102 / 56. 18.102
libavcodec 58. 20.103 / 58. 20.103
libavformat 58. 17.100 / 58. 17.100
libavdevice 58. 4.101 / 58. 4.101
libavfilter 7. 25.100 / 7. 25.100
libswscale 5. 2.100 / 5. 2.100
libswresample 3. 2.100 / 3. 2.100
libpostproc 55. 2.100 / 55. 2.100
Input #0, wav, from '48401-3405-48403--18042018170000.wav':
Duration: 00:04:11.37, bitrate: 10 kb/s
Stream #0:0: Audio: g723_1 ([0][161][0][0] / 0xA100), 8000 Hz, mono, s16, 10 kb/s
At least one output file must be specifiedSo I converted this file to PCM using
ffmpeg -acodec g723_1 -i 48401-3405-48403--18042018170000.wav -acodec pcm_s16le -f wav outnew1.wav
But the audio still sound gibberish , I tried many variation and only Goldwave worked but that works on windows and with GUI not cli.
So how can I convert this file to something useful so that atleast I can listen to it , It feels like a challenge now.
Audio file : https://drive.google.com/open?id=1T54lKaI6IJmOqTPNOA_OkYRz89EQ5F2L
PS : Use VLC to play audio file
-
ffmpeg combining mp4 videos incorrectly
9 juin 2018, par wdsfdsi have a list of 20 mp4 files that I need to combine into one video. They are all 60 FPS, but range in size. In between each mp4 file I need to show its ’intro’, a 1.5 second clip about the file.
Reference :
pics/i.png
- intro for the nth mp4 fileclips/i_intro.mp4
- mp4 intro generated from i.pngclips/i.mp4
- original mp4clips/i_resize.mp4
- 720p mp4
Here is what I’ve done :
Convert image to mp4 using this command (taken from this post) :
ffmpeg -y -i pics/{i}.png -f lavfi -i anullsrc -r 60 -pix_fmt yuv420p -ac 2 -ar 44100 -video_track_timescale 90k -t 1.5 clips/{i}_intro.mp4
Convert
i.mp4
to 720p using this command (taken from this post) :ffmpeg -i clips/{i}.mp4 -c:v libx264 -crf 23 -preset medium -c:a aac -b:a 128k -vf scale=-2:720,format=yuv420p {i}_resize.mp4
Created a file like this :
file 1_intro.mp4
file 1_resize.mp4
file 2_intro.mp4
file 2_resize.mp4
...Combine all clips :
ffmpeg -f concat -i {video_list} -c copy out.mp4
This generates a file, but it is definitely messed up. When I combined the files I got a ton of warning messages like this :
[mp4 @ 0x7f88fe01de00] Non-monotonous DTS in output stream 0:0;
previous: 48966232, current: 8826932; changing to 48966233. This may
result in incorrect timestamps in the output file.FFprobe for
out.mp4
ffprobe version 4.0 Copyright (c) 2007-2018 the FFmpeg developers
built with Apple LLVM version 9.1.0 (clang-902.0.39.1)
configuration: --prefix=/usr/local/Cellar/ffmpeg/4.0 --enable-shared --enable-pthreads --enable-version3 --enable-hardcoded-tables --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-gpl --enable-libmp3lame --enable-libx264 --enable-libxvid --enable-opencl --enable-videotoolbox --disable-lzma
libavutil 56. 14.100 / 56. 14.100
libavcodec 58. 18.100 / 58. 18.100
libavformat 58. 12.100 / 58. 12.100
libavdevice 58. 3.100 / 58. 3.100
libavfilter 7. 16.100 / 7. 16.100
libavresample 4. 0. 0 / 4. 0. 0
libswscale 5. 1.100 / 5. 1.100
libswresample 3. 1.100 / 3. 1.100
libpostproc 55. 1.100 / 55. 1.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'out.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf58.12.100
Duration: 00:09:35.56, start: 0.000000, bitrate: 4119 kb/s
Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1280x720 [SAR 1:1 DAR 16:9], 4221 kb/s, 59.98 fps, 351.56 tbr, 90k tbn, 120 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 122 kb/s (default)
Metadata:
handler_name : SoundHandlerFFprobe for
1.mp4
ffprobe version 4.0 Copyright (c) 2007-2018 the FFmpeg developers
built with Apple LLVM version 9.1.0 (clang-902.0.39.1)
configuration: --prefix=/usr/local/Cellar/ffmpeg/4.0 --enable-shared --enable-pthreads --enable-version3 --enable-hardcoded-tables --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-gpl --enable-libmp3lame --enable-libx264 --enable-libxvid --enable-opencl --enable-videotoolbox --disable-lzma
libavutil 56. 14.100 / 56. 14.100
libavcodec 58. 18.100 / 58. 18.100
libavformat 58. 12.100 / 58. 12.100
libavdevice 58. 3.100 / 58. 3.100
libavfilter 7. 16.100 / 7. 16.100
libavresample 4. 0. 0 / 4. 0. 0
libswscale 5. 1.100 / 5. 1.100
libswresample 3. 1.100 / 3. 1.100
libpostproc 55. 1.100 / 55. 1.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '1.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf57.71.100
Duration: 00:00:40.94, start: 0.000000, bitrate: 7613 kb/s
Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuvj420p(pc, bt709/unknown/unknown), 1920x1080 [SAR 1:1 DAR 16:9], 7451 kb/s, 60 fps, 60 tbr, 90k tbn, 120 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 160 kb/s (default)
Metadata:
handler_name : SoundHandlerFFprobe for
1_resized.mp4
ffprobe version 4.0 Copyright (c) 2007-2018 the FFmpeg developers
built with Apple LLVM version 9.1.0 (clang-902.0.39.1)
configuration: --prefix=/usr/local/Cellar/ffmpeg/4.0 --enable-shared --enable-pthreads --enable-version3 --enable-hardcoded-tables --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-gpl --enable-libmp3lame --enable-libx264 --enable-libxvid --enable-opencl --enable-videotoolbox --disable-lzma
libavutil 56. 14.100 / 56. 14.100
libavcodec 58. 18.100 / 58. 18.100
libavformat 58. 12.100 / 58. 12.100
libavdevice 58. 3.100 / 58. 3.100
libavfilter 7. 16.100 / 7. 16.100
libavresample 4. 0. 0 / 4. 0. 0
libswscale 5. 1.100 / 5. 1.100
libswresample 3. 1.100 / 3. 1.100
libpostproc 55. 1.100 / 55. 1.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '1_resize.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf58.12.100
Duration: 00:00:32.37, start: 0.000000, bitrate: 3266 kb/s
Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1280x720 [SAR 1:1 DAR 16:9], 3127 kb/s, 60 fps, 60 tbr, 15360 tbn, 120 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 129 kb/s (default)
Metadata:
handler_name : SoundHandlerFFprobe for
1_intro.mp4
ffprobe version 4.0 Copyright (c) 2007-2018 the FFmpeg developers
built with Apple LLVM version 9.1.0 (clang-902.0.39.1)
configuration: --prefix=/usr/local/Cellar/ffmpeg/4.0 --enable-shared --enable-pthreads --enable-version3 --enable-hardcoded-tables --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-gpl --enable-libmp3lame --enable-libx264 --enable-libxvid --enable-opencl --enable-videotoolbox --disable-lzma
libavutil 56. 14.100 / 56. 14.100
libavcodec 58. 18.100 / 58. 18.100
libavformat 58. 12.100 / 58. 12.100
libavdevice 58. 3.100 / 58. 3.100
libavfilter 7. 16.100 / 7. 16.100
libavresample 4. 0. 0 / 4. 0. 0
libswscale 5. 1.100 / 5. 1.100
libswresample 3. 1.100 / 3. 1.100
libpostproc 55. 1.100 / 55. 1.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '1_intro.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf58.12.100
Duration: 00:00:01.52, start: 0.000000, bitrate: 113 kb/s
Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1280x720 [SAR 1:1 DAR 16:9], 4667 kb/s, 60 fps, 60 tbr, 90k tbn, 120 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 2 kb/s (default)
Metadata:
handler_name : SoundHandlerI noticed that the length of the files are different, and the
tbn
in different, but idk what that means. Thanks for helping ! -
How to use Google's Cloud Speech-to-Text REST API to transcribe a video
24 juillet 2018, par mrbI’d like to have the transcript of 2 people speaking in a video, but I get an empty response from the Cloud Speech-to-Text API
Approach :
I have a 56 minute video file containing a conversation between two people. I would like to have the transcript of that conversation, and I would like to use Google’s Cloud Speech-to-Text API to get that.
To save a little on my Google Cloud Storage I converted to video to audio first by using
mmpeg
.First I’d tried to figure out the audio codec by using the command below, and it looks like AAC.
ffmpeg -i video.mp4
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'videoplayback.mp4':
Metadata:
major_brand : mp42
minor_version : 0
compatible_brands: isommp42
creation_time : 2015-12-30T08:17:14.000000Z
Duration: 00:56:03.99, start: 0.000000, bitrate: 362 kb/s
Stream #0:0(und): Video: h264 (Constrained Baseline) (avc1 / 0x31637661), yuv420p, 490x360 [SAR 1:1 DAR 49:36], 264 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 96 kb/s (default)
Metadata:
creation_time : 2015-12-30T08:17:31.000000Z
handler_name : IsoMedia File Produced by Google, 5-11-2011So I took that from the video by using :
ffmpeg -i video.mp4 -vn -acodec copy myaudio.aac
Details so far :
ffmpeg -i myaudio.aac
Outputs :Input #0, aac, from 'myaudio.aac':
Duration: 00:56:47.49, bitrate: 97 kb/s
Stream #0:0: Audio: aac (LC), 44100 Hz, stereo, fltp, 97 kb/sAfter that I converted it to opus because I’m told that opus is better
ffmpeg -i myaudio.aac -acodec libopus -b:a 97k -vbr on -compression_level 10 myaudio.opus
Info so far :
opusinfo myaudio.opus
User comments section follows...
encoder=Lavc58.18.100 libopus
Opus stream 1:
Pre-skip: 312
Playback gain: 0 dB
Channels: 2
Original sample rate: 48000Hz
Packet duration: 20.0ms (max), 20.0ms (avg), 20.0ms (min)
Page duration: 1000.0ms (max), 1000.0ms (avg), 1000.0ms (min)
Total data length: 29956714 bytes (overhead: 0.872%)
Playback length: 56m:03.990s
Average bitrate: 71.24 kb/s, w/o overhead: 70.62 kb/sI this point I uploaded the
myaudio.opus
to the Google Cloud Storage.curl POST 1
I started the speech recognition by doing a POST withcurl
:curl --request POST --header "Content-Type: application/json" --url 'https://speech.googleapis.com/v1/speech:longrunningrecognize?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&key={MY_API_KEY}' --data '{"audio": {"uri": "gs://{MY_BUCKET}/myaudio.opus"},"config": {"encoding": "OGG_OPUS", "sampleRateHertz": 48000, "languageCode": "en-US"}}'
Response :
{"name": "123456789"}
123456789 was not the actual value.curl GET 1
Now I wanted to have the results :curl --request GET --url 'https://speech.googleapis.com/v1/operations/123456789?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&key={MY_API_KEY}'
This gave me the error :
Error : Unable to recognize speech, possible error in encoding or channel config. Please correct the config and retry the request.
So I updated the encoding configuration from
OGG_OPUS
toLINEAR16
.curl POST 2
Did the post again :curl --request POST --header "Content-Type: application/json" --url 'https://speech.googleapis.com/v1/speech:longrunningrecognize?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&key={MY_API_KEY}' --data '{"audio": {"uri": "gs://{MY_BUCKET}/myaudio.opus"},"config": {"encoding": "LINEAR16", "sampleRateHertz": 48000, "languageCode": "en-US"}}'
Response :
{"name": "987654321"}
curl GET 2
curl --request GET --url 'https://speech.googleapis.com/v1/operations/987654321?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&key={MY_API_KEY}'
Response :
{
"name": "987654321",
"metadata": {
"@type": "type.googleapis.com/google.cloud.speech.v1.LongRunningRecognizeMetadata",
"progressPercent": 100,
"startTime": "2018-06-08T11:01:24.596504Z",
"lastUpdateTime": "2018-06-08T11:01:51.825882Z"
},
"done": true
}The problem is that I don’t get the actual transcription. According the the documentation there should be a
response
key in the response containing the data.Since I’m kinda stuck here I’d like to know if I’m doing something completely wrong. I don’t have any technical or resource limitation so all suggestions are very welcome ! Also happy to change my approach.
Thanks in advance ! Cheers