Recherche avancée

Médias (39)

Mot : - Tags -/audio

Autres articles (42)

  • La sauvegarde automatique de canaux SPIP

    1er avril 2010, par

    Dans le cadre de la mise en place d’une plateforme ouverte, il est important pour les hébergeurs de pouvoir disposer de sauvegardes assez régulières pour parer à tout problème éventuel.
    Pour réaliser cette tâche on se base sur deux plugins SPIP : Saveauto qui permet une sauvegarde régulière de la base de donnée sous la forme d’un dump mysql (utilisable dans phpmyadmin) mes_fichiers_2 qui permet de réaliser une archive au format zip des données importantes du site (les documents, les éléments (...)

  • Script d’installation automatique de MediaSPIP

    25 avril 2011, par

    Afin de palier aux difficultés d’installation dues principalement aux dépendances logicielles coté serveur, un script d’installation "tout en un" en bash a été créé afin de faciliter cette étape sur un serveur doté d’une distribution Linux compatible.
    Vous devez bénéficier d’un accès SSH à votre serveur et d’un compte "root" afin de l’utiliser, ce qui permettra d’installer les dépendances. Contactez votre hébergeur si vous ne disposez pas de cela.
    La documentation de l’utilisation du script d’installation (...)

  • Automated installation script of MediaSPIP

    25 avril 2011, par

    To overcome the difficulties mainly due to the installation of server side software dependencies, an "all-in-one" installation script written in bash was created to facilitate this step on a server with a compatible Linux distribution.
    You must have access to your server via SSH and a root account to use it, which will install the dependencies. Contact your provider if you do not have that.
    The documentation of the use of this installation script is available here.
    The code of this (...)

Sur d’autres sites (4556)

  • Converting a call center recording to something useful

    9 août 2018, par Abhay

    I have a call center recording (when played it sounds gibberish) for which the mediainfo shows info as

    ion@aurora:~/Inbound$ mediainfo 48401-3405-48403--18042018170000.wav
    General
    Complete name                            : 48401-3405-48403--18042018170000.wav
    Format                                   : Wave
    File size                                : 327 KiB
    Duration                                 : 4mn 11s
    Overall bit rate                         : 10.7 Kbps

    Audio
    Format                                   : G.723.1
    Codec ID                                 : A100
    Duration                                 : 4mn 11s
    Bit rate                                 : 10.7 Kbps
    Channel(s)                               : 2 channels
    Sampling rate                            : 8 000 Hz
    Stream size                              : 327 KiB (100%)

    The ffmpeg info shows this as

    ion@aurora:~/Inbound$ ffmpeg -i 48401-3405-48403--18042018170000.wav
    ffmpeg version N-91330-ga990184 Copyright (c) 2000-2018 the FFmpeg developers
     built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.9) 20160609
     configuration: --prefix=/home/ion/ffmpeg_build --pkg-config-flags=--static --extra-cflags=-I/home/ion/ffmpeg_build/include --extra-ldflags=-L/home/ion/ffmpeg_build/lib --extra-libs='-lpthread -lm' --bindir=/home/ion/bin --enable-gpl --enable-libaom --enable-libass --enable-libfdk-aac --enable-libfreetype --enable-libmp3lame --enable-libopus --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265 --enable-nonfree
     libavutil      56. 18.102 / 56. 18.102
     libavcodec     58. 20.103 / 58. 20.103
     libavformat    58. 17.100 / 58. 17.100
     libavdevice    58.  4.101 / 58.  4.101
     libavfilter     7. 25.100 /  7. 25.100
     libswscale      5.  2.100 /  5.  2.100
     libswresample   3.  2.100 /  3.  2.100
     libpostproc    55.  2.100 / 55.  2.100
    Input #0, wav, from '48401-3405-48403--18042018170000.wav':
     Duration: 00:04:11.37, bitrate: 10 kb/s
       Stream #0:0: Audio: g723_1 ([0][161][0][0] / 0xA100), 8000 Hz, mono, s16, 10 kb/s
    At least one output file must be specified

    So I converted this file to PCM using

    ffmpeg -acodec g723_1 -i 48401-3405-48403--18042018170000.wav -acodec pcm_s16le -f wav outnew1.wav

    But the audio still sound gibberish , I tried many variation and only Goldwave worked but that works on windows and with GUI not cli.

    So how can I convert this file to something useful so that atleast I can listen to it , It feels like a challenge now.

    Audio file : https://drive.google.com/open?id=1T54lKaI6IJmOqTPNOA_OkYRz89EQ5F2L

    PS : Use VLC to play audio file

  • ffmpeg combining mp4 videos incorrectly

    9 juin 2018, par wdsfds

    i have a list of 20 mp4 files that I need to combine into one video. They are all 60 FPS, but range in size. In between each mp4 file I need to show its ’intro’, a 1.5 second clip about the file.

    Reference :

    • pics/i.png - intro for the nth mp4 file
    • clips/i_intro.mp4 - mp4 intro generated from i.png
    • clips/i.mp4 - original mp4
    • clips/i_resize.mp4 - 720p mp4

    Here is what I’ve done :

    Convert image to mp4 using this command (taken from this post) :

    ffmpeg -y -i pics/{i}.png -f lavfi -i anullsrc -r 60 -pix_fmt yuv420p -ac 2 -ar 44100 -video_track_timescale 90k -t 1.5 clips/{i}_intro.mp4

    Convert i.mp4 to 720p using this command (taken from this post) :

    ffmpeg -i clips/{i}.mp4 -c:v libx264 -crf 23 -preset medium -c:a aac -b:a 128k -vf scale=-2:720,format=yuv420p {i}_resize.mp4

    Created a file like this :

    file 1_intro.mp4
    file 1_resize.mp4
    file 2_intro.mp4
    file 2_resize.mp4
    ...

    Combine all clips :

    ffmpeg -f concat -i {video_list} -c copy out.mp4

    This generates a file, but it is definitely messed up. When I combined the files I got a ton of warning messages like this :

    [mp4 @ 0x7f88fe01de00] Non-monotonous DTS in output stream 0:0;
    previous: 48966232, current: 8826932; changing to 48966233. This may
    result in incorrect timestamps in the output file.

    FFprobe for out.mp4

    ffprobe version 4.0 Copyright (c) 2007-2018 the FFmpeg developers
     built with Apple LLVM version 9.1.0 (clang-902.0.39.1)
     configuration: --prefix=/usr/local/Cellar/ffmpeg/4.0 --enable-shared --enable-pthreads --enable-version3 --enable-hardcoded-tables --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-gpl --enable-libmp3lame --enable-libx264 --enable-libxvid --enable-opencl --enable-videotoolbox --disable-lzma
     libavutil      56. 14.100 / 56. 14.100
     libavcodec     58. 18.100 / 58. 18.100
     libavformat    58. 12.100 / 58. 12.100
     libavdevice    58.  3.100 / 58.  3.100
     libavfilter     7. 16.100 /  7. 16.100
     libavresample   4.  0.  0 /  4.  0.  0
     libswscale      5.  1.100 /  5.  1.100
     libswresample   3.  1.100 /  3.  1.100
     libpostproc    55.  1.100 / 55.  1.100
    Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'out.mp4':
     Metadata:
       major_brand     : isom
       minor_version   : 512
       compatible_brands: isomiso2avc1mp41
       encoder         : Lavf58.12.100
     Duration: 00:09:35.56, start: 0.000000, bitrate: 4119 kb/s
       Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1280x720 [SAR 1:1 DAR 16:9], 4221 kb/s, 59.98 fps, 351.56 tbr, 90k tbn, 120 tbc (default)
       Metadata:
         handler_name    : VideoHandler
       Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 122 kb/s (default)
       Metadata:
         handler_name    : SoundHandler

    FFprobe for 1.mp4

    ffprobe version 4.0 Copyright (c) 2007-2018 the FFmpeg developers
     built with Apple LLVM version 9.1.0 (clang-902.0.39.1)
     configuration: --prefix=/usr/local/Cellar/ffmpeg/4.0 --enable-shared --enable-pthreads --enable-version3 --enable-hardcoded-tables --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-gpl --enable-libmp3lame --enable-libx264 --enable-libxvid --enable-opencl --enable-videotoolbox --disable-lzma
     libavutil      56. 14.100 / 56. 14.100
     libavcodec     58. 18.100 / 58. 18.100
     libavformat    58. 12.100 / 58. 12.100
     libavdevice    58.  3.100 / 58.  3.100
     libavfilter     7. 16.100 /  7. 16.100
     libavresample   4.  0.  0 /  4.  0.  0
     libswscale      5.  1.100 /  5.  1.100
     libswresample   3.  1.100 /  3.  1.100
     libpostproc    55.  1.100 / 55.  1.100
    Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '1.mp4':
     Metadata:
       major_brand     : isom
       minor_version   : 512
       compatible_brands: isomiso2avc1mp41
       encoder         : Lavf57.71.100
     Duration: 00:00:40.94, start: 0.000000, bitrate: 7613 kb/s
       Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuvj420p(pc, bt709/unknown/unknown), 1920x1080 [SAR 1:1 DAR 16:9], 7451 kb/s, 60 fps, 60 tbr, 90k tbn, 120 tbc (default)
       Metadata:
         handler_name    : VideoHandler
       Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 160 kb/s (default)
       Metadata:
         handler_name    : SoundHandler

    FFprobe for 1_resized.mp4

    ffprobe version 4.0 Copyright (c) 2007-2018 the FFmpeg developers
     built with Apple LLVM version 9.1.0 (clang-902.0.39.1)
     configuration: --prefix=/usr/local/Cellar/ffmpeg/4.0 --enable-shared --enable-pthreads --enable-version3 --enable-hardcoded-tables --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-gpl --enable-libmp3lame --enable-libx264 --enable-libxvid --enable-opencl --enable-videotoolbox --disable-lzma
     libavutil      56. 14.100 / 56. 14.100
     libavcodec     58. 18.100 / 58. 18.100
     libavformat    58. 12.100 / 58. 12.100
     libavdevice    58.  3.100 / 58.  3.100
     libavfilter     7. 16.100 /  7. 16.100
     libavresample   4.  0.  0 /  4.  0.  0
     libswscale      5.  1.100 /  5.  1.100
     libswresample   3.  1.100 /  3.  1.100
     libpostproc    55.  1.100 / 55.  1.100
    Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '1_resize.mp4':
     Metadata:
       major_brand     : isom
       minor_version   : 512
       compatible_brands: isomiso2avc1mp41
       encoder         : Lavf58.12.100
     Duration: 00:00:32.37, start: 0.000000, bitrate: 3266 kb/s
       Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1280x720 [SAR 1:1 DAR 16:9], 3127 kb/s, 60 fps, 60 tbr, 15360 tbn, 120 tbc (default)
       Metadata:
         handler_name    : VideoHandler
       Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 129 kb/s (default)
       Metadata:
         handler_name    : SoundHandler

    FFprobe for 1_intro.mp4

    ffprobe version 4.0 Copyright (c) 2007-2018 the FFmpeg developers
     built with Apple LLVM version 9.1.0 (clang-902.0.39.1)
     configuration: --prefix=/usr/local/Cellar/ffmpeg/4.0 --enable-shared --enable-pthreads --enable-version3 --enable-hardcoded-tables --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-gpl --enable-libmp3lame --enable-libx264 --enable-libxvid --enable-opencl --enable-videotoolbox --disable-lzma
     libavutil      56. 14.100 / 56. 14.100
     libavcodec     58. 18.100 / 58. 18.100
     libavformat    58. 12.100 / 58. 12.100
     libavdevice    58.  3.100 / 58.  3.100
     libavfilter     7. 16.100 /  7. 16.100
     libavresample   4.  0.  0 /  4.  0.  0
     libswscale      5.  1.100 /  5.  1.100
     libswresample   3.  1.100 /  3.  1.100
     libpostproc    55.  1.100 / 55.  1.100
    Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '1_intro.mp4':
     Metadata:
       major_brand     : isom
       minor_version   : 512
       compatible_brands: isomiso2avc1mp41
       encoder         : Lavf58.12.100
     Duration: 00:00:01.52, start: 0.000000, bitrate: 113 kb/s
       Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1280x720 [SAR 1:1 DAR 16:9], 4667 kb/s, 60 fps, 60 tbr, 90k tbn, 120 tbc (default)
       Metadata:
         handler_name    : VideoHandler
       Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 2 kb/s (default)
       Metadata:
         handler_name    : SoundHandler

    I noticed that the length of the files are different, and the tbn in different, but idk what that means. Thanks for helping !

  • How to use Google's Cloud Speech-to-Text REST API to transcribe a video

    24 juillet 2018, par mrb

    I’d like to have the transcript of 2 people speaking in a video, but I get an empty response from the Cloud Speech-to-Text API

    Approach :

    I have a 56 minute video file containing a conversation between two people. I would like to have the transcript of that conversation, and I would like to use Google’s Cloud Speech-to-Text API to get that.

    To save a little on my Google Cloud Storage I converted to video to audio first by using mmpeg.

    First I’d tried to figure out the audio codec by using the command below, and it looks like AAC.
    ffmpeg -i video.mp4

    Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'videoplayback.mp4':
     Metadata:
       major_brand     : mp42
       minor_version   : 0
       compatible_brands: isommp42
       creation_time   : 2015-12-30T08:17:14.000000Z
     Duration: 00:56:03.99, start: 0.000000, bitrate: 362 kb/s
       Stream #0:0(und): Video: h264 (Constrained Baseline) (avc1 / 0x31637661), yuv420p, 490x360 [SAR 1:1 DAR 49:36], 264 kb/s,     29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc (default)
       Metadata:
         handler_name    : VideoHandler
       Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 96 kb/s (default)
       Metadata:
         creation_time   : 2015-12-30T08:17:31.000000Z
         handler_name    : IsoMedia File Produced by Google, 5-11-2011    

    So I took that from the video by using :
    ffmpeg -i video.mp4 -vn -acodec copy myaudio.aac

    Details so far :
    ffmpeg -i myaudio.aac
    Outputs :

    Input #0, aac, from 'myaudio.aac':
     Duration: 00:56:47.49, bitrate: 97 kb/s
       Stream #0:0: Audio: aac (LC), 44100 Hz, stereo, fltp, 97 kb/s

    After that I converted it to opus because I’m told that opus is better
    ffmpeg -i myaudio.aac -acodec libopus -b:a 97k -vbr on -compression_level 10 myaudio.opus

    Info so far :
    opusinfo myaudio.opus

    User comments section follows...
       encoder=Lavc58.18.100 libopus
    Opus stream 1:
       Pre-skip: 312
       Playback gain: 0 dB
       Channels: 2
       Original sample rate: 48000Hz
       Packet duration:   20.0ms (max),   20.0ms (avg),   20.0ms (min)
       Page duration:   1000.0ms (max), 1000.0ms (avg), 1000.0ms (min)
       Total data length: 29956714 bytes (overhead: 0.872%)
       Playback length: 56m:03.990s
       Average bitrate: 71.24 kb/s, w/o overhead: 70.62 kb/s

    I this point I uploaded the myaudio.opus to the Google Cloud Storage.

    curl POST 1
    I started the speech recognition by doing a POST with curl :

    curl --request POST  --header "Content-Type: application/json" --url 'https://speech.googleapis.com/v1/speech:longrunningrecognize?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&key={MY_API_KEY}' --data '{"audio": {"uri": "gs://{MY_BUCKET}/myaudio.opus"},"config": {"encoding": "OGG_OPUS", "sampleRateHertz": 48000, "languageCode": "en-US"}}'

    Response : {"name": "123456789"}
    123456789 was not the actual value.

    curl GET 1
    Now I wanted to have the results :

    curl --request GET --url 'https://speech.googleapis.com/v1/operations/123456789?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&key={MY_API_KEY}'

    This gave me the error : Error : Unable to recognize speech, possible error in encoding or channel config. Please correct the config and retry the request.

    So I updated the encoding configuration from OGG_OPUS to LINEAR16.

    curl POST 2
    Did the post again :

    curl --request POST  --header "Content-Type: application/json" --url 'https://speech.googleapis.com/v1/speech:longrunningrecognize?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&key={MY_API_KEY}' --data '{"audio": {"uri": "gs://{MY_BUCKET}/myaudio.opus"},"config": {"encoding": "LINEAR16", "sampleRateHertz": 48000, "languageCode": "en-US"}}'

    Response : {"name": "987654321"}

    curl GET 2

    curl --request GET --url 'https://speech.googleapis.com/v1/operations/987654321?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&key={MY_API_KEY}'

    Response :

    {
     "name": "987654321",
     "metadata": {
       "@type": "type.googleapis.com/google.cloud.speech.v1.LongRunningRecognizeMetadata",
       "progressPercent": 100,
       "startTime": "2018-06-08T11:01:24.596504Z",
       "lastUpdateTime": "2018-06-08T11:01:51.825882Z"
     },
     "done": true
    }

    The problem is that I don’t get the actual transcription. According the the documentation there should be a response key in the response containing the data.

    Since I’m kinda stuck here I’d like to know if I’m doing something completely wrong. I don’t have any technical or resource limitation so all suggestions are very welcome ! Also happy to change my approach.

    Thanks in advance ! Cheers