Recherche avancée

Médias (29)

Mot : - Tags -/Musique

Autres articles (71)

  • Les autorisations surchargées par les plugins

    27 avril 2010, par

    Mediaspip core
    autoriser_auteur_modifier() afin que les visiteurs soient capables de modifier leurs informations sur la page d’auteurs

  • Configuration spécifique d’Apache

    4 février 2011, par

    Modules spécifiques
    Pour la configuration d’Apache, il est conseillé d’activer certains modules non spécifiques à MediaSPIP, mais permettant d’améliorer les performances : mod_deflate et mod_headers pour compresser automatiquement via Apache les pages. Cf ce tutoriel ; mode_expires pour gérer correctement l’expiration des hits. Cf ce tutoriel ;
    Il est également conseillé d’ajouter la prise en charge par apache du mime-type pour les fichiers WebM comme indiqué dans ce tutoriel.
    Création d’un (...)

  • Supporting all media types

    13 avril 2011, par

    Unlike most software and media-sharing platforms, MediaSPIP aims to manage as many different media types as possible. The following are just a few examples from an ever-expanding list of supported formats : images : png, gif, jpg, bmp and more audio : MP3, Ogg, Wav and more video : AVI, MP4, OGV, mpg, mov, wmv and more text, code and other data : OpenOffice, Microsoft Office (Word, PowerPoint, Excel), web (html, CSS), LaTeX, Google Earth and (...)

Sur d’autres sites (10612)

  • Converting a call center recording to something useful

    9 août 2018, par Abhay

    I have a call center recording (when played it sounds gibberish) for which the mediainfo shows info as

    ion@aurora:~/Inbound$ mediainfo 48401-3405-48403--18042018170000.wav
    General
    Complete name                            : 48401-3405-48403--18042018170000.wav
    Format                                   : Wave
    File size                                : 327 KiB
    Duration                                 : 4mn 11s
    Overall bit rate                         : 10.7 Kbps

    Audio
    Format                                   : G.723.1
    Codec ID                                 : A100
    Duration                                 : 4mn 11s
    Bit rate                                 : 10.7 Kbps
    Channel(s)                               : 2 channels
    Sampling rate                            : 8 000 Hz
    Stream size                              : 327 KiB (100%)

    The ffmpeg info shows this as

    ion@aurora:~/Inbound$ ffmpeg -i 48401-3405-48403--18042018170000.wav
    ffmpeg version N-91330-ga990184 Copyright (c) 2000-2018 the FFmpeg developers
     built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.9) 20160609
     configuration: --prefix=/home/ion/ffmpeg_build --pkg-config-flags=--static --extra-cflags=-I/home/ion/ffmpeg_build/include --extra-ldflags=-L/home/ion/ffmpeg_build/lib --extra-libs='-lpthread -lm' --bindir=/home/ion/bin --enable-gpl --enable-libaom --enable-libass --enable-libfdk-aac --enable-libfreetype --enable-libmp3lame --enable-libopus --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265 --enable-nonfree
     libavutil      56. 18.102 / 56. 18.102
     libavcodec     58. 20.103 / 58. 20.103
     libavformat    58. 17.100 / 58. 17.100
     libavdevice    58.  4.101 / 58.  4.101
     libavfilter     7. 25.100 /  7. 25.100
     libswscale      5.  2.100 /  5.  2.100
     libswresample   3.  2.100 /  3.  2.100
     libpostproc    55.  2.100 / 55.  2.100
    Input #0, wav, from '48401-3405-48403--18042018170000.wav':
     Duration: 00:04:11.37, bitrate: 10 kb/s
       Stream #0:0: Audio: g723_1 ([0][161][0][0] / 0xA100), 8000 Hz, mono, s16, 10 kb/s
    At least one output file must be specified

    So I converted this file to PCM using

    ffmpeg -acodec g723_1 -i 48401-3405-48403--18042018170000.wav -acodec pcm_s16le -f wav outnew1.wav

    But the audio still sound gibberish , I tried many variation and only Goldwave worked but that works on windows and with GUI not cli.

    So how can I convert this file to something useful so that atleast I can listen to it , It feels like a challenge now.

    Audio file : https://drive.google.com/open?id=1T54lKaI6IJmOqTPNOA_OkYRz89EQ5F2L

    PS : Use VLC to play audio file

  • combine multiple mp4 videos and images

    8 juin 2018, par wdsfds

    so I have a folder of images, 1/20 named *.png and a folder of mp4’s named *.mp4.

    I want to create a video in this order :

    • 1.png for 3 sec
    • 1.mp4
    • 2.png for 3 sec
    • 3.mp4
    • etc

    Is there a way I can display each png for 3 seconds and then show the respective mp4 using ffmpeg ? I know I can convert each picture to a 3 second video invididually using this command and the framerate differences will be a problem (1/3 vs 60), but I’m not very experienced with command line video editing :

    ffmpeg -r 1/3 -i 1.png -vcodec mpeg4 1_intro.mp4

    ffprobe output :

    ffprobe version 4.0 Copyright (c) 2007-2018 the FFmpeg developers
     built with Apple LLVM version 9.1.0 (clang-902.0.39.1)
     configuration: --prefix=/usr/local/Cellar/ffmpeg/4.0 --enable-shared --enable-pthreads --enable-version3 --enable-hardcoded-tables --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-gpl --enable-libmp3lame --enable-libx264 --enable-libxvid --enable-opencl --enable-videotoolbox --disable-lzma
     libavutil      56. 14.100 / 56. 14.100
     libavcodec     58. 18.100 / 58. 18.100
     libavformat    58. 12.100 / 58. 12.100
     libavdevice    58.  3.100 / 58.  3.100
     libavfilter     7. 16.100 /  7. 16.100
     libavresample   4.  0.  0 /  4.  0.  0
     libswscale      5.  1.100 /  5.  1.100
     libswresample   3.  1.100 /  3.  1.100
     libpostproc    55.  1.100 / 55.  1.100
    Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '1.mp4':
     Metadata:
       major_brand     : isom
       minor_version   : 512
       compatible_brands: isomiso2avc1mp41
       encoder         : Lavf57.71.100
     Duration: 00:00:32.69, start: 0.000000, bitrate: 7039 kb/s
       Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuvj420p(pc, bt709/unknown/unknown), 1920x1080 [SAR 1:1 DAR 16:9], 7004 kb/s, 60 fps, 60 tbr, 90k tbn, 120 tbc (default)
       Metadata:
         handler_name    : VideoHandler
       Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 43 kb/s (default)
       Metadata:
         handler_name    : SoundHandler

    output of out.mp4

    ffprobe version 4.0 Copyright (c) 2007-2018 the FFmpeg developers
     built with Apple LLVM version 9.1.0 (clang-902.0.39.1)
     configuration: --prefix=/usr/local/Cellar/ffmpeg/4.0 --enable-shared --enable-pthreads --enable-version3 --enable-hardcoded-tables --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-gpl --enable-libmp3lame --enable-libx264 --enable-libxvid --enable-opencl --enable-videotoolbox --disable-lzma
     libavutil      56. 14.100 / 56. 14.100
     libavcodec     58. 18.100 / 58. 18.100
     libavformat    58. 12.100 / 58. 12.100
     libavdevice    58.  3.100 / 58.  3.100
     libavfilter     7. 16.100 /  7. 16.100
     libavresample   4.  0.  0 /  4.  0.  0
     libswscale      5.  1.100 /  5.  1.100
     libswresample   3.  1.100 /  3.  1.100
     libpostproc    55.  1.100 / 55.  1.100
    Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'out.mp4':
     Metadata:
       major_brand     : isom
       minor_version   : 512
       compatible_brands: isomiso2avc1mp41
       encoder         : Lavf58.12.100
     Duration: 00:10:19.13, start: 0.000000, bitrate: 5689 kb/s
       Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 2560x1440 [SAR 1:1 DAR 16:9], 5565 kb/s, 53.91 fps, 60 tbr, 90k tbn, 120 tbc (default)
       Metadata:
         handler_name    : VideoHandler
       Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 114 kb/s (default)
       Metadata:
         handler_name    : SoundHandler
  • How to use Google's Cloud Speech-to-Text API to transcribe a video using the REST API

    8 juin 2018, par mrb

    I’d like to have the transcript of 2 people speaking in a video, but I get an empty response from the Cloud Speech-to-Text API

    Approach :

    I have a 56 minute video file containing a conversation between two people. I would like to have the transcript of that conversation, and I would like to use Google’s Cloud Speech-to-Text API to get that.

    To save a little on my Google Cloud Storage I converted to video to audio first by using mmpeg.

    First I’d tried to figure out the audio codec by using the command below, and it looks like AAC.
    ffmpeg -i video.mp4

    Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'videoplayback.mp4':
     Metadata:
       major_brand     : mp42
       minor_version   : 0
       compatible_brands: isommp42
       creation_time   : 2015-12-30T08:17:14.000000Z
     Duration: 00:56:03.99, start: 0.000000, bitrate: 362 kb/s
       Stream #0:0(und): Video: h264 (Constrained Baseline) (avc1 / 0x31637661), yuv420p, 490x360 [SAR 1:1 DAR 49:36], 264 kb/s,     29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc (default)
       Metadata:
         handler_name    : VideoHandler
       Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 96 kb/s (default)
       Metadata:
         creation_time   : 2015-12-30T08:17:31.000000Z
         handler_name    : IsoMedia File Produced by Google, 5-11-2011    

    So I took that from the video by using :
    ffmpeg -i video.mp4 -vn -acodec copy myaudio.aac

    Details so far :
    ffmpeg -i myaudio.aac
    Outputs :

    Input #0, aac, from 'myaudio.aac':
     Duration: 00:56:47.49, bitrate: 97 kb/s
       Stream #0:0: Audio: aac (LC), 44100 Hz, stereo, fltp, 97 kb/s

    After that I converted it to opus because I’m told that opus is better
    ffmpeg -i myaudio.aac -acodec libopus -b:a 97k -vbr on -compression_level 10 myaudio.opus

    Info so far :
    opusinfo myaudio.opus

    User comments section follows...
       encoder=Lavc58.18.100 libopus
    Opus stream 1:
       Pre-skip: 312
       Playback gain: 0 dB
       Channels: 2
       Original sample rate: 48000Hz
       Packet duration:   20.0ms (max),   20.0ms (avg),   20.0ms (min)
       Page duration:   1000.0ms (max), 1000.0ms (avg), 1000.0ms (min)
       Total data length: 29956714 bytes (overhead: 0.872%)
       Playback length: 56m:03.990s
       Average bitrate: 71.24 kb/s, w/o overhead: 70.62 kb/s

    I this point I uploaded the myaudio.opus to the Google Cloud Storage.

    curl POST 1
    I started the speech recognition by doing a POST with curl :

    curl --request POST  --header "Content-Type: application/json" --url 'https://speech.googleapis.com/v1/speech:longrunningrecognize?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&key={MY_API_KEY}' --data '{"audio": {"uri": "gs://{MY_BUCKET}/myaudio.opus"},"config": {"encoding": "OGG_OPUS", "sampleRateHertz": 48000, "languageCode": "en-US"}}'

    Response : {"name": "123456789"}
    123456789 was not the actual value.

    curl GET 1
    Now I wanted to have the results :

    curl --request GET --url 'https://speech.googleapis.com/v1/operations/123456789?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&key={MY_API_KEY}'

    This gave me the error : Error : Unable to recognize speech, possible error in encoding or channel config. Please correct the config and retry the request.

    So I updated the encoding configuration from OGG_OPUS to LINEAR16.

    curl POST 2
    Did the post again :

    curl --request POST  --header "Content-Type: application/json" --url 'https://speech.googleapis.com/v1/speech:longrunningrecognize?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&key={MY_API_KEY}' --data '{"audio": {"uri": "gs://{MY_BUCKET}/myaudio.opus"},"config": {"encoding": "LINEAR16", "sampleRateHertz": 48000, "languageCode": "en-US"}}'

    Response : {"name": "987654321"}

    curl GET 2

    curl --request GET --url 'https://speech.googleapis.com/v1/operations/987654321?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&key={MY_API_KEY}'

    Response :

    {
     "name": "987654321",
     "metadata": {
       "@type": "type.googleapis.com/google.cloud.speech.v1.LongRunningRecognizeMetadata",
       "progressPercent": 100,
       "startTime": "2018-06-08T11:01:24.596504Z",
       "lastUpdateTime": "2018-06-08T11:01:51.825882Z"
     },
     "done": true
    }

    The problem is that I don’t get the actual transcription. According the the documentation there should be a response key in the response containing the data.

    Since I’m kinda stuck here I’d like to know if I’m doing something completely wrong. I don’t have any technical or resource limitation so all suggestions are very welcome ! Also happy to change my approach.

    Thanks in advance ! Cheers