Recherche avancée

Recherche
Choix de la période de publication
Date minimale :

Date maximale :

Type de date :
Choix de la langue
Choix du type de média
Choix de la rubrique
Choix de la licence de publication
Choix de l’auteur

Médias (2)

Mot : - Tags -/media

SPIP - plugins - embed code - Exemple

2 septembre 2013, par kent1

Mis à jour : Septembre 2013

Langue : français

Type : Image

Tags : media, embed, code, intégration

1
2
3
4
5
Publier une image simplement

13 avril 2011, par kent1, Webmaster - Bij de Brest

Mis à jour : Février 2012

Langue : français

Type : Video

Tags : publier, publishing, media, image

1
2
3
4
5

Autres articles (70)

Pas question de marché, de cloud etc...

10 avril 2011

Le vocabulaire utilisé sur ce site essaie d’éviter toute référence à la mode qui fleurit allègrement
sur le web 2.0 et dans les entreprises qui en vivent.
Vous êtes donc invité à bannir l’utilisation des termes "Brand", "Cloud", "Marché" etc...
Notre motivation est avant tout de créer un outil simple, accessible à pour tout le monde, favorisant
le partage de créations sur Internet et permettant aux auteurs de garder une autonomie optimale.
Aucun "contrat Gold ou Premium" n’est donc prévu, aucun (...)
Mediabox : ouvrir les images dans l’espace maximal pour l’utilisateur

8 février 2011, par kent1

La visualisation des images est restreinte par la largeur accordée par le design du site (dépendant du thème utilisé). Elles sont donc visibles sous un format réduit. Afin de profiter de l’ensemble de la place disponible sur l’écran de l’utilisateur, il est possible d’ajouter une fonctionnalité d’affichage de l’image dans une boite multimedia apparaissant au dessus du reste du contenu.
Pour ce faire il est nécessaire d’installer le plugin "Mediabox".
Configuration de la boite multimédia
Dès (...)
Activation de l’inscription des visiteurs

12 avril 2011, par kent1

Il est également possible d’activer l’inscription des visiteurs ce qui permettra à tout un chacun d’ouvrir soit même un compte sur le canal en question dans le cadre de projets ouverts par exemple.
Pour ce faire, il suffit d’aller dans l’espace de configuration du site en choisissant le sous menus "Gestion des utilisateurs". Le premier formulaire visible correspond à cette fonctionnalité.
Par défaut, MediaSPIP a créé lors de son initialisation un élément de menu dans le menu du haut de la page menant (...)

1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | ... | 24

Sur d’autres sites (11564)

How to use Google's Cloud Speech-to-Text API to transcribe a video using the REST API

8 juin 2018, par mrb

I’d like to have the transcript of 2 people speaking in a video, but I get an empty response from the Cloud Speech-to-Text API

Approach :

I have a 56 minute video file containing a conversation between two people. I would like to have the transcript of that conversation, and I would like to use Google’s Cloud Speech-to-Text API to get that.

To save a little on my Google Cloud Storage I converted to video to audio first by using mmpeg.

First I’d tried to figure out the audio codec by using the command below, and it looks like AAC.
ffmpeg -i video.mp4

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'videoplayback.mp4':

  Metadata:

    major_brand     : mp42

    minor_version   : 0

    compatible_brands: isommp42

    creation_time   : 2015-12-30T08:17:14.000000Z

  Duration: 00:56:03.99, start: 0.000000, bitrate: 362 kb/s

    Stream #0:0(und): Video: h264 (Constrained Baseline) (avc1 / 0x31637661), yuv420p, 490x360 [SAR 1:1 DAR 49:36], 264 kb/s,     29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc (default)

    Metadata:

      handler_name    : VideoHandler

    Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 96 kb/s (default)

    Metadata:

      creation_time   : 2015-12-30T08:17:31.000000Z

      handler_name    : IsoMedia File Produced by Google, 5-11-2011

So I took that from the video by using :
ffmpeg -i video.mp4 -vn -acodec copy myaudio.aac

Details so far :
ffmpeg -i myaudio.aac
Outputs :

Input #0, aac, from 'myaudio.aac':

  Duration: 00:56:47.49, bitrate: 97 kb/s

    Stream #0:0: Audio: aac (LC), 44100 Hz, stereo, fltp, 97 kb/s

After that I converted it to opus because I’m told that opus is better
ffmpeg -i myaudio.aac -acodec libopus -b:a 97k -vbr on -compression_level 10 myaudio.opus

Info so far :
opusinfo myaudio.opus

User comments section follows...

    encoder=Lavc58.18.100 libopus

Opus stream 1:

    Pre-skip: 312

    Playback gain: 0 dB

    Channels: 2

    Original sample rate: 48000Hz

    Packet duration:   20.0ms (max),   20.0ms (avg),   20.0ms (min)

    Page duration:   1000.0ms (max), 1000.0ms (avg), 1000.0ms (min)

    Total data length: 29956714 bytes (overhead: 0.872%)

    Playback length: 56m:03.990s

    Average bitrate: 71.24 kb/s, w/o overhead: 70.62 kb/s

I this point I uploaded the myaudio.opus to the Google Cloud Storage.

curl POST 1
I started the speech recognition by doing a POST with curl :

curl --request POST  --header "Content-Type: application/json" --url 'https://speech.googleapis.com/v1/speech:longrunningrecognize?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&amp;key={MY_API_KEY}' --data '{"audio": {"uri": "gs://{MY_BUCKET}/myaudio.opus"},"config": {"encoding": "OGG_OPUS", "sampleRateHertz": 48000, "languageCode": "en-US"}}'

Response : {"name": "123456789"}
123456789 was not the actual value.

curl GET 1
Now I wanted to have the results :

curl --request GET --url 'https://speech.googleapis.com/v1/operations/123456789?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&amp;key={MY_API_KEY}'

This gave me the error : Error : Unable to recognize speech, possible error in encoding or channel config. Please correct the config and retry the request.

So I updated the encoding configuration from OGG_OPUS to LINEAR16.

curl POST 2
Did the post again :

curl --request POST  --header "Content-Type: application/json" --url 'https://speech.googleapis.com/v1/speech:longrunningrecognize?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&amp;key={MY_API_KEY}' --data '{"audio": {"uri": "gs://{MY_BUCKET}/myaudio.opus"},"config": {"encoding": "LINEAR16", "sampleRateHertz": 48000, "languageCode": "en-US"}}'

Response : {"name": "987654321"}

curl GET 2

curl --request GET --url 'https://speech.googleapis.com/v1/operations/987654321?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&amp;key={MY_API_KEY}'

Response :

{

  "name": "987654321",

  "metadata": {

    "@type": "type.googleapis.com/google.cloud.speech.v1.LongRunningRecognizeMetadata",

    "progressPercent": 100,

    "startTime": "2018-06-08T11:01:24.596504Z",

    "lastUpdateTime": "2018-06-08T11:01:51.825882Z"

  },

  "done": true

}

The problem is that I don’t get the actual transcription. According the the documentation there should be a response key in the response containing the data.

Since I’m kinda stuck here I’d like to know if I’m doing something completely wrong. I don’t have any technical or resource limitation so all suggestions are very welcome ! Also happy to change my approach.

Thanks in advance ! Cheers

combine multiple mp4 videos and images

8 juin 2018, par wdsfds

so I have a folder of images, 1/20 named *.png and a folder of mp4’s named *.mp4.

I want to create a video in this order :

1.png for 3 sec
1.mp4
2.png for 3 sec
3.mp4
etc

Is there a way I can display each png for 3 seconds and then show the respective mp4 using ffmpeg ? I know I can convert each picture to a 3 second video invididually using this command and the framerate differences will be a problem (1/3 vs 60), but I’m not very experienced with command line video editing :

ffmpeg -r 1/3 -i 1.png -vcodec mpeg4 1_intro.mp4

ffprobe output :

ffprobe version 4.0 Copyright (c) 2007-2018 the FFmpeg developers

  built with Apple LLVM version 9.1.0 (clang-902.0.39.1)

  configuration: --prefix=/usr/local/Cellar/ffmpeg/4.0 --enable-shared --enable-pthreads --enable-version3 --enable-hardcoded-tables --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-gpl --enable-libmp3lame --enable-libx264 --enable-libxvid --enable-opencl --enable-videotoolbox --disable-lzma

  libavutil      56. 14.100 / 56. 14.100

  libavcodec     58. 18.100 / 58. 18.100

  libavformat    58. 12.100 / 58. 12.100

  libavdevice    58.  3.100 / 58.  3.100

  libavfilter     7. 16.100 /  7. 16.100

  libavresample   4.  0.  0 /  4.  0.  0

  libswscale      5.  1.100 /  5.  1.100

  libswresample   3.  1.100 /  3.  1.100

  libpostproc    55.  1.100 / 55.  1.100

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '1.mp4':

  Metadata:

    major_brand     : isom

    minor_version   : 512

    compatible_brands: isomiso2avc1mp41

    encoder         : Lavf57.71.100

  Duration: 00:00:32.69, start: 0.000000, bitrate: 7039 kb/s

    Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuvj420p(pc, bt709/unknown/unknown), 1920x1080 [SAR 1:1 DAR 16:9], 7004 kb/s, 60 fps, 60 tbr, 90k tbn, 120 tbc (default)

    Metadata:

      handler_name    : VideoHandler

    Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 43 kb/s (default)

    Metadata:

      handler_name    : SoundHandler

output of out.mp4

ffprobe version 4.0 Copyright (c) 2007-2018 the FFmpeg developers

  built with Apple LLVM version 9.1.0 (clang-902.0.39.1)

  configuration: --prefix=/usr/local/Cellar/ffmpeg/4.0 --enable-shared --enable-pthreads --enable-version3 --enable-hardcoded-tables --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-gpl --enable-libmp3lame --enable-libx264 --enable-libxvid --enable-opencl --enable-videotoolbox --disable-lzma

  libavutil      56. 14.100 / 56. 14.100

  libavcodec     58. 18.100 / 58. 18.100

  libavformat    58. 12.100 / 58. 12.100

  libavdevice    58.  3.100 / 58.  3.100

  libavfilter     7. 16.100 /  7. 16.100

  libavresample   4.  0.  0 /  4.  0.  0

  libswscale      5.  1.100 /  5.  1.100

  libswresample   3.  1.100 /  3.  1.100

  libpostproc    55.  1.100 / 55.  1.100

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'out.mp4':

  Metadata:

    major_brand     : isom

    minor_version   : 512

    compatible_brands: isomiso2avc1mp41

    encoder         : Lavf58.12.100

  Duration: 00:10:19.13, start: 0.000000, bitrate: 5689 kb/s

    Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 2560x1440 [SAR 1:1 DAR 16:9], 5565 kb/s, 53.91 fps, 60 tbr, 90k tbn, 120 tbc (default)

    Metadata:

      handler_name    : VideoHandler

    Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 114 kb/s (default)

    Metadata:

      handler_name    : SoundHandler

Converting a call center recording to something useful

9 août 2018, par Abhay

I have a call center recording (when played it sounds gibberish) for which the mediainfo shows info as

ion@aurora:~/Inbound$ mediainfo 48401-3405-48403--18042018170000.wav 

General

Complete name                            : 48401-3405-48403--18042018170000.wav

Format                                   : Wave

File size                                : 327 KiB

Duration                                 : 4mn 11s

Overall bit rate                         : 10.7 Kbps



Audio

Format                                   : G.723.1

Codec ID                                 : A100

Duration                                 : 4mn 11s

Bit rate                                 : 10.7 Kbps

Channel(s)                               : 2 channels

Sampling rate                            : 8 000 Hz

Stream size                              : 327 KiB (100%)

The ffmpeg info shows this as

ion@aurora:~/Inbound$ ffmpeg -i 48401-3405-48403--18042018170000.wav

ffmpeg version N-91330-ga990184 Copyright (c) 2000-2018 the FFmpeg developers

  built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.9) 20160609

  configuration: --prefix=/home/ion/ffmpeg_build --pkg-config-flags=--static --extra-cflags=-I/home/ion/ffmpeg_build/include --extra-ldflags=-L/home/ion/ffmpeg_build/lib --extra-libs='-lpthread -lm' --bindir=/home/ion/bin --enable-gpl --enable-libaom --enable-libass --enable-libfdk-aac --enable-libfreetype --enable-libmp3lame --enable-libopus --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265 --enable-nonfree

  libavutil      56. 18.102 / 56. 18.102

  libavcodec     58. 20.103 / 58. 20.103

  libavformat    58. 17.100 / 58. 17.100

  libavdevice    58.  4.101 / 58.  4.101

  libavfilter     7. 25.100 /  7. 25.100

  libswscale      5.  2.100 /  5.  2.100

  libswresample   3.  2.100 /  3.  2.100

  libpostproc    55.  2.100 / 55.  2.100

Input #0, wav, from '48401-3405-48403--18042018170000.wav':

  Duration: 00:04:11.37, bitrate: 10 kb/s

    Stream #0:0: Audio: g723_1 ([0][161][0][0] / 0xA100), 8000 Hz, mono, s16, 10 kb/s

At least one output file must be specified

So I converted this file to PCM using

ffmpeg -acodec g723_1 -i 48401-3405-48403--18042018170000.wav -acodec pcm_s16le -f wav outnew1.wav

But the audio still sound gibberish , I tried many variation and only Goldwave worked but that works on windows and with GUI not cli.

So how can I convert this file to something useful so that atleast I can listen to it , It feels like a challenge now.

Audio file : https://drive.google.com/open?id=1T54lKaI6IJmOqTPNOA_OkYRz89EQ5F2L

PS : Use VLC to play audio file

1 | ... | 3441 | 3442 | 3443 | 3444 | 3445 | 3446 | 3447 | 3448 | 3449 | ... | 3855

Recherche avancée

Médias (2)

SPIP - plugins - embed code - Exemple

Publier une image simplement

Autres articles (70)

Pas question de marché, de cloud etc...

Mediabox : ouvrir les images dans l’espace maximal pour l’utilisateur

Activation de l’inscription des visiteurs

Sur d’autres sites (11564)

How to use Google's Cloud Speech-to-Text API to transcribe a video using the REST API

combine multiple mp4 videos and images

Converting a call center recording to something useful

Se connecter

Navigation

Syndication

Boussole SPIP