Recherche avancée

Recherche
Choix de la période de publication
Date minimale :

Date maximale :

Type de date :
Choix de la langue
Choix du type de média
Choix de la rubrique
Choix de la licence de publication
Choix de l’auteur

Médias (91)

#3 The Safest Place

16 octobre 2011, par kent1

Mis à jour : Février 2013

Langue : English

Type : Audio

Tags : creative commons, Musique, mp3, Elephant dreams, soundtrack

1
2
3
4
5
#4 Emo Creates

15 octobre 2011, par kent1

Mis à jour : Février 2013

Langue : English

Type : Audio

Tags : creative commons, Musique, mp3, Elephant dreams, soundtrack

1
2
3
4
5
#2 Typewriter Dance

15 octobre 2011, par kent1

Mis à jour : Février 2013

Langue : English

Type : Audio

Tags : creative commons, Musique, mp3, Elephant dreams, soundtrack

1
2
3
4
5
#1 The Wires

11 octobre 2011, par kent1

Mis à jour : Février 2013

Langue : English

Type : Audio

Tags : creative commons, Musique, mp3, Elephant dreams, soundtrack

1
2
3
4
5
ED-ME-5 1-DVD

11 octobre 2011, par kent1

Mis à jour : Octobre 2011

Langue : English

Type : Audio

Tags : opensource, audio, open film making, Elephant dreams, ac3, karaoke

1
2
3
4
5
Revolution of Open-source and film making towards open film making

6 octobre 2011, par kent1

Mis à jour : Juillet 2013

Langue : English

Type : Texte

Tags : creative commons, thèse, opensource, copyleft, open film making, lev manovitch, Elephant dreams, university

1
2
3
4
5

1 | ... | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | ... | 16

Autres articles (112)

Personnaliser en ajoutant son logo, sa bannière ou son image de fond

5 septembre 2013, par kent1

Certains thèmes prennent en compte trois éléments de personnalisation : l’ajout d’un logo ; l’ajout d’une bannière l’ajout d’une image de fond ;
Les autorisations surchargées par les plugins

27 avril 2010, par kent1

Mediaspip core
autoriser_auteur_modifier() afin que les visiteurs soient capables de modifier leurs informations sur la page d’auteurs
Personnaliser les catégories

21 juin 2013, par etalarma

Formulaire de création d’une catégorie
Pour ceux qui connaissent bien SPIP, une catégorie peut être assimilée à une rubrique.
Dans le cas d’un document de type catégorie, les champs proposés par défaut sont : Texte
On peut modifier ce formulaire dans la partie :
Administration > Configuration des masques de formulaire.
Dans le cas d’un document de type média, les champs non affichés par défaut sont : Descriptif rapide
Par ailleurs, c’est dans cette partie configuration qu’on peut indiquer le (...)

1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | ... | 38

Sur d’autres sites (11168)

How to use Google's Cloud Speech-to-Text REST API to transcribe a video

24 juillet 2018, par mrb

I’d like to have the transcript of 2 people speaking in a video, but I get an empty response from the Cloud Speech-to-Text API

Approach :

I have a 56 minute video file containing a conversation between two people. I would like to have the transcript of that conversation, and I would like to use Google’s Cloud Speech-to-Text API to get that.

To save a little on my Google Cloud Storage I converted to video to audio first by using mmpeg.

First I’d tried to figure out the audio codec by using the command below, and it looks like AAC.
ffmpeg -i video.mp4

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'videoplayback.mp4':

  Metadata:

    major_brand     : mp42

    minor_version   : 0

    compatible_brands: isommp42

    creation_time   : 2015-12-30T08:17:14.000000Z

  Duration: 00:56:03.99, start: 0.000000, bitrate: 362 kb/s

    Stream #0:0(und): Video: h264 (Constrained Baseline) (avc1 / 0x31637661), yuv420p, 490x360 [SAR 1:1 DAR 49:36], 264 kb/s,     29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc (default)

    Metadata:

      handler_name    : VideoHandler

    Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 96 kb/s (default)

    Metadata:

      creation_time   : 2015-12-30T08:17:31.000000Z

      handler_name    : IsoMedia File Produced by Google, 5-11-2011

So I took that from the video by using :
ffmpeg -i video.mp4 -vn -acodec copy myaudio.aac

Details so far :
ffmpeg -i myaudio.aac
Outputs :

Input #0, aac, from 'myaudio.aac':

  Duration: 00:56:47.49, bitrate: 97 kb/s

    Stream #0:0: Audio: aac (LC), 44100 Hz, stereo, fltp, 97 kb/s

After that I converted it to opus because I’m told that opus is better
ffmpeg -i myaudio.aac -acodec libopus -b:a 97k -vbr on -compression_level 10 myaudio.opus

Info so far :
opusinfo myaudio.opus

User comments section follows...

    encoder=Lavc58.18.100 libopus

Opus stream 1:

    Pre-skip: 312

    Playback gain: 0 dB

    Channels: 2

    Original sample rate: 48000Hz

    Packet duration:   20.0ms (max),   20.0ms (avg),   20.0ms (min)

    Page duration:   1000.0ms (max), 1000.0ms (avg), 1000.0ms (min)

    Total data length: 29956714 bytes (overhead: 0.872%)

    Playback length: 56m:03.990s

    Average bitrate: 71.24 kb/s, w/o overhead: 70.62 kb/s

I this point I uploaded the myaudio.opus to the Google Cloud Storage.

curl POST 1
I started the speech recognition by doing a POST with curl :

curl --request POST  --header "Content-Type: application/json" --url 'https://speech.googleapis.com/v1/speech:longrunningrecognize?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&amp;key={MY_API_KEY}' --data '{"audio": {"uri": "gs://{MY_BUCKET}/myaudio.opus"},"config": {"encoding": "OGG_OPUS", "sampleRateHertz": 48000, "languageCode": "en-US"}}'

Response : {"name": "123456789"}
123456789 was not the actual value.

curl GET 1
Now I wanted to have the results :

curl --request GET --url 'https://speech.googleapis.com/v1/operations/123456789?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&amp;key={MY_API_KEY}'

This gave me the error : Error : Unable to recognize speech, possible error in encoding or channel config. Please correct the config and retry the request.

So I updated the encoding configuration from OGG_OPUS to LINEAR16.

curl POST 2
Did the post again :

curl --request POST  --header "Content-Type: application/json" --url 'https://speech.googleapis.com/v1/speech:longrunningrecognize?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&amp;key={MY_API_KEY}' --data '{"audio": {"uri": "gs://{MY_BUCKET}/myaudio.opus"},"config": {"encoding": "LINEAR16", "sampleRateHertz": 48000, "languageCode": "en-US"}}'

Response : {"name": "987654321"}

curl GET 2

curl --request GET --url 'https://speech.googleapis.com/v1/operations/987654321?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&amp;key={MY_API_KEY}'

Response :

{

  "name": "987654321",

  "metadata": {

    "@type": "type.googleapis.com/google.cloud.speech.v1.LongRunningRecognizeMetadata",

    "progressPercent": 100,

    "startTime": "2018-06-08T11:01:24.596504Z",

    "lastUpdateTime": "2018-06-08T11:01:51.825882Z"

  },

  "done": true

}

The problem is that I don’t get the actual transcription. According the the documentation there should be a response key in the response containing the data.

Since I’m kinda stuck here I’d like to know if I’m doing something completely wrong. I don’t have any technical or resource limitation so all suggestions are very welcome ! Also happy to change my approach.

Thanks in advance ! Cheers

Anomalie #3592 : Robots.txt.html et tests de compatibilité mobile de Google

25 novembre 2019, par b b

Pas certain que ça soit nécessaire de doublonner en ouvrant de nouveau ce ticket puisque le sujet est discuté par ici https://core.spip.net/issues/4103 :)

FFmpeg can't process videos with filenames containing emojis on Google Colab

25 novembre 2022, par athena

I mounted my Google Drive on Google Colab, inside the Drive, there's a video file with an emoji on it's filename (example : 20221124 [우리의식탁 W TABLE] 직접 기른 허브로 만들면 더 맛있는 허브포카치아 🌿 (8m2hNIEoXEw).mkv).

!ffmpeg -i "/content/drive/MyDrive/DOWNLOAD/20221124 [우리의식탁 W TABLE] 직접 기른 허브로 만들면 더 맛있는 허브포카치아 &#127807; (8m2hNIEoXEw).mkv"&#xA;

Trying to run FFmpeg gives me this error :

---------------------------------------------------------------------------&#xA;UnicodeEncodeError                        Traceback (most recent call last)&#xA; in <module>&#xA;      3 video = "/content/drive/MyDrive/DOWNLOAD/20221124 [\u110B\u116E\u1105\u1175\u110B\u1174\u1109\u1175\u11A8\u1110\u1161\u11A8 W TABLE] \u110C\u1175\u11A8\u110C\u1165\u11B8 \u1100\u1175\u1105\u1173\u11AB \u1112\u1165\u1107\u1173\u1105\u1169 \u1106\u1161\u11AB\u1103\u1173\u11AF\u1106\u1167\u11AB \u1103\u1165 \u1106\u1161\u11BA\u110B\u1175\u11BB\u1102\u1173\u11AB \u1112\u1165\u1107\u1173\u1111\u1169\u110F\u1161\u110E\u1175\u110B\u1161 \uD83C\uDF3F (8m2hNIEoXEw).mkv" #@param {type: "string"}&#xA;      4 &#xA;----> 5 get_ipython().system(&#x27;ffmpeg -i "$video" #-hide_banner&#x27;)&#xA;&#xA;4 frames&#xA;/usr/local/lib/python3.7/dist-packages/google/colab/_shell.py in system(self, *args, **kwargs)&#xA;     93       kwargs.update({&#x27;also_return_output&#x27;: True})&#xA;     94 &#xA;---> 95     output = _system_commands._system_compat(self, *args, **kwargs)  # pylint:disable=protected-access&#xA;     96 &#xA;     97     if pip_warn:&#xA;&#xA;/usr/local/lib/python3.7/dist-packages/google/colab/_system_commands.py in _system_compat(shell, cmd, also_return_output)&#xA;    435   # stack.&#xA;    436   result = _run_command(&#xA;--> 437       shell.var_expand(cmd, depth=2), clear_streamed_output=False)&#xA;    438   shell.user_ns[&#x27;_exit_code&#x27;] = result.returncode&#xA;    439   if -result.returncode in _INTERRUPTED_SIGNALS:&#xA;&#xA;/usr/local/lib/python3.7/dist-packages/google/colab/_system_commands.py in _run_command(cmd, clear_streamed_output)&#xA;    189           stdin=stdin,&#xA;    190           stderr=child_pty,&#xA;--> 191           close_fds=True)&#xA;    192       # The child PTY is only needed by the spawned process.&#xA;    193       os.close(child_pty)&#xA;&#xA;/usr/lib/python3.7/subprocess.py in __init__(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags, restore_signals, start_new_session, pass_fds, encoding, errors, text)&#xA;    798                                 c2pread, c2pwrite,&#xA;    799                                 errread, errwrite,&#xA;--> 800                                 restore_signals, start_new_session)&#xA;    801         except:&#xA;    802             # Cleanup if the child failed starting.&#xA;&#xA;/usr/lib/python3.7/subprocess.py in _execute_child(self, args, executable, preexec_fn, close_fds, pass_fds, cwd, env, startupinfo, creationflags, shell, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite, restore_signals, start_new_session)&#xA;   1480                             errread, errwrite,&#xA;   1481                             errpipe_read, errpipe_write,&#xA;-> 1482                             restore_signals, start_new_session, preexec_fn)&#xA;   1483                     self._child_created = True&#xA;   1484                 finally:&#xA;&#xA;UnicodeEncodeError: &#x27;utf-8&#x27; codec can&#x27;t encode characters in position 131-132: surrogates not allowed&#xA;</module>

My Colab uses Python 3.7.15 and ffmpeg/ffprobe version N-109226-g2ad199ae31-20221125 (from https://github.com/BtbN/FFmpeg-Builds).

I tried searching similar issues as mine here, but most of the solutions are way beyond my knowledge, I'm not sure how to apply them to my use case.

I'll appreciate your help, thank you !

1 | ... | 462 | 463 | 464 | 465 | 466 | 467 | 468 | 469 | 470 | ... | 3723

Recherche avancée

Médias (91)

#3 The Safest Place

#4 Emo Creates

#2 Typewriter Dance

#1 The Wires

ED-ME-5 1-DVD

Revolution of Open-source and film making towards open film making

Autres articles (112)

Personnaliser en ajoutant son logo, sa bannière ou son image de fond

Les autorisations surchargées par les plugins

Personnaliser les catégories

Sur d’autres sites (11168)

How to use Google's Cloud Speech-to-Text REST API to transcribe a video

Anomalie #3592 : Robots.txt.html et tests de compatibilité mobile de Google

FFmpeg can't process videos with filenames containing emojis on Google Colab

Se connecter

Navigation

Syndication

Boussole SPIP