Recherche avancée

Médias (91)

Autres articles (112)

  • Personnaliser en ajoutant son logo, sa bannière ou son image de fond

    5 septembre 2013, par

    Certains thèmes prennent en compte trois éléments de personnalisation : l’ajout d’un logo ; l’ajout d’une bannière l’ajout d’une image de fond ;

  • Les autorisations surchargées par les plugins

    27 avril 2010, par

    Mediaspip core
    autoriser_auteur_modifier() afin que les visiteurs soient capables de modifier leurs informations sur la page d’auteurs

  • Personnaliser les catégories

    21 juin 2013, par

    Formulaire de création d’une catégorie
    Pour ceux qui connaissent bien SPIP, une catégorie peut être assimilée à une rubrique.
    Dans le cas d’un document de type catégorie, les champs proposés par défaut sont : Texte
    On peut modifier ce formulaire dans la partie :
    Administration > Configuration des masques de formulaire.
    Dans le cas d’un document de type média, les champs non affichés par défaut sont : Descriptif rapide
    Par ailleurs, c’est dans cette partie configuration qu’on peut indiquer le (...)

Sur d’autres sites (11168)

  • How to use Google's Cloud Speech-to-Text REST API to transcribe a video

    24 juillet 2018, par mrb

    I’d like to have the transcript of 2 people speaking in a video, but I get an empty response from the Cloud Speech-to-Text API

    Approach :

    I have a 56 minute video file containing a conversation between two people. I would like to have the transcript of that conversation, and I would like to use Google’s Cloud Speech-to-Text API to get that.

    To save a little on my Google Cloud Storage I converted to video to audio first by using mmpeg.

    First I’d tried to figure out the audio codec by using the command below, and it looks like AAC.
    ffmpeg -i video.mp4

    Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'videoplayback.mp4':
     Metadata:
       major_brand     : mp42
       minor_version   : 0
       compatible_brands: isommp42
       creation_time   : 2015-12-30T08:17:14.000000Z
     Duration: 00:56:03.99, start: 0.000000, bitrate: 362 kb/s
       Stream #0:0(und): Video: h264 (Constrained Baseline) (avc1 / 0x31637661), yuv420p, 490x360 [SAR 1:1 DAR 49:36], 264 kb/s,     29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc (default)
       Metadata:
         handler_name    : VideoHandler
       Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 96 kb/s (default)
       Metadata:
         creation_time   : 2015-12-30T08:17:31.000000Z
         handler_name    : IsoMedia File Produced by Google, 5-11-2011    

    So I took that from the video by using :
    ffmpeg -i video.mp4 -vn -acodec copy myaudio.aac

    Details so far :
    ffmpeg -i myaudio.aac
    Outputs :

    Input #0, aac, from 'myaudio.aac':
     Duration: 00:56:47.49, bitrate: 97 kb/s
       Stream #0:0: Audio: aac (LC), 44100 Hz, stereo, fltp, 97 kb/s

    After that I converted it to opus because I’m told that opus is better
    ffmpeg -i myaudio.aac -acodec libopus -b:a 97k -vbr on -compression_level 10 myaudio.opus

    Info so far :
    opusinfo myaudio.opus

    User comments section follows...
       encoder=Lavc58.18.100 libopus
    Opus stream 1:
       Pre-skip: 312
       Playback gain: 0 dB
       Channels: 2
       Original sample rate: 48000Hz
       Packet duration:   20.0ms (max),   20.0ms (avg),   20.0ms (min)
       Page duration:   1000.0ms (max), 1000.0ms (avg), 1000.0ms (min)
       Total data length: 29956714 bytes (overhead: 0.872%)
       Playback length: 56m:03.990s
       Average bitrate: 71.24 kb/s, w/o overhead: 70.62 kb/s

    I this point I uploaded the myaudio.opus to the Google Cloud Storage.

    curl POST 1
    I started the speech recognition by doing a POST with curl :

    curl --request POST  --header "Content-Type: application/json" --url 'https://speech.googleapis.com/v1/speech:longrunningrecognize?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&key={MY_API_KEY}' --data '{"audio": {"uri": "gs://{MY_BUCKET}/myaudio.opus"},"config": {"encoding": "OGG_OPUS", "sampleRateHertz": 48000, "languageCode": "en-US"}}'

    Response : {"name": "123456789"}
    123456789 was not the actual value.

    curl GET 1
    Now I wanted to have the results :

    curl --request GET --url 'https://speech.googleapis.com/v1/operations/123456789?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&key={MY_API_KEY}'

    This gave me the error : Error : Unable to recognize speech, possible error in encoding or channel config. Please correct the config and retry the request.

    So I updated the encoding configuration from OGG_OPUS to LINEAR16.

    curl POST 2
    Did the post again :

    curl --request POST  --header "Content-Type: application/json" --url 'https://speech.googleapis.com/v1/speech:longrunningrecognize?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&key={MY_API_KEY}' --data '{"audio": {"uri": "gs://{MY_BUCKET}/myaudio.opus"},"config": {"encoding": "LINEAR16", "sampleRateHertz": 48000, "languageCode": "en-US"}}'

    Response : {"name": "987654321"}

    curl GET 2

    curl --request GET --url 'https://speech.googleapis.com/v1/operations/987654321?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&key={MY_API_KEY}'

    Response :

    {
     "name": "987654321",
     "metadata": {
       "@type": "type.googleapis.com/google.cloud.speech.v1.LongRunningRecognizeMetadata",
       "progressPercent": 100,
       "startTime": "2018-06-08T11:01:24.596504Z",
       "lastUpdateTime": "2018-06-08T11:01:51.825882Z"
     },
     "done": true
    }

    The problem is that I don’t get the actual transcription. According the the documentation there should be a response key in the response containing the data.

    Since I’m kinda stuck here I’d like to know if I’m doing something completely wrong. I don’t have any technical or resource limitation so all suggestions are very welcome ! Also happy to change my approach.

    Thanks in advance ! Cheers

  • Anomalie #3592 : Robots.txt.html et tests de compatibilité mobile de Google

    25 novembre 2019, par b b

    Pas certain que ça soit nécessaire de doublonner en ouvrant de nouveau ce ticket puisque le sujet est discuté par ici https://core.spip.net/issues/4103 :)

  • FFmpeg can't process videos with filenames containing emojis on Google Colab

    25 novembre 2022, par athena

    I mounted my Google Drive on Google Colab, inside the Drive, there's a video file with an emoji on it's filename (example : 20221124 [우리의식탁 W TABLE] 직접 기른 허브로 만들면 더 맛있는 허브포카치아 🌿 (8m2hNIEoXEw).mkv).

    


    !ffmpeg -i "/content/drive/MyDrive/DOWNLOAD/20221124 [우리의식탁 W TABLE] 직접 기른 허브로 만들면 더 맛있는 허브포카치아 🌿 (8m2hNIEoXEw).mkv"


    


    Trying to run FFmpeg gives me this error :

    


    ---------------------------------------------------------------------------&#xA;UnicodeEncodeError                        Traceback (most recent call last)&#xA; in <module>&#xA;      3 video = "/content/drive/MyDrive/DOWNLOAD/20221124 [\u110B\u116E\u1105\u1175\u110B\u1174\u1109\u1175\u11A8\u1110\u1161\u11A8 W TABLE] \u110C\u1175\u11A8\u110C\u1165\u11B8 \u1100\u1175\u1105\u1173\u11AB \u1112\u1165\u1107\u1173\u1105\u1169 \u1106\u1161\u11AB\u1103\u1173\u11AF\u1106\u1167\u11AB \u1103\u1165 \u1106\u1161\u11BA\u110B\u1175\u11BB\u1102\u1173\u11AB \u1112\u1165\u1107\u1173\u1111\u1169\u110F\u1161\u110E\u1175\u110B\u1161 \uD83C\uDF3F (8m2hNIEoXEw).mkv" #@param {type: "string"}&#xA;      4 &#xA;----> 5 get_ipython().system(&#x27;ffmpeg -i "$video" #-hide_banner&#x27;)&#xA;&#xA;4 frames&#xA;/usr/local/lib/python3.7/dist-packages/google/colab/_shell.py in system(self, *args, **kwargs)&#xA;     93       kwargs.update({&#x27;also_return_output&#x27;: True})&#xA;     94 &#xA;---> 95     output = _system_commands._system_compat(self, *args, **kwargs)  # pylint:disable=protected-access&#xA;     96 &#xA;     97     if pip_warn:&#xA;&#xA;/usr/local/lib/python3.7/dist-packages/google/colab/_system_commands.py in _system_compat(shell, cmd, also_return_output)&#xA;    435   # stack.&#xA;    436   result = _run_command(&#xA;--> 437       shell.var_expand(cmd, depth=2), clear_streamed_output=False)&#xA;    438   shell.user_ns[&#x27;_exit_code&#x27;] = result.returncode&#xA;    439   if -result.returncode in _INTERRUPTED_SIGNALS:&#xA;&#xA;/usr/local/lib/python3.7/dist-packages/google/colab/_system_commands.py in _run_command(cmd, clear_streamed_output)&#xA;    189           stdin=stdin,&#xA;    190           stderr=child_pty,&#xA;--> 191           close_fds=True)&#xA;    192       # The child PTY is only needed by the spawned process.&#xA;    193       os.close(child_pty)&#xA;&#xA;/usr/lib/python3.7/subprocess.py in __init__(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags, restore_signals, start_new_session, pass_fds, encoding, errors, text)&#xA;    798                                 c2pread, c2pwrite,&#xA;    799                                 errread, errwrite,&#xA;--> 800                                 restore_signals, start_new_session)&#xA;    801         except:&#xA;    802             # Cleanup if the child failed starting.&#xA;&#xA;/usr/lib/python3.7/subprocess.py in _execute_child(self, args, executable, preexec_fn, close_fds, pass_fds, cwd, env, startupinfo, creationflags, shell, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite, restore_signals, start_new_session)&#xA;   1480                             errread, errwrite,&#xA;   1481                             errpipe_read, errpipe_write,&#xA;-> 1482                             restore_signals, start_new_session, preexec_fn)&#xA;   1483                     self._child_created = True&#xA;   1484                 finally:&#xA;&#xA;UnicodeEncodeError: &#x27;utf-8&#x27; codec can&#x27;t encode characters in position 131-132: surrogates not allowed&#xA;</module>

    &#xA;

    My Colab uses Python 3.7.15 and ffmpeg/ffprobe version N-109226-g2ad199ae31-20221125 (from https://github.com/BtbN/FFmpeg-Builds).

    &#xA;

    I tried searching similar issues as mine here, but most of the solutions are way beyond my knowledge, I'm not sure how to apply them to my use case.

    &#xA;

    I'll appreciate your help, thank you !

    &#xA;