Recherche avancée

Recherche
Choix de la période de publication
Date minimale :

Date maximale :

Type de date :
Choix de la langue
Choix du type de média
Choix de la rubrique
Choix de la licence de publication
Choix de l’auteur

Médias (91)

Chuck D with Fine Arts Militia - No Meaning No

15 septembre 2011, par kent1

Mis à jour : Septembre 2011

Langue : English

Type : Audio

Tags : creative commons, wired, audio

1
2
3
4
5
Paul Westerberg - Looking Up in Heaven

15 septembre 2011, par kent1

Mis à jour : Septembre 2011

Langue : English

Type : Audio

Tags : creative commons, wired, audio

1
2
3
4
5
Le Tigre - Fake French

15 septembre 2011, par kent1

Mis à jour : Septembre 2011

Langue : English

Type : Audio

Tags : creative commons, wired, audio

1
2
3
4
5
Thievery Corporation - DC 3000

15 septembre 2011, par kent1

Mis à jour : Septembre 2011

Langue : English

Type : Audio

Tags : creative commons, wired, audio

1
2
3
4
5
Dan the Automator - Relaxation Spa Treatment

15 septembre 2011, par kent1

Mis à jour : Septembre 2011

Langue : English

Type : Audio

Tags : creative commons, wired, audio

1
2
3
4
5
Gilberto Gil - Oslodum

15 septembre 2011, par kent1

Mis à jour : Septembre 2011

Langue : English

Type : Audio

Tags : creative commons, wired, audio

1
2
3
4
5

1 | ... | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16

Autres articles (51)

Contribute to a better visual interface

13 avril 2011

MediaSPIP is based on a system of themes and templates. Templates define the placement of information on the page, and can be adapted to a wide range of uses. Themes define the overall graphic appearance of the site.
Anyone can submit a new graphic theme or template and make it available to the MediaSPIP community.
Librairies et binaires spécifiques au traitement vidéo et sonore

31 janvier 2010, par kent1

Les logiciels et librairies suivantes sont utilisées par SPIPmotion d’une manière ou d’une autre.
Binaires obligatoires FFMpeg : encodeur principal, permet de transcoder presque tous les types de fichiers vidéo et sonores dans les formats lisibles sur Internet. CF ce tutoriel pour son installation ; Oggz-tools : outils d’inspection de fichiers ogg ; Mediainfo : récupération d’informations depuis la plupart des formats vidéos et sonores ;
Binaires complémentaires et facultatifs flvtool2 : (...)
Support audio et vidéo HTML5

10 avril 2011

MediaSPIP utilise les balises HTML5 video et audio pour la lecture de documents multimedia en profitant des dernières innovations du W3C supportées par les navigateurs modernes.
Pour les navigateurs plus anciens, le lecteur flash Flowplayer est utilisé.
Le lecteur HTML5 utilisé a été spécifiquement créé pour MediaSPIP : il est complètement modifiable graphiquement pour correspondre à un thème choisi.
Ces technologies permettent de distribuer vidéo et son à la fois sur des ordinateurs conventionnels (...)

1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | ... | 17

Sur d’autres sites (10229)

How to make a MPEG-DASH MPD which starts the playback in the middle of the first segment ?

18 septembre 2018, par ravin.wang
Here are the reproduce steps :
1. Normalize an H.264 video stream
  
  ffmpeg -i 2.h264 -c:v libx264 -intra -r 25 -vf scale=640x360,setdar=16:9 2@25fps@intra@640x360.h264
  
  (*) After that, I got an H.264 stream where all pictures are H.264 IDR frames, and fps is 25, resolution is 640x360, aspect-ratio is 16:9.
2. Generate an MP4 file
  
  MP4Box -add 2@25fps@intra@640x360.h264:timescale=1000 -fps 25 2@25fps@intra@640x360.mp4
3. Make dash MP4 fragmented content, including init mp4, .m4s files and one .mpd file
  
  MP4Box -dash 5000 -frag 5000 -dash-scale 1000 -frag-rap -segment-name ’seg_second$Number$’ -segment-timeline -profile live 2@25fps@intra@640x360.mp4
4. Copy and publish all these files to a folder under one HTTPD server
5. I want to play from 4s of the first segment, and don’t display any frames before 4s, so I changed the .MPD file to modify the fields "SegmentTemplate@presentationTimeOffset", "SegmentTimeline:S@d/t", like as :
  
  <?xml version="1.0"?> <mpd xmlns="urn:mpeg:dash:schema:mpd:2011" minbuffertime="PT1.500S" type="static" mediapresentationduration="PT0H0M26.000S" maxsegmentduration="PT0H0M5.000S" profiles="urn:mpeg:dash:profile:isoff-live:2011"> <period duration="PT0H0M26.000S"> <adaptationset segmentalignment="true" maxwidth="640" maxheight="360" maxframerate="25" par="16:9" lang="und"> <segmenttemplate presentationtimeoffset="4000" media="seg_second$Number$.m4s" timescale="1000" startnumber="1" initialization="seg_secondinit.mp4"> <segmenttimeline> <s d="1000" t="4000"></s> <s d="5000" r="4"></s> </segmenttimeline> </segmenttemplate> <representation mimetype="video/mp4" codecs="avc3.64101E" width="640" height="360" framerate="25" sar="1:1" startwithsap="1" bandwidth="2261831"> </representation> </adaptationset> </period> </mpd>
6. Play the MPD url from VLC player, or Edge browser, it always starts the the first frame of the first segment, the frames between 0s 4s are also displayed unexpectedly.
What’s wrong with my steps ? Or any other options for it ?

MPD MPEG-DASH - Shows only one bitrate

13 août 2018, par Justin Rec

Help. I wont show bitrate.
player.getBitrateInfoListFor("video") ;
Shows only one bitrate - 454948

manifest.mpd generated by GPAC

 <period duration="PT0H21M48.338S">

  <adaptationset segmentalignment="true" group="1" maxwidth="270" maxheight="480" maxframerate="2070000/93437" par="270:480" lang="und">

   <representation mimetype="video/mp4" codecs="avc3.640015" width="270" height="480" framerate="2070000/93437" sar="1:1" startwithsap="1" bandwidth="454948">

    <segmenttemplate media="480_bbb/segment__track1_$Number$.m4s" timescale="2070000" startnumber="1" duration="8280000" initialization="480_bbb/segment__track1_init.mp4"></segmenttemplate>

   </representation>

  </adaptationset>

  <adaptationset segmentalignment="true" group="1" maxwidth="202" maxheight="360" maxframerate="2070000/93437" par="202:360" lang="und">

   <representation mimetype="video/mp4" codecs="avc3.64000D" width="202" height="360" framerate="2070000/93437" sar="1:1" startwithsap="1" bandwidth="281508">

    <segmenttemplate media="360_bbb/segment__track1_$Number$.m4s" timescale="2070000" startnumber="1" duration="8280000" initialization="360_bbb/segment__track1_init.mp4"></segmenttemplate>

   </representation>

  </adaptationset>

  <adaptationset segmentalignment="true" group="1" maxwidth="134" maxheight="240" maxframerate="2070000/93437" par="134:240" lang="und">

   <representation mimetype="video/mp4" codecs="avc3.64000B" width="134" height="240" framerate="2070000/93437" sar="1:1" startwithsap="1" bandwidth="182832">

    <segmenttemplate media="240_bbb/segment__track1_$Number$.m4s" timescale="2070000" startnumber="1" duration="8280000" initialization="240_bbb/segment__track1_init.mp4"></segmenttemplate>

   </representation>

  </adaptationset>

  <adaptationset segmentalignment="true" group="1" maxwidth="80" maxheight="144" maxframerate="2070000/93437" par="80:144" lang="und">

   <representation mimetype="video/mp4" codecs="avc3.640009" width="80" height="144" framerate="2070000/93437" sar="1:1" startwithsap="1" bandwidth="99667">

    <segmenttemplate media="144_bbb/segment__track1_$Number$.m4s" timescale="2070000" startnumber="1" duration="8280000" initialization="144_bbb/segment__track1_init.mp4"></segmenttemplate>

   </representation>

  </adaptationset>

  <adaptationset segmentalignment="true" lang="und">

   <representation mimetype="audio/mp4" codecs="mp4a.40.2" startwithsap="1" bandwidth="66056">

    <audiochannelconfiguration schemeiduri="urn:mpeg:dash:23003:3:audio_channel_configuration:2011" value="1"></audiochannelconfiguration>

    <segmenttemplate media="audio_bbb/segment__track2_$Number$.m4s" timescale="48000" startnumber="1" duration="192000" initialization="audio_bbb/segment__track2_init.mp4"></segmenttemplate>

   </representation>

  </adaptationset>

 </period>

player.getBitrateInfoListFor("video") ;
Shows only one bitrate - 454948

How to use Google's Cloud Speech-to-Text REST API to transcribe a video

24 juillet 2018, par mrb

I’d like to have the transcript of 2 people speaking in a video, but I get an empty response from the Cloud Speech-to-Text API

Approach :

I have a 56 minute video file containing a conversation between two people. I would like to have the transcript of that conversation, and I would like to use Google’s Cloud Speech-to-Text API to get that.

To save a little on my Google Cloud Storage I converted to video to audio first by using mmpeg.

First I’d tried to figure out the audio codec by using the command below, and it looks like AAC.
ffmpeg -i video.mp4

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'videoplayback.mp4':

  Metadata:

    major_brand     : mp42

    minor_version   : 0

    compatible_brands: isommp42

    creation_time   : 2015-12-30T08:17:14.000000Z

  Duration: 00:56:03.99, start: 0.000000, bitrate: 362 kb/s

    Stream #0:0(und): Video: h264 (Constrained Baseline) (avc1 / 0x31637661), yuv420p, 490x360 [SAR 1:1 DAR 49:36], 264 kb/s,     29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc (default)

    Metadata:

      handler_name    : VideoHandler

    Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 96 kb/s (default)

    Metadata:

      creation_time   : 2015-12-30T08:17:31.000000Z

      handler_name    : IsoMedia File Produced by Google, 5-11-2011

So I took that from the video by using :
ffmpeg -i video.mp4 -vn -acodec copy myaudio.aac

Details so far :
ffmpeg -i myaudio.aac
Outputs :

Input #0, aac, from 'myaudio.aac':

  Duration: 00:56:47.49, bitrate: 97 kb/s

    Stream #0:0: Audio: aac (LC), 44100 Hz, stereo, fltp, 97 kb/s

After that I converted it to opus because I’m told that opus is better
ffmpeg -i myaudio.aac -acodec libopus -b:a 97k -vbr on -compression_level 10 myaudio.opus

Info so far :
opusinfo myaudio.opus

User comments section follows...

    encoder=Lavc58.18.100 libopus

Opus stream 1:

    Pre-skip: 312

    Playback gain: 0 dB

    Channels: 2

    Original sample rate: 48000Hz

    Packet duration:   20.0ms (max),   20.0ms (avg),   20.0ms (min)

    Page duration:   1000.0ms (max), 1000.0ms (avg), 1000.0ms (min)

    Total data length: 29956714 bytes (overhead: 0.872%)

    Playback length: 56m:03.990s

    Average bitrate: 71.24 kb/s, w/o overhead: 70.62 kb/s

I this point I uploaded the myaudio.opus to the Google Cloud Storage.

curl POST 1
I started the speech recognition by doing a POST with curl :

curl --request POST  --header "Content-Type: application/json" --url 'https://speech.googleapis.com/v1/speech:longrunningrecognize?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&amp;key={MY_API_KEY}' --data '{"audio": {"uri": "gs://{MY_BUCKET}/myaudio.opus"},"config": {"encoding": "OGG_OPUS", "sampleRateHertz": 48000, "languageCode": "en-US"}}'

Response : {"name": "123456789"}
123456789 was not the actual value.

curl GET 1
Now I wanted to have the results :

curl --request GET --url 'https://speech.googleapis.com/v1/operations/123456789?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&amp;key={MY_API_KEY}'

This gave me the error : Error : Unable to recognize speech, possible error in encoding or channel config. Please correct the config and retry the request.

So I updated the encoding configuration from OGG_OPUS to LINEAR16.

curl POST 2
Did the post again :

curl --request POST  --header "Content-Type: application/json" --url 'https://speech.googleapis.com/v1/speech:longrunningrecognize?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&amp;key={MY_API_KEY}' --data '{"audio": {"uri": "gs://{MY_BUCKET}/myaudio.opus"},"config": {"encoding": "LINEAR16", "sampleRateHertz": 48000, "languageCode": "en-US"}}'

Response : {"name": "987654321"}

curl GET 2

curl --request GET --url 'https://speech.googleapis.com/v1/operations/987654321?fields=done%2Cerror%2Cmetadata%2Cname%2Cresponse&amp;key={MY_API_KEY}'

Response :

{

  "name": "987654321",

  "metadata": {

    "@type": "type.googleapis.com/google.cloud.speech.v1.LongRunningRecognizeMetadata",

    "progressPercent": 100,

    "startTime": "2018-06-08T11:01:24.596504Z",

    "lastUpdateTime": "2018-06-08T11:01:51.825882Z"

  },

  "done": true

}

The problem is that I don’t get the actual transcription. According the the documentation there should be a response key in the response containing the data.

Since I’m kinda stuck here I’d like to know if I’m doing something completely wrong. I don’t have any technical or resource limitation so all suggestions are very welcome ! Also happy to change my approach.

Thanks in advance ! Cheers

1 | ... | 3050 | 3051 | 3052 | 3053 | 3054 | 3055 | 3056 | 3057 | 3058 | ... | 3410

Recherche avancée

Médias (91)

Chuck D with Fine Arts Militia - No Meaning No

Paul Westerberg - Looking Up in Heaven

Le Tigre - Fake French

Thievery Corporation - DC 3000

Dan the Automator - Relaxation Spa Treatment

Gilberto Gil - Oslodum

Autres articles (51)

Contribute to a better visual interface

Librairies et binaires spécifiques au traitement vidéo et sonore

Support audio et vidéo HTML5

Sur d’autres sites (10229)

How to make a MPEG-DASH MPD which starts the playback in the middle of the first segment ?

MPD MPEG-DASH - Shows only one bitrate

How to use Google's Cloud Speech-to-Text REST API to transcribe a video

Se connecter

Navigation

Syndication

Boussole SPIP