Recherche avancée

Recherche
Choix de la période de publication
Date minimale :

Date maximale :

Type de date :
Choix de la langue
Choix du type de média
Choix de la rubrique
Choix de la licence de publication
Choix de l’auteur

Médias (1)

Mot : - Tags -/Rennes

Autres articles (82)

La file d’attente de SPIPmotion

28 novembre 2010, par kent1

Une file d’attente stockée dans la base de donnée
Lors de son installation, SPIPmotion crée une nouvelle table dans la base de donnée intitulée spip_spipmotion_attentes.
Cette nouvelle table est constituée des champs suivants : id_spipmotion_attente, l’identifiant numérique unique de la tâche à traiter ; id_document, l’identifiant numérique du document original à encoder ; id_objet l’identifiant unique de l’objet auquel le document encodé devra être attaché automatiquement ; objet, le type d’objet auquel (...)
MediaSPIP version 0.1 Beta

16 avril 2011, par kent1

MediaSPIP 0.1 beta est la première version de MediaSPIP décrétée comme "utilisable".
Le fichier zip ici présent contient uniquement les sources de MediaSPIP en version standalone.
Pour avoir une installation fonctionnelle, il est nécessaire d’installer manuellement l’ensemble des dépendances logicielles sur le serveur.
Si vous souhaitez utiliser cette archive pour une installation en mode ferme, il vous faudra également procéder à d’autres modifications (...)
Le profil des utilisateurs

12 avril 2011, par kent1

Chaque utilisateur dispose d’une page de profil lui permettant de modifier ses informations personnelle. Dans le menu de haut de page par défaut, un élément de menu est automatiquement créé à l’initialisation de MediaSPIP, visible uniquement si le visiteur est identifié sur le site.
L’utilisateur a accès à la modification de profil depuis sa page auteur, un lien dans la navigation "Modifier votre profil" est (...)

1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | ... | 28

Sur d’autres sites (10268)

IBM Watson Speech to Text Audio Conversion on Node.js Web Application

25 avril 2016, par Raquel Hosein

The gist of the issue is that IBM Watson Speech to Text only allows for FLAC, WAV, and OGG file formats to be uploaded and used with the API.

My solution to that would be that if the user uploads an mp3, BEFORE sending the file to Watson, a data conversion would take place. Essentially, the user uploads an mp3, then using ffmpeg or sox the audio would be converted to an OGG, after which the audio would then be uploaded to Watson.

What I am unsure about is : What exactly do I have to modify in the Node.js Watson code to allow for the audio conversion to happen ? Linked below is the Watson repo which is what I am working through. I am sure that the file that will have to be changes is fileupload.js, which I have linked, but where the changes go is what I am uncertain about ?

I have looked through both SO and developerWorks, the IBM SO for answers to this issue, but I have not seen any which is why I am posting here. I would be happy to clarify my question if that is necessary.

Watson Speech to Text Repo

tools/python : add script to convert TensorFlow model (.pb) to native model (.model)

13 juin 2019, par Guo, Yejun

tools/python : add script to convert TensorFlow model (.pb) to native model (.model)

For example, given TensorFlow model file espcn.pb,
to generate native model file espcn.model, just run :
python convert.py espcn.pb

In current implementation, the native model file is generated for
specific dnn network with hard-code python scripts maintained out of ffmpeg.
For example, srcnn network used by vf_sr is generated with
https://github.com/HighVoltageRocknRoll/sr/blob/master/generate_header_and_model.py#L85

In this patch, the script is designed as a general solution which
converts general TensorFlow model .pb file into .model file. The script
now has some tricky to be compatible with current implemention, will
be refined step by step.

The script is also added into ffmpeg source tree. It is expected there
will be many more patches and community needs the ownership of it.

Another technical direction is to do the conversion in c/c++ code within
ffmpeg source tree. While .pb file is organized with protocol buffers,
it is not easy to do such work with tiny c/c++ code, see more discussion
at http://ffmpeg.org/pipermail/ffmpeg-devel/2019-May/244496.html. So,
choose the python script.

Signed-off-by : Guo, Yejun <yejun.guo@intel.com>

[D H] .gitignore
[D H] tools/python/convert.py
[D H] tools/python/convert_from_tensorflow.py

Watson NarrowBand Speech to Text not accepting ogg file

19 janvier 2017, par Bob Dill

NodeJS app using ffmpeg to create ogg files from mp3 & mp4. If the source file is broadband, Watson Speech to Text accepts the file with no issues. If the source file is narrow band, Watson Speech to Text fails to read the ogg file. I’ve tested the output from ffmpeg and the narrowband ogg file has the same audio content (e.g. I can listen to it and hear the same people) as the mp3 file. Yes, in advance, I am changing the call to Watson to correctly specify the model and content_type. Code follows :

exports.createTranscript = function(req, res, next)

{ var _name = getNameBase(req.body.movie);

  var _type = getType(req.body.movie);

  var _voice = (_type == "mp4") ? "en-US_BroadbandModel" : "en-US_NarrowbandModel" ;

  var _contentType = (_type == "mp4") ? "audio/ogg" : "audio/basic" ;

  var _audio = process.cwd()+"/HTML/movies/"+_name+'ogg';

  var transcriptFile = process.cwd()+"/HTML/movies/"+_name+'json';



  speech_to_text.createSession({model: _voice}, function(error, session) {

    if (error) {console.log('error:', error);}

    else

      {

        var params = { content_type: _contentType, continuous: true,

         audio: fs.createReadStream(_audio),

          session_id: session.session_id

          };

          speech_to_text.recognize(params, function(error, transcript) {

            if (error) {console.log('error:', error);}

            else

              { fs.writeFile(transcriptFile, JSON.stringify(transcript), function(err) {if (err) {console.log(err);}});

                res.send(transcript);

              }

          });

      }

  });

}

_type is either mp3 (narrowband from phone recording) or mp4 (broadband)
model: _voice has been traced to ensure correct setting
content_type: _contentType has been traced to ensure correct setting

Any ogg file submitted to Speech to Text with narrowband settings fails with Error: No speech detected for 30s. Tested with both real narrowband files and asking Watson to read a broadband ogg file (created from mp4) as narrowband. Same error message. What am I missing ?

1 | ... | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | ... | 3423

Recherche avancée

Médias (1)

Rennes Emotion Map 2010-11

Autres articles (82)

La file d’attente de SPIPmotion

MediaSPIP version 0.1 Beta

Le profil des utilisateurs

Sur d’autres sites (10268)

IBM Watson Speech to Text Audio Conversion on Node.js Web Application

tools/python : add script to convert TensorFlow model (.pb) to native model (.model)

Watson NarrowBand Speech to Text not accepting ogg file

Se connecter

Navigation

Syndication

Boussole SPIP