Recherche avancée

Médias (1)

Mot : - Tags -/Rennes

Autres articles (82)

  • La file d’attente de SPIPmotion

    28 novembre 2010, par

    Une file d’attente stockée dans la base de donnée
    Lors de son installation, SPIPmotion crée une nouvelle table dans la base de donnée intitulée spip_spipmotion_attentes.
    Cette nouvelle table est constituée des champs suivants : id_spipmotion_attente, l’identifiant numérique unique de la tâche à traiter ; id_document, l’identifiant numérique du document original à encoder ; id_objet l’identifiant unique de l’objet auquel le document encodé devra être attaché automatiquement ; objet, le type d’objet auquel (...)

  • MediaSPIP version 0.1 Beta

    16 avril 2011, par

    MediaSPIP 0.1 beta est la première version de MediaSPIP décrétée comme "utilisable".
    Le fichier zip ici présent contient uniquement les sources de MediaSPIP en version standalone.
    Pour avoir une installation fonctionnelle, il est nécessaire d’installer manuellement l’ensemble des dépendances logicielles sur le serveur.
    Si vous souhaitez utiliser cette archive pour une installation en mode ferme, il vous faudra également procéder à d’autres modifications (...)

  • Le profil des utilisateurs

    12 avril 2011, par

    Chaque utilisateur dispose d’une page de profil lui permettant de modifier ses informations personnelle. Dans le menu de haut de page par défaut, un élément de menu est automatiquement créé à l’initialisation de MediaSPIP, visible uniquement si le visiteur est identifié sur le site.
    L’utilisateur a accès à la modification de profil depuis sa page auteur, un lien dans la navigation "Modifier votre profil" est (...)

Sur d’autres sites (10268)

  • IBM Watson Speech to Text Audio Conversion on Node.js Web Application

    25 avril 2016, par Raquel Hosein

    The gist of the issue is that IBM Watson Speech to Text only allows for FLAC, WAV, and OGG file formats to be uploaded and used with the API.

    My solution to that would be that if the user uploads an mp3, BEFORE sending the file to Watson, a data conversion would take place. Essentially, the user uploads an mp3, then using ffmpeg or sox the audio would be converted to an OGG, after which the audio would then be uploaded to Watson.

    What I am unsure about is : What exactly do I have to modify in the Node.js Watson code to allow for the audio conversion to happen ? Linked below is the Watson repo which is what I am working through. I am sure that the file that will have to be changes is fileupload.js, which I have linked, but where the changes go is what I am uncertain about ?

    I have looked through both SO and developerWorks, the IBM SO for answers to this issue, but I have not seen any which is why I am posting here. I would be happy to clarify my question if that is necessary.

    Watson Speech to Text Repo

  • tools/python : add script to convert TensorFlow model (.pb) to native model (.model)

    13 juin 2019, par Guo, Yejun
    tools/python : add script to convert TensorFlow model (.pb) to native model (.model)
    

    For example, given TensorFlow model file espcn.pb,
    to generate native model file espcn.model, just run :
    python convert.py espcn.pb

    In current implementation, the native model file is generated for
    specific dnn network with hard-code python scripts maintained out of ffmpeg.
    For example, srcnn network used by vf_sr is generated with
    https://github.com/HighVoltageRocknRoll/sr/blob/master/generate_header_and_model.py#L85

    In this patch, the script is designed as a general solution which
    converts general TensorFlow model .pb file into .model file. The script
    now has some tricky to be compatible with current implemention, will
    be refined step by step.

    The script is also added into ffmpeg source tree. It is expected there
    will be many more patches and community needs the ownership of it.

    Another technical direction is to do the conversion in c/c++ code within
    ffmpeg source tree. While .pb file is organized with protocol buffers,
    it is not easy to do such work with tiny c/c++ code, see more discussion
    at http://ffmpeg.org/pipermail/ffmpeg-devel/2019-May/244496.html. So,
    choose the python script.

    Signed-off-by : Guo, Yejun <yejun.guo@intel.com>

    • [DH] .gitignore
    • [DH] tools/python/convert.py
    • [DH] tools/python/convert_from_tensorflow.py
  • Watson NarrowBand Speech to Text not accepting ogg file

    19 janvier 2017, par Bob Dill

    NodeJS app using ffmpeg to create ogg files from mp3 & mp4. If the source file is broadband, Watson Speech to Text accepts the file with no issues. If the source file is narrow band, Watson Speech to Text fails to read the ogg file. I’ve tested the output from ffmpeg and the narrowband ogg file has the same audio content (e.g. I can listen to it and hear the same people) as the mp3 file. Yes, in advance, I am changing the call to Watson to correctly specify the model and content_type. Code follows :

    exports.createTranscript = function(req, res, next)
    { var _name = getNameBase(req.body.movie);
     var _type = getType(req.body.movie);
     var _voice = (_type == "mp4") ? "en-US_BroadbandModel" : "en-US_NarrowbandModel" ;
     var _contentType = (_type == "mp4") ? "audio/ogg" : "audio/basic" ;
     var _audio = process.cwd()+"/HTML/movies/"+_name+'ogg';
     var transcriptFile = process.cwd()+"/HTML/movies/"+_name+'json';

     speech_to_text.createSession({model: _voice}, function(error, session) {
       if (error) {console.log('error:', error);}
       else
         {
           var params = { content_type: _contentType, continuous: true,
            audio: fs.createReadStream(_audio),
             session_id: session.session_id
             };
             speech_to_text.recognize(params, function(error, transcript) {
               if (error) {console.log('error:', error);}
               else
                 { fs.writeFile(transcriptFile, JSON.stringify(transcript), function(err) {if (err) {console.log(err);}});
                   res.send(transcript);
                 }
             });
         }
     });
    }

    _type is either mp3 (narrowband from phone recording) or mp4 (broadband)
    model: _voice has been traced to ensure correct setting
    content_type: _contentType has been traced to ensure correct setting

    Any ogg file submitted to Speech to Text with narrowband settings fails with Error: No speech detected for 30s. Tested with both real narrowband files and asking Watson to read a broadband ogg file (created from mp4) as narrowband. Same error message. What am I missing ?