
Recherche avancée
Médias (1)
-
Rennes Emotion Map 2010-11
19 octobre 2011, par
Mis à jour : Juillet 2013
Langue : français
Type : Texte
Autres articles (82)
-
La file d’attente de SPIPmotion
28 novembre 2010, parUne file d’attente stockée dans la base de donnée
Lors de son installation, SPIPmotion crée une nouvelle table dans la base de donnée intitulée spip_spipmotion_attentes.
Cette nouvelle table est constituée des champs suivants : id_spipmotion_attente, l’identifiant numérique unique de la tâche à traiter ; id_document, l’identifiant numérique du document original à encoder ; id_objet l’identifiant unique de l’objet auquel le document encodé devra être attaché automatiquement ; objet, le type d’objet auquel (...) -
MediaSPIP version 0.1 Beta
16 avril 2011, parMediaSPIP 0.1 beta est la première version de MediaSPIP décrétée comme "utilisable".
Le fichier zip ici présent contient uniquement les sources de MediaSPIP en version standalone.
Pour avoir une installation fonctionnelle, il est nécessaire d’installer manuellement l’ensemble des dépendances logicielles sur le serveur.
Si vous souhaitez utiliser cette archive pour une installation en mode ferme, il vous faudra également procéder à d’autres modifications (...) -
Le profil des utilisateurs
12 avril 2011, parChaque utilisateur dispose d’une page de profil lui permettant de modifier ses informations personnelle. Dans le menu de haut de page par défaut, un élément de menu est automatiquement créé à l’initialisation de MediaSPIP, visible uniquement si le visiteur est identifié sur le site.
L’utilisateur a accès à la modification de profil depuis sa page auteur, un lien dans la navigation "Modifier votre profil" est (...)
Sur d’autres sites (10268)
-
IBM Watson Speech to Text Audio Conversion on Node.js Web Application
25 avril 2016, par Raquel HoseinThe gist of the issue is that IBM Watson Speech to Text only allows for FLAC, WAV, and OGG file formats to be uploaded and used with the API.
My solution to that would be that if the user uploads an mp3, BEFORE sending the file to Watson, a data conversion would take place. Essentially, the user uploads an mp3, then using ffmpeg or sox the audio would be converted to an OGG, after which the audio would then be uploaded to Watson.
What I am unsure about is : What exactly do I have to modify in the Node.js Watson code to allow for the audio conversion to happen ? Linked below is the Watson repo which is what I am working through. I am sure that the file that will have to be changes is fileupload.js, which I have linked, but where the changes go is what I am uncertain about ?
I have looked through both SO and developerWorks, the IBM SO for answers to this issue, but I have not seen any which is why I am posting here. I would be happy to clarify my question if that is necessary.
-
tools/python : add script to convert TensorFlow model (.pb) to native model (.model)
13 juin 2019, par Guo, Yejuntools/python : add script to convert TensorFlow model (.pb) to native model (.model)
For example, given TensorFlow model file espcn.pb,
to generate native model file espcn.model, just run :
python convert.py espcn.pbIn current implementation, the native model file is generated for
specific dnn network with hard-code python scripts maintained out of ffmpeg.
For example, srcnn network used by vf_sr is generated with
https://github.com/HighVoltageRocknRoll/sr/blob/master/generate_header_and_model.py#L85In this patch, the script is designed as a general solution which
converts general TensorFlow model .pb file into .model file. The script
now has some tricky to be compatible with current implemention, will
be refined step by step.The script is also added into ffmpeg source tree. It is expected there
will be many more patches and community needs the ownership of it.Another technical direction is to do the conversion in c/c++ code within
ffmpeg source tree. While .pb file is organized with protocol buffers,
it is not easy to do such work with tiny c/c++ code, see more discussion
at http://ffmpeg.org/pipermail/ffmpeg-devel/2019-May/244496.html. So,
choose the python script.Signed-off-by : Guo, Yejun <yejun.guo@intel.com>
-
Watson NarrowBand Speech to Text not accepting ogg file
19 janvier 2017, par Bob DillNodeJS app using ffmpeg to create ogg files from mp3 & mp4. If the source file is broadband, Watson Speech to Text accepts the file with no issues. If the source file is narrow band, Watson Speech to Text fails to read the ogg file. I’ve tested the output from ffmpeg and the narrowband ogg file has the same audio content (e.g. I can listen to it and hear the same people) as the mp3 file. Yes, in advance, I am changing the call to Watson to correctly specify the model and content_type. Code follows :
exports.createTranscript = function(req, res, next)
{ var _name = getNameBase(req.body.movie);
var _type = getType(req.body.movie);
var _voice = (_type == "mp4") ? "en-US_BroadbandModel" : "en-US_NarrowbandModel" ;
var _contentType = (_type == "mp4") ? "audio/ogg" : "audio/basic" ;
var _audio = process.cwd()+"/HTML/movies/"+_name+'ogg';
var transcriptFile = process.cwd()+"/HTML/movies/"+_name+'json';
speech_to_text.createSession({model: _voice}, function(error, session) {
if (error) {console.log('error:', error);}
else
{
var params = { content_type: _contentType, continuous: true,
audio: fs.createReadStream(_audio),
session_id: session.session_id
};
speech_to_text.recognize(params, function(error, transcript) {
if (error) {console.log('error:', error);}
else
{ fs.writeFile(transcriptFile, JSON.stringify(transcript), function(err) {if (err) {console.log(err);}});
res.send(transcript);
}
});
}
});
}_type
is either mp3 (narrowband from phone recording) or mp4 (broadband)
model: _voice
has been traced to ensure correct setting
content_type: _contentType
has been traced to ensure correct settingAny ogg file submitted to Speech to Text with narrowband settings fails with
Error: No speech detected for 30s.
Tested with both real narrowband files and asking Watson to read a broadband ogg file (created from mp4) as narrowband. Same error message. What am I missing ?