Recherche avancée
Sur d’autres sites (233)
-
Synchronize video subtitle with text-to-speech voice
8 décembre 2015, par AhmadI try to create a video of a text in which the text is narrated by text-to-speech.
To create the video file, I use the
VideoFileWriter
ofAforge.Net
as the following :VideoWriter = new VideoFileWriter();
VideoWriter.Open(CurVideoFile, (int)(Properties.Settings.Default.VideoWidth),
(int)(Properties.Settings.Default.VideoHeight), 25, VideoCodec.MPEG4, 800000);To read aloud the text I use
SpeechSynthesizer
class and write the output to a wave streamAudioStream = new FileStream(CurAudioFile, FileMode.Create);
synth.SetOutputToWaveStream(AudioStream);I want to highlight the word is spoken in the video, so I synchronize them by the
SpeakProgress
event :void synth_SpeakProgress(object sender, SpeakProgressEventArgs e)
{
curAuidoPosition = e.AudioPosition;
using (Graphics g = Graphics.FromImage(Screen))
{
g.DrawString(e.Text,....);
}
VideoWriter.WriteVideoFrame(Screen, curAuidoPosition);
}And finally, I merge the video and audio using
ffmpeg
using (Process process = new Process())
{
process.StartInfo.FileName = exe_path;
process.StartInfo.Arguments = string.Format(@"-i ""{0}"" -i ""{1}"" -y -acodec copy -vcodec copy ""{2}""",
avi_path, mp3_path, output_file);
......The problem is that for some voices like Microsoft Hazel, Zira and David, the video is not synchronized with the audio, and the audio is much faster than the shown subtitle. In windows 7, it works for
Mircrosoft Sam
How can I synchronize them so that it works for any text-to-speech voices ?
-
m4a/mp3 files to wav for Bing Speech API
17 décembre 2018, par WaqasBing Speech API only accepts wav files so I have been trying to convert m4a (Skype) and mp3 (Facebook) audio files I am getting in my chatbot to wav format. I am using fluent-ffmpeg in node.js.
For now, I am downloading the audio file, converting it to wav and returning the piped output for use ahead.
if (attachment.contentType === 'audio/x-m4a') {
request.get(attachment.contentUrl).pipe(fs.createWriteStream('file.m4a'));
var command = ffmpeg('file.m4a')
.toFormat('wav')
.on('error', function (err) {
console.log('An error occurred: ' + err.message);
})
.on('progress', function (progress) {
// console.log(JSON.stringify(progress));
console.log('Processing: ' + progress.targetSize + ' KB converted');
})
.on('end', function () {
console.log('Processing finished !');
});
return command.pipe();
}Right now, the conversion works when I send the m4a file through the botframework-emulator on my pc. But when I specify my pc as the endpoint (through ngrok) and try to send the m4a file from the chat test at the bot framework developer end, ffmpeg returns an error :
An error occurred: ffmpeg exited with code 1: file.m4a: Invalid data found when processing input
But when I play the downloaded m4a file, it plays alright.
The content URL is https in the second case if that matters.
Kindly help me with two things :
- Downloading, Converting and Returning without storing anything on my end
- Downloading/Accessing m4a/mp3 files properly
I am new to streams, pipes and ffmpeg and all the above code is after googling.
-
Google Speech API "Sample rate in request does not match FLAC header"
13 février 2017, par kjdion84I’m trying to convert an mp4 video clip into a FLAC audio file and then have google speech spit out the words from the video so that I can detect if specific words were said.
I have everything working except that I am getting an error from the Speech API :
{
"error": {
"code": 400,
"message": "Sample rate in request does not match FLAC header.",
"status": "INVALID_ARGUMENT"
}
}I am using FFMPEG in order to convert the mp4 into a FLAC file. I am specifying that the FLAC file be 16 bits in the command, but when I right click on the FLAC file Windows is telling me it is 302kbps.
Here is my PHP code :
// convert mp4 video to 16 bit flac audio file
$cmd = 'C:/wamp/www/ffmpeg/bin/ffmpeg.exe -i C:/wamp/www/test.mp4 -c:a flac -sample_fmt s16 C:/wamp/www/test.flac';
exec($cmd, $output);
// convert flac to text so we can detect if certain words were said
$data = array(
"config" => array(
"encoding" => "FLAC",
"sampleRate" => 16000,
"languageCode" => "en-US"
),
"audio" => array(
"content" => base64_encode(file_get_contents("test.flac")),
)
);
$json_data = json_encode($data);
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://speech.googleapis.com/v1beta1/speech:syncrecognize?key=MY_API_KEY');
curl_setopt($ch, CURLOPT_HTTPHEADER, array("Content-Type: application/json"));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $json_data);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
$result = curl_exec($ch);