Recherche avancée

Médias (91)

Sur d’autres sites (233)

  • Synchronize video subtitle with text-to-speech voice

    8 décembre 2015, par Ahmad

    I try to create a video of a text in which the text is narrated by text-to-speech.

    To create the video file, I use the VideoFileWriter of Aforge.Net as the following :

    VideoWriter = new VideoFileWriter();

    VideoWriter.Open(CurVideoFile, (int)(Properties.Settings.Default.VideoWidth),
       (int)(Properties.Settings.Default.VideoHeight), 25, VideoCodec.MPEG4, 800000);

    To read aloud the text I use SpeechSynthesizer class and write the output to a wave stream

    AudioStream = new FileStream(CurAudioFile, FileMode.Create);
    synth.SetOutputToWaveStream(AudioStream);

    I want to highlight the word is spoken in the video, so I synchronize them by the SpeakProgress event :

       void synth_SpeakProgress(object sender, SpeakProgressEventArgs e)
       {

           curAuidoPosition = e.AudioPosition;
           using (Graphics g = Graphics.FromImage(Screen))
           {
                g.DrawString(e.Text,....);
           }                    
           VideoWriter.WriteVideoFrame(Screen, curAuidoPosition);
       }

    And finally, I merge the video and audio using ffmpeg

    using (Process process = new Process())
    {
             process.StartInfo.FileName = exe_path;
             process.StartInfo.Arguments = string.Format(@"-i ""{0}"" -i ""{1}"" -y -acodec copy -vcodec copy ""{2}""",
                                              avi_path, mp3_path, output_file);
    ......

    The problem is that for some voices like Microsoft Hazel, Zira and David, the video is not synchronized with the audio, and the audio is much faster than the shown subtitle. In windows 7, it works for Mircrosoft Sam

    How can I synchronize them so that it works for any text-to-speech voices ?

  • m4a/mp3 files to wav for Bing Speech API

    17 décembre 2018, par Waqas

    Bing Speech API only accepts wav files so I have been trying to convert m4a (Skype) and mp3 (Facebook) audio files I am getting in my chatbot to wav format. I am using fluent-ffmpeg in node.js.

    For now, I am downloading the audio file, converting it to wav and returning the piped output for use ahead.

    if (attachment.contentType === 'audio/x-m4a') {
     request.get(attachment.contentUrl).pipe(fs.createWriteStream('file.m4a'));
     var command = ffmpeg('file.m4a')
           .toFormat('wav')
           .on('error', function (err) {
               console.log('An error occurred: ' + err.message);
           })
           .on('progress', function (progress) {
               // console.log(JSON.stringify(progress));
               console.log('Processing: ' + progress.targetSize + ' KB converted');
           })
           .on('end', function () {
               console.log('Processing finished !');
           });

     return command.pipe();
    }

    Right now, the conversion works when I send the m4a file through the botframework-emulator on my pc. But when I specify my pc as the endpoint (through ngrok) and try to send the m4a file from the chat test at the bot framework developer end, ffmpeg returns an error :

    An error occurred: ffmpeg exited with code 1: file.m4a: Invalid data found when processing input

    But when I play the downloaded m4a file, it plays alright.

    The content URL is https in the second case if that matters.

    Kindly help me with two things :

    1. Downloading, Converting and Returning without storing anything on my end
    2. Downloading/Accessing m4a/mp3 files properly

    I am new to streams, pipes and ffmpeg and all the above code is after googling.

  • Google Speech API "Sample rate in request does not match FLAC header"

    13 février 2017, par kjdion84

    I’m trying to convert an mp4 video clip into a FLAC audio file and then have google speech spit out the words from the video so that I can detect if specific words were said.

    I have everything working except that I am getting an error from the Speech API :

    {
     "error": {
       "code": 400,
       "message": "Sample rate in request does not match FLAC header.",
       "status": "INVALID_ARGUMENT"
     }
    }

    I am using FFMPEG in order to convert the mp4 into a FLAC file. I am specifying that the FLAC file be 16 bits in the command, but when I right click on the FLAC file Windows is telling me it is 302kbps.

    Here is my PHP code :

    // convert mp4 video to 16 bit flac audio file
    $cmd = 'C:/wamp/www/ffmpeg/bin/ffmpeg.exe -i C:/wamp/www/test.mp4 -c:a flac -sample_fmt s16 C:/wamp/www/test.flac';
    exec($cmd, $output);

    // convert flac to text so we can detect if certain words were said
    $data = array(
       "config" => array(
           "encoding" => "FLAC",
           "sampleRate" => 16000,
           "languageCode" => "en-US"
       ),
       "audio" => array(
           "content" => base64_encode(file_get_contents("test.flac")),
       )
    );

    $json_data = json_encode($data);

    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, 'https://speech.googleapis.com/v1beta1/speech:syncrecognize?key=MY_API_KEY');
    curl_setopt($ch, CURLOPT_HTTPHEADER, array("Content-Type: application/json"));
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($ch, CURLOPT_POST, true);
    curl_setopt($ch, CURLOPT_POSTFIELDS, $json_data);
    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);

    $result = curl_exec($ch);