Recherche avancée

Médias (1)

Mot : - Tags -/geti3

Sur d’autres sites (233)

  • ffmpeg get the audio stream from mp4 and send it to speech recognition

    12 juillet 2013, par user1896859

    I have few .mp4 video files in which at the start of each video file there is a word, I want to load these files get the audio check what is the spoken word and rename the file accordingly.

    Currently what i am doing is, converting all the mp4 files to wav and then sending the to speech recognition and then doing the renaming stuff.

    Is there a way to cut short the "converting to wav" part out and directly send the mp4 audio stream to speech recognition ??

    Thanks,

  • C# - Capture RTP Stream and send to speech recognition

    2 septembre 2017, par dgreenheck

    What I am trying to accomplish :

    • Capture RTP Stream in C#
    • Forward that stream to the System.Speech.SpeechRecognitionEngine

    I am creating a Linux-based robot which will take microphone input, send it Windows machine which will process the audio using Microsoft Speech Recognition and send the response back to the robot. The robot might be hundreds of miles from the server, so I would like to do this over the Internet.

    What I have done so far :

    • Have the robot generate an RTP stream encoded in MP3 format (other formats available) using FFmpeg (the robot is running on a Raspberry Pi running Arch Linux)
    • Captured stream on the client computer using VLC ActiveX control
    • Found that the SpeechRecognitionEngine has the available methods :
      1. recognizer.SetInputToWaveStream()
      2. recognizer.SetInputToAudioStream()
      3. recognizer.SetInputToDefaultAudioDevice()
    • Looked at using JACK to send the output of the app to line-in, but was completely confused by it.

    What I need help with :

    I’m stuck on how to actually send the stream from VLC to the SpeechRecognitionEngine. VLC doesn’t expose the stream at all. Is there a way I can just capture a stream and pass that stream object to the SpeechRecognitionEngine ? Or is RTP not the solution here ?

    Thanks in advance for your help.

  • C# - Capture RTP Stream and send to speech recognition

    16 avril 2013, par dgreenheck

    What I am trying to accomplish :

    • Capture RTP Stream in C#
    • Forward that stream to the System.Speech.SpeechRecognitionEngine

    I am creating a Linux-based robot which will take microphone input, send it Windows machine which will process the audio using Microsoft Speech Recognition and send the response back to the robot. The robot might be hundreds of miles from the server, so I would like to do this over the Internet.

    What I have done so far :

    • Have the robot generate an RTP stream encoded in MP3 format (other formats available) using FFmpeg (the robot is running on a Raspberry Pi running Arch Linux)
    • Captured stream on the client computer using VLC ActiveX control
    • Found that the SpeechRecognitionEngine has the available methods :
      1. recognizer.SetInputToWaveStream()
      2. recognizer.SetInputToAudioStream()
      3. recognizer.SetInputToDefaultAudioDevice()
    • Looked at using JACK to send the output of the app to line-in, but was completely confused by it.

    What I need help with :

    I'm stuck on how to actually send the stream from VLC to the SpeechRecognitionEngine. VLC doesn't expose the stream at all. Is there a way I can just capture a stream and pass that stream object to the SpeechRecognitionEngine ? Or is RTP not the solution here ?

    Thanks in advance for your help.