Recherche avancée

Médias (3)

Mot : - Tags -/pdf

Autres articles (1)

  • Submit bugs and patches

    13 avril 2011

    Unfortunately a software is never perfect.
    If you think you have found a bug, report it using our ticket system. Please to help us to fix it by providing the following information : the browser you are using, including the exact version as precise an explanation as possible of the problem if possible, the steps taken resulting in the problem a link to the site / page in question
    If you think you have solved the bug, fill in a ticket and attach to it a corrective patch.
    You may also (...)

Sur d’autres sites (235)

  • Google Speech to Text on WAV file gives

    7 janvier 2021, par Darth.Vader

    I am using the Google Speech to Text API to convert a WAV file to text. When I play the WAV file, it works fine but when I run the Google Speech To Text API I get this error :

    


    WAV header indicates an unsupported format.


    


    When I try to analyze the file using ffmpeg tool, it get the following error :

    


    Output #0, wav, to '/home/shubham/workspace/intent-service/scripts/audio2.tmp.wav':
Metadata:
  ISFT            : Lavf57.83.100
  Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 8000 Hz, mono, s16, 128 kb/s
  Metadata:
    encoder         : Lavc57.107.100 pcm_s16le
[gsm_ms @ 0x55d4c255cd20] Packet is too small
Error while decoding stream #0:0: Invalid data found when processing input size=7924kB time=00:08:27.16 bitrate= 128.0kbits/s speed=3.72e+03x    
video:0kB audio:7924kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.000961%


    


    What am I missing ?

    


  • Google Speech API doesn't give correct result when audio is sent in file

    4 août 2012, par Cupidvogel

    I chanced upon the article at Google Speech API which suggested a mechanism for extracting text from audio file through Perl. Now I have recorded a audio file, which you will find at http://vocaroo.com/i/s0lPN5d3YQJj. It is a simple piece of audio, reading I love you. When I go to the Google speech API in Chrome, and speak those words, I get the right result. When I try the code at the above mentioned link with the audio file I pointed out, it returns strange results, like logan. How can I make it more accurate ? This is just a sample audio, what I am generally doing is extracting the audio from a video file through FFMpeg using something like ffmpeg -i input.avi -vn -ar 44100 -ac 2 -ab 192 -f mp3 output.mp3, followed by ffmpeg -i input.mp3 output.flac.

  • Google Speech API - Is there a way to determine if the audio has human voice or not ?

    20 décembre 2019, par stupid_sma

    I am making an audio filtering application at work that reads over hundreds of audio files and filters them. So, if the audio has human voice in it, it will accept it and if it does not- it will delete the audio file.

    I am using ffmpeg to get the details of the audio and add other filters like size and duration and silence (though it is not very accurate in detecting silence for all audio files.)

    My company asked me to try the Google Cloud Speech API to detect if the audio has any human voice in it.

    So with this code, some audio files return a Transcript of spoken words in the audio file, but what I need is to determine if a human is speaking or not.

    I have considered using hark.js for it but there does not seem to be enough documentation and I am short on time !

    Ps. I am an intern and I’m just starting out with programming. I apologize if my question does not make sense or sounds dumb.

      # Includes the autoloader for libraries installed with composer
      require __DIR__ . '/vendor/autoload.php';

      # Imports the Google Cloud client library
      use Google\Cloud\Speech\V1\SpeechClient;
      use Google\Cloud\Speech\V1\RecognitionAudio;
      use Google\Cloud\Speech\V1\RecognitionConfig;
      use Google\Cloud\Speech\V1\RecognitionConfig\AudioEncoding;

      putenv('GOOGLE_APPLICATION_CREDENTIALS=../../credentials.json');



      echo getcwd() . "<br />";
      chdir('test-sounds');
      echo getcwd() . "<br />";
      echo shell_exec('ls -lr');

      $fileList = glob('*');
      foreach($fileList as $filename){
      //echo $filename, '<br />';

      # The name of the audio file to transcribe
      $audioFile = __DIR__ . '/' . $filename;

      # get contents of a file into a string
      $content = file_get_contents($audioFile);

      # set string as audio content
      $audio = (new RecognitionAudio())
          ->setContent($content);

      # The audio file's encoding, sample rate and language
      $config = new RecognitionConfig([
          'encoding' => AudioEncoding::LINEAR16,
          'language_code' => 'ja-JP'
      ]);

      # Instantiates a client
      $client = new SpeechClient();

      # Detects speech in the audio file
      $response = $client->recognize($config, $audio);

      # Print most likely transcription
      foreach ($response->getResults() as $result) {
          $alternatives = $result->getAlternatives();
          $mostLikely = $alternatives[0];
          $transcript = $mostLikely->getTranscript();
          printf('<br />Transcript: %s' . PHP_EOL, $transcript . '<br />');

      }

      $client->close();

      }

      ?> ```