
Recherche avancée
Médias (1)
-
Collections - Formulaire de création rapide
19 février 2013, par
Mis à jour : Février 2013
Langue : français
Type : Image
Autres articles (63)
-
Websites made with MediaSPIP
2 mai 2011, parThis page lists some websites based on MediaSPIP.
-
Contribute to a better visual interface
13 avril 2011MediaSPIP is based on a system of themes and templates. Templates define the placement of information on the page, and can be adapted to a wide range of uses. Themes define the overall graphic appearance of the site.
Anyone can submit a new graphic theme or template and make it available to the MediaSPIP community. -
Submit enhancements and plugins
13 avril 2011If you have developed a new extension to add one or more useful features to MediaSPIP, let us know and its integration into the core MedisSPIP functionality will be considered.
You can use the development discussion list to request for help with creating a plugin. As MediaSPIP is based on SPIP - or you can use the SPIP discussion list SPIP-Zone.
Sur d’autres sites (10050)
-
I faced ffmpeg error in my project run time
3 juillet 2023, par Jesy JRuntime error: can't load audio from file: 'ffmpeg' not found. Please install 
'ffmpeg' in your system to use non- wav audio file format and make sure 'ffprobe' 
is in your path



I configure ffmpeg in my system but still I face this error.


This is my code :


!pip install gradio
!pip install SpeechRecognition
!pip install pydub
!pip install openai

import gradio as gr
import speech_recognition as sr
from pydub import AudioSegment
import openai

# Set up OpenAI API
openai.api_key = [MASKED]

# Function to convert text to speech using OpenAI's API
def text_to_speech(text, language):
 response = openai.Completion.create(
 engine="davinci",
 prompt=f"Translate the following English text into {language}: \"{text}\"",
 max_tokens=100,
 temperature=0.8,
 top_p=1.0,
 frequency_penalty=0.0,
 presence_penalty=0.0,
 stop=None,
 n=1,
 log_level="info"
 )
 return response.choices[0].text.strip()

# Function to recognize speech from audio
def speech_to_text(audio):
 recognizer = sr.Recognizer()
 with sr.AudioFile(audio) as source:
 audio_data = recognizer.record(source)
 return recognizer.recognize_google(audio_data)

# Function to convert audio to desired language
def convert_language(audio, target_language):
 recognized_text = speech_to_text(audio)
 translated_text = text_to_speech(recognized_text, target_language)
 return translated_text

# Function to process user input and generate output
def process_audio(input_audio, target_language):
 converted_text = convert_language(input_audio.name, target_language)
 return gr.outputs.Audio(converted_text, type="filepath")

# Set up Gradio interface
audio_input = gr.inputs.Audio(source="microphone")

language_input = gr.inputs.Dropdown(choices=["English", "French", "German"]) # Add more languages as needed

output_audio = gr.outputs.Audio(type="filepath", label="Output Audio")

title = "Multilingual AI Voice Assistant"

description = "Upload an audio file and select the target language for translation."

gr.Interface(fn=process_audio, inputs=[audio_input, language_input], outputs=output_audio, title=title, description=description).launch()



-
Transcription via OpenAi's whisper : AssertionError : incorrect audio shape
1er avril 2024, par muratowskiI'm trying to use OpenAI's open source Whisper library to transcribe audio files.


Here is my script's source code :


import whisper

model = whisper.load_model("large-v2")

# load the entire audio file
audio = whisper.load_audio("/content/file.mp3")
#When i write that code snippet here ==> audio = whisper.pad_or_trim(audio) the first 30 secs are converted and without any problem they are converted.

# make log-Mel spectrogram and move to the same device as the model
mel = whisper.log_mel_spectrogram(audio).to(model.device)

# detect the spoken language
_, probs = model.detect_language(mel)
print(f"Detected language: {max(probs, key=probs.get)}")

# decode the audio
options = whisper.DecodingOptions(fp16=False)
result = whisper.decode(model, mel, options)

# print the recognized text if available
try:
 if hasattr(result, "text"):
 print(result.text)
except Exception as e:
 print(f"Error while printing transcription: {e}")

# write the recognized text to a file
try:
 with open("output_of_file.txt", "w") as f:
 f.write(result.text)
 print("Transcription saved to file.")
except Exception as e:
 print(f"Error while saving transcription: {e}")



In here :


# load the entire audio file
audio = whisper.load_audio("/content/file.mp3")



when I write below : " audio = whisper.pad_or_trim(audio) ", the first 30 secs of the sound file is transcribed without any problem and language detection works as well,


but when I delete it and want the whole file to be transcribed, I get the following error :




AssertionError : incorrect audio shape




What should I do ? Should I change the structure of the sound file ? If yes, which library should I use and what type of script should I write ?


-
Changes to the WebM Open Source License
5 juin 2010, par noreply@blogger.com (John Luther)You’ll see on the WebM license page and in our source code repositories that we’ve made a small change to our open source license. There were a couple of issues that popped up after we released WebM at Google I/O a couple weeks ago, specifically around how the patent clause was written.
As it was originally written, if a patent action was brought against Google, the patent license terminated. This provision itself is not unusual in an OSS license, and similar provisions exist in the 2nd Apache License and in version 3 of the GPL. The twist was that ours terminated "any" rights and not just rights to the patents, which made our license GPLv3 and GPLv2 incompatible. Also, in doing this, we effectively created a potentially new open source copyright license, something we are loath to do.
Using patent language borrowed from both the Apache and GPLv3 patent clauses, in this new iteration of the patent clause we’ve decoupled patents from copyright, thus preserving the pure BSD nature of the copyright license. This means we are no longer creating a new open source copyright license, and the patent grant can exist on its own. Additionally, we have updated the patent grant language to make it clearer that the grant includes the right to modify the code and give it to others. (We’ve updated the licensing FAQ to reflect these changes as well.)
We’ve also added a definition for the "this implementation" language, to make that more clear.
Thanks for your patience as we worked through this, and we hope you like, enjoy and (most importantly) use WebM and join with us in creating more freedom online. We had a lot of help on these changes, so thanks to our friends in open source and free software who traded many emails, often at odd hours, with us.
Chris DiBona is the Open Source Programs Manager at Google.