Recherche avancée

Recherche
Choix de la période de publication
Date minimale :

Date maximale :

Type de date :
Choix de la langue
Choix du type de média
Choix de la rubrique
Choix de la licence de publication
Choix de l’auteur

Médias (0)

Mot : - Tags -/performance

Aucun média correspondant à vos critères n’est disponible sur le site.

Autres articles (34)

Récupération d’informations sur le site maître à l’installation d’une instance

26 novembre 2010, par kent1

Utilité
Sur le site principal, une instance de mutualisation est définie par plusieurs choses : Les données dans la table spip_mutus ; Son logo ; Son auteur principal (id_admin dans la table spip_mutus correspondant à un id_auteur de la table spip_auteurs)qui sera le seul à pouvoir créer définitivement l’instance de mutualisation ;
Il peut donc être tout à fait judicieux de vouloir récupérer certaines de ces informations afin de compléter l’installation d’une instance pour, par exemple : récupérer le (...)
Pas question de marché, de cloud etc...

10 avril 2011

Le vocabulaire utilisé sur ce site essaie d’éviter toute référence à la mode qui fleurit allègrement
sur le web 2.0 et dans les entreprises qui en vivent.
Vous êtes donc invité à bannir l’utilisation des termes "Brand", "Cloud", "Marché" etc...
Notre motivation est avant tout de créer un outil simple, accessible à pour tout le monde, favorisant
le partage de créations sur Internet et permettant aux auteurs de garder une autonomie optimale.
Aucun "contrat Gold ou Premium" n’est donc prévu, aucun (...)
HTML5 audio and video support

13 avril 2011, par kent1

MediaSPIP uses HTML5 video and audio tags to play multimedia files, taking advantage of the latest W3C innovations supported by modern browsers.
The MediaSPIP player used has been created specifically for MediaSPIP and can be easily adapted to fit in with a specific theme.
For older browsers the Flowplayer flash fallback is used.
MediaSPIP allows for media playback on major mobile platforms with the above (...)

1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | ... | 12

Sur d’autres sites (4435)

Can I make calls to APIs such as youtube-dl and ffmpeg from a chrome-app ?

8 janvier 2015, par ErickR

First of all, I haven’t started the implementation of the system I’m about to describe, as I didn’t want to commit on implementing something I did not know if was possible.

So, what I’m trying to achieve is to build a chrome-app to download the audio from certain websites (e.g. youtube and soundcloud) using youtube-dl, post process it using ffmpeg and then upload it to a cloud service via some api. The reason I want to do it via a chrome-app is because I could do all the work on the client side (no need for servers) and I’d have the ability to insert javascript into the pages using content scripts, which would make the app pretty simple to use (I could create buttons such as ’download song’ and stuff like that).

Although I have already read the documentation explaining the NaCl Technical Overview and some of the Application Structure, I still am not sure as to whether I would be able to make these calls via some C/C++ module or if I would get denied due to security reasons.

To summarize : considering that the user has the needed dependencies in his system (youtube-dl, python, ffmpeg and etc.), is it possible to make calls to third party APIs such as the ones described before via a chrome-app using NaCl ?

Thank you all in advance,

How to Stream Audio from Google Cloud Storage in Chunks and Convert Each Chunk to WAV for Whisper Transcription

14 novembre 2024, par Douglas Landvik

I'm working on a project where I need to transcribe audio stored in a Google Cloud Storage bucket using OpenAI's Whisper model. The audio is stored in WebM format with Opus encoding, and due to the file size, I'm streaming the audio in 30-second chunks.

To convert each chunk to WAV (16 kHz, mono, 16-bit PCM) compatible with Whisper, I'm using FFmpeg. The first chunk converts successfully, but subsequent chunks fail to convert. I suspect this is because each chunk lacks the WebM container's header, which FFmpeg needs to interpret the Opus codec correctly.

Here’s a simplified version of my approach :

Download Chunk : I download each chunk from GCS as bytes.
Convert with FFmpeg : I pass the bytes to FFmpeg to convert each chunk from WebM/Opus to WAV.

async def handle_transcription_and_notify(&#xA;    consultation_service: ConsultationService,&#xA;    consultation_processor: ConsultationProcessor,&#xA;    consultation: Consultation,&#xA;    language: str,&#xA;    notes: str,&#xA;    clinic_id: str,&#xA;    vet_email: str,&#xA;    trace_id: str,&#xA;    blob_path: str,&#xA;    max_retries: int = 3,&#xA;    retry_delay: int = 5,&#xA;    max_concurrent_tasks: int = 3&#xA;):&#xA;    """&#xA;    Handles the transcription process by streaming the file from GCS, converting to a compatible format, &#xA;    and notifying the client via WebSocket.&#xA;    """&#xA;    chunk_duration_sec = 30  # 30 seconds per chunk&#xA;    logger.info(f"Starting transcription process for consultation {consultation.consultation_id}",&#xA;                extra={&#x27;trace_id&#x27;: trace_id})&#xA;&#xA;    # Initialize GCS client&#xA;    service_account_key = os.environ.get(&#x27;SERVICE_ACCOUNT_KEY_BACKEND&#x27;)&#xA;    if not service_account_key:&#xA;        logger.error("Service account key not found in environment variables", extra={&#x27;trace_id&#x27;: trace_id})&#xA;        await send_discord_alert(&#xA;            f"Service account key not found for consultation {consultation.consultation_id}.\nTrace ID: {trace_id}"&#xA;        )&#xA;        return&#xA;&#xA;    try:&#xA;        service_account_info = json.loads(service_account_key)&#xA;        credentials = service_account.Credentials.from_service_account_info(service_account_info)&#xA;    except Exception as e:&#xA;        logger.error(f"Error loading service account credentials: {str(e)}", extra={&#x27;trace_id&#x27;: trace_id})&#xA;        await send_discord_alert(&#xA;            f"Error loading service account credentials for consultation {consultation.consultation_id}.\nError: {str(e)}\nTrace ID: {trace_id}"&#xA;        )&#xA;        return&#xA;&#xA;    # Initialize GCS client&#xA;    service_account_key = os.environ.get(&#x27;SERVICE_ACCOUNT_KEY_BACKEND&#x27;)&#xA;    if not service_account_key:&#xA;        logger.error("Service account key not found in environment variables", extra={&#x27;trace_id&#x27;: trace_id})&#xA;        await send_discord_alert(&#xA;            f"Service account key not found for consultation {consultation.consultation_id}.\nTrace ID: {trace_id}"&#xA;        )&#xA;        return&#xA;&#xA;    try:&#xA;        service_account_info = json.loads(service_account_key)&#xA;        credentials = service_account.Credentials.from_service_account_info(service_account_info)&#xA;    except Exception as e:&#xA;        logger.error(f"Error loading service account credentials: {str(e)}", extra={&#x27;trace_id&#x27;: trace_id})&#xA;        await send_discord_alert(&#xA;            f"Error loading service account credentials for consultation {consultation.consultation_id}.\nError: {str(e)}\nTrace ID: {trace_id}"&#xA;        )&#xA;        return&#xA;&#xA;    storage_client = storage.Client(credentials=credentials)&#xA;    bucket_name = &#x27;vetz_consultations&#x27;&#xA;    blob = storage_client.bucket(bucket_name).get_blob(blob_path)&#xA;    bytes_per_second = 16000 * 2  # 32,000 bytes per second&#xA;    chunk_size_bytes = 30 * bytes_per_second&#xA;    size = blob.size&#xA;&#xA;    async def stream_blob_in_chunks(blob, chunk_size):&#xA;        loop = asyncio.get_running_loop()&#xA;        start = 0&#xA;        size = blob.size&#xA;        while start &lt; size:&#xA;            end = min(start &#x2B; chunk_size - 1, size - 1)&#xA;            try:&#xA;                logger.info(f"Requesting chunk from {start} to {end}", extra={&#x27;trace_id&#x27;: trace_id})&#xA;                chunk = await loop.run_in_executor(&#xA;                    None, lambda: blob.download_as_bytes(start=start, end=end)&#xA;                )&#xA;                if not chunk:&#xA;                    break&#xA;                logger.info(f"Yielding chunk from {start} to {end}, size: {len(chunk)} bytes",&#xA;                            extra={&#x27;trace_id&#x27;: trace_id})&#xA;                yield chunk&#xA;                start &#x2B;= chunk_size&#xA;            except Exception as e:&#xA;                logger.error(f"Error downloading chunk from {start} to {end}: {str(e)}", exc_info=True,&#xA;                             extra={&#x27;trace_id&#x27;: trace_id})&#xA;                raise e&#xA;&#xA;    async def convert_to_wav(chunk_bytes, chunk_idx):&#xA;        """&#xA;        Convert audio chunk to WAV format compatible with Whisper, ensuring it&#x27;s 16 kHz, mono, and 16-bit PCM.&#xA;        """&#xA;        try:&#xA;            logger.debug(f"Processing chunk {chunk_idx}: size = {len(chunk_bytes)} bytes")&#xA;&#xA;            detected_format = await detect_audio_format(chunk_bytes)&#xA;            logger.info(f"Detected audio format for chunk {chunk_idx}: {detected_format}")&#xA;            input_io = io.BytesIO(chunk_bytes)&#xA;            output_io = io.BytesIO()&#xA;&#xA;            # ffmpeg command to convert webm/opus to WAV with 16 kHz, mono, and 16-bit PCM&#xA;&#xA;            # ffmpeg command with debug information&#xA;            ffmpeg_command = [&#xA;                "ffmpeg",&#xA;                "-loglevel", "debug",&#xA;                "-f", "s16le",            # Treat input as raw PCM data&#xA;                "-ar", "48000",           # Set input sample rate&#xA;                "-ac", "1",               # Set input to mono&#xA;                "-i", "pipe:0",&#xA;                "-ar", "16000",           # Set output sample rate to 16 kHz&#xA;                "-ac", "1",               # Ensure mono output&#xA;                "-sample_fmt", "s16",     # Set output format to 16-bit PCM&#xA;                "-f", "wav",              # Output as WAV format&#xA;                "pipe:1"&#xA;            ]&#xA;&#xA;            process = subprocess.Popen(&#xA;                ffmpeg_command,&#xA;                stdin=subprocess.PIPE,&#xA;                stdout=subprocess.PIPE,&#xA;                stderr=subprocess.PIPE&#xA;            )&#xA;&#xA;            stdout, stderr = process.communicate(input=input_io.read())&#xA;&#xA;            if process.returncode == 0:&#xA;                logger.info(f"FFmpeg conversion completed successfully for chunk {chunk_idx}")&#xA;                output_io.write(stdout)&#xA;                output_io.seek(0)&#xA;&#xA;                # Save the WAV file locally for listening&#xA;                output_dir = "converted_chunks"&#xA;                os.makedirs(output_dir, exist_ok=True)&#xA;                file_path = os.path.join(output_dir, f"chunk_{chunk_idx}.wav")&#xA;&#xA;                with open(file_path, "wb") as f:&#xA;                    f.write(stdout)&#xA;                logger.info(f"Chunk {chunk_idx} saved to {file_path}")&#xA;&#xA;                return output_io&#xA;            else:&#xA;                logger.error(f"FFmpeg failed for chunk {chunk_idx} with return code {process.returncode}")&#xA;                logger.error(f"Chunk {chunk_idx} - FFmpeg stderr: {stderr.decode()}")&#xA;                return None&#xA;&#xA;        except Exception as e:&#xA;            logger.error(f"Unexpected error in FFmpeg conversion for chunk {chunk_idx}: {str(e)}")&#xA;            return None&#xA;&#xA;    async def transcribe_chunk(idx, chunk_bytes):&#xA;        for attempt in range(1, max_retries &#x2B; 1):&#xA;            try:&#xA;                logger.info(f"Transcribing chunk {idx &#x2B; 1} (attempt {attempt}).", extra={&#x27;trace_id&#x27;: trace_id})&#xA;&#xA;                # Convert to WAV format&#xA;                wav_io = await convert_to_wav(chunk_bytes, idx)&#xA;                if not wav_io:&#xA;                    logger.error(f"Failed to convert chunk {idx &#x2B; 1} to WAV format.")&#xA;                    return ""&#xA;&#xA;                wav_io.name = "chunk.wav"&#xA;                chunk_transcription = await consultation_processor.transcribe_audio_whisper(wav_io)&#xA;                logger.info(f"Chunk {idx &#x2B; 1} transcribed successfully.", extra={&#x27;trace_id&#x27;: trace_id})&#xA;                return chunk_transcription&#xA;            except Exception as e:&#xA;                logger.error(f"Error transcribing chunk {idx &#x2B; 1} (attempt {attempt}): {str(e)}", exc_info=True,&#xA;                             extra={&#x27;trace_id&#x27;: trace_id})&#xA;                if attempt &lt; max_retries:&#xA;                    await asyncio.sleep(retry_delay)&#xA;                else:&#xA;                    await send_discord_alert(&#xA;                        f"Max retries reached for chunk {idx &#x2B; 1} in consultation {consultation.consultation_id}.\nError: {str(e)}\nTrace ID: {trace_id}"&#xA;                    )&#xA;                    return ""  # Return empty string for failed chunk&#xA;&#xA;    await notification_manager.send_personal_message(&#xA;        f"Consultation {consultation.consultation_id} is being transcribed.", vet_email&#xA;    )&#xA;&#xA;    try:&#xA;        idx = 0&#xA;        full_transcription = []&#xA;        async for chunk in stream_blob_in_chunks(blob, chunk_size_bytes):&#xA;            transcription = await transcribe_chunk(idx, chunk)&#xA;            if transcription:&#xA;                full_transcription.append(transcription)&#xA;            idx &#x2B;= 1&#xA;&#xA;        combined_transcription = " ".join(full_transcription)&#xA;        consultation.full_transcript = (consultation.full_transcript or "") &#x2B; " " &#x2B; combined_transcription&#xA;        consultation_service.save_consultation(clinic_id, vet_email, consultation)&#xA;        logger.info(f"Transcription saved for consultation {consultation.consultation_id}.",&#xA;                    extra={&#x27;trace_id&#x27;: trace_id})&#xA;&#xA;    except Exception as e:&#xA;        logger.error(f"Error during transcription process: {str(e)}", exc_info=True, extra={&#x27;trace_id&#x27;: trace_id})&#xA;        await send_discord_alert(&#xA;            f"Error during transcription process for consultation {consultation.consultation_id}.\nError: {str(e)}\nTrace ID: {trace_id}"&#xA;        )&#xA;        return&#xA;&#xA;    await notification_manager.send_personal_message(&#xA;        f"Consultation {consultation.consultation_id} has been transcribed.", vet_email&#xA;    )&#xA;&#xA;    try:&#xA;        template_service = TemplateService()&#xA;        medical_record_template = template_service.get_template_by_name(&#xA;            consultation.medical_record_template_id).sections&#xA;&#xA;        sections = await consultation_processor.extract_structured_sections(&#xA;            transcription=consultation.full_transcript,&#xA;            notes=notes,&#xA;            language=language,&#xA;            template=medical_record_template,&#xA;        )&#xA;        consultation.sections = sections&#xA;        consultation_service.save_consultation(clinic_id, vet_email, consultation)&#xA;        logger.info(f"Sections processed for consultation {consultation.consultation_id}.",&#xA;                    extra={&#x27;trace_id&#x27;: trace_id})&#xA;    except Exception as e:&#xA;        logger.error(f"Error processing sections for consultation {consultation.consultation_id}: {str(e)}",&#xA;                     exc_info=True, extra={&#x27;trace_id&#x27;: trace_id})&#xA;        await send_discord_alert(&#xA;            f"Error processing sections for consultation {consultation.consultation_id}.\nError: {str(e)}\nTrace ID: {trace_id}"&#xA;        )&#xA;        raise e&#xA;&#xA;    await notification_manager.send_personal_message(&#xA;        f"Consultation {consultation.consultation_id} is fully processed.", vet_email&#xA;    )&#xA;    logger.info(f"Successfully processed consultation {consultation.consultation_id}.",&#xA;                extra={&#x27;trace_id&#x27;: trace_id})&#xA;&#xA;

Upload part of a video files to server

10 septembre 2014, par Peter Lur

I want to make a webpage where you can upload a video to the server (let say using a File Input HTML TAG). I would like to know if it’s possible in ANY languages to upload only small parts of the video files without uploading the whole file first and then splitting the file.

Let me give you an example of what I am trying to achieve. I am trying to create a service where one upload a long & big video file and I give them 10 samples segment of the video file as a result.

Let say I have a 1 hours long & /1 Gig size video (avi) file.

I would like to get a sample of 10 segments of 1 minutes spread across the entire file uploaded to the server.

I know that I could upload the whole file and then split it but uploading a 1G file would be way long for the user.

I want to know if upfront there is a way to choose what part of the file need to be uploaded so I could upload only specific segments and then re-construct the files to make them readable.

I know this is more of a subjective question and it may not be well fitted for Stack Overflow but I really need to figure out how to do that.

1 | ... | 429 | 430 | 431 | 432 | 433 | 434 | 435 | 436 | 437 | ... | 1479

Recherche avancée

Médias (0)

Autres articles (34)

Récupération d’informations sur le site maître à l’installation d’une instance

Pas question de marché, de cloud etc...

HTML5 audio and video support

Sur d’autres sites (4435)

Can I make calls to APIs such as youtube-dl and ffmpeg from a chrome-app ?

How to Stream Audio from Google Cloud Storage in Chunks and Convert Each Chunk to WAV for Whisper Transcription

Upload part of a video files to server

Se connecter

Navigation

Syndication

Boussole SPIP