Recherche avancée

Médias (0)

Mot : - Tags -/protocoles

Aucun média correspondant à vos critères n’est disponible sur le site.

Autres articles (111)

  • Script d’installation automatique de MediaSPIP

    25 avril 2011, par

    Afin de palier aux difficultés d’installation dues principalement aux dépendances logicielles coté serveur, un script d’installation "tout en un" en bash a été créé afin de faciliter cette étape sur un serveur doté d’une distribution Linux compatible.
    Vous devez bénéficier d’un accès SSH à votre serveur et d’un compte "root" afin de l’utiliser, ce qui permettra d’installer les dépendances. Contactez votre hébergeur si vous ne disposez pas de cela.
    La documentation de l’utilisation du script d’installation (...)

  • Que fait exactement ce script ?

    18 janvier 2011, par

    Ce script est écrit en bash. Il est donc facilement utilisable sur n’importe quel serveur.
    Il n’est compatible qu’avec une liste de distributions précises (voir Liste des distributions compatibles).
    Installation de dépendances de MediaSPIP
    Son rôle principal est d’installer l’ensemble des dépendances logicielles nécessaires coté serveur à savoir :
    Les outils de base pour pouvoir installer le reste des dépendances Les outils de développements : build-essential (via APT depuis les dépôts officiels) ; (...)

  • Automated installation script of MediaSPIP

    25 avril 2011, par

    To overcome the difficulties mainly due to the installation of server side software dependencies, an "all-in-one" installation script written in bash was created to facilitate this step on a server with a compatible Linux distribution.
    You must have access to your server via SSH and a root account to use it, which will install the dependencies. Contact your provider if you do not have that.
    The documentation of the use of this installation script is available here.
    The code of this (...)

Sur d’autres sites (10104)

  • How to convert a lump set of PPM files to a single .mp4 video using ffmpeg

    4 août 2021, par Nash G-L

    I'm trying to convert some .ppm files into a single video using ffmpeg. The images are listed as image10.ppm, image20.ppm, all the way up to image2000.ppm

    


    Essentially, there are 200 images following the format of image(1-200)0.ppm in a single folder.

    


    I've navigated to the correct folder containing the images using cd and the terminal displays this folder as the directory I'm in.

    


    Inputting the following command into the terminal :

    


    ffmpeg -r 10 -f image2 -s 500x500 -i image%04d.ppm -vcodec libx264 -crf 25  -pix_fmt yuv420p presentation_video.mp4


    


    I get the error that says :

    


    [image2 @ 0x7ff36180ba00] Could find no file with path 'image%04d.ppm' and index in the range 0-4
image%04d.ppm: No such file or directory


    


    What am I missing ? I've just downloaded ffmpeg and am fairly new in general to command line interface.

    


  • Google’s YouTube Uses FFmpeg

    9 février 2011, par Multimedia Mike — General

    Controversy arose last week when Google accused Microsoft of stealing search engine results for their Bing search engine. It was a pretty novel sting operation and Google did a good job of visually illustrating their side of the story on their official blog.

    This reminds me of the fact that Google’s YouTube video hosting site uses FFmpeg for converting videos. Not that this is in the same league as the search engine shenanigans (it’s perfectly legit to use FFmpeg in this capacity, but to my knowledge, Google/YouTube has never confirmed FFmpeg usage), but I thought I would revisit this item and illustrate it with screenshots. This is not new information— I first empirically tested this fact 4 years ago. However, a lot of people wonder how exactly I can identify FFmpeg on the backend when I claim that I’ve written code that helps power YouTube.

    Short Answer
    How do I know YouTube uses FFmpeg to convert multimedia ? Because :

    1. FFmpeg can decode a number of impossibly obscure multimedia formats using code I wrote
    2. YouTube can transcode many of the same formats
    3. I screwed up when I wrote the code to support some of these weird formats
    4. My mistakes are still present when YouTube transcodes certain fringe formats

    Longer Answer (With Pictures !)
    Let’s take a video format named RoQ, developed by noted game designer Graeme Devine. Originated for use in the FMV-heavy game The 11th Hour, the format eventually found its way into the Quake 3 engine as well as many games derived from the same technology.

    Dr. Tim Ferguson reverse engineered the format (though it would later be open sourced along with the rest of the Q3 engine). I wrote a RoQ playback system for FFmpeg, and I messed up in doing so. I believe my coding error helps demonstrate the case I’m trying to make here.

    Observe what happened when I pushed the jk02.roq sample through YouTube in my original experiment 4 years ago :



    Do you see how the canyon walls bleed into the sky ? That’s not supposed to happen. FFmpeg doesn’t do that anymore but I was able to go back into the source code history to find when it did do that :



    Academic Answer
    FFmpeg fixed this bug in June of 2007 (thanks to Eric Lasota). The problem had to do with premature colorspace conversion in my original decoder.

    Leftovers
    I tried uploading the video again to see if the problem persists in YouTube’s transcoder. First bit of trivia : YouTube detects when you have uploaded the same video twice and rejects the subsequent attempts. So I created a double concatenation of the video and uploaded it. The problem is gone, illustrating that the backend is actually using a newer version of FFmpeg. This surprises me for somewhat esoteric reasons.

    Here’s another interesting bit of trivia for those who don’t do a lot of YouTube uploading— YouTube reports format details when you upload a video :



    So, yep, RoQ format. And you can wager that this will prompt me to go back through the litany of unusual formats that FFmpeg supports to see how YouTube responds.

  • Computer crashing when using python tools in same script

    5 février 2023, par SL1997

    I am attempting to use the speech recognition toolkit VOSK and the speech diarization package Resemblyzer to transcibe audio and then identify the speakers in the audio.

    


    Tools :

    


    https://github.com/alphacep/vosk-api
    
https://github.com/resemble-ai/Resemblyzer

    


    I can do both things individually but run into issues when trying to do them when running the one python script.

    


    I used the following guide when setting up the diarization system :

    


    https://medium.com/saarthi-ai/who-spoke-when-build-your-own-speaker-diarization-module-from-scratch-e7d725ee279

    


    Computer specs are as follows :

    


    Intel(R) Core(TM) i3-7100 CPU @ 3.90GHz, 3912 Mhz, 2 Core(s), 4 Logical Processor(s)
    
32GB RAM

    


    The following is my code, I am not to sure if using threading is appropriate or if I even implemented it correctly, how can I best optimize this code as to achieve the results I am looking for and not crash.

    


    from vosk import Model, KaldiRecognizer
from pydub import AudioSegment
import json
import sys
import os
import subprocess
import datetime
from resemblyzer import preprocess_wav, VoiceEncoder
from pathlib import Path
from resemblyzer.hparams import sampling_rate
from spectralcluster import SpectralClusterer
import threading
import queue
import gc



def recognition(queue, audio, FRAME_RATE):

    model = Model("Vosk_Models/vosk-model-small-en-us-0.15")

    rec = KaldiRecognizer(model, FRAME_RATE)
    rec.SetWords(True)

    rec.AcceptWaveform(audio.raw_data)
    result = rec.Result()

    transcript = json.loads(result)#["text"]

    #return transcript
    queue.put(transcript)



def diarization(queue, audio):

    wav = preprocess_wav(audio)
    encoder = VoiceEncoder("cpu")
    _, cont_embeds, wav_splits = encoder.embed_utterance(wav, return_partials=True, rate=16)
    print(cont_embeds.shape)

    clusterer = SpectralClusterer(
        min_clusters=2,
        max_clusters=100,
        p_percentile=0.90,
        gaussian_blur_sigma=1)

    labels = clusterer.predict(cont_embeds)

    def create_labelling(labels, wav_splits):

        times = [((s.start + s.stop) / 2) / sampling_rate for s in wav_splits]
        labelling = []
        start_time = 0

        for i, time in enumerate(times):
            if i > 0 and labels[i] != labels[i - 1]:
                temp = [str(labels[i - 1]), start_time, time]
                labelling.append(tuple(temp))
                start_time = time
            if i == len(times) - 1:
                temp = [str(labels[i]), start_time, time]
                labelling.append(tuple(temp))

        return labelling

    #return
    labelling = create_labelling(labels, wav_splits)
    queue.put(labelling)



def identify_speaker(queue1, queue2):

    transcript = queue1.get()
    labelling = queue2.get()

    for speaker in labelling:

        speakerID = speaker[0]
        speakerStart = speaker[1]
        speakerEnd = speaker[2]

        result = transcript['result']
        words = [r['word'] for r in result if speakerStart < r['start'] < speakerEnd]
        #return
        print("Speaker",speakerID,":",' '.join(words), "\n")





def main():

    queue1 = queue.Queue()
    queue2 = queue.Queue()

    FRAME_RATE = 16000
    CHANNELS = 1

    podcast = AudioSegment.from_mp3("Podcast_Audio/Film-Release-Clip.mp3")
    podcast = podcast.set_channels(CHANNELS)
    podcast = podcast.set_frame_rate(FRAME_RATE)

    first_thread = threading.Thread(target=recognition, args=(queue1, podcast, FRAME_RATE))
    second_thread = threading.Thread(target=diarization, args=(queue2, podcast))
    third_thread = threading.Thread(target=identify_speaker, args=(queue1, queue2))

    first_thread.start()
    first_thread.join()
    gc.collect()

    second_thread.start()
    second_thread.join()
    gc.collect()

    third_thread.start()
    third_thread.join()
    gc.collect()

    # transcript = recognition(podcast,FRAME_RATE)
    #
    # labelling = diarization(podcast)
    #
    # print(identify_speaker(transcript, labelling))


if __name__ == '__main__':
    main()


    


    When I say crash I mean everything freezes, I have to hold down the power button on the desktop and turn it back on again. No blue/blank screen, just frozen in my IDE looking at my code. Any help in resolving this issue would be greatly appreciated.