Recherche avancée

Recherche
Choix de la période de publication
Date minimale :

Date maximale :

Type de date :
Choix de la langue
Choix du type de média
Choix de la rubrique
Choix de la licence de publication
Choix de l’auteur

Médias (1)

Mot : - Tags -/wave

Autres articles (41)

Les autorisations surchargées par les plugins

27 avril 2010, par kent1

Mediaspip core
autoriser_auteur_modifier() afin que les visiteurs soient capables de modifier leurs informations sur la page d’auteurs
Supporting all media types

13 avril 2011, par kent1

Unlike most software and media-sharing platforms, MediaSPIP aims to manage as many different media types as possible. The following are just a few examples from an ever-expanding list of supported formats : images : png, gif, jpg, bmp and more audio : MP3, Ogg, Wav and more video : AVI, MP4, OGV, mpg, mov, wmv and more text, code and other data : OpenOffice, Microsoft Office (Word, PowerPoint, Excel), web (html, CSS), LaTeX, Google Earth and (...)
Encoding and processing into web-friendly formats

13 avril 2011, par kent1

MediaSPIP automatically converts uploaded files to internet-compatible formats.
Video files are encoded in MP4, Ogv and WebM (supported by HTML5) and MP4 (supported by Flash).
Audio files are encoded in MP3 and Ogg (supported by HTML5) and MP3 (supported by Flash).
Where possible, text is analyzed in order to retrieve the data needed for search engine detection, and then exported as a series of image files.
All uploaded files are stored online in their original format, so you can (...)

1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | ... | 14

Sur d’autres sites (6942)

Start and end time of MoviePy's VideoClip not working

21 mars 2024, par ernesto casco velazquez

I'm trying to add captions to a video. The desired outcome is to show each word in the exact moment is being said.

I have a method that gives me the accurate time start and end per each word :

def get_words_per_time(audio_speech_file):&#xA;    model = whisper.load_model("base")&#xA;    transcribe = model.transcribe(&#xA;        audio=audio_speech_file, fp16=False, word_timestamps=True&#xA;    )&#xA;    segments = transcribe["segments"]&#xA;    words = []&#xA;&#xA;    for seg in segments:&#xA;        for word in seg["words"]:&#xA;            words.append(&#xA;                {&#xA;                    "word": word["word"],&#xA;                    "start": word["start"],&#xA;                    "end": word["end"],&#xA;                    "prob": round(word["probability"], 4),&#xA;                }&#xA;            )&#xA;    return words&#xA;

Then I have a code that uses MoviePy to create TextClip and assing a given start and end time per pair of words (I know there are redundant statements, srry) :

def generate_captions(&#xA;    words,&#xA;    font="Komika",&#xA;    fontsize=32,&#xA;    color="White",&#xA;    align="center",&#xA;    stroke_width=3,&#xA;    stroke_color="black",&#xA;):&#xA;    text_comp = []&#xA;    for i in track(range(0, len(words), 2), description="Creating captions..."):&#xA;        word1 = words[i]&#xA;        if i &#x2B; 1 &lt; len(words):&#xA;            word2 = words[i &#x2B; 1]&#xA;        text_clip = TextClip(&#xA;            f"{word1[&#x27;word&#x27;]} {word2[&#x27;word&#x27;] if i &#x2B; 1 &lt; len(words) else &#x27;&#x27;}",&#xA;            font=font,  # Change Font if not found&#xA;            fontsize=fontsize,&#xA;            color=color,&#xA;            align=align,&#xA;            method="caption",&#xA;            size=(660, None),&#xA;            stroke_width=stroke_width,&#xA;            stroke_color=stroke_color,&#xA;        )&#xA;        text_clip = text_clip.set_start(word1["start"])&#xA;        text_clip = text_clip.set_end(&#xA;            word2["end"] if i &#x2B; 1 &lt; len(words) else word1["end"]&#xA;        )&#xA;        text_comp.append(text_clip)&#xA;    return text_comp&#xA;

Finally, I concatenate the words into a single video :

vid_clip = CompositeVideoClip(&#xA;    [vid_clip, concatenate_videoclips(text_comp).set_position(("center", 860))]&#xA;)&#xA;

The output is this, but you can clearly see the words are not flowing with the speech. They somehow move faster as if the start/end time did not matter. Here's the video

The words with their respective start/end time, look like this :

[&#xA;    {&#xA;        &#x27;word&#x27;: &#x27;This&#x27;,&#xA;        &#x27;start&#x27;: 0.0,&#xA;        &#x27;end&#x27;: 0.22,&#xA;        &#x27;prob&#x27;: 0.805&#xA;    },&#xA;    {&#xA;        &#x27;word&#x27;: &#x27;is&#x27;,&#xA;        &#x27;start&#x27;: 0.22,&#xA;        &#x27;end&#x27;: 0.42,&#xA;        &#x27;prob&#x27;: 0.9991&#xA;    },&#xA;    {&#xA;        &#x27;word&#x27;: &#x27;a&#x27;,&#xA;        &#x27;start&#x27;: 0.42,&#xA;        &#x27;end&#x27;: 0.6,&#xA;        &#x27;prob&#x27;: 0.999&#xA;    },&#xA;    {&#xA;        &#x27;word&#x27;: &#x27;test,&#xA;        &#x27;,&#xA;        &#x27;start&#x27;: 0.6,&#xA;        &#x27;end&#x27;: 1.04,&#xA;        &#x27;prob&#x27;: 0.9939&#xA;    },&#xA;    {&#xA;        &#x27;word&#x27;: &#x27;to&#x27;,&#xA;        &#x27;start&#x27;: 1.18,&#xA;        &#x27;end&#x27;: 1.3,&#xA;        &#x27;prob&#x27;: 0.9847&#xA;    },&#xA;    {&#xA;        &#x27;word&#x27;: &#x27;show&#x27;,&#xA;        &#x27;start&#x27;: 1.3,&#xA;        &#x27;end&#x27;: 1.54,&#xA;        &#x27;prob&#x27;: 0.9971&#xA;    },&#xA;    {&#xA;        &#x27;word&#x27;: &#x27;words&#x27;,&#xA;        &#x27;start&#x27;: 1.54,&#xA;        &#x27;end&#x27;: 1.9,&#xA;        &#x27;prob&#x27;: 0.995&#xA;    },&#xA;    {&#xA;        &#x27;word&#x27;: &#x27;does&#x27;,&#xA;        &#x27;start&#x27;: 1.9,&#xA;        &#x27;end&#x27;: 2.16,&#xA;        &#x27;prob&#x27;: 0.997&#xA;    },&#xA;    {&#xA;        &#x27;word&#x27;: &#x27;not&#x27;,&#xA;        &#x27;start&#x27;: 2.16,&#xA;        &#x27;end&#x27;: 2.4,&#xA;        &#x27;prob&#x27;: 0.9978&#xA;    },&#xA;    {&#xA;        &#x27;word&#x27;: &#x27;appear.&#x27;,&#xA;        &#x27;start&#x27;: 2.4,&#xA;        &#x27;end&#x27;: 2.82,&#xA;        &#x27;prob&#x27;: 0.9984&#xA;    },&#xA;    {&#xA;        &#x27;word&#x27;: &#x27;At&#x27;,&#xA;        &#x27;start&#x27;: 3.46,&#xA;        &#x27;end&#x27;: 3.6,&#xA;        &#x27;prob&#x27;: 0.9793&#xA;    },&#xA;    {&#xA;        &#x27;word&#x27;: &#x27;their&#x27;,&#xA;        &#x27;start&#x27;: 3.6,&#xA;        &#x27;end&#x27;: 3.8,&#xA;        &#x27;prob&#x27;: 0.9984&#xA;    },&#xA;    {&#xA;        &#x27;word&#x27;: &#x27;proper&#x27;,&#xA;        &#x27;start&#x27;: 3.8,&#xA;        &#x27;end&#x27;: 4.22,&#xA;        &#x27;prob&#x27;: 0.9976&#xA;    },&#xA;    {&#xA;        &#x27;word&#x27;: &#x27;time.&#x27;,&#xA;        &#x27;start&#x27;: 4.22,&#xA;        &#x27;end&#x27;: 4.72,&#xA;        &#x27;prob&#x27;: 0.999&#xA;    },&#xA;    {&#xA;        &#x27;word&#x27;: &#x27;Thanks&#x27;,&#xA;        &#x27;start&#x27;: 5.04,&#xA;        &#x27;end&#x27;: 5.4,&#xA;        &#x27;prob&#x27;: 0.9662&#xA;    },&#xA;    {&#xA;        &#x27;word&#x27;: &#x27;for,&#xA;        &#x27;,&#xA;        &#x27;start&#x27;: 5.4,&#xA;        &#x27;end&#x27;: 5.66,&#xA;        &#x27;prob&#x27;: 0.9941&#xA;    },&#xA;    {&#xA;        &#x27;word&#x27;: &#x27;watching.&#x27;,&#xA;        &#x27;start&#x27;: 5.94,&#xA;        &#x27;end&#x27;: 6.36,&#xA;        &#x27;prob&#x27;: 0.7701&#xA;    }&#xA;]&#xA;

What could be causing this ?

ffmpeg silenceremove - hear what bits are removed

7 avril 2020, par jimo

ffmpeg silenceremove is pretty cool. im loving it. i can trim 3 second silences to 2 seconds and reduce a 1.5 hour file of spoken audio down 3 or 4 minutes (depending on the speaker).





once in a while I do hear my choice for stop_threshold (ie-40dB on audio only analog file) does cause the end of a word to be clipped, just here and there when the speaker trails off softly at the end of the word.





is there any way to output what is trimmed to a file ? so I can listen to it and get an idea of just how often this word clipping happens ?





thanks !



Anomalie #2244 : association fichiers zip - nom fichier long - css privé

23 mars 2012, par cedric -

#grml... c’est le word-wrap:break-word qui est pas appliqué chez toi, ou qui est pas pris en compte, ou qui fait rien sur ce cas là ? où alors un max-width manquant ? Parceque hein ça marchait au moment du patch... :(

1 | ... | 662 | 663 | 664 | 665 | 666 | 667 | 668 | 669 | 670 | ... | 2314

Recherche avancée

Médias (1)

1 000 000 (wav version)

Autres articles (41)

Les autorisations surchargées par les plugins

Supporting all media types

Encoding and processing into web-friendly formats

Sur d’autres sites (6942)

Start and end time of MoviePy's VideoClip not working

ffmpeg silenceremove - hear what bits are removed

Anomalie #2244 : association fichiers zip - nom fichier long - css privé

Se connecter

Navigation

Syndication

Boussole SPIP