
Recherche avancée
Médias (1)
-
Bug de détection d’ogg
22 mars 2013, par
Mis à jour : Avril 2013
Langue : français
Type : Video
Autres articles (58)
-
Les autorisations surchargées par les plugins
27 avril 2010, parMediaspip core
autoriser_auteur_modifier() afin que les visiteurs soient capables de modifier leurs informations sur la page d’auteurs -
Publier sur MédiaSpip
13 juin 2013Puis-je poster des contenus à partir d’une tablette Ipad ?
Oui, si votre Médiaspip installé est à la version 0.2 ou supérieure. Contacter au besoin l’administrateur de votre MédiaSpip pour le savoir -
Support audio et vidéo HTML5
10 avril 2011MediaSPIP utilise les balises HTML5 video et audio pour la lecture de documents multimedia en profitant des dernières innovations du W3C supportées par les navigateurs modernes.
Pour les navigateurs plus anciens, le lecteur flash Flowplayer est utilisé.
Le lecteur HTML5 utilisé a été spécifiquement créé pour MediaSPIP : il est complètement modifiable graphiquement pour correspondre à un thème choisi.
Ces technologies permettent de distribuer vidéo et son à la fois sur des ordinateurs conventionnels (...)
Sur d’autres sites (10710)
-
FFMPEG, DrawText Issue in Live Stream
7 décembre 2022, par KennethI am using the following command to create an H264 stream with text data from a text file. Example data is fake. I am sending this to an RTSP server that then allows clients to connect. I am connecting from VLC to view the stream.


See update, this only happens for the live stream. If I output to file, it looks correct.


OS : Windows 10


ffmpeg -f lavfi -re -i color=size=1280x720:rate=1:color=black ^
 -vf drawtext="fontsize=16:fontfile=C\\:/Windows/fonts/consola.ttf:fontcolor=white:textfile='livetext.txt':x=50:y=50: reload=1" ^
 -c:v libx264 -preset ultrafast -tune zerolatency -x264-params keyint=10:min-keyint=10 ^
 -f rtsp rtsp://127.0.0.1:60000/sorting



The issue I am having is that the text shown in the video seems to be limited to 10 rows. On a fresh restart, I get even less. I don't see anything mentioned in the documentation about a limitation on length.


I have tried different
-preset
and-tune
options. Nothing improves this issue.

Are there settings I should adjust to help this ?




Console Output :


..\ffmpeg\ffmpeg -f lavfi -re -i color=size=1280x720:rate=5:color=black -vf drawtext="fontsize=20:fontfile=C\\:/Windows/fonts/consola.ttf:fontcolor=white:textfile='livetext.txt':x=50:y=50: reload=5" -c:v libx264 -preset ultrafast -tune zerolatency -x264-params keyint=10:min-keyint=10 -f rtsp rtsp://127.0.0.1:60000/sorting
ffmpeg version 2022-12-04-git-6c814093d8-essentials_build-www.gyan.dev Copyright (c) 2000-2022 the FFmpeg developers
 built with gcc 12.1.0 (Rev2, Built by MSYS2 project)
 configuration: --enable-gpl --enable-version3 --enable-static --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-gmp --enable-bzlib --enable-lzma --enable-zlib --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-sdl2 --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxvid --enable-libaom --enable-libopenjpeg --enable-libvpx --enable-mediafoundation --enable-libass --enable-libfreetype --enable-libfribidi --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf --enable-cuda-llvm --enable-cuvid --enable-ffnvcodec --enable-nvdec --enable-nvenc --enable-d3d11va --enable-dxva2 --enable-libvpl --enable-libgme --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libtheora --enable-libvo-amrwbenc --enable-libgsm --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enable-librubberband
 libavutil 57. 43.100 / 57. 43.100
 libavcodec 59. 54.100 / 59. 54.100
 libavformat 59. 34.102 / 59. 34.102
 libavdevice 59. 8.101 / 59. 8.101
 libavfilter 8. 51.100 / 8. 51.100
 libswscale 6. 8.112 / 6. 8.112
 libswresample 4. 9.100 / 4. 9.100
 libpostproc 56. 7.100 / 56. 7.100
Input #0, lavfi, from 'color=size=1280x720:rate=5:color=black':
 Duration: N/A, start: 0.000000, bitrate: N/A
 Stream #0:0: Video: wrapped_avframe, yuv420p, 1280x720 [SAR 1:1 DAR 16:9], 5 fps, 5 tbr, 5 tbn
Stream mapping:
 Stream #0:0 -> #0:0 (wrapped_avframe (native) -> h264 (libx264))
Press [q] to stop, [?] for help
[libx264 @ 000002402dcca140] using SAR=1/1
[libx264 @ 000002402dcca140] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 000002402dcca140] profile Constrained Baseline, level 3.1, 4:2:0, 8-bit
[libx264 @ 000002402dcca140] 264 - core 164 r3101 b093bbe - H.264/MPEG-4 AVC codec - Copyleft 2003-2022 - http://www.videolan.org/x264.html - options: cabac=0 ref=1 deblock=0:0:0 analyse=0:0 me=dia subme=0 psy=1 psy_rd=1.00:0.00 mixed_ref=0 me_range=16 chroma_me=1 trellis=0 8x8dct=0 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=0 threads=11 lookahead_threads=11 sliced_threads=1 slices=11 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=0 weightp=0 keyint=10 keyint_min=6 scenecut=0 intra_refresh=0 rc=crf mbtree=0 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=0
Output #0, rtsp, to 'rtsp://127.0.0.1:60000/sorting':
 Metadata:
 encoder : Lavf59.34.102
 Stream #0:0: Video: h264, yuv420p(progressive), 1280x720 [SAR 1:1 DAR 16:9], q=2-31, 5 fps, 90k tbn
 Metadata:
 encoder : Lavc59.54.100 libx264
 Side data:
 cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: N/A
frame= 485 fps=5.0 q=11.0 size=N/A time=00:01:36.80 bitrate=N/A speed= 1x 0x



Update 1 :


If I output to a file, the text is shown correctly (shown below). If I take out the -re argument for the output to match input rate, I get 150fps, so processing power does not seem to be the issue.




-
android exoplayer add ffmpeg extension to my project how ?
7 février 2023, par MaloI have an android application containing exoplayer instance, some udp video play without sounds , so i want to add Ffmpeg extension to my project, i am working on windows system and need to follow the instructions below :


https://github.com/google/ExoPlayer/blob/release-v2/extensions/ffmpeg/README.md



So first step is Set the following shell variable :
cd ""
FFMPEG_MODULE_PATH="$(pwd)/extensions/ffmpeg/src/main"


i downloaded Git to use as power shell, so what is pwd ??


PLus...
Set the host platform (use "darwin-x86_64" for Mac OS X) :
HOST_PLATFORM="linux-x86_64" what is this variable in windows ?


Please i am confused how to build this library manually in windows and it is not straightforward at all....


-
Computer crashing when using python tools in same script
5 février 2023, par SL1997I am attempting to use the speech recognition toolkit VOSK and the speech diarization package Resemblyzer to transcibe audio and then identify the speakers in the audio.


Tools :


https://github.com/alphacep/vosk-api

https://github.com/resemble-ai/Resemblyzer

I can do both things individually but run into issues when trying to do them when running the one python script.


I used the following guide when setting up the diarization system :




Computer specs are as follows :


Intel(R) Core(TM) i3-7100 CPU @ 3.90GHz, 3912 Mhz, 2 Core(s), 4 Logical Processor(s)

32GB RAM

The following is my code, I am not to sure if using threading is appropriate or if I even implemented it correctly, how can I best optimize this code as to achieve the results I am looking for and not crash.


from vosk import Model, KaldiRecognizer
from pydub import AudioSegment
import json
import sys
import os
import subprocess
import datetime
from resemblyzer import preprocess_wav, VoiceEncoder
from pathlib import Path
from resemblyzer.hparams import sampling_rate
from spectralcluster import SpectralClusterer
import threading
import queue
import gc



def recognition(queue, audio, FRAME_RATE):

 model = Model("Vosk_Models/vosk-model-small-en-us-0.15")

 rec = KaldiRecognizer(model, FRAME_RATE)
 rec.SetWords(True)

 rec.AcceptWaveform(audio.raw_data)
 result = rec.Result()

 transcript = json.loads(result)#["text"]

 #return transcript
 queue.put(transcript)



def diarization(queue, audio):

 wav = preprocess_wav(audio)
 encoder = VoiceEncoder("cpu")
 _, cont_embeds, wav_splits = encoder.embed_utterance(wav, return_partials=True, rate=16)
 print(cont_embeds.shape)

 clusterer = SpectralClusterer(
 min_clusters=2,
 max_clusters=100,
 p_percentile=0.90,
 gaussian_blur_sigma=1)

 labels = clusterer.predict(cont_embeds)

 def create_labelling(labels, wav_splits):

 times = [((s.start + s.stop) / 2) / sampling_rate for s in wav_splits]
 labelling = []
 start_time = 0

 for i, time in enumerate(times):
 if i > 0 and labels[i] != labels[i - 1]:
 temp = [str(labels[i - 1]), start_time, time]
 labelling.append(tuple(temp))
 start_time = time
 if i == len(times) - 1:
 temp = [str(labels[i]), start_time, time]
 labelling.append(tuple(temp))

 return labelling

 #return
 labelling = create_labelling(labels, wav_splits)
 queue.put(labelling)



def identify_speaker(queue1, queue2):

 transcript = queue1.get()
 labelling = queue2.get()

 for speaker in labelling:

 speakerID = speaker[0]
 speakerStart = speaker[1]
 speakerEnd = speaker[2]

 result = transcript['result']
 words = [r['word'] for r in result if speakerStart < r['start'] < speakerEnd]
 #return
 print("Speaker",speakerID,":",' '.join(words), "\n")





def main():

 queue1 = queue.Queue()
 queue2 = queue.Queue()

 FRAME_RATE = 16000
 CHANNELS = 1

 podcast = AudioSegment.from_mp3("Podcast_Audio/Film-Release-Clip.mp3")
 podcast = podcast.set_channels(CHANNELS)
 podcast = podcast.set_frame_rate(FRAME_RATE)

 first_thread = threading.Thread(target=recognition, args=(queue1, podcast, FRAME_RATE))
 second_thread = threading.Thread(target=diarization, args=(queue2, podcast))
 third_thread = threading.Thread(target=identify_speaker, args=(queue1, queue2))

 first_thread.start()
 first_thread.join()
 gc.collect()

 second_thread.start()
 second_thread.join()
 gc.collect()

 third_thread.start()
 third_thread.join()
 gc.collect()

 # transcript = recognition(podcast,FRAME_RATE)
 #
 # labelling = diarization(podcast)
 #
 # print(identify_speaker(transcript, labelling))


if __name__ == '__main__':
 main()



When I say crash I mean everything freezes, I have to hold down the power button on the desktop and turn it back on again. No blue/blank screen, just frozen in my IDE looking at my code. Any help in resolving this issue would be greatly appreciated.