
Recherche avancée
Autres articles (92)
-
Websites made with MediaSPIP
2 mai 2011, parThis page lists some websites based on MediaSPIP.
-
MediaSPIP v0.2
21 juin 2013, parMediaSPIP 0.2 est la première version de MediaSPIP stable.
Sa date de sortie officielle est le 21 juin 2013 et est annoncée ici.
Le fichier zip ici présent contient uniquement les sources de MediaSPIP en version standalone.
Comme pour la version précédente, il est nécessaire d’installer manuellement l’ensemble des dépendances logicielles sur le serveur.
Si vous souhaitez utiliser cette archive pour une installation en mode ferme, il vous faudra également procéder à d’autres modifications (...) -
Gestion des droits de création et d’édition des objets
8 février 2011, parPar défaut, beaucoup de fonctionnalités sont limitées aux administrateurs mais restent configurables indépendamment pour modifier leur statut minimal d’utilisation notamment : la rédaction de contenus sur le site modifiables dans la gestion des templates de formulaires ; l’ajout de notes aux articles ; l’ajout de légendes et d’annotations sur les images ;
Sur d’autres sites (13702)
-
Google cloud speech to text not giving output for OGG & MP3 files
27 avril 2021, par Vedant JumleI am trying to perform speech to text on a bunch of audio files which are over 10 mins long. I don't want to waste storage on the cloud bucket by straight-up uploading wav files on it. So I am using
ffmpeg
to convert the files either to ogg or mp3 like :
ffmpeg -y -i audio.wav -ar 12000 -r 16000 audio.mp3


ffmpeg -y -i audio.wav -ar 12000 -r 16000 audio.ogg


For testing purpose I ran the speech to text service on a dummy wav file and it seemed to work, I got the text as expected. But for some reason it isn't detecting any speech when I use the ogg or mp3 file. I could not give amr files to work either.


My code :


def transcribe_gcs(gcs_uri):
 client = speech.SpeechClient()

 audio = speech.RecognitionAudio(uri=gcs_uri)
 config = speech.RecognitionConfig(
 encoding="OGG_OPUS", #replace with "LINEAR16" for wav, "OGG_OPUS" for ogg, "AMR" for amr
 sample_rate_hertz=16000,
 language_code="en-US",
 )
 print("starting operation")
 operation = client.long_running_recognize(config=config, audio=audio)
 response = operation.result()
 print(response)



I have set up the authentication properly, so that is not a problem.


When I run the speech to text service on the same audio but in ogg or mp3(I just comment out the encoding setting from the config for mp3) format, it gives no response, just prints out a line break and done.


What can I do to fix this ?


-
avformat/av1 : add color config values to AV1SequenceParameters
30 juillet 2019, par James Almer -
FFmpeg : What re-encoding settings can be used to achieve results similar to Google Drive's video processing ?
4 août 2023, par Mycroft_47Context :


I have a large collection of videos recorded by my phone's camera, which is taking up a significant amount of space. Recently, I noticed that when I uploaded a video to Google Drive and then downloaded it again using IDM (by clicking on the pop-up that IDM displays when it detects something that can be downloaded here's what i mean), the downloaded video retained the same visual quality but occupied much less space. Upon further research, I discovered that Google re-encodes uploaded videos using H.264 video encoding, and I believe I can achieve similar compression using FFmpeg.


Problem :


Despite experimenting with various FFmpeg commands, I haven't been able to replicate Google Drive's compression. Every attempt using
-codec:v libx264
option alone resulted in videos larger than the original files.

While adjusting the
-crf
parameter to a higher value and opting for a faster-preset
option did yield smaller file sizes, it unfortunately came at the cost of a noticeable degradation in visual quality and the appearance of some visible artifacts in the video.

Google Drive's processing, on the other hand, strikes a commendable balance, achieving a satisfactory file size without compromising visual clarity, (I should note that upon zooming in on this video, I observed some minor blurring, but it was acceptable to me).


Note :


I'm aware that using the H.265 video encoder instead of H.264 may give better results. However, to ensure fairness and avoid any potential bias, I think the optimal approach is first to find the best command using the H.264 video encoder. Once identified, I can then replace
-codec:v libx264
with-codec:v libx265
. This approach will ensure that the chosen command is really the best that FFMPEG can achieve, and that it is not solely influenced by the superior performance of H.265 when used from the outset.

Here's the FFMPEG command I am currently using :


ffmpeg -hide_banner -loglevel verbose ^
 -i input.mp4 ^
 -codec:v libx264 ^
 -crf 36 -preset ultrafast ^
 -codec:a libopus -b:a 112k ^
 -movflags use_metadata_tags+faststart -map_metadata 0 ^
 output.mp4








 Video file 

Size (bytes) 

Bit rate (bps) 

Encoder 

FFPROB - JSON 







 Original (named 'raw 1.mp4') 

31,666,777 

10,314,710 

!!! 

link 




 Without crf 

36,251,852 

11,805,216 

Lavf60.3.100 

link 




 With crf 

10,179,113 

3,314,772 

Lavf60.3.100 

link 




 Gdrive 

6,726,189 

2,190,342 

Google 

link 









Those files can be found here.


Update :


I continued my experiments with the video "raw_1.mp4" and found some interesting results that resemble those shown in this blog post, (I recommend consulting this answer).


In the following figure, I observed that using the
-preset
set to veryfast provided the most advantageous results, striking the optimal balance between compression ratio and compression time, (Note that a negative percentage in the compression variable indicates an increase in file size after processing) :


In this figure, I used the H.264 encoder and compared the compression ratio of different outputted files resulting from seven different values of the
-crf
parameter (CRF values used : 25, 27, 29, 31, 33, 35, 37),


For this figure, I've switched the encoder to H.265 while maintaining the same CRF values used in the previous figure :



Based on these results, the
-preset
veryfast and a-crf
value of 31 are my current preferred settings for FFmpeg, until they are proven to be suboptimal choices.
As a result, the FFmpeg command I'll use is as follows :

ffmpeg -hide_banner -loglevel verbose ^
 -i input.mp4 ^
 -codec:v libx264 ^
 -crf 31 -preset veryfast ^
 -codec:a libopus -b:a 112k ^
 -movflags use_metadata_tags+faststart -map_metadata 0 ^
 output.mp4



Note that these choices are based solely on the compression results obtained so far, and they do not take into account the visual quality of the outputted files.