Recherche avancée

Recherche
Choix de la période de publication
Date minimale :

Date maximale :

Type de date :
Choix de la langue
Choix du type de média
Choix de la rubrique
Choix de la licence de publication
Choix de l’auteur

Médias (1)

Mot : - Tags -/blender

Autres articles (103)

MediaSPIP 0.1 Beta version

25 avril 2011, par kent1

MediaSPIP 0.1 beta is the first version of MediaSPIP proclaimed as "usable".
The zip file provided here only contains the sources of MediaSPIP in its standalone version.
To get a working installation, you must manually install all-software dependencies on the server.
If you want to use this archive for an installation in "farm mode", you will also need to proceed to other manual (...)
Multilang : améliorer l’interface pour les blocs multilingues

18 février 2011, par kent1

Multilang est un plugin supplémentaire qui n’est pas activé par défaut lors de l’initialisation de MediaSPIP.
Après son activation, une préconfiguration est mise en place automatiquement par MediaSPIP init permettant à la nouvelle fonctionnalité d’être automatiquement opérationnelle. Il n’est donc pas obligatoire de passer par une étape de configuration pour cela.
Personnaliser en ajoutant son logo, sa bannière ou son image de fond

5 septembre 2013, par kent1

Certains thèmes prennent en compte trois éléments de personnalisation : l’ajout d’un logo ; l’ajout d’une bannière l’ajout d’une image de fond ;

1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | ... | 35

Sur d’autres sites (12217)

ffmpeg to count words in audio text

17 juillet 2020, par Joel Parker

I am new to signal processing but wanted to take an audio file and determine how many words are spoken in one minute. I was thinking I could use the top of the loudness peaks to count the words but do not quite understand how to achieve this.

First I used ffmpeg to remove the audio from the mp4 file I am using :

ffmpeg -i courtcase.mp4 audiofile.mp4

Then I tried to detect the loudness :

ffmpeg -t 10 -i audiofile.mp4 -af "volumedetect" -f null /dev/null

This produced some statistical information :

video:157kB audio:1723kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown&#xA;[Parsed_volumedetect_0 @ 0x7fa6b26068c0] n_samples: 882000&#xA;[Parsed_volumedetect_0 @ 0x7fa6b26068c0] mean_volume: -20.6 dB&#xA;[Parsed_volumedetect_0 @ 0x7fa6b26068c0] max_volume: -4.0 dB&#xA;[Parsed_volumedetect_0 @ 0x7fa6b26068c0] histogram_4db: 64&#xA;[Parsed_volumedetect_0 @ 0x7fa6b26068c0] histogram_5db: 88&#xA;[Parsed_volumedetect_0 @ 0x7fa6b26068c0] histogram_6db: 220&#xA;[Parsed_volumedetect_0 @ 0x7fa6b26068c0] histogram_7db: 843&#xA;&#xA;

I am not sure why it still shows 157kB of video, maybe my first command is wrong ?

Anyway, assuming the file is just audio I found this command, which I believe shows dbm slices for 10 seconds :

ffmpeg -i audiofile.mp4 -af astats=metadata=1:reset=1,ametadata=print:key=lavfi.astats.Overall.RMS_level:file=- -f null -&#xA;

and it produced a bunch of output :

video:5782kB audio:63504kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Channel: 1&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] DC offset: 0.000240&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Min level: -0.166239&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Max level: 0.127112&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Min difference: 0.000003&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Max difference: 0.025335&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Mean difference: 0.004455&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] RMS difference: 0.006165&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Peak level dB: -15.585332&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] RMS level dB: -26.251394&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] RMS peak dB: -26.251394&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] RMS trough dB: -26.251394&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Crest factor: 3.414311&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Flat factor: 0.000000&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Peak count: 2&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Noise floor dB: nan&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Noise floor count: 0&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Bit depth: 32/32&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Dynamic range: 72.297593&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Zero crossings: 74&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Zero crossings rate: 0.072266&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Number of NaNs: 0&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Number of Infs: 0&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Number of denormals: 0&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Channel: 2&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] DC offset: 0.000240&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Min level: -0.166239&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Max level: 0.127112&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Min difference: 0.000003&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Max difference: 0.025335&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Mean difference: 0.004455&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] RMS difference: 0.006165&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Peak level dB: -15.585332&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] RMS level dB: -26.251394&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] RMS peak dB: -26.251394&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] RMS trough dB: -26.251394&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Crest factor: 3.414311&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Flat factor: 0.000000&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Peak count: 2&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Noise floor dB: nan&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Noise floor count: 0&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Bit depth: 32/32&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Dynamic range: 72.297593&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Zero crossings: 74&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Zero crossings rate: 0.072266&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Number of NaNs: 0&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Number of Infs: 0&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Number of denormals: 0&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Overall&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] DC offset: 0.000240&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Min level: -0.166239&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Max level: 0.127112&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Min difference: 0.000003&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Max difference: 0.025335&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Mean difference: 0.004455&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] RMS difference: 0.006165&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Peak level dB: -15.585332&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] RMS level dB: -26.251394&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] RMS peak dB: -26.251394&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] RMS trough dB: -26.251394&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Flat factor: 0.000000&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Peak count: 2.000000&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Noise floor dB: nan&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Noise floor count: 0.000000&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Bit depth: 32/32&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Number of samples: 1024&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Number of NaNs: 0.000000&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Number of Infs: 0.000000&#xA;[Parsed_astats_0 @ 0x7ff74c004bc0] Number of denormals: 0.000000&#xA;ts_time:368.268&#xA;lavfi.astats.Overall.RMS_level=-29.670653&#xA;frame:15861 pts:16241664 pts_time:368.292&#xA;lavfi.astats.Overall.RMS_level=-30.851195&#xA;frame:15862 pts:16242688 pts_time:368.315&#xA;lavfi.astats.Overall.RMS_level=-30.700943&#xA;frame:15863 pts:16243712 pts_time:368.338&#xA;lavfi.astats.Overall.RMS_level=-33.638604&#xA;frame:15864 pts:16244736 pts_time:368.361&#xA;lavfi.astats.Overall.RMS_level=-21.873170&#xA;frame:15865 pts:16245760 pts_time:368.385&#xA;lavfi.astats.Overall.RMS_level=-20.001936&#xA;frame:15866 pts:16246784 pts_time:368.408&#xA;lavfi.astats.Overall.RMS_level=-18.571318&#xA;frame:15867 pts:16247808 pts_time:368.431&#xA;lavfi.astats.Overall.RMS_level=-18.470749&#xA;frame:15868 pts:16248832 pts_time:368.454&#xA;lavfi.astats.Overall.RMS_level=-19.506688&#xA;frame:15869 pts:16249856 pts_time:368.477&#xA;lavfi.astats.Overall.RMS_level=-21.270579&#xA;frame:15870 pts:16250880 pts_time:368.501&#xA;lavfi.astats.Overall.RMS_level=-25.007862&#xA;frame:15871 pts:16251904 pts_time:368.524&#xA;lavfi.astats.Overall.RMS_level=-25.654372&#xA;frame:15872 pts:16252928 pts_time:368.547&#xA;lavfi.astats.Overall.RMS_level=-24.948357&#xA;frame:15873 pts:16253952 pts_time:368.57&#xA;lavfi.astats.Overall.RMS_level=-30.523540&#xA;frame:15874 pts:16254976 pts_time:368.594&#xA;....&#xA;

This is where I'm stuck. I think I have the information I need to determine the number of words spoken in a minute, except I don't know how to put all together. Also the last command just measures 10s slices, would I need to change that to 60s ? Does anyone know how to do this or if there is a better approach ?

How to view the stream generated from mkvserver via ffplay ?

18 juin 2019, par Chaitanya Bhardwaj
I want to live stream from one source(ffmpeg) to multiple clients for which I’m using mkvserver.
I’m able to live stream a webcam from ffmpeg(client) to mkvserver(server) as follows :

On Server :
```
nc -l  | ./server
```
On Client :
```
ffmpeg -f avfoundation -framerate 30 -i 0 -b 900k -f matroska -r 20 tcp://:
```
To view the genereted steam on server, I used the ffplay as :
```
ffplay tcp://:<port>

</port>
```
but I got the Connection timed out error. Please suggest a way to view the generated stream on the server via ffplay. Thanks !
remuxing audio and videos (screen and presenter) captured at the same time does not synchronize

22 septembre 2014, par user28163
trying to merge a screencast with a video (without sound) and a sound steam which has been captured separately using ffmpeg using a bash command. All the stream-capture were started at the same time and all ffmpeg processes killed at the same time (pkill). But when I remux them together, the screencast and video does not match, and thus sound does not synchronize either.

Where did I go wrong ? Any inputs appreciated from ffmpeg experts here. Thanks in advance.

Please find the ffmpeg output as follows :
1. The ffmpeg log of two videos muxing (http://pastebin.com/XwnDSf5i)
2. The ffmpeg log of remuxing the sound with the side-by side video as of 1 above (cannot paste as the pastebin limit exeeded :( ).
UPDATE :

After checking the lenght of the screencast, I figured out that the screen capture (though started and stopped at the same time as video and sound using a bash script), is shorter in lenght by 1m54s than video and audio (former 34:22 vs later 36:16). The video was captured in h264 mp4 wrapper at -r 30. So is screen capture but lossless

%ffmpeg -report -f x11grab -r 30 -s 1920x1080 -i :0.0 -qscale 0 -vcodec libx264 -threads 4 screen.m4v

Could that be the reason for the delay ? Is there any way to extend the screencast against the videos ? Thanks !

1 | ... | 1584 | 1585 | 1586 | 1587 | 1588 | 1589 | 1590 | 1591 | 1592 | ... | 4073

Recherche avancée

Médias (1)

Sintel MP4 Surround 5.1 Full

Autres articles (103)

MediaSPIP 0.1 Beta version

Multilang : améliorer l’interface pour les blocs multilingues

Personnaliser en ajoutant son logo, sa bannière ou son image de fond

Sur d’autres sites (12217)

ffmpeg to count words in audio text

How to view the stream generated from mkvserver via ffplay ?

remuxing audio and videos (screen and presenter) captured at the same time does not synchronize

Se connecter

Navigation

Syndication

Boussole SPIP