Recherche avancée

Recherche
Choix de la période de publication
Date minimale :

Date maximale :

Type de date :
Choix de la langue
Choix du type de média
Choix de la rubrique
Choix de la licence de publication
Choix de l’auteur

Médias (91)

Head down (wav version)

26 septembre 2011, par kent1

Mis à jour : Avril 2013

Langue : English

Type : Audio

Tags : audio, Nine Inch Nails, Musique, wav

1
2
3
4
5
Echoplex (wav version)

26 septembre 2011, par kent1

Mis à jour : Avril 2013

Langue : English

Type : Audio

Tags : audio, Nine Inch Nails, Musique, wav

1
2
3
4
5
Discipline (wav version)

26 septembre 2011, par kent1

Mis à jour : Avril 2013

Langue : English

Type : Audio

Tags : audio, Nine Inch Nails, Musique, wav

1
2
3
4
5
Letting you (wav version)

26 septembre 2011, par kent1

Mis à jour : Avril 2013

Langue : English

Type : Audio

Tags : audio, Nine Inch Nails, Musique, wav

1
2
3
4
5
1 000 000 (wav version)

26 septembre 2011, par kent1

Mis à jour : Avril 2013

Langue : English

Type : Audio

Tags : audio, wave, Nine Inch Nails, Musique

1
2
3
4
5
999 999 (wav version)

26 septembre 2011, par kent1

Mis à jour : Avril 2013

Langue : English

Type : Audio

Tags : audio, Nine Inch Nails, Musique, wav

1
2
3
4
5

1 | ... | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | ... | 16

Autres articles (95)

Websites made with MediaSPIP

2 mai 2011, par kent1

This page lists some websites based on MediaSPIP.
Amélioration de la version de base

13 septembre 2013

Jolie sélection multiple
Le plugin Chosen permet d’améliorer l’ergonomie des champs de sélection multiple. Voir les deux images suivantes pour comparer.
Il suffit pour cela d’activer le plugin Chosen (Configuration générale du site > Gestion des plugins), puis de configurer le plugin (Les squelettes > Chosen) en activant l’utilisation de Chosen dans le site public et en spécifiant les éléments de formulaires à améliorer, par exemple select[multiple] pour les listes à sélection multiple (...)
Emballe médias : à quoi cela sert ?

4 février 2011, par kent1

Ce plugin vise à gérer des sites de mise en ligne de documents de tous types.
Il crée des "médias", à savoir : un "média" est un article au sens SPIP créé automatiquement lors du téléversement d’un document qu’il soit audio, vidéo, image ou textuel ; un seul document ne peut être lié à un article dit "média" ;

1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | ... | 32

Sur d’autres sites (15016)

How to Adjust Google TTS SSML to Match Original SRT Timing ?

2 avril, par Alexandre Silkin
I have an .srt file where each speech segment is supposed to last a specific duration (e.g., 4 seconds). However, when I generate the speech using Google Text-to-Speech (TTS) with SSML, the resulting audio plays the same segment in a shorter time (e.g., 3 seconds).




I want to adjust the speech rate dynamically in SSML so that each segment matches its original timing. My idea is to use ffmpeg to extract the actual duration of each generated speech segment, then calculate the speech rate percentage as :
generated duration
speech rate = --------------------
original duration




This percentage would then be applied in SSML using the tag, like :
Text to be spoken




How can I accurately measure the duration of each segment using ffmpeg, and what is the best way to apply the correct speech rate in SSML to match the original .srt timing ?




I tried duration and my SSML should look like this :



```
        f.write(f&#x27;\t<p>{break_until_start}{text}<break time="{value["></break></p>\n&#x27;)&#xA;
```



Code writing the SSML :




text = value['text']
start_time_ms = int(value['start_ms']) # Start time in milliseconds
previous_end_ms = int(subsDict.get(str(int(key) - 1), {}).get('end_ms', 0)) # Get the previous end time
gap_to_fill = max(0, start_time_ms - previous_end_ms)



```
        text = text.replace("&amp;", "&amp;amp;").replace(&#x27;"&#x27;, "&amp;quot;").replace("&#x27;", "&amp;apos;").replace("&lt;", "&amp;lt;").replace(&#xA;            ">", "&amp;gt;")&#xA;&#xA;        break_until_start = f&#x27;<break time="{gap_to_fill}ms"></break>&#x27; if gap_to_fill > 0 else &#x27;&#x27;&#xA;&#xA;        f.write(f&#x27;\t<p>{break_until_start}{text}<break time="{value["></break></p>\n&#x27;)&#xA;&#xA;    f.write(&#x27;\n&#x27;)&#xA;
```

lavu/tx : require output argument to match input for inplace transforms

26 février 2021, par Lynne

lavu/tx : require output argument to match input for inplace transforms

This simplifies some assembly code by a lot, by either saving a branch
or saving an entire duplicated function.

[D H] libavutil/tx.h
[D H] libavutil/tx_template.c

How to re-encode an audio to match another one, to avoid re-encoding the whole audio

21 mars 2024, par Bernard Wiesner
I have an audio editor in the browser using ffmpeg (WebAssembly), and I want to insert new audio into the existing audio without having to re-encode everything. Re-encoding everything takes a long time, especially in the browser, so I would like to only re-encode the inserted file, match it to the original one and concatenate them using the copy command.




On ffmpeg concatenate docs it says :






All files must have the same streams (same codecs, same time base, etc.)







But it is not clear what is meant by time base. So far I have observed I need to match :



- codec
- bit rate
- sample rate
- channels (mono, stereo)



Is there anything else I need to match so that the resulting audio is not corrupt/broken when concatenating ?




I have observed with mp3 for example it has VBR, CBR, and ABR. If the original audio has a bit rate of 128 kb/s, I am assuming it is a CBR, so I match it with :



```
ffmpeg -i original.mp3&#xA;# > Stream #0:0: Audio: mp3, 44100 Hz, stereo, fltp, 128 kb/s&#xA;&#xA;ffmpeg -i input.mp3 -b:a 128k -ar 44100 -ac 2 re_encoded.mp3&#xA;&#xA;# then merge&#xA;# concat_list.txt contains the original audio and the re_encoded.mp3&#xA;&#xA;ffmpeg -f concat -i concat_list.txt -safe 0 -c copy merged.mp3&#xA;
```



And that works fine for CBR such as 8, 16, 24, 32, 40, 48, 64, 80, 96, 112, 128, 160, 192, 224, 256, or 320 (docs), as far as I have tested.




The issue is when the original.mp3 has a VBR (variable bit rate) or ABR, such as 150 kb/s.




If I try to match it like below :



```
ffmpeg -i input.mp3 -b:a 150k -ar 44100 -ac 2 re_encoded.mp3&#xA;ffmpeg -i re_encoded.mp3&#xA;# Stream #0:0: Audio: mp3, 44100 Hz, stereo, fltp, 160 kb/s&#xA;
```



The resulting bitrate is rounded to the nearest CBR which is 160.




I can solve this with mp3 by using -abr 1 :



```
ffmpeg -i input.mp3 -abr 1 -b:a 150k -ar 44100 -ac 2 re_encoded.mp3&#xA;ffmpeg -i re_encoded.mp3&#xA;# Stream #0:0: Audio: mp3, 44100 Hz, stereo, fltp, 150 kb/s&#xA;
```



Now the bitrate matches the original audio, however I am not sure this is correct since I am modifying the new audio to an ABR and concatenating it with a VBR ? I am not even sure how to check with ffmpeg if the audio is VBR, CBR or ABR, or if that even matters when concatenating.




Another issue also happens with aac files. When I try to match the original audio bitrate I can't.



```
ffmpeg -i input.mp3 -b:a 128k -ar 44100 -ac 2 re_encoded.aac&#xA;ffmpeg -i re_encoded.aac&#xA;# Stream #0:0: Audio: aac (LC), 44100 Hz, stereo, fltp, 135 kb/s&#xA;
```



The resulting bitrate always seems to be variable (135 in this case), and hence I can't match it to the original one.




So my question is, what conditions need to be met when concatenating audios with different streams, and how can I achieve re-encoding only one audio to match the other one. Or if there is some package that can do this, it would be of great help.

1 | ... | 1639 | 1640 | 1641 | 1642 | 1643 | 1644 | 1645 | 1646 | 1647 | ... | 5006

Recherche avancée

Médias (91)

Head down (wav version)

Echoplex (wav version)

Discipline (wav version)

Letting you (wav version)

1 000 000 (wav version)

999 999 (wav version)

Autres articles (95)

Websites made with MediaSPIP

Amélioration de la version de base

Emballe médias : à quoi cela sert ?

Sur d’autres sites (15016)

How to Adjust Google TTS SSML to Match Original SRT Timing ?

lavu/tx : require output argument to match input for inplace transforms

How to re-encode an audio to match another one, to avoid re-encoding the whole audio

Se connecter

Navigation

Syndication

Boussole SPIP