
Recherche avancée
Médias (91)
-
Valkaama DVD Cover Outside
4 octobre 2011, par
Mis à jour : Octobre 2011
Langue : English
Type : Image
-
Valkaama DVD Label
4 octobre 2011, par
Mis à jour : Février 2013
Langue : English
Type : Image
-
Valkaama DVD Cover Inside
4 octobre 2011, par
Mis à jour : Octobre 2011
Langue : English
Type : Image
-
1,000,000
27 septembre 2011, par
Mis à jour : Septembre 2011
Langue : English
Type : Audio
-
Demon Seed
26 septembre 2011, par
Mis à jour : Septembre 2011
Langue : English
Type : Audio
-
The Four of Us are Dying
26 septembre 2011, par
Mis à jour : Septembre 2011
Langue : English
Type : Audio
Autres articles (63)
-
MediaSPIP v0.2
21 juin 2013, parMediaSPIP 0.2 est la première version de MediaSPIP stable.
Sa date de sortie officielle est le 21 juin 2013 et est annoncée ici.
Le fichier zip ici présent contient uniquement les sources de MediaSPIP en version standalone.
Comme pour la version précédente, il est nécessaire d’installer manuellement l’ensemble des dépendances logicielles sur le serveur.
Si vous souhaitez utiliser cette archive pour une installation en mode ferme, il vous faudra également procéder à d’autres modifications (...) -
MediaSPIP version 0.1 Beta
16 avril 2011, parMediaSPIP 0.1 beta est la première version de MediaSPIP décrétée comme "utilisable".
Le fichier zip ici présent contient uniquement les sources de MediaSPIP en version standalone.
Pour avoir une installation fonctionnelle, il est nécessaire d’installer manuellement l’ensemble des dépendances logicielles sur le serveur.
Si vous souhaitez utiliser cette archive pour une installation en mode ferme, il vous faudra également procéder à d’autres modifications (...) -
Personnaliser les catégories
21 juin 2013, parFormulaire de création d’une catégorie
Pour ceux qui connaissent bien SPIP, une catégorie peut être assimilée à une rubrique.
Dans le cas d’un document de type catégorie, les champs proposés par défaut sont : Texte
On peut modifier ce formulaire dans la partie :
Administration > Configuration des masques de formulaire.
Dans le cas d’un document de type média, les champs non affichés par défaut sont : Descriptif rapide
Par ailleurs, c’est dans cette partie configuration qu’on peut indiquer le (...)
Sur d’autres sites (8546)
-
Batch splitting large audio files into small fixed-length audio files in moments of silence
26 juillet 2023, par Haldjärvito train the SO-VITS-SVC neural network, we need 10-14 second voice files. As a material, let's say I use phrases from some game. I have already made a batch script for decoding different files into one working format, another batch script for removing silence, as well as a batch script for combining small audio files into files of 13-14 seconds (I used Python, pydub and FFmpeg). To successfully automatically create a training dataset, it remains only to make one batch script - Cutting audio files lasting more than 14 seconds into separate files lasting 10-14 seconds, cutting in places of silence or close to silence is highly preferable.


So, it is necessary to batch cut large audio files (20 seconds, 70 seconds, possibly several hundred seconds) into segments of approximately 10-14 seconds, however, the main task is to look for the quietest place in the cut areas so as not to cut phrases in the middle of a word (this is not very good for model training). So, is it really possible to do this in a very optimal way, so that the processing of a 30-second file does not take 15 seconds, but is fast ? Quiet zone detection is required only in the area of cuts, that is, 10-14 seconds, if counted from the very beginning of the file.


I would be very grateful for any help.


I tried to write a script together with ChatGPT, but all options gave completely unpredictable results and were not even close to what I needed... I had to stop at the option with a sharp cut of files for exactly 14000 milliseconds. However, I hope there is a chance to make a variant with cutting exactly in quiet areas.


import os
from pydub import AudioSegment

input_directory = ".../RemSilence/"
output_directory = ".../Split/"
max_duration = 14000

def split_audio_by_duration(input_file, duration):
 audio = AudioSegment.from_file(input_file)
 segments = []
 for i in range(0, len(audio), duration):
 segment = audio[i:i + duration]
 segments.append(segment)
 return segments

if __name__ == "__main__":
 os.makedirs(output_directory, exist_ok=True)
 audio_files = [os.path.join(input_directory, file) for file in os.listdir(input_directory) if file.endswith(".wav")]
 audio_files.sort(key=lambda file: len(AudioSegment.from_file(file)))
 for file in audio_files:
 audio = AudioSegment.from_file(file)
 if len(audio) > max_duration:
 segments = split_audio_by_duration(file, max_duration)
 for i, segment in enumerate(segments):
 output_filename = f"output_{len(os.listdir(output_directory))+1}.wav"
 output_file_path = os.path.join(output_directory, output_filename)
 segment.export(output_file_path, format="wav")
 else:
 output_filename = f"output_{len(os.listdir(output_directory))+1}.wav"
 output_file_path = os.path.join(output_directory, output_filename)
 audio.export(output_file_path, format="wav")



-
How to set pts, dts and duration in ffmpeg library ?
24 mars, par hsleeI want to pack some compressed video packets(h.264) to ".mp4" container.
One word, Muxing, no decoding and no encoding.
And I have no idea how to set pts, dts and duration.



- 

- I get the packets with "pcap" library.
- I removed headers before compressed video data show up. e.g. Ethernet, VLAN.
- I collected data until one frame and decoded it for getting information of data. e.g. width, height. (I am not sure that it is necessary)
- I initialized output context, stream and codec context.
- I started to receive packets with "pcap" library again. (now for muxing)
- I made one frame and put that data in AVPacket structure.
- I try to set PTS, DTS and duration. (I think here is wrong part, not sure though)

















*7-1. At the first frame, I saved time(msec) with packet header structure.



*7-2. whenever I made one frame, I set parameters like this : PTS(current time - start time), DTS(same PTS value), duration(current PTS - before PTS)



I think it has some error because :



- 

-
I don't know how far is suitable long for dts from pts.
-
At least, I think duration means how long time show this frame from now to next frame, so It should have value(next PTS - current PTS), but I can not know the value next PTS at that time.







It has I-frame only.



// make input context for decoding

AVFormatContext *&ic = gInputContext;

ic = avformat_alloc_context();

AVCodec *cd = avcodec_find_decoder(AV_CODEC_ID_H264);

AVStream *st = avformat_new_stream(ic, cd);

AVCodecContext *cc = st->codec;

avcodec_open2(cc, cd, NULL);

// make packet and decode it after collect packets is be one frame

gPacket.stream_index = 0;

gPacket.size = gPacketLength[0];

gPacket.data = gPacketData[0];

gPacket.pts = AV_NOPTS_VALUE;

gPacket.dts = AV_NOPTS_VALUE;

gPacket.flags = AV_PKT_FLAG_KEY;

avcodec_decode_video2(cc, gFrame, &got_picture, &gPacket);

// I checked automatically it initialized after "avcodec_decode_video2"

// put some info that I know that not initialized

cc->time_base.den = 90000;

cc->time_base.num = 1;

cc->bit_rate = 2500000;

cc->gop_size = 1;

// make output context with input context

AVFormatContext *&oc = gOutputContext;

avformat_alloc_output_context2(&oc, NULL, NULL, filename);

AVFormatContext *&ic = gInputContext;

AVStream *ist = ic->streams[0];

AVCodecContext *&icc = ist->codec;

AVStream *ost = avformat_new_stream(oc, icc->codec);

AVCodecContext *occ = ost->codec;

avcodec_copy_context(occ, icc);

occ->flags |= CODEC_FLAG_GLOBAL_HEADER;

avio_open(&(oc->pb), filename, AVIO_FLAG_WRITE);

// repeated part for muxing

AVRational Millisecond = { 1, 1000 };

gPacket.stream_index = 0;

gPacket.data = gPacketData[0];

gPacket.size = gPacketLength[0];

gPacket.pts = av_rescale_rnd(pkthdr->ts.tv_sec * 1000 /

 + pkthdr->ts.tv_usec / 1000 /

 - gStartTime, Millisecond.den, ost->time_base.den, /

 (AVRounding)(AV_ROUND_NEAR_INF | AV_ROUND_PASS_MINMAX));

gPacket.dts = gPacket.pts;

gPacket.duration = gPacket.pts - gPrev;

gPacket.flags = AV_PKT_FLAG_KEY;

gPrev = gPacket.pts;

av_interleaved_write_frame(gOutputContext, &gPacket);




Expected and actual results is a .mp4 video file that can play.


-
Android + ffmpeg + AudioTrack produces bad audio output
12 septembre 2014, par Goddchenhere is what I am trying to do : use an
AudioRecord
and "pipe" the output ofAudioRecord.read(byte[],...)
to an ffmpeg process’ stdin that will convert to a 3gp (AAC) file.The ffmpeg call is as follows :
ProcessBuilder processBuilder = new ProcessBuilder(BINARY.getAbsolutePath(),
"-y",
"-ar", "44100", "-c:a", "pcm_s16le", "-ac", "1","-f","s16le",
"-i", "-",
"-strict", "-2", "-c:a", "aac",
outFile.getAbsolutePath());The AudioRecord is setup as follows :
AudioRecord record = new AudioRecord(/*AudioSource.VOICE_RECOGNITION,*/ AudioSource.MIC,
SAMPLING_RATE,
AudioFormat.CHANNEL_IN_MONO,
AudioFormat.ENCODING_PCM_16BIT,
bufferSize);SAMPLING_RATE = 44100
andbufferSize
is the one returned byAudioRecord.getMinBufferSize(...)
I am writing the data to ffmpeg like this :
try {
IOUtils.write(data, getFFmpegHelper().getCurrentProcessOutputStream());
} catch (Exception e) {
Log.e(Application.LOG_TAG, "Error writing data to ffmpeg process", e);
//TODO notify user, stop the recording, etc...
}So far so good, the ffmpeg runs and created a proper 3gp file. But the audio in the file is totally off. It seems "choppy" (not sure if this is the correct english word ;) ) and also the pace is wrong, is plays too fast.
Check out this sample : http://goddchen.de/android/tmp/tmp.3gp
This is the output of the ffmpeg process :
[s16le @ 0x23634d0] Estimating duration from bitrate, this may be inaccurate
Guessed Channel Layout for Input Stream #0.0 : mono
Input #0, s16le, from 'pipe:':
Duration: N/A, start: 0.000000, bitrate: 705 kb/s
Stream #0:0: Audio: pcm_s16le, 44100 Hz, mono, s16, 705 kb/s
[aformat @ 0x2363100] auto-inserting filter 'auto-inserted resampler 0' between the filter 'src' and the filter 'aformat'
[aresample @ 0x235b0a0] chl:mono fmt:s16 r:44100Hz -> chl:mono fmt:flt r:44100Hz
Output #0, 3gp, to '/data/data/com.test.audio/files/tmp.3gp':
Metadata:
encoder : Lavf54.6.100
Stream #0:0: Audio: aac (mp4a / 0x6134706D), 44100 Hz, mono, flt, 128 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (pcm_s16le -> aac)
size= 3kB time=00:00:00.18 bitrate= 132.5kbits/s
size= 8kB time=00:00:00.55 bitrate= 120.9kbits/s
size= 12kB time=00:00:00.83 bitrate= 121.8kbits/s
size= 16kB time=00:00:01.04 bitrate= 122.8kbits/s
size= 20kB time=00:00:01.32 bitrate= 122.5kbits/s
size= 23kB time=00:00:01.53 bitrate= 121.6kbits/s
size= 27kB time=00:00:01.81 bitrate= 121.0kbits/s
size= 31kB time=00:00:02.11 bitrate= 120.7kbits/s
size= 35kB time=00:00:02.32 bitrate= 123.4kbits/s
video:0kB audio:34kB global headers:0kB muxing overhead 3.031610%