Recherche avancée

Médias (91)

Autres articles (13)

  • Les formats acceptés

    28 janvier 2010, par

    Les commandes suivantes permettent d’avoir des informations sur les formats et codecs gérés par l’installation local de ffmpeg :
    ffmpeg -codecs ffmpeg -formats
    Les format videos acceptés en entrée
    Cette liste est non exhaustive, elle met en exergue les principaux formats utilisés : h264 : H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10 m4v : raw MPEG-4 video format flv : Flash Video (FLV) / Sorenson Spark / Sorenson H.263 Theora wmv :
    Les formats vidéos de sortie possibles
    Dans un premier temps on (...)

  • Supporting all media types

    13 avril 2011, par

    Unlike most software and media-sharing platforms, MediaSPIP aims to manage as many different media types as possible. The following are just a few examples from an ever-expanding list of supported formats : images : png, gif, jpg, bmp and more audio : MP3, Ogg, Wav and more video : AVI, MP4, OGV, mpg, mov, wmv and more text, code and other data : OpenOffice, Microsoft Office (Word, PowerPoint, Excel), web (html, CSS), LaTeX, Google Earth and (...)

  • Ajouter notes et légendes aux images

    7 février 2011, par

    Pour pouvoir ajouter notes et légendes aux images, la première étape est d’installer le plugin "Légendes".
    Une fois le plugin activé, vous pouvez le configurer dans l’espace de configuration afin de modifier les droits de création / modification et de suppression des notes. Par défaut seuls les administrateurs du site peuvent ajouter des notes aux images.
    Modification lors de l’ajout d’un média
    Lors de l’ajout d’un média de type "image" un nouveau bouton apparait au dessus de la prévisualisation (...)

Sur d’autres sites (3346)

  • Android + ffmpeg + AudioTrack produces bad audio output

    12 septembre 2014, par Goddchen

    here is what I am trying to do : use an AudioRecord and "pipe" the output of AudioRecord.read(byte[],...) to an ffmpeg process’ stdin that will convert to a 3gp (AAC) file.

    The ffmpeg call is as follows :

           ProcessBuilder processBuilder = new ProcessBuilder(BINARY.getAbsolutePath(),
                   "-y",
                   "-ar", "44100", "-c:a", "pcm_s16le", "-ac", "1","-f","s16le",
                   "-i", "-",
                   "-strict", "-2", "-c:a", "aac",
                   outFile.getAbsolutePath());

    The AudioRecord is setup as follows :

    AudioRecord record = new AudioRecord(/*AudioSource.VOICE_RECOGNITION,*/ AudioSource.MIC,
               SAMPLING_RATE,
               AudioFormat.CHANNEL_IN_MONO,
               AudioFormat.ENCODING_PCM_16BIT,
               bufferSize);

    SAMPLING_RATE = 44100 and bufferSize is the one returned by AudioRecord.getMinBufferSize(...)

    I am writing the data to ffmpeg like this :

    try {
                           IOUtils.write(data, getFFmpegHelper().getCurrentProcessOutputStream());
                       } catch (Exception e) {
                           Log.e(Application.LOG_TAG, "Error writing data to ffmpeg process", e);
                           //TODO notify user, stop the recording, etc...
                       }

    So far so good, the ffmpeg runs and created a proper 3gp file. But the audio in the file is totally off. It seems "choppy" (not sure if this is the correct english word ;) ) and also the pace is wrong, is plays too fast.

    Check out this sample : http://goddchen.de/android/tmp/tmp.3gp

    This is the output of the ffmpeg process :

       [s16le @ 0x23634d0] Estimating duration from bitrate, this may be inaccurate
       Guessed Channel Layout for  Input Stream #0.0 : mono
       Input #0, s16le, from 'pipe:':
       Duration: N/A, start: 0.000000, bitrate: 705 kb/s
       Stream #0:0: Audio: pcm_s16le, 44100 Hz, mono, s16, 705 kb/s
       [aformat @ 0x2363100] auto-inserting filter 'auto-inserted resampler 0' between the filter 'src' and the filter 'aformat'
       [aresample @ 0x235b0a0] chl:mono fmt:s16 r:44100Hz -> chl:mono fmt:flt r:44100Hz
       Output #0, 3gp, to '/data/data/com.test.audio/files/tmp.3gp':
       Metadata:
       encoder         : Lavf54.6.100
       Stream #0:0: Audio: aac (mp4a / 0x6134706D), 44100 Hz, mono, flt, 128 kb/s
       Stream mapping:
       Stream #0:0 -> #0:0 (pcm_s16le -> aac)
       size=       3kB time=00:00:00.18 bitrate= 132.5kbits/s    
    size=       8kB time=00:00:00.55 bitrate= 120.9kbits/s    
    size=      12kB time=00:00:00.83 bitrate= 121.8kbits/s    
    size=      16kB time=00:00:01.04 bitrate= 122.8kbits/s    
    size=      20kB time=00:00:01.32 bitrate= 122.5kbits/s    
    size=      23kB time=00:00:01.53 bitrate= 121.6kbits/s    
    size=      27kB time=00:00:01.81 bitrate= 121.0kbits/s    
    size=      31kB time=00:00:02.11 bitrate= 120.7kbits/s    
    size=      35kB time=00:00:02.32 bitrate= 123.4kbits/s
       video:0kB audio:34kB global headers:0kB muxing overhead 3.031610%
  • How to set pts, dts and duration in ffmpeg library ?

    24 mars, par hslee

    I want to pack some compressed video packets(h.264) to ".mp4" container.
One word, Muxing, no decoding and no encoding.
And I have no idea how to set pts, dts and duration.

    



      

    1. I get the packets with "pcap" library.
    2. 


    3. I removed headers before compressed video data show up. e.g. Ethernet, VLAN.
    4. 


    5. I collected data until one frame and decoded it for getting information of data. e.g. width, height. (I am not sure that it is necessary)
    6. 


    7. I initialized output context, stream and codec context.
    8. 


    9. I started to receive packets with "pcap" library again. (now for muxing)
    10. 


    11. I made one frame and put that data in AVPacket structure.
    12. 


    13. I try to set PTS, DTS and duration. (I think here is wrong part, not sure though)
    14. 


    



    *7-1. At the first frame, I saved time(msec) with packet header structure.

    



    *7-2. whenever I made one frame, I set parameters like this : PTS(current time - start time), DTS(same PTS value), duration(current PTS - before PTS)

    



    I think it has some error because :

    



      

    1. I don't know how far is suitable long for dts from pts.

    2. 


    3. At least, I think duration means how long time show this frame from now to next frame, so It should have value(next PTS - current PTS), but I can not know the value next PTS at that time.

    4. 


    



    It has I-frame only.

    



    // make input context for decoding

AVFormatContext *&ic = gInputContext;

ic = avformat_alloc_context();

AVCodec *cd = avcodec_find_decoder(AV_CODEC_ID_H264);

AVStream *st = avformat_new_stream(ic, cd);

AVCodecContext *cc = st->codec;

avcodec_open2(cc, cd, NULL);

// make packet and decode it after collect packets is be one frame

gPacket.stream_index = 0;

gPacket.size    = gPacketLength[0];

gPacket.data    = gPacketData[0];

gPacket.pts     = AV_NOPTS_VALUE;

gPacket.dts     = AV_NOPTS_VALUE;

gPacket.flags   = AV_PKT_FLAG_KEY;

avcodec_decode_video2(cc, gFrame, &got_picture, &gPacket);

// I checked automatically it initialized after "avcodec_decode_video2"

// put some info that I know that not initialized

cc->time_base.den   = 90000;

cc->time_base.num   = 1;

cc->bit_rate    = 2500000;

cc->gop_size    = 1;

// make output context with input context

AVFormatContext *&oc = gOutputContext;

avformat_alloc_output_context2(&oc, NULL, NULL, filename);

AVFormatContext *&ic = gInputContext;

AVStream *ist = ic->streams[0];

AVCodecContext *&icc = ist->codec;

AVStream *ost = avformat_new_stream(oc, icc->codec);

AVCodecContext *occ = ost->codec;

avcodec_copy_context(occ, icc);

occ->flags |= CODEC_FLAG_GLOBAL_HEADER;

avio_open(&(oc->pb), filename, AVIO_FLAG_WRITE);

// repeated part for muxing

AVRational Millisecond = { 1, 1000 };

gPacket.stream_index = 0;

gPacket.data = gPacketData[0];

gPacket.size = gPacketLength[0];

gPacket.pts = av_rescale_rnd(pkthdr->ts.tv_sec * 1000 /

    + pkthdr->ts.tv_usec / 1000 /

    - gStartTime, Millisecond.den, ost->time_base.den, /

    (AVRounding)(AV_ROUND_NEAR_INF | AV_ROUND_PASS_MINMAX));

gPacket.dts = gPacket.pts;

gPacket.duration = gPacket.pts - gPrev;

gPacket.flags = AV_PKT_FLAG_KEY;

gPrev = gPacket.pts;

av_interleaved_write_frame(gOutputContext, &gPacket);


    



    Expected and actual results is a .mp4 video file that can play.

    


  • Batch splitting large audio files into small fixed-length audio files in moments of silence

    26 juillet 2023, par Haldjärvi

    to train the SO-VITS-SVC neural network, we need 10-14 second voice files. As a material, let's say I use phrases from some game. I have already made a batch script for decoding different files into one working format, another batch script for removing silence, as well as a batch script for combining small audio files into files of 13-14 seconds (I used Python, pydub and FFmpeg). To successfully automatically create a training dataset, it remains only to make one batch script - Cutting audio files lasting more than 14 seconds into separate files lasting 10-14 seconds, cutting in places of silence or close to silence is highly preferable.

    


    So, it is necessary to batch cut large audio files (20 seconds, 70 seconds, possibly several hundred seconds) into segments of approximately 10-14 seconds, however, the main task is to look for the quietest place in the cut areas so as not to cut phrases in the middle of a word (this is not very good for model training). So, is it really possible to do this in a very optimal way, so that the processing of a 30-second file does not take 15 seconds, but is fast ? Quiet zone detection is required only in the area of cuts, that is, 10-14 seconds, if counted from the very beginning of the file.

    


    I would be very grateful for any help.

    


    I tried to write a script together with ChatGPT, but all options gave completely unpredictable results and were not even close to what I needed... I had to stop at the option with a sharp cut of files for exactly 14000 milliseconds. However, I hope there is a chance to make a variant with cutting exactly in quiet areas.

    


    import os
from pydub import AudioSegment

input_directory = ".../RemSilence/"
output_directory = ".../Split/"
max_duration = 14000

def split_audio_by_duration(input_file, duration):
    audio = AudioSegment.from_file(input_file)
    segments = []
    for i in range(0, len(audio), duration):
        segment = audio[i:i + duration]
        segments.append(segment)
    return segments

if __name__ == "__main__":
    os.makedirs(output_directory, exist_ok=True)
    audio_files = [os.path.join(input_directory, file) for file in os.listdir(input_directory) if file.endswith(".wav")]
    audio_files.sort(key=lambda file: len(AudioSegment.from_file(file)))
    for file in audio_files:
        audio = AudioSegment.from_file(file)
        if len(audio) > max_duration:
            segments = split_audio_by_duration(file, max_duration)
            for i, segment in enumerate(segments):
                output_filename = f"output_{len(os.listdir(output_directory))+1}.wav"
                output_file_path = os.path.join(output_directory, output_filename)
                segment.export(output_file_path, format="wav")
        else:
            output_filename = f"output_{len(os.listdir(output_directory))+1}.wav"
            output_file_path = os.path.join(output_directory, output_filename)
            audio.export(output_file_path, format="wav")