
Recherche avancée
Autres articles (61)
-
Gestion des droits de création et d’édition des objets
8 février 2011, parPar défaut, beaucoup de fonctionnalités sont limitées aux administrateurs mais restent configurables indépendamment pour modifier leur statut minimal d’utilisation notamment : la rédaction de contenus sur le site modifiables dans la gestion des templates de formulaires ; l’ajout de notes aux articles ; l’ajout de légendes et d’annotations sur les images ;
-
Dépôt de média et thèmes par FTP
31 mai 2013, parL’outil MédiaSPIP traite aussi les média transférés par la voie FTP. Si vous préférez déposer par cette voie, récupérez les identifiants d’accès vers votre site MédiaSPIP et utilisez votre client FTP favori.
Vous trouverez dès le départ les dossiers suivants dans votre espace FTP : config/ : dossier de configuration du site IMG/ : dossier des média déjà traités et en ligne sur le site local/ : répertoire cache du site web themes/ : les thèmes ou les feuilles de style personnalisées tmp/ : dossier de travail (...) -
Keeping control of your media in your hands
13 avril 2011, parThe vocabulary used on this site and around MediaSPIP in general, aims to avoid reference to Web 2.0 and the companies that profit from media-sharing.
While using MediaSPIP, you are invited to avoid using words like "Brand", "Cloud" and "Market".
MediaSPIP is designed to facilitate the sharing of creative media online, while allowing authors to retain complete control of their work.
MediaSPIP aims to be accessible to as many people as possible and development is based on expanding the (...)
Sur d’autres sites (7101)
-
FFMPEG loudnorm filter does not work in combination with silenceremove filter
12 mai 2021, par MareikePI want to consistently normalize audio files for TTS model training. The output audio files should meet the following criteria :


- 

- mono channel
- sample rate of 22050 Hz
- wav format
- no silence at beginning and end of audio clip
- volume of -24 dB












I have already fulfilled the first 4 criteria. So far, it works properly.


Normalizing the volume basically works as well with this ffmpeg command
-af loudnorm=I=-24:LRA=11:TP=-1.5
, but not in combination with the silence removal : As soon as I remove silence with this ffmpeg commandagate=threshold=0.045:attack=0.5:release=500:ratio=5000,silenceremove=start_periods=1:start_threshold=0.0075,areverse,silenceremove=start_periods=1:start_threshold=0.0075,areverse
, the loudness normalization does not work any longer : the output volume now varies between -25dB and -32dB instead of the desired -24 dB.

This is the complete ffmpeg command I used :


ffmpeg -i filename.flac -ac 1 -af agate=threshold=0.045:attack=0.5:release=500:ratio=5000,silenceremove=start_periods=1:start_threshold=0.0075,areverse,silenceremove=start_periods=1:start_threshold=0.0075,areverse,loudnorm=I=-24:LRA=11:TP=-1.5,aresample=22050 -y -hide_banner filename.wav



And this is the piece of code that I'm using to run it :


import os

INPUT_DIR = '/home/username/all_data'
OUTPUT_DIR = '/home/username/normalized_data'
for filename in os.listdir(INPUT_DIR):
 wav_filename = filename[:-5] + '.wav'
 command = (f'ffmpeg -i {INPUT_DIR}/{filename} -ac 1 -af agate='
 f'threshold=0.045:attack=0.5:release=500:ratio=5000,'
 f'silenceremove=start_periods=1:start_threshold=0.0075,'
 f'areverse,silenceremove=start_periods=1:start_threshold='
 f'0.0075,areverse,loudnorm=I=-24:LRA=11:TP=-1.5,aresample'
 f'=22050 -y -hide_banner {OUTPUT_DIR}/{wav_filename}')
 os.system(command)



EDIT :


A complete log from the ffmpeg command can be seen here :


username@pop-os:~$ ffmpeg -i /home/username/audios/filename.flac -ac 1 -af agate=threshold=0.045:attack=0.5:release=500:ratio=5000,silenceremove=start_periods=1:start_threshold=0.0075,areverse,silenceremove=start_periods=1:start_threshold=0.0075,areverse,loudnorm=I=-24:LRA=11:TP=-1.5,aresample=22050 /home/username/result.wav
ffmpeg version 4.2.4-1ubuntu0.1 Copyright (c) 2000-2020 the FFmpeg developers
 built with gcc 9 (Ubuntu 9.3.0-10ubuntu2)
 configuration: --prefix=/usr --extra-version=1ubuntu0.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-nvenc --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
 libavutil 56. 31.100 / 56. 31.100
 libavcodec 58. 54.100 / 58. 54.100
 libavformat 58. 29.100 / 58. 29.100
 libavdevice 58. 8.100 / 58. 8.100
 libavfilter 7. 57.100 / 7. 57.100
 libavresample 4. 0. 0 / 4. 0. 0
 libswscale 5. 5.100 / 5. 5.100
 libswresample 3. 5.100 / 3. 5.100
 libpostproc 55. 5.100 / 55. 5.100
Input #0, flac, from '/home/mareike/tts_data/save/audios_flac/0a6c8520-7536-11eb-8338-b7015f354987.flac':
 Duration: 00:00:04.64, start: 0.000000, bitrate: 1090 kb/s
 Stream #0:0: Audio: flac, 44100 Hz, stereo, s32 (24 bit)
Stream mapping:
 Stream #0:0 -> #0:0 (flac (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, wav, to '/home/mareike/result_0a6c8520-7536-11eb-8338-b7015f354987.wav':
 Metadata:
 ISFT : Lavf58.29.100
 Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 22050 Hz, mono, s16, 352 kb/s
 Metadata:
 encoder : Lavc58.54.100 pcm_s16le
size= 138kB time=00:00:03.19 bitrate= 353.0kbits/s speed=14.3x 
video:0kB audio:138kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.055375%



Can anyone tell me what I'm doing wrong and how I can finally get the volume normalized to -24 dB (in combination with silence removal) ? Any help is appreciated, thank you very much !


-
ffmpeg error on decode
25 octobre 2013, par ademar111190I'm developing an android app with the libav and I'm trying decode a 3gp with code below :
#define simbiLog(...) __android_log_print(ANDROID_LOG_DEBUG, "simbiose", __VA_ARGS__)
...
AVCodec *codec;
AVCodecContext *c = NULL;
int len;
FILE *infile, *outfile;
uint8_t inbuf[AUDIO_INBUF_SIZE + FF_INPUT_BUFFER_PADDING_SIZE];
AVPacket avpkt;
AVFrame *decoded_frame = NULL;
simbiLog("inbuf size: %d", sizeof(inbuf) / sizeof(inbuf[0]));
av_register_all();
av_init_packet(&avpkt);
codec = avcodec_find_decoder(AV_CODEC_ID_AMR_NB);
if (!codec) {
simbiLog("codec not found");
return ERROR;
}
c = avcodec_alloc_context3(codec);
if (!c) {
simbiLog("Could not allocate audio codec context");
return ERROR;
}
int open = avcodec_open2(c, codec, NULL);
if (open < 0) {
simbiLog("could not open codec %d", open);
return ERROR;
}
infile = fopen(inputPath, "rb");
if (!infile) {
simbiLog("could not open %s", inputPath);
return ERROR;
}
outfile = fopen(outputPath, "wb");
if (!outfile) {
simbiLog("could not open %s", outputPath);
return ERROR;
}
avpkt.data = inbuf;
avpkt.size = fread(inbuf, 1, AUDIO_INBUF_SIZE, infile);
int iterations = 0;
while (avpkt.size > 0) {
simbiLog("iteration %d", (++iterations));
simbiLog("avpkt.size %d avpkt.data %X", avpkt.size, avpkt.data);
int got_frame = 0;
if (!decoded_frame) {
if (!(decoded_frame = avcodec_alloc_frame())) {
simbiLog("out of memory");
return ERROR;
}
} else {
avcodec_get_frame_defaults(decoded_frame);
}
//below the error, but it isn't occur on first time, only in 4th loop interation
len = avcodec_decode_audio4(c, decoded_frame, &got_frame, &avpkt);
if (len < 0) {
simbiLog("Error while decoding error %d frame %d duration %d", len, got_frame, avpkt.duration);
return ERROR;
} else {
simbiLog("Decoding length %d frame %d duration %d", len, got_frame, avpkt.duration);
}
if (got_frame) {
int data_size = av_samples_get_buffer_size(NULL, c->channels, decoded_frame->nb_samples, c->sample_fmt, 1);
size_t* fwrite_size = fwrite(decoded_frame->data[0], 1, data_size, outfile);
simbiLog("fwrite returned %d", fwrite_size);
}
avpkt.size -= len;
avpkt.data += len;
if (avpkt.size < AUDIO_REFILL_THRESH) {
memmove(inbuf, avpkt.data, avpkt.size);
avpkt.data = inbuf;
len = fread(avpkt.data + avpkt.size, 1, AUDIO_INBUF_SIZE - avpkt.size, infile);
if (len > 0)
avpkt.size += len;
simbiLog("fread returned %d", len);
}
}
fclose(outfile);
fclose(infile);
avcodec_close(c);
av_free(c);
av_free(decoded_frame);but I'm getting the follow log and error :
inbuf size: 20488
iteration 1
avpkt.size 3305 avpkt.data BEEED40C
Decoding length 13 frame 1 duration 0
fwrite returned 640
fread returned 0
iteration 2
avpkt.size 3292 avpkt.data BEEED40C
Decoding length 13 frame 1 duration 0
fwrite returned 640
fread returned 0
iteration 3
avpkt.size 3279 avpkt.data BEEED40C
Decoding length 14 frame 1 duration 0
fwrite returned 640
fread returned 0
iteration 4
avpkt.size 3265 avpkt.data BEEED40C
Error while decoding error -1052488119 frame 0 duration 0the audio file I'm trying decode :
$ avprobe blue.3gp
avprobe version 0.8.6-6:0.8.6-1ubuntu2, Copyright (c) 2007-2013 the Libav developers
built on Mar 30 2013 22:23:21 with gcc 4.7.2
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'blue.3gp':
Metadata:
major_brand : 3gp4
minor_version : 0
compatible_brands: isom3gp4
creation_time : 2013-09-19 18:53:38
Duration: 00:00:01.52, start: 0.000000, bitrate: 17 kb/s
Stream #0.0(eng): Audio: amrnb, 8000 Hz, 1 channels, flt, 12 kb/s
Metadata:
creation_time : 2013-09-19 18:53:38thanks a lot !
EDITED
I read on ffmper documentation about the method
avcodec_decode_audio4
the follow :@warning The input buffer, avpkt->data must be FF_INPUT_BUFFER_PADDING_SIZE larger than the actual read bytes because some optimized bitstream readers read 32 or 64 bits at once and could read over the end.
@note You might have to align the input buffer. The alignment requirements depend on the CPU and the decoder.and I see here a solution using
posix_memalign
, to android i founded a similar method calledmemalign
, so i did the change :removed :
uint8_t inbuf[AUDIO_INBUF_SIZE + FF_INPUT_BUFFER_PADDING_SIZE];
inserted :
int inbufSize = sizeof(uint8_t) * (AUDIO_INBUF_SIZE + FF_INPUT_BUFFER_PADDING_SIZE);
uint8_t *inbuf = memalign(FF_INPUT_BUFFER_PADDING_SIZE, inbufSize);
simbiLog("inbuf size: %d", inbufSize);
for (; inbufSize >= 0; inbufSize--)
simbiLog("inbuf position: %d index: %p", inbufSize, &inbuf[inbufSize]);I'm getting the correct memory sequence position, but the error not changed.
A piece of outpout :
inbuf position: 37 index: 0x4e43d745
inbuf position: 36 index: 0x4e43d744
inbuf position: 35 index: 0x4e43d743
inbuf position: 34 index: 0x4e43d742
inbuf position: 33 index: 0x4e43d741
inbuf position: 32 index: 0x4e43d740
inbuf position: 31 index: 0x4e43d73f
inbuf position: 30 index: 0x4e43d73e
inbuf position: 29 index: 0x4e43d73d
inbuf position: 28 index: 0x4e43d73c
inbuf position: 27 index: 0x4e43d73b
inbuf position: 26 index: 0x4e43d73a
inbuf position: 25 index: 0x4e43d739
inbuf position: 24 index: 0x4e43d738
inbuf position: 23 index: 0x4e43d737
inbuf position: 22 index: 0x4e43d736
inbuf position: 21 index: 0x4e43d735
inbuf position: 20 index: 0x4e43d734
inbuf position: 19 index: 0x4e43d733 -
ffmpeg avcodec_send_packet fail after several frames
8 janvier 2024, par XingdiI am trying to use ffmpeg to decode a video stream and convert to Opencv
cv::Mat
.

I found this piece of code and made it compiled. Howerver, the decoding process will always fail after several frames. The call to
avcodec_send_packet
return error code-11
.

I do now know what the problem is. I am sure the video stream and my FFmpeg is ok. I can transcode the stream video using ffmpeg command line.


The error always happen on


error = avcodec_send_packet(videoCodecContext, &packet);



#include <iostream>
#include <string>
#include <vector>

extern "C" {
#include <libavformat></libavformat>avformat.h>
#include <libavutil></libavutil>imgutils.h>
#include <libswscale></libswscale>swscale.h>
#include <libavcodec></libavcodec>avcodec.h>
}


#include <opencv2></opencv2>opencv.hpp>

// helper function to check for FFmpeg errors
inline void checkError(int error, const std::string& message) {
 if (error < 0) {
 //std::cerr << message << ": " << av_err2str(error) << std::endl; //error C4576: a parenthesized type followed by an initializer list is a non-standard explicit type conv>
 std::cerr << message << ": " << std::to_string(error) << std::endl;
 exit(EXIT_FAILURE);
 }
}

int main() {
 // initialize FFmpeg
 av_log_set_level(AV_LOG_ERROR);
 avformat_network_init();

 // open the input file
 std::string fileName = "rtsp://xyz@192.168.15.112:554//Streaming/Channels/101";
 AVFormatContext* formatContext = nullptr;
 int error = avformat_open_input(&formatContext, fileName.c_str(), nullptr, nullptr);
 checkError(error, "Error opening input file");

 //Read packets of a media file to get stream information.
 ////////////////////////////////////////////////////////////////////////////
 error = avformat_find_stream_info(formatContext, nullptr);
 checkError(error, "Error avformat find stream info");
 ////////////////////////////////////////////////////////////////////////////


 // find the video stream
 AVStream* videoStream = nullptr;
 for (unsigned int i = 0; i < formatContext->nb_streams; i++) {
 if (formatContext->streams[i]->codecpar->codec_type == AVMEDIA_TYPE_VIDEO && !videoStream) {
 videoStream = formatContext->streams[i];
 }
 }
 if (!videoStream) {
 std::cerr << "Error: input file does not contain a video stream" << std::endl;
 exit(EXIT_FAILURE);
 }

 // create the video codec context
 const AVCodec* videoCodec = avcodec_find_decoder(videoStream->codecpar->codec_id);
 AVCodecContext* videoCodecContext = avcodec_alloc_context3(videoCodec);
 if (!videoCodecContext) {
 std::cerr << "Error allocating video codec context" << std::endl;
 exit(EXIT_FAILURE);
 }
 error = avcodec_parameters_to_context(videoCodecContext, videoStream->codecpar);
 checkError(error, "Error setting video codec context parameters");
 error = avcodec_open2(videoCodecContext, videoCodec, nullptr);
 checkError(error, "Error opening video codec");

 // create the frame scaler
 int width = videoCodecContext->width;
 int height = videoCodecContext->height;

 std::cout<<"frame width:" << width << std::endl;
 std::cout<<"frame height:" << height << std::endl;
 std::cout<<"frame pix fmt:" << videoCodecContext->pix_fmt << std::endl;

 struct SwsContext* frameScaler = sws_getContext(width, height, videoCodecContext->pix_fmt, width, height, AV_PIX_FMT_BGR24, SWS_BICUBIC, nullptr, nullptr, nullptr);

 // read the packets and decode the video frames
 std::vector videoFrames;
 AVPacket packet;
 int i=0;
 while (av_read_frame(formatContext, &packet) == 0) {
 std::cout<<"frame number:"<< i++ <index) {
 // decode the video frame
 AVFrame* frame = av_frame_alloc();
 int gotFrame = 0;
 error = avcodec_send_packet(videoCodecContext, &packet);
 checkError(error, "Error sending packet to video codec");
 if (error<0) {
 continue;
 }
 error = avcodec_receive_frame(videoCodecContext, frame);

 //There is not enough data for decoding the frame, have to free and get more data
 ////////////////////////////////////////////////////////////////////////////
 if (error == AVERROR(EAGAIN))
 {
 av_frame_unref(frame);
 av_freep(frame);
 continue;
 }

 if (error == AVERROR_EOF)
 {
 std::cerr << "AVERROR_EOF" << std::endl;
 break;
 }
 ////////////////////////////////////////////////////////////////////////////

 checkError(error, "Error receiving frame from video codec");


 if (error == 0) {
 gotFrame = 1;
 }
 if (gotFrame) {
 // scale the frame to the desired format
 AVFrame* scaledFrame = av_frame_alloc();
 av_image_alloc(scaledFrame->data, scaledFrame->linesize, width, height, AV_PIX_FMT_BGR24, 32);
 sws_scale(frameScaler, frame->data, frame->linesize, 0, height, scaledFrame->data, scaledFrame->linesize);
 // copy the frame data to a cv::Mat object
 cv::Mat mat(height, width, CV_8UC3, scaledFrame->data[0], scaledFrame->linesize[0]);

 //Show mat image for testing
 ////////////////////////////////////////////////////////////////////////////
 //cv::imshow("mat", mat);
 //cv::waitKey(100); //Wait 100msec (relativly long time - for testing).
 ////////////////////////////////////////////////////////////////////////////


 videoFrames.push_back(mat.clone());

 // clean up
 av_freep(&scaledFrame->data[0]);
 av_frame_free(&scaledFrame);
 }
 av_frame_free(&frame);
 }
 av_packet_unref(&packet);
 }

 // clean up
 sws_freeContext(frameScaler);
 avcodec_free_context(&videoCodecContext);
 avformat_close_input(&formatContext);

 return 0;
}
</vector></string></iostream>