Recherche avancée

Médias (0)

Mot : - Tags -/utilisateurs

Aucun média correspondant à vos critères n’est disponible sur le site.

Autres articles (61)

  • Websites made ​​with MediaSPIP

    2 mai 2011, par

    This page lists some websites based on MediaSPIP.

  • Creating farms of unique websites

    13 avril 2011, par

    MediaSPIP platforms can be installed as a farm, with a single "core" hosted on a dedicated server and used by multiple websites.
    This allows (among other things) : implementation costs to be shared between several different projects / individuals rapid deployment of multiple unique sites creation of groups of like-minded sites, making it possible to browse media in a more controlled and selective environment than the major "open" (...)

  • Submit enhancements and plugins

    13 avril 2011

    If you have developed a new extension to add one or more useful features to MediaSPIP, let us know and its integration into the core MedisSPIP functionality will be considered.
    You can use the development discussion list to request for help with creating a plugin. As MediaSPIP is based on SPIP - or you can use the SPIP discussion list SPIP-Zone.

Sur d’autres sites (6986)

  • Resampling audio using FFmpeg API

    13 octobre 2020, par bbdd

    I have a task to decode audio data, re-encode it to another format, and save this encoded data to a buffer. The encoded data that I need to save to the buffer is in AVPacket::data. I save them after this procedure :

    


      

    1. I receive a packet from the input stream
    2. 


    3. I send the packet to the decoder
    4. 


    5. I get the decrypted frame
    6. 


    7. I send it to the encoder
    8. 


    9. I get the encoded packet
    10. 


    11. Save to the buffer
    12. 


    


    All procedures work. But here's the problem. I need to create a "resampling" between points 3 and 4. Before sending data to the encoder, it must pass resampling if required. For example, I get audio data in PCM_ALAW format, with 1 audio channel, and 8000 sample rate. When exiting, I want to get PCM_S32LE, with 2 channels and a sampling rate of 44100. Converting audio format PCM_ALAW to PCM_S32LE works. But I do not know how to implement resampling.

    


    I have an incomplete implementation of the oversampling functions, but I do not know how to put it all together. I was advised this and this example. But I couldn't solve it.

    


    I provide the full code.

    


    audiodecoder.h

    


    class AudioDecoder
{
public:

    AudioDecoder(const AudioDecoderSettings& settings);
    AudioDecoder& operator=(const AudioDecoder& other) = delete;
    AudioDecoder& operator=(AudioDecoder&& other)      = delete;
    AudioDecoder(const AudioDecoder& other)            = delete;
    AudioDecoder(AudioDecoder&& other)                 = delete;
    virtual ~AudioDecoder(void);

    virtual qint32 init(void) noexcept;
    //virtual QByteArray getData(const quint32 &size) noexcept;
    virtual QByteArray get() noexcept;
    virtual qint32 term(void) noexcept;

protected:

    virtual qint32 openInputStream (void) noexcept;
    virtual qint32 openEncoderForStream(void) noexcept;
    virtual qint32 decodeAudioFrame(AVFrame *frame);
    virtual qint32 encodeAudioFrame(AVFrame *frame);
    virtual qint32 initResampler(void);
    virtual qint32 initConvertedSamples(uint8_t ***converted_input_samples, int frame_size);

    class Deleter
    {
    public:
        static void cleanup(AVFormatContext* p);
        static void cleanup(AVCodecContext* p);
        static void cleanup(AudioDecoderSettings* p);
    };

protected:

    bool   m_edf;
    bool   m_initialized{ false };
    qint32 m_streamIndex{ 0 };
    QByteArray                                     m_buffer;
    QScopedPointer p_frmCtx{nullptr};
    QScopedPointer p_iCdcCtx{nullptr};
    QScopedPointer p_oCdcCtx{nullptr};
    QScopedPointer p_settings{nullptr};
    SwrContext *swrCtx;
};


    


    audiodecoder.cpp

    


    static void initPacket(AVPacket *packet)
{
    av_init_packet(packet);
    // Set the packet data and size so that
    // it is recognized as being empty.
    packet->data = nullptr;
    packet->size = 0;
}

static QString error2string(const qint32& code)
{
    if (code < 0) {
        char errorBuffer[255]{ '0' };
        av_strerror(code, errorBuffer, sizeof(errorBuffer));
        return QString(errorBuffer);
    }
    return QString();
}

static void printErrorMessage(const QString &message, const qint32 &code = 0)
{
    qDebug() << "AudioDecoder: " << message << error2string(code);
}

static qint32 initInputFrame(AVFrame **frame)
{
    if (!(*frame = av_frame_alloc())) {
        printErrorMessage(QString("Could not allocate input frame"));
        return AVERROR(ENOMEM);
    }
    return 0;
}

void AudioDecoder::Deleter::cleanup(AVFormatContext* p)
{
    if (p) {
        avformat_close_input(&p);
    }
}

void AudioDecoder::Deleter::cleanup(AVCodecContext* p)
{
    if (p) {
        avcodec_free_context(&p);
    }
}

void AudioDecoder::Deleter::cleanup(AudioDecoderSettings* p)
{
    if (p) {
        delete p;
    }
}

AudioDecoder::AudioDecoder(const AudioDecoderSettings& settings)
    : m_edf(false),
      m_initialized(false)
    , m_streamIndex(0)
    , p_frmCtx(nullptr)
    , p_iCdcCtx(nullptr)
    , p_oCdcCtx(nullptr)
    , p_settings(new AudioDecoderSettings(settings))
{
    av_register_all();
    avcodec_register_all();
}

qint32 AudioDecoder::openInputStream(void) noexcept
{
    qint32           error  = -1;
    AVCodecContext  *avctx  = nullptr;
    AVFormatContext *frmCtx = nullptr;

    // Open the input file to read from it.
    if ((error = avformat_open_input(&frmCtx,
            p_settings->inputFile().toStdString().c_str(), nullptr, nullptr)) < 0) {
        frmCtx = nullptr;
        printErrorMessage(QString("Could not open input file '%1' (error '%2')")
                          .arg(p_settings->inputFile()
                          .arg(error2string(error))));
        return error;
    }

    // Get information on the input file (number of streams etc.).
    if ((error = avformat_find_stream_info(frmCtx, nullptr)) < 0) {
        printErrorMessage(QString("Could not open find stream info (error '%1')")
                          .arg(error2string(error)));
        avformat_close_input(&frmCtx);
        return error;
    }

    // Find audio stream index
    auto getAudioStreamIndex = [](AVFormatContext *frmCtx) -> qint32
    {
        if (frmCtx->nb_streams != 1) {
            for (quint32 i = 0; i < frmCtx->nb_streams; ++i) {
                if (frmCtx->streams[i]->codecpar->codec_type == AVMEDIA_TYPE_AUDIO) {
                    return i;
                }
            }
        }
        return 0;
    };

    if (frmCtx->streams[m_streamIndex =
            getAudioStreamIndex(frmCtx)]->codecpar->codec_type != AVMEDIA_TYPE_AUDIO) {
        avformat_close_input(&frmCtx);
        printErrorMessage(QString("The audio stream was not found"));
        return -1;
    }

    AVCodec *codec = nullptr;
    // Find a decoder for the audio stream.
    if (!(codec = avcodec_find_decoder(
              frmCtx->streams[m_streamIndex]->codecpar->codec_id))) {
        printErrorMessage(QString("Could not find input codec"));
        avformat_close_input(&frmCtx);
        return -1;
    }

    // Allocate a new decoding context.
    avctx = avcodec_alloc_context3(codec);
    if (!avctx) {
        printErrorMessage(QString("Could not allocate a decoding context"));
        avformat_close_input(&frmCtx);
        return AVERROR(ENOMEM);
    }

    // Initialize the stream parameters with demuxer information.
    error = avcodec_parameters_to_context(
                avctx, frmCtx->streams[m_streamIndex]->codecpar);
    if (error < 0) {
        avformat_close_input(&frmCtx);
        avcodec_free_context(&avctx);
        return error;
    }

    // Open the decoder for the audio stream to use it later.
    if ((error = avcodec_open2(avctx, codec, nullptr)) < 0) {
        printErrorMessage(QString("Could not open input codec: "), error);
        avcodec_free_context(&avctx);
        avformat_close_input(&frmCtx);
        return error;
    }

    // Save the decoder context for easier access later.
    p_iCdcCtx.reset(avctx);
    p_frmCtx.reset(frmCtx);

    // Print detailed information about the input format
    av_dump_format(p_frmCtx.data(), 0,
                   p_settings->inputFile().toStdString().c_str(), 0);
    return 0;
}

AudioDecoder::~AudioDecoder(void)
{
    term();
}

qint32 AudioDecoder::term(void) noexcept
{
    if (!m_initialized) {
        return -1;
    }
    if (p_frmCtx   != nullptr) {
        p_frmCtx.reset();
    }
    if (p_iCdcCtx  != nullptr) {
        p_iCdcCtx.reset();
    }
    if (p_oCdcCtx  != nullptr) {
        p_oCdcCtx.reset();
    }
    if (p_settings != nullptr) {
        p_settings.reset();
    }
    m_initialized = false;
    return (p_frmCtx && p_iCdcCtx && p_oCdcCtx && p_settings) ? -1 : 0;
}

qint32 AudioDecoder::init(void) noexcept
{
    if (m_initialized) {
        return 0;
    }
    if (p_settings->inputFile().isEmpty()) {
        return -1;
    }
    if (p_settings->audioCodec().isEmpty()) {
        return -1;
    }
    if (openInputStream() < 0) {
        return -1;
    }
    if (openEncoderForStream() < 0) {
        return -1;
    }
    if (initResampler() < 0) {
        return -1;
    }

    m_initialized = true;
    return 0;
}

qint32 AudioDecoder::openEncoderForStream(void) noexcept
{
    AVCodecContext *avctx = nullptr;
    AVCodec        *codec = nullptr;
    qint32          error = 0;

    // Set the basic encoder parameters.
    const quint32 sampleRate   = p_settings->sampleRate()   > 0
            ? p_settings->sampleRate()   : p_iCdcCtx->sample_rate;
    const quint16 channelCount = p_settings->channelCount() > 0
            ? p_settings->channelCount() : p_iCdcCtx->channels;
    const quint32 constBitRate = p_settings->constBitRate() > 0
            ? p_settings->constBitRate() : p_iCdcCtx->bit_rate;
    const QString encodeName   = p_settings->audioCodec() == "copy"
            ? QString(p_iCdcCtx->codec->name) : p_settings->audioCodec();

    if (!(codec = avcodec_find_encoder_by_name(
              encodeName.toStdString().c_str()))) {
        printErrorMessage(QString(
            "Could not find an %1 encoder").arg(p_settings->audioCodec()));
        return -1;
    }

    avctx = avcodec_alloc_context3(codec);
    if (!avctx) {
        printErrorMessage(QString("Could not allocate an encoding context"));
        avcodec_free_context(&avctx);
        return -1;
    }

    if (!codec->sample_fmts) {
        avcodec_free_context(&avctx);
        return -1;
    }

    avctx->channels              = channelCount;
    avctx->channel_layout        = av_get_default_channel_layout(channelCount);
    avctx->sample_rate           = sampleRate;
    avctx->bit_rate              = constBitRate;
    avctx->sample_fmt            = codec->sample_fmts[0];
    // Set the sample rate for the container.
    avctx->time_base.den         = sampleRate;
    avctx->time_base.num         = 1;
    // Allow the use of the experimental encoder.
    avctx->strict_std_compliance = FF_COMPLIANCE_EXPERIMENTAL;

    // Open the encoder for the audio stream to use it later.
    if ((error = avcodec_open2(avctx, codec, nullptr)) < 0) {
        printErrorMessage(QString("Could not open output codec (error '%1')")
                          .arg(error2string(error)));
        avcodec_free_context(&avctx);
        return -1;
    }

    p_oCdcCtx.reset(avctx);
    return 0;
}

qint32 AudioDecoder::decodeAudioFrame(AVFrame *frame)
{
    // Packet used for temporary storage.
    AVPacket input_packet;
    qint32 error = 0;
    initPacket(&input_packet);

    // Read one audio frame from the input file into a temporary packet.
    if ((error = av_read_frame(p_frmCtx.data(), &input_packet)) < 0) {
        // If we are at the end of the file, flush the decoder below.
        if (error == AVERROR_EOF) {
            m_edf = true;
            return 0;
        }
        else {
            printErrorMessage(QString("Could not read frame (error '%1')")
                              .arg(error2string(error)));
            return error;
        }
    }

    if (input_packet.stream_index != m_streamIndex) {
        av_packet_unref(&input_packet);
        return -1;
    }

    // Send the audio frame stored in the temporary packet to the decoder.
    // The input audio stream decoder is used to do this.
    if ((error = avcodec_send_packet(p_iCdcCtx.data(), &input_packet)) < 0) {
        printErrorMessage(QString("Could not send packet for decoding (error '%1')")
                          .arg(error2string(error)));
        return error;
    }

    // Receive one frame from the decoder.
    error = avcodec_receive_frame(p_iCdcCtx.data(), frame);
    // If the decoder asks for more data to be able to decode a frame,
    // return indicating that no data is present.

    if (error == AVERROR(EAGAIN)) {
        error = 0;
    // If the end of the input file is reached, stop decoding.
    } else if (error == AVERROR_EOF) {
        m_edf = true;
        error = 0;
    } else if (error < 0) {
        printErrorMessage(QString("Could not decode frame (error '%1')")
                          .arg(error2string(error)));
    } else {
        error = 0;
    }
    av_packet_unref(&input_packet);
    return error;
}

qint32 AudioDecoder::encodeAudioFrame(AVFrame *frame)
{
    /* Packet used for temporary storage. */
    AVPacket output_packet;
    int error;
    initPacket(&output_packet);
    // Send the audio frame stored in the temporary packet to the encoder.
    // The output audio stream encoder is used to do this.
    error = avcodec_send_frame(p_oCdcCtx.data(), frame);
    // The encoder signals that it has nothing more to encode.
    if (error == AVERROR_EOF) {
        error = 0;
    } else if (error < 0) {
        printErrorMessage(QString("Could not send packet for encoding (error '%1')")
                          .arg(error2string(error)));
    }
    else {

        // Receive one encoded frame from the encoder.
        error = avcodec_receive_packet(p_oCdcCtx.data(), &output_packet);
        // If the encoder asks for more data to be able to provide an
        // encoded frame, return indicating that no data is present.
        if (error == AVERROR(EAGAIN)) {
            error = 0;
        /* If the last frame has been encoded, stop encoding. */
        } else if (error == AVERROR_EOF) {
            error = 0;
        } else if (error < 0) {
            printErrorMessage(QString("Could not encode frame (error '%1')")
                              .arg(error2string(error)));
        } else {

            // Copy packet
            // output_packet.pts      = av_rescale_q_rnd(output_packet.pts,  p_iCdcCtx->time_base, p_oCdcCtx->time_base, (enum AVRounding) (AV_ROUND_NEAR_INF | AV_ROUND_PASS_MINMAX) );
            // output_packet.dts      = av_rescale_q_rnd(output_packet.dts,  p_iCdcCtx->time_base, p_oCdcCtx->time_base, (enum AVRounding) (AV_ROUND_NEAR_INF | AV_ROUND_PASS_MINMAX) );
            // output_packet.duration = av_rescale_q(output_packet.duration, p_iCdcCtx->time_base, p_oCdcCtx->time_base);
            // output_packet.pos      = -1;

            // Save decoded - encoded audio data
            for (int i = 0; i < output_packet.size; ++i) {
                m_buffer.push_back(output_packet.data[i]);
            }
        }
    }
    av_packet_unref(&output_packet);
    return error;
}

QByteArray AudioDecoder::get() noexcept
{
    AVFrame *frame = nullptr;
    if (initInputFrame(&frame) < 0) {
        return m_buffer;
    }

    while (!m_edf) {
        if (decodeAudioFrame(frame) < 0) {
            av_frame_free(&frame);
            return m_buffer;
        }

        // ????
        uint8_t **converted_input_samples = nullptr;
        if (initConvertedSamples(&converted_input_samples, frame->nb_samples) < 0) {
            if (converted_input_samples) {
                av_freep(&converted_input_samples[0]);
                free(converted_input_samples);
            }
            av_frame_free(&frame);
            return {};
        }



        if (encodeAudioFrame(frame) < 0) {
            av_frame_free(&frame);
            return m_buffer;
        }
        av_frame_unref(frame);
    }
    av_frame_free(&frame);
    return m_buffer;
}

qint32 AudioDecoder::initResampler(void)
{
    qint32 error = 0;
    // Create a resampler context for the conversion.
    // Set the conversion parameters. Default channel layouts based on the number of channels
    // are assumed for simplicity (they are sometimes not detected properly by the demuxer and/or decoder).
    swrCtx = swr_alloc_set_opts(
                nullptr,
                av_get_default_channel_layout(p_oCdcCtx->channels), p_oCdcCtx->sample_fmt, p_oCdcCtx->sample_rate,
                av_get_default_channel_layout(p_iCdcCtx->channels), p_iCdcCtx->sample_fmt, p_iCdcCtx->sample_rate,  0,
                nullptr);
    if (!swrCtx) {
        printErrorMessage(QString("Could not allocate resample context"));
        return AVERROR(ENOMEM);
    }

    // Perform a sanity check so that the number of converted samples is
    // not greater than the number of samples to be converted.
    // If the sample rates differ, this case has to be handled differently
    av_assert0(p_oCdcCtx->sample_rate == p_iCdcCtx->sample_rate);

    // Open the resampler with the specified parameters.
    if ((error = swr_init(swrCtx)) < 0) {
        printErrorMessage(QString("Could not open resample context"));
        swr_free(&swrCtx);
        return error;
    }
    return 0;
}

qint32 AudioDecoder::initConvertedSamples(uint8_t ***converted_input_samples, int frame_size)
{
    qint32 error = 0;
    // Allocate as many pointers as there are audio channels.
    // Each pointer will later point to the audio samples of the corresponding
    // channels (although it may be NULL for interleaved formats).
    if (!(*converted_input_samples =
            (uint8_t **) calloc(p_oCdcCtx->channels, sizeof(**converted_input_samples)))) {
        printErrorMessage("Could not allocate converted input sample pointers");
        return AVERROR(ENOMEM);
    }
    
    // Allocate memory for the samples of all channels in one consecutive
    // block for convenience
    if ((error = av_samples_alloc(
             *converted_input_samples,
             nullptr,
             p_oCdcCtx->channels,
             frame_size,
             p_oCdcCtx->sample_fmt,
             0)) < 0) {

        printErrorMessage(QString("Could not allocate converted input samples (error '%1')")
                          .arg(error2string(error)));
        av_freep(&(*converted_input_samples)[0]);
        free(*converted_input_samples);
        return error;
    }
    return 0;
}


    


    This is where I have to implement data resampling before sending it to the encoder :

    


    QByteArray AudioDecoder::get() noexcept
{
    AVFrame *frame = nullptr;
    if (initInputFrame(&frame) < 0) {
        return m_buffer;
    }

    while (!m_edf) {
        if (decodeAudioFrame(frame) < 0) {
            av_frame_free(&frame);
            return m_buffer;
        }

        // ???
        // ???
        // ???
        // This is where I have to implement data 
        // resampling before sending it to the encoder
        // ???
        // ???
        // ???

        if (encodeAudioFrame(frame) < 0) {
            av_frame_free(&frame);
            return m_buffer;
        }
        av_frame_unref(frame);
    }
    av_frame_free(&frame);
    return m_buffer;
}


    


  • Formatting ffmpeg arguments correctly in Swift

    22 octobre 2019, par NCrusher

    I deleted my previous post about this because I’ve gone through a lot more trial and error on it and I needed to be sure I was giving current and relevant information.

    I’m trying to create a very simple audio conversion app for MacOS using ffmpeg. Because it’s geared at audiobooks, the audio options are pretty basic.

    func conversionSelection() {
       if inputFileUrl != nil {
           let conversionChoice = conversionOptionsPopup.indexOfSelectedItem
           switch conversionChoice {
               case 1 :
                   outputExtension = ".mp3"
                   ffmpegFilters = ["-c:a libmp3lame", "-ac 1", "-ar 22050", "-q:a 9"]
               case 2 :
                   outputExtension = ".mp3"
                   ffmpegFilters = ["-c:a libmp3lame", "-ac 2", "-ar 44100", "-q:a 5"]
               case 3 :
                   outputExtension = ".mp3"
                   ffmpegFilters = ["-c:a libmp3lame", "-ac 1", "-ar 22050", "-b:a 32k"]
               case 4:
                   outputExtension = ".flac"
                   ffmpegFilters = ["-c:a flac"]
               default :
                   outputExtension = ".m4b"
                   ffmpegFilters = ["-c copy"]
           }
       }
    }

    I don’t want to inundate this post with code, but I’m not sure what all is necessary here for people to help me troubleshoot this problem. This is the code that allows me to get my input and output paths set up :

    func updateOutputText(outputString: String, outputField: NSTextField) {
       if inputFileUrl != nil && outputDirectoryUrl == nil {
           // derive output path and filename from input and tack on a new extension if a different conversion format is chosen
           let outputFileUrl = inputFileUrl!.deletingPathExtension()
           let outputPath = outputFileUrl.path
           outputFilePath = outputPath + "\(outputExtension)"
       } else if inputFileUrl != nil && outputDirectoryUrl != nil {
           // derive output directory from outputBrowseButton action, derive filename from input file, and derive output format from conversionSelection
           let outputFile = inputFileUrl!.deletingPathExtension()
           let outputFilename = outputFile.lastPathComponent
           let outputDirectory = outputDirectoryUrl!.path
           outputFilePath = outputDirectory + "/" + outputFilename + "\(outputExtension)"
       }
       outputTextDisplay.stringValue = outputFilePath
    }

    // update input and output text fields
    func updateInputText(inputString: String, inputField: NSTextField) {
       conversionSelection()
       inputTextDisplay.stringValue = inputFilePath
    }

    Everything on the Swift side appears to be working. The input and output Browse buttons work fine. The input and output file paths are written to text fields exactly as they should be. When I select a different conversion option, it updates the file extension for my output file.

    Here are my methods for actually launching ffmpeg :

    func ffmpegConvert(inputPath: String, filters: String, outputPath: String) {
       guard let launchPath = Bundle.main.path(forResource: "ffmpeg", ofType: "") else { return }
       do {
           let convertTask: Process = Process()
           convertTask.launchPath = launchPath
           convertTask.arguments = [
               "-i", inputPath,
               filters,
               outputPath
           ]
           convertTask.standardInput = FileHandle.nullDevice
           convertTask.launch()
           convertTask.waitUntilExit()
       }
    }
    @IBAction func startConversionClicked(_ sender: Any) {
       ffmpegConvert(inputPath: inputFilePath, filters: ffmpegFilters.joined(), outputPath: "outputFilePath")
    }

    The errors are coming from ffmpeg. But I’m fairly certain the PROBLEM is that I haven’t figured out how to write them so that Swift will pass them on to ffmpeg properly.

    If I use the argument arrays exactly as they are formatted above, these are my results :

    When I choose the default option (.m4b, "-c copy") I get this error :

    Unrecognized option ’c copy’. Error splitting the argument list :
    Option not found

    If I choose any of the other options (mp3 or flac), I get both a warning that reads

    Trailing options were found on the commandline.

    Then it will actually read in the metadata of my input file (so I know my input path works at least) and then it will tell me :

    At least one output file must be specified

    So while it’s reading my input file correctly, perhaps it’s not reading my output file path ?

    Moving on.

    I then put double hyphens in the argument strings, thinking that perhaps the reason ffmpeg was reading "-c copy" as "c copy" is because the hyphen is a mathematical operator. My results are as follows :

    (Still doesn’t read metadata)
    Unrecognized option ’-c copy’.
    Error splitting the argument list : Option not found

    (choosing .mp3 output options)
    (doesn’t read metadata this time)
    Unrecognized option ’-c:a libmp3lame—ac 1—ar 22050—q:a 9’.
    Error splitting the argument list : Option not found

    (choosing .flac output option)
    (doesn’t read metadata this time)
    Unrecognized option ’-c:a flac’.
    Error splitting the argument list : Option not found

    So. No help there. Time for a different approach.

    This time, I added a whitespace in front of my arguments. Results :

    (reads metadata)
    [NULL @ 0x106800000] Unable to find a suitable output format for ’ -c copy’
    -c copy : Invalid argument

    (reads metadata)
    [NULL @ 0x10b004400] Unable to find a suitable output format for ’ -c:a libmp3lame -ac 1 -ar 22050 -q:a 9’
    -c:a libmp3lame -ac 1 -ar 22050 -q:a 9 : Invalid argument

    (reads metadata)
    NULL @ 0x10580c400] Unable to find a suitable output format for ’ -c:a flac’
    -c:a flac : Invalid argument

    So, again, I have fairly decent confirmation that it’s reading my input file correctly, but possibly not my output file. Which might make sense in that the process of stringing together my output filepath is a lot more complex than the process for my input file, but I have visual confirmation that my output file path is accurate displayed in my outputTextDisplay box, which displays the outputFilePath perfectly.

    I really have no idea what isn’t working here. I’m sorry for the long post, but I’m not able to narrow the pertinent information down any further.

  • ffmpeg video encoder skips first frames [duplicate]

    19 octobre 2022, par Eduard Barnoviciu

    I am new to ffmpeg. I am trying to run this simple video encoding example :

    


    &#xA;#include <iostream>&#xA;#include <vector>&#xA;// FFmpeg&#xA;extern "C" {&#xA;#include <libavformat></libavformat>avformat.h>&#xA;#include <libavcodec></libavcodec>avcodec.h>&#xA;#include <libavutil></libavutil>imgutils.h>&#xA;#include <libswscale></libswscale>swscale.h>&#xA;}&#xA;// OpenCV&#xA;#include <opencv2></opencv2>opencv.hpp>&#xA;#include <opencv2></opencv2>highgui.hpp>&#xA;&#xA;&#xA;int main(int argc, char* argv[])&#xA;{&#xA;    if (argc &lt; 2) {&#xA;        std::cout &lt;&lt; "Usage: cv2ff <outfile>" &lt;&lt; std::endl;&#xA;        return 1;&#xA;    }&#xA;    const char* outfile = argv[1];&#xA;&#xA; // av_log_set_level(AV_LOG_DEBUG);&#xA;    int ret;&#xA;&#xA;    const int dst_width = 640;&#xA;    const int dst_height = 480;&#xA;    const AVRational dst_fps = {30, 1};&#xA;&#xA;    // initialize OpenCV capture as input frame generator&#xA;    cv::VideoCapture cvcap(0);&#xA;    if (!cvcap.isOpened()) {&#xA;        std::cerr &lt;&lt; "fail to open cv::VideoCapture";&#xA;        return 2;&#xA;    }&#xA;    cvcap.set(cv::CAP_PROP_FRAME_WIDTH, dst_width);&#xA;    cvcap.set(cv::CAP_PROP_FRAME_HEIGHT, dst_height);&#xA;    cvcap.set(cv::CAP_PROP_FPS, dst_fps.num);&#xA;    // some device ignore above parameters for capturing image,&#xA;    // so we query actual parameters for image rescaler.&#xA;    const int cv_width = cvcap.get(cv::CAP_PROP_FRAME_WIDTH);&#xA;    const int cv_height = cvcap.get(cv::CAP_PROP_FRAME_HEIGHT);&#xA;    const int cv_fps = cvcap.get(cv::CAP_PROP_FPS);&#xA;&#xA;    // open output format context&#xA;    AVFormatContext* outctx = nullptr;&#xA;    ret = avformat_alloc_output_context2(&amp;outctx, nullptr, nullptr, outfile);&#xA;    if (ret &lt; 0) {&#xA;        std::cerr &lt;&lt; "fail to avformat_alloc_output_context2(" &lt;&lt; outfile &lt;&lt; "): ret=" &lt;&lt; ret;&#xA;        return 2;&#xA;    }&#xA;&#xA;    // create new video stream&#xA;    AVCodec* vcodec = avcodec_find_encoder(outctx->oformat->video_codec);&#xA;    AVStream* vstrm = avformat_new_stream(outctx, vcodec);&#xA;    if (!vstrm) {&#xA;        std::cerr &lt;&lt; "fail to avformat_new_stream";&#xA;        return 2;&#xA;    }&#xA;&#xA;    // open video encoder&#xA;    AVCodecContext* cctx = avcodec_alloc_context3(vcodec);&#xA;    if (!vstrm) {&#xA;        std::cerr &lt;&lt; "fail to avcodec_alloc_context3";&#xA;        return 2;&#xA;    }&#xA;    cctx->width = dst_width;&#xA;    cctx->height = dst_height;&#xA;    cctx->pix_fmt = vcodec->pix_fmts[0];&#xA;    cctx->time_base = av_inv_q(dst_fps);&#xA;    cctx->framerate = dst_fps;&#xA;    if (outctx->oformat->flags &amp; AVFMT_GLOBALHEADER)&#xA;        cctx->flags |= AV_CODEC_FLAG_GLOBAL_HEADER;&#xA;    ret = avcodec_open2(cctx, vcodec, nullptr);&#xA;    if (ret &lt; 0) {&#xA;        std::cerr &lt;&lt; "fail to avcodec_open2: ret=" &lt;&lt; ret;&#xA;        return 2;&#xA;    }&#xA;    avcodec_parameters_from_context(vstrm->codecpar, cctx);&#xA;&#xA;    // initialize sample scaler&#xA;    SwsContext* swsctx = sws_getContext(&#xA;        cv_width, cv_height, AV_PIX_FMT_BGR24,&#xA;        dst_width, dst_height, cctx->pix_fmt,&#xA;        SWS_BILINEAR, nullptr, nullptr, nullptr);&#xA;    if (!swsctx) {&#xA;        std::cerr &lt;&lt; "fail to sws_getContext";&#xA;        return 2;&#xA;    }&#xA;&#xA;    // allocate frame buffer for encoding&#xA;    AVFrame* frame = av_frame_alloc();&#xA;    frame->width = dst_width;&#xA;    frame->height = dst_height;&#xA;    frame->format = static_cast<int>(cctx->pix_fmt);&#xA;    ret = av_frame_get_buffer(frame, 32);&#xA;    if (ret &lt; 0) {&#xA;        std::cerr &lt;&lt; "fail to av_frame_get_buffer: ret=" &lt;&lt; ret;&#xA;        return 2;&#xA;    }&#xA;&#xA;    // allocate packet to retrive encoded frame&#xA;    AVPacket* pkt = av_packet_alloc();&#xA;&#xA;    // open output IO context&#xA;    ret = avio_open2(&amp;outctx->pb, outfile, AVIO_FLAG_WRITE, nullptr, nullptr);&#xA;    if (ret &lt; 0) {&#xA;        std::cerr &lt;&lt; "fail to avio_open2: ret=" &lt;&lt; ret;&#xA;        return 2;&#xA;    }&#xA;&#xA;    std::cout&#xA;        &lt;&lt; "camera:  " &lt;&lt; cv_width &lt;&lt; &#x27;x&#x27; &lt;&lt; cv_height &lt;&lt; &#x27;@&#x27; &lt;&lt; cv_fps &lt;&lt; "\n"&#xA;        &lt;&lt; "outfile: " &lt;&lt; outfile &lt;&lt; "\n"&#xA;        &lt;&lt; "format:  " &lt;&lt; outctx->oformat->name &lt;&lt; "\n"&#xA;        &lt;&lt; "vcodec:  " &lt;&lt; vcodec->name &lt;&lt; "\n"&#xA;        &lt;&lt; "size:    " &lt;&lt; dst_width &lt;&lt; &#x27;x&#x27; &lt;&lt; dst_height &lt;&lt; "\n"&#xA;        &lt;&lt; "fps:     " &lt;&lt; av_q2d(cctx->framerate) &lt;&lt; "\n"&#xA;        &lt;&lt; "pixfmt:  " &lt;&lt; av_get_pix_fmt_name(cctx->pix_fmt) &lt;&lt; "\n"&#xA;        &lt;&lt; std::flush;&#xA;&#xA;    // write media container header (if any)&#xA;    ret = avformat_write_header(outctx, nullptr);&#xA;    if (ret &lt; 0) {&#xA;        std::cerr &lt;&lt; "fail to avformat_write_header: ret=" &lt;&lt; ret;&#xA;        return 2;&#xA;    }&#xA;&#xA;    cv::Mat image;&#xA;&#xA;    // encoding loop&#xA;    int64_t frame_pts = 0;&#xA;    unsigned nb_frames = 0;&#xA;    bool end_of_stream = false;&#xA;    for (;;) {&#xA;        if (!end_of_stream) {&#xA;            // retrieve source image&#xA;            cvcap >> image;&#xA;            cv::imshow("press ESC to exit", image);&#xA;            if (cv::waitKey(33) == 0x1b) {&#xA;                // flush encoder&#xA;                avcodec_send_frame(cctx, nullptr);&#xA;                end_of_stream = true;&#xA;            }&#xA;        }&#xA;        if (!end_of_stream) {&#xA;            // convert cv::Mat(OpenCV) to AVFrame(FFmpeg)&#xA;            const int stride[4] = { static_cast<int>(image.step[0]) };&#xA;            sws_scale(swsctx, &amp;image.data, stride, 0, image.rows, frame->data, frame->linesize);&#xA;            frame->pts = frame_pts&#x2B;&#x2B;;&#xA;            // encode video frame&#xA;            ret = avcodec_send_frame(cctx, frame);&#xA;            if (ret &lt; 0) {&#xA;                std::cerr &lt;&lt; "fail to avcodec_send_frame: ret=" &lt;&lt; ret &lt;&lt; "\n";&#xA;                break;&#xA;            }&#xA;        }&#xA;        while ((ret = avcodec_receive_packet(cctx, pkt)) >= 0) {&#xA;            // rescale packet timestamp&#xA;            pkt->duration = 1;&#xA;            av_packet_rescale_ts(pkt, cctx->time_base, vstrm->time_base);&#xA;            // write encoded packet&#xA;            av_write_frame(outctx, pkt);&#xA;            av_packet_unref(pkt);&#xA;            std::cout &lt;&lt; nb_frames &lt;&lt; &#x27;\r&#x27; &lt;&lt; std::flush;  // dump progress&#xA;            &#x2B;&#x2B;nb_frames;&#xA;        }&#xA;        if (ret == AVERROR_EOF)&#xA;            break;&#xA;    };&#xA;    std::cout &lt;&lt; nb_frames &lt;&lt; " frames encoded" &lt;&lt; std::endl;&#xA;&#xA;    // write trailer and close file&#xA;    av_write_trailer(outctx);&#xA;    avio_close(outctx->pb);&#xA;&#xA;    av_packet_free(&amp;pkt);&#xA;    av_frame_free(&amp;frame);&#xA;    sws_freeContext(swsctx);&#xA;    avcodec_free_context(&amp;cctx);&#xA;    avformat_free_context(outctx);&#xA;    return 0;&#xA;}&#xA;</int></int></outfile></vector></iostream>

    &#xA;

    The problem is, while using codecs such as HEVC, H265 or VP9, the encoder always drops first 27 frames.

    &#xA;

    More exactly, at line 163 :

    &#xA;

      while ((ret = avcodec_receive_packet(cctx, pkt)) >= 0) {&#xA;

    &#xA;

    ret is equal to -11 and it doesn't go inside the while loop. From that point onward it's always equal to 0 and no issues are found.

    &#xA;

    If I use MPEG4 for example, ret is 0 from the start and no frames are dropped.

    &#xA;