Recherche avancée

Médias (1)

Mot : - Tags -/berlin

Autres articles (107)

  • Le profil des utilisateurs

    12 avril 2011, par

    Chaque utilisateur dispose d’une page de profil lui permettant de modifier ses informations personnelle. Dans le menu de haut de page par défaut, un élément de menu est automatiquement créé à l’initialisation de MediaSPIP, visible uniquement si le visiteur est identifié sur le site.
    L’utilisateur a accès à la modification de profil depuis sa page auteur, un lien dans la navigation "Modifier votre profil" est (...)

  • Configurer la prise en compte des langues

    15 novembre 2010, par

    Accéder à la configuration et ajouter des langues prises en compte
    Afin de configurer la prise en compte de nouvelles langues, il est nécessaire de se rendre dans la partie "Administrer" du site.
    De là, dans le menu de navigation, vous pouvez accéder à une partie "Gestion des langues" permettant d’activer la prise en compte de nouvelles langues.
    Chaque nouvelle langue ajoutée reste désactivable tant qu’aucun objet n’est créé dans cette langue. Dans ce cas, elle devient grisée dans la configuration et (...)

  • XMP PHP

    13 mai 2011, par

    Dixit Wikipedia, XMP signifie :
    Extensible Metadata Platform ou XMP est un format de métadonnées basé sur XML utilisé dans les applications PDF, de photographie et de graphisme. Il a été lancé par Adobe Systems en avril 2001 en étant intégré à la version 5.0 d’Adobe Acrobat.
    Étant basé sur XML, il gère un ensemble de tags dynamiques pour l’utilisation dans le cadre du Web sémantique.
    XMP permet d’enregistrer sous forme d’un document XML des informations relatives à un fichier : titre, auteur, historique (...)

Sur d’autres sites (9224)

  • H.264 muxed to MP4 using libavformat not playing back

    14 mai 2015, par Brad Mitchell

    I am trying to mux H.264 data into a MP4 file. There appear to be no errors in saving this H.264 Annex B data out to an MP4 file, but the file fails to playback.

    I’ve done a binary comparison on the files and the issue seems to be somewhere in what is being written to the footer (trailer) of the MP4 file.

    I suspect it has to be something with the way the stream is being created or something.

    Init :

    AVOutputFormat* fmt = av_guess_format( 0, "out.mp4", 0 );
    oc = avformat_alloc_context();
    oc->oformat = fmt;
    strcpy(oc->filename, filename);

    Part of this prototype app I have is creating a png file for each IFrame. So when the first IFrame is encountered, I create the video stream and write the av header etc :

    void addVideoStream(AVCodecContext* decoder)
    {
       videoStream = av_new_stream(oc, 0);
       if (!videoStream)
       {
            cout << "ERROR creating video stream" << endl;
            return;        
       }
       vi = videoStream->index;    
       videoContext = videoStream->codec;      
       videoContext->codec_type = AVMEDIA_TYPE_VIDEO;
       videoContext->codec_id = decoder->codec_id;
       videoContext->bit_rate = 512000;
       videoContext->width = decoder->width;
       videoContext->height = decoder->height;
       videoContext->time_base.den = 25;
       videoContext->time_base.num = 1;    
       videoContext->gop_size = decoder->gop_size;
       videoContext->pix_fmt = decoder->pix_fmt;      

       if (oc->oformat->flags & AVFMT_GLOBALHEADER)
           videoContext->flags |= CODEC_FLAG_GLOBAL_HEADER;

       av_dump_format(oc, 0, filename, 1);

       if (!(oc->oformat->flags & AVFMT_NOFILE))
       {
           if (avio_open(&oc->pb, filename, AVIO_FLAG_WRITE) < 0) {
           cout << "Error opening file" << endl;
       }
       avformat_write_header(oc, NULL);
    }

    I write packets out :

    unsigned char* data = block->getData();
    unsigned char videoFrameType = data[4];
    int dataLen = block->getDataLen();

    // store pps
    if (videoFrameType == 0x68)
    {
       if (ppsFrame != NULL)
       {
           delete ppsFrame; ppsFrameLength = 0; ppsFrame = NULL;
       }
       ppsFrameLength = block->getDataLen();
       ppsFrame = new unsigned char[ppsFrameLength];
       memcpy(ppsFrame, block->getData(), ppsFrameLength);
    }
    else if (videoFrameType == 0x67)
    {
       // sps
       if (spsFrame != NULL)
       {
           delete spsFrame; spsFrameLength = 0; spsFrame = NULL;
    }
       spsFrameLength = block->getDataLen();
       spsFrame = new unsigned char[spsFrameLength];
       memcpy(spsFrame, block->getData(), spsFrameLength);                
    }                                          

    if (videoFrameType == 0x65 || videoFrameType == 0x41)
    {
       videoFrameNumber++;
    }
    if (videoFrameType == 0x65)
    {
       decodeIFrame(videoFrameNumber, spsFrame, spsFrameLength, ppsFrame, ppsFrameLength, data, dataLen);
    }

    if (videoStream != NULL)
    {
       AVPacket pkt = { 0 };
       av_init_packet(&pkt);
       pkt.stream_index = vi;
       pkt.flags = 0;                      
       pkt.pts = pkt.dts = 0;                                  

       if (videoFrameType == 0x65)
       {
           // combine the SPS PPS & I frames together
           pkt.flags |= AV_PKT_FLAG_KEY;                                                  
           unsigned char* videoFrame = new unsigned char[spsFrameLength+ppsFrameLength+dataLen];
           memcpy(videoFrame, spsFrame, spsFrameLength);
           memcpy(&videoFrame[spsFrameLength], ppsFrame, ppsFrameLength);
           memcpy(&videoFrame[spsFrameLength+ppsFrameLength], data, dataLen);

           // overwrite the start code (00 00 00 01 with a 32-bit length)
           setLength(videoFrame, spsFrameLength-4);
           setLength(&videoFrame[spsFrameLength], ppsFrameLength-4);
           setLength(&videoFrame[spsFrameLength+ppsFrameLength], dataLen-4);
           pkt.size = dataLen + spsFrameLength + ppsFrameLength;
           pkt.data = videoFrame;
           av_interleaved_write_frame(oc, &pkt);
           delete videoFrame; videoFrame = NULL;
       }
       else if (videoFrameType != 0x67 && videoFrameType != 0x68)
       {  
           // Send other frames except pps & sps which are caught and stored                  
           pkt.size = dataLen;
           pkt.data = data;
           setLength(data, dataLen-4);                    
           av_interleaved_write_frame(oc, &pkt);
       }

    Finally to close the file off :

    av_write_trailer(oc);
    int i = 0;
    for (i = 0; i < oc->nb_streams; i++)
    {
       av_freep(&oc->streams[i]->codec);
       av_freep(&oc->streams[i]);      
    }

    if (!(oc->oformat->flags & AVFMT_NOFILE))
    {
       avio_close(oc->pb);
    }
    av_free(oc);

    If I take the H.264 data alone and convert it :

    ffmpeg -i recording.h264 -vcodec copy recording.mp4

    All but the "footer" of the files are the same.

    Output from my program :
    readrec recording.tcp out.mp4
    ** START * 01-03-2013 14:26:01 180000
    Output #0, mp4, to ’out.mp4’ :
    Stream #0:0 : Video : h264, yuv420p, 352x288, q=2-31, 512 kb/s, 90k tbn, 25 tbc
    * END ** 01-03-2013 14:27:01 102000
    Wrote 1499 video frames.

    If I try to convert using ffmpeg the MP4 file created using CODE :

    ffmpeg -i out.mp4 -vcodec copy out2.mp4
    ffmpeg version 0.11.1 Copyright (c) 2000-2012 the FFmpeg developers
         built on Mar  7 2013 12:49:22 with suncc 0x5110
         configuration: --extra-cflags=-KPIC -g --disable-mmx
         --disable-protocol=udp --disable-encoder=nellymoser --cc=cc --cxx=CC
    libavutil      51. 54.100 / 51. 54.100
    libavcodec     54. 23.100 / 54. 23.100
    libavformat    54.  6.100 / 54.  6.100
    libavdevice    54.  0.100 / 54.  0.100
    libavfilter     2. 77.100 /  2. 77.100
    libswscale      2.  1.100 /  2.  1.100
    libswresample   0. 15.100 /  0. 15.100
    h264 @ 12eaac0] no frame!
       Last message repeated 1 times
    [h264 @ 12eaac0] slice type too large (0) at 0 0
    [h264 @ 12eaac0] decode_slice_header error
    [h264 @ 12eaac0] no frame!
       Last message repeated 23 times
    [h264 @ 12eaac0] slice type too large (0) at 0 0
    [h264 @ 12eaac0] decode_slice_header error
    [h264 @ 12eaac0] no frame!
       Last message repeated 74 times
    [h264 @ 12eaac0] slice type too large (0) at 0 0
    [h264 @ 12eaac0] decode_slice_header error
    [h264 @ 12eaac0] no frame!
       Last message repeated 64 times
    [h264 @ 12eaac0] slice type too large (0) at 0 0
    [h264 @ 12eaac0] decode_slice_header error
    [h264 @ 12eaac0] no frame!
       Last message repeated 34 times
    [h264 @ 12eaac0] slice type too large (0) at 0 0
    [h264 @ 12eaac0] decode_slice_header error
    [h264 @ 12eaac0] no frame!
       Last message repeated 49 times
    [h264 @ 12eaac0] slice type too large (0) at 0 0
    [h264 @ 12eaac0] decode_slice_header error
    [h264 @ 12eaac0] no frame!
       Last message repeated 24 times
    [h264 @ 12eaac0] Partitioned H.264 support is incomplete
    [h264 @ 12eaac0] no frame!
       Last message repeated 23 times
    [h264 @ 12eaac0] sps_id out of range
    [h264 @ 12eaac0] no frame!
       Last message repeated 148 times
    [h264 @ 12eaac0] sps_id (32) out of range
       Last message repeated 1 times
    [h264 @ 12eaac0] no frame!
       Last message repeated 33 times
    [h264 @ 12eaac0] slice type too large (0) at 0 0
    [h264 @ 12eaac0] decode_slice_header error
    [h264 @ 12eaac0] no frame!
       Last message repeated 128 times
    [h264 @ 12eaac0] sps_id (32) out of range
       Last message repeated 1 times
    [h264 @ 12eaac0] no frame!
       Last message repeated 3 times
    [h264 @ 12eaac0] slice type too large (0) at 0 0
    [h264 @ 12eaac0] decode_slice_header error
    [h264 @ 12eaac0] no frame!
       Last message repeated 3 times
    [h264 @ 12eaac0] slice type too large (0) at 0 0
    [h264 @ 12eaac0] decode_slice_header error
    [h264 @ 12eaac0] no frame!
       Last message repeated 309 times
    [h264 @ 12eaac0] sps_id (32) out of range
       Last message repeated 1 times
    [h264 @ 12eaac0] no frame!
       Last message repeated 192 times
    [h264 @ 12eaac0] Partitioned H.264 support is incomplete
    [h264 @ 12eaac0] no frame!
       Last message repeated 73 times
    [h264 @ 12eaac0] sps_id (32) out of range
       Last message repeated 1 times
    [h264 @ 12eaac0] no frame!
       Last message repeated 99 times
    [h264 @ 12eaac0] sps_id (32) out of range
       Last message repeated 1 times
    [h264 @ 12eaac0] no frame!
       Last message repeated 197 times
    [mov,mp4,m4a,3gp,3g2,mj2 @ 12e3100] decoding for stream 0 failed
    [mov,mp4,m4a,3gp,3g2,mj2 @ 12e3100] Could not find codec parameters
    (Video: h264 (avc1 / 0x31637661), 393539 kb/s)
    out.mp4: could not find codec parameters

    I really do not know where the issue is, except it has to be something to do with the way the streams are being set up. I’ve looked at bits of code from where other people are doing a similar thing, and tried to use this advice in setting up the streams, but to no avail !


    The final code which gave me a H.264/AAC muxed (synced) file is as follows. First a bit of background information. The data is coming from an IP camera. The data is presented via a 3rd party API as video/audio packets. The video packets are presented as the RTP payload data (no header) and consist of NALU’s that are reconstructed and converted to H.264 video in Annex B format. AAC audio is presented as raw AAC and is converted to adts format to enable playback. These packets have been put into a bitstream format that allows the transmission of the timestamp (64 bit milliseconds since Jan 1 1970) along with a few other things.

    This is more or less a prototype and is not clean in any respects. It probably leaks bad. I do however, hope this helps anyone else out trying to achieve something similar to what I am.

    Globals :

    AVFormatContext* oc = NULL;
    AVCodecContext* videoContext = NULL;
    AVStream* videoStream = NULL;
    AVCodecContext* audioContext = NULL;
    AVStream* audioStream = NULL;
    AVCodec* videoCodec = NULL;
    AVCodec* audioCodec = NULL;
    int vi = 0;  // Video stream
    int ai = 1;  // Audio stream

    uint64_t firstVideoTimeStamp = 0;
    uint64_t firstAudioTimeStamp = 0;
    int audioStartOffset = 0;

    char* filename = NULL;

    Boolean first = TRUE;

    int videoFrameNumber = 0;
    int audioFrameNumber = 0;

    Main :

    int main(int argc, char* argv[])
    {
       if (argc != 3)
       {  
           cout &lt;&lt; argv[0] &lt;&lt; " <stream playback="playback" file="file"> <output mp4="mp4" file="file">" &lt;&lt; endl;
           return 0;
       }
       char* input_stream_file = argv[1];
       filename = argv[2];

       av_register_all();    

       fstream inFile;
       inFile.open(input_stream_file, ios::in);

       // Used to store the latest pps &amp; sps frames
       unsigned char* ppsFrame = NULL;
       int ppsFrameLength = 0;
       unsigned char* spsFrame = NULL;
       int spsFrameLength = 0;

       // Setup MP4 output file
       AVOutputFormat* fmt = av_guess_format( 0, filename, 0 );
       oc = avformat_alloc_context();
       oc->oformat = fmt;
       strcpy(oc->filename, filename);

       // Setup the bitstream filter for AAC in adts format.  Could probably also achieve
       // this by stripping the first 7 bytes!
       AVBitStreamFilterContext* bsfc = av_bitstream_filter_init("aac_adtstoasc");
       if (!bsfc)
       {      
           cout &lt;&lt; "Error creating adtstoasc filter" &lt;&lt; endl;
           return -1;
       }

       while (inFile.good())
       {
           TcpAVDataBlock* block = new TcpAVDataBlock();
           block->readStruct(inFile);
           DateTime dt = block->getTimestampAsDateTime();
           switch (block->getPacketType())
           {
               case TCP_PACKET_H264:
               {      
                   if (firstVideoTimeStamp == 0)
                       firstVideoTimeStamp = block->getTimeStamp();
                   unsigned char* data = block->getData();
                   unsigned char videoFrameType = data[4];
                   int dataLen = block->getDataLen();

                   // pps
                   if (videoFrameType == 0x68)
                   {
                       if (ppsFrame != NULL)
                       {
                           delete ppsFrame; ppsFrameLength = 0;
                           ppsFrame = NULL;
                       }
                       ppsFrameLength = block->getDataLen();
                       ppsFrame = new unsigned char[ppsFrameLength];
                       memcpy(ppsFrame, block->getData(), ppsFrameLength);
                   }
                   else if (videoFrameType == 0x67)
                   {
                       // sps
                       if (spsFrame != NULL)
                       {
                           delete spsFrame; spsFrameLength = 0;
                           spsFrame = NULL;
                       }
                       spsFrameLength = block->getDataLen();
                       spsFrame = new unsigned char[spsFrameLength];
                       memcpy(spsFrame, block->getData(), spsFrameLength);                  
                   }                                          

                   if (videoFrameType == 0x65 || videoFrameType == 0x41)
                   {
                       videoFrameNumber++;
                   }
                   // Extract a thumbnail for each I-Frame
                   if (videoFrameType == 0x65)
                   {
                       decodeIFrame(h264, spsFrame, spsFrameLength, ppsFrame, ppsFrameLength, data, dataLen);
                   }
                   if (videoStream != NULL)
                   {
                       AVPacket pkt = { 0 };
                       av_init_packet(&amp;pkt);
                       pkt.stream_index = vi;
                       pkt.flags = 0;          
                       pkt.pts = videoFrameNumber;
                       pkt.dts = videoFrameNumber;          
                       if (videoFrameType == 0x65)
                       {
                           pkt.flags = 1;                          

                           unsigned char* videoFrame = new unsigned char[spsFrameLength+ppsFrameLength+dataLen];
                           memcpy(videoFrame, spsFrame, spsFrameLength);
                           memcpy(&amp;videoFrame[spsFrameLength], ppsFrame, ppsFrameLength);

                           memcpy(&amp;videoFrame[spsFrameLength+ppsFrameLength], data, dataLen);
                           pkt.data = videoFrame;
                           av_interleaved_write_frame(oc, &amp;pkt);
                           delete videoFrame; videoFrame = NULL;
                       }
                       else if (videoFrameType != 0x67 &amp;&amp; videoFrameType != 0x68)
                       {                      
                           pkt.size = dataLen;
                           pkt.data = data;
                           av_interleaved_write_frame(oc, &amp;pkt);
                       }                      
                   }
                   break;
               }

           case TCP_PACKET_AAC:

               if (firstAudioTimeStamp == 0)
               {
                   firstAudioTimeStamp = block->getTimeStamp();
                   uint64_t millseconds_difference = firstAudioTimeStamp - firstVideoTimeStamp;
                   audioStartOffset = millseconds_difference * 16000 / 1000;
                   cout &lt;&lt; "audio offset: " &lt;&lt; audioStartOffset &lt;&lt; endl;
               }

               if (audioStream != NULL)
               {
                   AVPacket pkt = { 0 };
                   av_init_packet(&amp;pkt);
                   pkt.stream_index = ai;
                   pkt.flags = 1;          
                   pkt.pts = audioFrameNumber*1024;
                   pkt.dts = audioFrameNumber*1024;
                   pkt.data = block->getData();
                   pkt.size = block->getDataLen();
                   pkt.duration = 1024;

                   AVPacket newpacket = pkt;                      
                   int rc = av_bitstream_filter_filter(bsfc, audioContext,
                       NULL,
                       &amp;newpacket.data, &amp;newpacket.size,
                       pkt.data, pkt.size,
                       pkt.flags &amp; AV_PKT_FLAG_KEY);

                   if (rc >= 0)
                   {
                       //cout &lt;&lt; "Write audio frame" &lt;&lt; endl;
                       newpacket.pts = audioFrameNumber*1024;
                       newpacket.dts = audioFrameNumber*1024;
                       audioFrameNumber++;
                       newpacket.duration = 1024;                  

                       av_interleaved_write_frame(oc, &amp;newpacket);
                       av_free_packet(&amp;newpacket);
                   }  
                   else
                   {
                       cout &lt;&lt; "Error filtering aac packet" &lt;&lt; endl;

                   }
               }
               break;

           case TCP_PACKET_START:
               break;

           case TCP_PACKET_END:
               break;
           }
           delete block;
       }
       inFile.close();

       av_write_trailer(oc);
       int i = 0;
       for (i = 0; i &lt; oc->nb_streams; i++)
       {
           av_freep(&amp;oc->streams[i]->codec);
           av_freep(&amp;oc->streams[i]);      
       }

       if (!(oc->oformat->flags &amp; AVFMT_NOFILE))
       {
           avio_close(oc->pb);
       }

       av_free(oc);

       delete spsFrame; spsFrame = NULL;
       delete ppsFrame; ppsFrame = NULL;

       cout &lt;&lt; "Wrote " &lt;&lt; videoFrameNumber &lt;&lt; " video frames." &lt;&lt; endl;

       return 0;
    }
    </output></stream>

    The stream stream/codecs are added and the header is created in a function called addVideoAndAudioStream(). This function is called from decodeIFrame() so there are a few assumptions (which aren’t necessarily good)
    1. A video packet comes first
    2. AAC is present

    The decodeIFrame was kind of a separate prototype by where I was creating a thumbnail for each I Frame. The code to generate thumbnails was from : https://gnunet.org/svn/Extractor/src/plugins/thumbnailffmpeg_extractor.c

    The decodeIFrame function passes an AVCodecContext into addVideoAudioStream :

    void addVideoAndAudioStream(AVCodecContext* decoder = NULL)
    {
       videoStream = av_new_stream(oc, 0);
       if (!videoStream)
       {
           cout &lt;&lt; "ERROR creating video stream" &lt;&lt; endl;
           return;      
       }
       vi = videoStream->index;  
       videoContext = videoStream->codec;      
       videoContext->codec_type = AVMEDIA_TYPE_VIDEO;
       videoContext->codec_id = decoder->codec_id;
       videoContext->bit_rate = 512000;
       videoContext->width = decoder->width;
       videoContext->height = decoder->height;
       videoContext->time_base.den = 25;
       videoContext->time_base.num = 1;
       videoContext->gop_size = decoder->gop_size;
       videoContext->pix_fmt = decoder->pix_fmt;      

       audioStream = av_new_stream(oc, 1);
       if (!audioStream)
       {
           cout &lt;&lt; "ERROR creating audio stream" &lt;&lt; endl;
           return;
       }
       ai = audioStream->index;
       audioContext = audioStream->codec;
       audioContext->codec_type = AVMEDIA_TYPE_AUDIO;
       audioContext->codec_id = CODEC_ID_AAC;
       audioContext->bit_rate = 64000;
       audioContext->sample_rate = 16000;
       audioContext->channels = 1;

       if (oc->oformat->flags &amp; AVFMT_GLOBALHEADER)
       {
           videoContext->flags |= CODEC_FLAG_GLOBAL_HEADER;
           audioContext->flags |= CODEC_FLAG_GLOBAL_HEADER;
       }

       av_dump_format(oc, 0, filename, 1);

       if (!(oc->oformat->flags &amp; AVFMT_NOFILE))
       {
           if (avio_open(&amp;oc->pb, filename, AVIO_FLAG_WRITE) &lt; 0) {
               cout &lt;&lt; "Error opening file" &lt;&lt; endl;
           }
       }

       avformat_write_header(oc, NULL);
    }

    As far as I can tell, a number of assumptions didn’t seem to matter, for example :
    1. Bit Rate. The actual video bit rate was 262k whereas I specified 512kbit
    2. AAC channels. I specified mono, although the actual output was Stereo from memory

    You would still need to know what the frame rate (time base) is for the video & audio.

    Contrary to a lot of other examples, when setting pts & dts on the video packets, it was not playable. I needed to know the time base (25fps) and then set the pts & dts according to that time base, i.e. first frame = 0 (PPS, SPS, I), second frame = 1 (intermediate frame, whatever its called ;)).

    AAC I also had to make the assumption that it was 16000 hz. 1024 samples per AAC packet (You can also have AAC @ 960 samples I think) to determine the audio "offset". I added this to the pts & dts. So the pts/dts are the sample number that it is to played back at. You also need to make sure that the duration of 1024 is set in the packet before writing also.

    I have found additionally today that Annex B isn’t really compatible with any other player so AVCC format should really be used.

    These URLS helped :
    Problem to Decode H264 video over RTP with ffmpeg (libavcodec)
    http://aviadr1.blogspot.com.au/2010/05/h264-extradata-partially-explained-for.html

    When constructing the video stream, I filled out the extradata & extradata_size :

    // Extradata contains PPS &amp; SPS for AVCC format
    int extradata_len = 8 + spsFrameLen-4 + 1 + 2 + ppsFrameLen-4;
    videoContext->extradata = (uint8_t*)av_mallocz(extradata_len);
    videoContext->extradata_size = extradata_len;
    videoContext->extradata[0] = 0x01;
    videoContext->extradata[1] = spsFrame[4+1];
    videoContext->extradata[2] = spsFrame[4+2];
    videoContext->extradata[3] = spsFrame[4+3];
    videoContext->extradata[4] = 0xFC | 3;
    videoContext->extradata[5] = 0xE0 | 1;
    int tmp = spsFrameLen - 4;
    videoContext->extradata[6] = (tmp >> 8) &amp; 0x00ff;
    videoContext->extradata[7] = tmp &amp; 0x00ff;
    int i = 0;
    for (i=0;iextradata[8+i] = spsFrame[4+i];
    videoContext->extradata[8+tmp] = 0x01;
    int tmp2 = ppsFrameLen-4;  
    videoContext->extradata[8+tmp+1] = (tmp2 >> 8) &amp; 0x00ff;
    videoContext->extradata[8+tmp+2] = tmp2 &amp; 0x00ff;
    for (i=0;iextradata[8+tmp+3+i] = ppsFrame[4+i];

    When writing out the frames, don’t prepend the SPS & PPS frames, just write out the I Frame & P frames. In addition, replace the Annex B start code contained in the first 4 bytes (0x00 0x00 0x00 0x01) with the size of the I/P frame.

  • Cuda Memory Management : re-using device memory from C calls (multithreaded, ffmpeg), but failing on cudaMemcpy

    4 mars 2013, par Nuke Stollak

    I'm trying to CUDA-fy my ffmpeg filter that was taking over 90% of the CPU time, according to gprof. I first went from one core to OpenMP on 4 cores and got a 3.8x increase in frames encoded per second, but it's still too slow. CUDA seemed like the next natural step.

    I've gotten a modest (20% ?) increase by replacing one of my filter's functions with a CUDA kernel call, and just to get things up and running, I was cudaMalloc'ing and cudaMemcpy'ing on each frame. I suspected I would get better results if I weren't doing this each frame, so before I go ahead and move the rest of my code to CUDA, I wanted to fix this by allocating the memory before my filter is called and freeing it afterwards, but the device memory isn't having it. I'm only storing the device memory locations outside of code that knows about CUDA ; I'm not trying to use the data there, just save it for the next time I call a CUDA-aware function that needs it.

    Here's where I am so far :

    Environment : the last AMI Linux on EC2's GPU Cluster, latest updates installed. Everything is fairly standard.

    My filter is split into two files : vf_myfilter.c (compiled by gcc, like almost every other file in ffmpeg) and vf_myfilter_cu.cu (compiled by nvcc). My Makefile's link step includes -lcudart and both .o files. I build vf_myfilter_cu.o using (as one line)

    nvcc -I. -I./ -I/opt/nvidia/cuda/include $(CPPFLAGS)
        -Xcompiler "$(CFLAGS)"
         -c -o libfilter/vf_myfilter_cu.o libfilter/vf_myfilter_cu.cu

    When the variables (set by configure) are expanded, here's what I get, again all in one line but split up here for easier reading. I just noticed the duplicate include path directives, but it shouldn't hurt.

    nvcc -I. -I./ -I/opt/nvidia/cuda/include -I. -I./ -D_ISOC99_SOURCE
       -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_POSIX_C_SOURCE=200112
       -D_XOPEN_SOURCE=600 -DHAVE_AV_CONFIG_H
       -XCompiler "-fopenmp -std=c99 -fomit-frame-pointer -pthread -g
                   -Wdeclaration-after-statment -Wall -Wno-parentheses
                   -Wno-switch -Wno-format-zero-length -Wdisabled-optimization  
                   -Wpointer-arith -Wredundant-decls -Wno-pointer-sign
                   -Wwrite-strings -Wtype-limits -Wundef -Wmissing-prototypes
                   -Wno-pointer-to-int-case -Wstrict-prototypes -O3 -fno-math-errno
                   -fno-signed-zeros -fno-tree-vectorize
                   -Werror=implicit-function-declaration -Werror=missing-prototypes
                   -Werror=vla "
       -c -o libavfilter/vf_myfilter_cu.o libavfilter/vf_myfilter_cu.cu

    vf_myfilter.c calls three functions from vf_myfilter_cu.cu file which handle memory and call the CUDA kernel code. I thought I would be able to save the device pointers from my memory initialization, which runs once per ffmpeg run, and re-use that space each time I called the wrapper for my kernel function, but when I cudaMemcpy from my host memory to my device memory that I stored, it fails with cudaInvalidValue. If I cudaMalloc my device memory on every frame, I'm fine.

    I plan on using pinned host memory, once I have everything up in CUDA code and have minimized the number of times I need to return to the main ffmpeg code.

    Steps taken :

    First sign of trouble : search the web. I found Passing a pointer to device memory between classes in CUDA and printed out the pointers at various places in my execution to ensure that the device memory values were the same everywhere, and they are. FWIW, they seem to start around 0x90010000.

    ffmpeg's configure gave me -pthreads, so I checked to see if my filter was being called from multiple threads according to how can I tell if pthread_self is the main (first) thread in the process ? and checking syscall(SYS_gettid) == getpid() to ensure that I'm not calling CUDA from different threads—I'm indeed in the primary thread at every step, according to those two funcs. I am still using OpenMP later around some for loops in the main .c filter function, but the calls to CUDA don't occur in those loops.

    Code Overview :

    ffmpeg provides me a MyfilterContext structure pointer on each frame, as well as on the filter's config_input and uninit routines (called once per file), so I added some *host_var and *dev_var variables (a few of each, float and unsigned char).

    There is a whole lot of code I skipped for this post, but most of it has to do with my algorithm and details involved in writing an ffmpeg filter. I'm actually using about 6 host variables and 7 device variables right now, but for demonstration I limited it to one of each.

    Here is, broadly, what my vf_myfilter.c looks like.

    // declare my functions from vf_myfilter_cu.cu
    extern void cudaMyInit(unsigned char **dev_var, size_t mysize);
    extern void cudaMyUninit(unsigned char *dev_var);
    extern void cudaMyFunction(unsigned char *host_var, unsigned char *dev_var, size_t mysize);

    // part of the MyFilterContext structure, which ffmpeg keeps track of for me.
    typedef struct {
       unsigned char *host_var;
       unsigned char *dev_var;
    } MyFilterContext;

    // ffmpeg calls this function once per file, before any frames are processed.
    static int config_input(AVFilterLink *inlink) {
           // how ffmpeg passes me my context, fairly standard.
       MyfilterContext * myContext = inlink->dst->priv;
           // compute the size one video plane of one frame of video
       size_t mysize = sizeof(unsigned char) * inlink->w * inlink->h;
           // av_mallocz is a malloc wrapper provided and required by ffmpeg
       myContext->host_var = (unsigned char*) av_mallocz(size);
           // Here&#39;s where I attempt to allocate my device memory.
       cudaMyInit( &amp; myContext->dev_var, mysize);  
    }

    // Called once per frame of video
    static int filter_frame(AVFilterLink *inlink, AVFilterBufferRef *frame) {
       MyFilterContext *myContext = inlink->dst->priv;

       // sanity check to make sure that this isn&#39;t part of the multithreaded code
       if ( syscall(SYS_gettid) == getpid() )
           av_log(.... ); // This line never runs, so it&#39;s not threaded?

       // ...fill host_var with data from frame,
       // set mysize to the size of the buffer

       // Call my wrapper function defined in the .cu file
       cudaMyFunction(myContext->host_var, myContext->dev_var, mysize);

       // ... take the results from host_var and apply them to frame
       // ... and return the processed frame to ffmpeg
    }

    // called after everything else has happened:  free up the memory.
    static av_cold void uninit(AVFilterContext *ctx) {
       MyFilterContext *myContext = ctx->priv;
       // free my host_var
       if(myContext->host_var!=NULL) {
           av_free(myContext->host_var);
           myContext->host_var=NULL;
       }
       // free my dev_var
       cudaMyUninit(myContext->dev_var);
    }

    Here is, broadly, what my vf_myfilter_cu.cu looks like :

    // my kernel function that does the work.
    __global__ void myfunc(unsigned char *dev_var, size_t mysize) {
       // find the offset for this particular GPU thread to process
       // exit this function if the block/thread combo points to somewhere
       //     outside the frame
       // make sure we&#39;re less than mysize bytes from the beginning of dev_var
       // do things to dev_var[some_offset]
    }
    // Allocate the device memory
    extern "C" void cudaMyInit(unsigned char **dev_var, size_t mysize) {
       if(cudaMalloc( (void**) dev_var, mysize) != cudaSuccess) {
           printf("Cannot allocate the memory\n");
       }
    }

    // Free the device memory.
    extern "C" void cudaMyUninit(unsigned char *dev_var) {
       cudaFree(dev_var);
    }

    // Copy data from the host to the device,
    // Call the kernel function, and
    // Copy data from the device to the host.
    extern "C" void cudaMyFunction(
           unsigned char *host_var,
           unsigned char *dev_var,
           size_t mysize         )
    {
       cudaError_t cres;

       // dev_works is what I want to get rid of, but
       // to make sure that there&#39;s not something more obvious going
       // on, I made sure that my cudaMemcpy works if I&#39;m allocating
       // the device memory in every frame.
       unsigned char *dev_works;  
       if(cudaMalloc( (void **) &amp;dev_works, mysize)!=cudaSuccess) {
           // I don&#39;t see this message
           printf("failed at per-frame malloc\n");
       }

       // THIS PART WORKS, copying host_var to dev_works
       cres=cudaMemcpy( (void *) dev_works, host_var, mysize, cudaMemcpyHostToDevice);
       if(cres!=cudaSuccess) {
           if(cres==cudaErrorInvalidValue) {
               // I don&#39;t see this message.
               printf("cudaErrorInvalidValue at per-frame cudaMemcpy\n");
           }
       }

       // THIS PART FAILS, copying host_var to dev_var
       cres=cudaMemcpy( (void *) dev_var, host_var, mysize, cudaMemcpyHostToDevice);
       if(cres!=cudaSuccess) {
           if(cres==cudaErrorInvalidValue) {
               // this is the error code that prints.
               printf("cudaErrorInvalidValue at per-frame cudaMemcpy\n");
           }
           // I check for other error codes, but they&#39;re not being hit.
       }

       // and this works with dev_works
       myfunc&lt;&lt;>>(dev_works, mysize);

       if(cudaMemcpy(host_var, dev_works, mysize, cudaMemcpyDeviceToHost)!=cudaSuccess) {
           // I don&#39;t see this message.
           printf("Failed to copy post-kernel func\n");
       }

       cudaFree(dev_works);

    }

    Any ideas ?

  • Error splitting .mov file with ffmpeg

    5 novembre 2011, par Deepak Lamichhane

    I have used the following ffmpeg command to split the given media file.

    ffmpeg -i test.mov -ss 00:00:00 -t 00:07:00 -acodec copy -vcodec copy test1.mov

    The video "test.mov" has the following characteristics :

    Dimensions: 320 * 240
    Codecs: MPEG-4 Video, ACC
    Duration: 00:45
    Audio channels: 2
    Total bit rate: 1,292

    But while splitting it shows the following errors

    FFmpeg version 0.6, Copyright (c) 2000-2010 the FFmpeg developers
    built on Apr 29 2011 12:03:13 with gcc 4.2.1 (Apple Inc. build 5664)
    configuration: --disable-debug --prefix=/usr/local/Cellar/ffmpeg/0.6 --enable-shared --enable-pthreads --enable-nonfree --enable-gpl --disable-indev=jack --enable-libx264 --enable-libfaac --enable-libfaad --enable-libmp3lame --enable-libtheora --enable-libvorbis --enable-libvpx
     libavutil     50.15. 1 / 50.15. 1
     libavcodec    52.72. 2 / 52.72. 2
     libavformat   52.64. 2 / 52.64. 2
     libavdevice   52. 2. 0 / 52. 2. 0
     libswscale     0.11. 0 /  0.11. 0
    [aac @ 0x10181e200]channel element 1.0 is not allocated
       Last message repeated 215 times
    [mov,mp4,m4a,3gp,3g2,mj2 @ 0x10180b000]max_analyze_duration reached

    Seems stream 0 codec frame rate differs from container frame rate: 30000.00 (30000/1) -> 29.97 (30000/1001)
    Input #0, mov,mp4,m4a,3gp,3g2,mj2, from &#39;/Users/sts-107-/Dev/cloudfactory/tmp/media/test.mov&#39;:
     Duration: 00:00:45.14, start: 0.000000, bitrate: 1297 kb/s
       Stream #0.0(eng): Video: mpeg4, yuv420p, 320x240 [PAR 1:1 DAR 4:3], 1195 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 30k tbc
       Stream #0.1(eng): Audio: aac, 0 channels, s16, 97 kb/s
    File &#39;/Users/sts-107-/Dev/cloudfactory/tmp/media/ten/ten-1.mov&#39; already exists. Overwrite ? [y/N] y
    [mov @ 0x10181cc00]sample rate not set
    Output #0, mov, to &#39;/Users/sts-107-/Dev/cloudfactory/tmp/media/ten/ten-1.mov&#39;:
       Stream #0.0(eng): Video: mpeg4, yuv420p, 320x240 [PAR 1:1 DAR 4:3], q=2-31, 1195 kb/s, 90k tbn, 30k tbc
       Stream #0.1(eng): Audio: libfaac, 0 channels, 97 kb/s
    Stream mapping:
     Stream #0.0 -> #0.0
     Stream #0.1 -> #0.1
    Could not write header for output file #0 (incorrect codec parameters ?)
    ""

    I couldn't figure out whats the problem.

    Any suggestions are most welcome
    Thank you in advance !!!