Recherche avancée

Médias (0)

Mot : - Tags -/xmlrpc

Aucun média correspondant à vos critères n’est disponible sur le site.

Autres articles (36)

  • Supporting all media types

    13 avril 2011, par

    Unlike most software and media-sharing platforms, MediaSPIP aims to manage as many different media types as possible. The following are just a few examples from an ever-expanding list of supported formats : images : png, gif, jpg, bmp and more audio : MP3, Ogg, Wav and more video : AVI, MP4, OGV, mpg, mov, wmv and more text, code and other data : OpenOffice, Microsoft Office (Word, PowerPoint, Excel), web (html, CSS), LaTeX, Google Earth and (...)

  • HTML5 audio and video support

    13 avril 2011, par

    MediaSPIP uses HTML5 video and audio tags to play multimedia files, taking advantage of the latest W3C innovations supported by modern browsers.
    The MediaSPIP player used has been created specifically for MediaSPIP and can be easily adapted to fit in with a specific theme.
    For older browsers the Flowplayer flash fallback is used.
    MediaSPIP allows for media playback on major mobile platforms with the above (...)

  • De l’upload à la vidéo finale [version standalone]

    31 janvier 2010, par

    Le chemin d’un document audio ou vidéo dans SPIPMotion est divisé en trois étapes distinctes.
    Upload et récupération d’informations de la vidéo source
    Dans un premier temps, il est nécessaire de créer un article SPIP et de lui joindre le document vidéo "source".
    Au moment où ce document est joint à l’article, deux actions supplémentaires au comportement normal sont exécutées : La récupération des informations techniques des flux audio et video du fichier ; La génération d’une vignette : extraction d’une (...)

Sur d’autres sites (3498)

  • Ffmpeg set output format C++

    7 septembre 2022, par Turgut

    I made a program that encodes a video and I want to specify the format as h264 but I can't figure out how to do it. It automatically sets the format to mpeg4 and I can't change it. I got my code from ffmpegs official examples muxing.c and slightly edited it to fit my code (I haven't changed much especially did not touch the parts where it sets the format)

    


    Here is my code so for (I have trimmed down the code slightly, removing redundant parts)

    


    video_encoder.cpp :

    


    
video_encoder::video_encoder(int w, int h, float fps, unsigned int duration) 
 :width(w), height(h), STREAM_FRAME_RATE(fps), STREAM_DURATION(duration)
{
    std::string as_str = "./output/video.mp4";

    char* filename = const_cast(as_str.c_str());
    enc_inf.video_st, enc_inf.audio_st = (struct OutputStream) { 0 };
    enc_inf.video_st.next_pts = 1; 
    enc_inf.audio_st.next_pts = 1;
    enc_inf.encode_audio, enc_inf.encode_video = 0;
    int ret;
    int i;

    /* allocate the output media context */
    avformat_alloc_output_context2(&enc_inf.oc, NULL, NULL, filename);

    if (!enc_inf.oc) {
        std::cout << "FAILED" << std::endl;
        avformat_alloc_output_context2(&enc_inf.oc, NULL, "mpeg", filename);
    }

    enc_inf.fmt = enc_inf.oc->oformat;

    /* Add the audio and video streams using the default format codecs
     * and initialize the codecs. */
    if (enc_inf.fmt->video_codec != AV_CODEC_ID_NONE) {
        add_stream(&enc_inf.video_st, enc_inf.oc, &video_codec, enc_inf.fmt->video_codec);
        enc_inf.have_video = 1;
        enc_inf.encode_video = 1;
    }
    if (enc_inf.fmt->audio_codec != AV_CODEC_ID_NONE) {
        add_stream(&enc_inf.audio_st, enc_inf.oc, &audio_codec, enc_inf.fmt->audio_codec);
        enc_inf.have_audio = 1;
        enc_inf.encode_audio = 1;
    }

    /* Now that all the parameters are set, we can open the audio and
     * video codecs and allocate the necessary encode buffers. */
    if (enc_inf.have_video)
        open_video(enc_inf.oc, video_codec, &enc_inf.video_st, opt);

    if (enc_inf.have_audio)
        open_audio(enc_inf.oc, audio_codec, &enc_inf.audio_st, opt);
    av_dump_format(enc_inf.oc, 0, filename, 1);

    /* open the output file, if needed */
    if (!(enc_inf.fmt->flags & AVFMT_NOFILE)) {
        ret = avio_open(&enc_inf.oc->pb, filename, AVIO_FLAG_WRITE);
        if (ret < 0) {
            //VI_ERROR("Could not open '%s': %s\n", filename, ret);
            //return 1;
        }
    }

    /* Write the stream header, if any. */
    ret = avformat_write_header(enc_inf.oc, &opt);
    if (ret < 0) {
        VI_ERROR("Error occurred when opening output file:");
        //return 1;
    }
    
    //return 0;
}


/* Add an output stream. */
void video_encoder::add_stream(OutputStream *ost, AVFormatContext *oc,
                       const AVCodec **codec,
                       enum AVCodecID codec_id)
{
    AVCodecContext *c;
    int i;

    /* find the encoder */
    *codec = avcodec_find_encoder(codec_id);
    
    if (!(*codec)) {
        fprintf(stderr, "Could not find encoder for '%s'\n",
                avcodec_get_name(codec_id));
        exit(1);
    }

    ost->tmp_pkt = av_packet_alloc();

    if (!ost->tmp_pkt) {
        fprintf(stderr, "Could not allocate AVPacket\n");
        exit(1);
    }

    ost->st = avformat_new_stream(oc, NULL);
    if (!ost->st) {
        fprintf(stderr, "Could not allocate stream\n");
        exit(1);
    }
    ost->st->id = oc->nb_streams-1;
    c = avcodec_alloc_context3(*codec);
    if (!c) {
        fprintf(stderr, "Could not alloc an encoding context\n");
        exit(1);
    }
    ost->enc = c;


    switch ((*codec)->type) {
    case AVMEDIA_TYPE_AUDIO:
        ...
        break;
    case AVMEDIA_TYPE_VIDEO:
        c->codec_id = codec_id;

        c->bit_rate = 10000;
        /* Resolution must be a multiple of two. */
        c->width    = width;
        c->height   = height;
        /* timebase: This is the fundamental unit of time (in seconds) in terms
         * of which frame timestamps are represented. For fixed-fps content,
         * timebase should be 1/framerate and timestamp increments should be
         * identical to 1. */
        ost->st->time_base = (AVRational){ 1, STREAM_FRAME_RATE }; // *frame_rate
        c->time_base       = ost->st->time_base;

        c->gop_size      = 7; /* emit one intra frame every twelve frames at most */
        //c->codec_id      = AV_CODEC_ID_H264;
        c->pix_fmt       = STREAM_PIX_FMT;
        //if (c->codec_id == AV_CODEC_ID_MPEG2VIDEO) 
        //    c->max_b_frames = 2;
        if (c->codec_id == AV_CODEC_ID_MPEG1VIDEO) {
            /* Needed to avoid using macroblocks in which some coeffs overflow.
             * This does not happen with normal video, it just happens here as
             * the motion of the chroma plane does not match the luma plane. */
            c->mb_decision = 2;
        }

        if ((*codec)->pix_fmts){
            //c->pix_fmt = (*codec)->pix_fmts[0];
            std::cout << "NEW FORMAT : " << c->pix_fmt << std::endl;
        }

        break;
    }
     

    /* Some formats want stream headers to be separate. */
    if (oc->oformat->flags & AVFMT_GLOBALHEADER)
        c->flags |= AV_CODEC_FLAG_GLOBAL_HEADER;
}


    


    video_encoder.h

    


    
typedef struct OutputStream {
    AVStream *st;
    AVCodecContext *enc;

    /* pts of the next frame that will be generated */
    int64_t next_pts;
    int samples_count;

    AVFrame *frame;
    AVFrame *tmp_frame;

    AVPacket *tmp_pkt;

    float t, tincr, tincr2;

    struct SwsContext *sws_ctx;
    struct SwrContext *swr_ctx;
} OutputStream;

class video_encoder{
    private:
        typedef struct {
            OutputStream video_st, audio_st;
            const AVOutputFormat *fmt;
            AVFormatContext *oc;
            int have_video, have_audio, encode_video, encode_audio;
            std::string name;
        } encode_info;
    public:
        encode_info enc_inf;
        video_encoder(int w, int h, float fps, unsigned int duration);
        ~video_encoder();  
        ...
    private:
        ...
        void add_stream(OutputStream *ost, AVFormatContext *oc,
                       const AVCodec **codec,
                       enum AVCodecID codec_id);


    


    I'm thinking that the example sets the codec at avformat_alloc_output_context2(&enc_inf.oc, NULL, NULL, filename) but I'm not quite sure how to set it to h264.

    


    I've tried something like this avformat_alloc_output_context2(&enc_inf.oc, enc_inf.fmt, "h264", filename)

    


    But it just gives a seg fault. What am I supposed to do ?

    


    Edit : I've tried adding these two lines to video_encoder::video_encoder by deleting avformat_alloc_output_context2(&enc_inf.oc, NULL, NULL, filename); :

    


    
    video_codec = avcodec_find_encoder(AV_CODEC_ID_H264);
    enc_inf.video_st.enc = avcodec_alloc_context3(video_codec);



    


    But it resulted in these errors :
It says this every frame (A bunch of times)

    


    [mpeg @ 0x56057c465480] buffer underflow st=0 bufi=26822 size=31816


    


    Says this once when the frame encoding loop is over :

    


    [mpeg @ 0x5565ac4a04c0] start time for stream 0 is not set in estimate_timings_from_pts
[mpeg @ 0x5565ac4a04c0] stream 0 : no TS found at start of file, duration not set
[mpeg @ 0x5565ac4a04c0] Could not find codec parameters for stream 0 (Video: mpeg2video, none): unspecified size
Consider increasing the value for the 'analyzeduration' (0) and 'probesize' (5000000) options


    


  • Neutral net or neutered

    4 juin 2013, par Mans — Law and liberty

    In recent weeks, a number of high-profile events, in the UK and elsewhere, have been quickly seized upon to promote a variety of schemes for monitoring or filtering Internet access. These proposals, despite their good intentions of protecting children or fighting terrorism, pose a serious threat to fundamental liberties. Although at a glance the ideas may seem like a reasonable price to pay for the prevention of some truly hideous crimes, there is more than first meets the eye. Internet regulation in any form whatsoever is the thin end of a wedge at whose other end we find severely restricted freedom of expression of the kind usually associated with oppressive dictatorships. Where the Internet was once a novelty, it now forms an integrated part of modern society ; regulating the Internet means regulating our lives.

    Terrorism

    Following the brutal murder of British soldier Lee Rigby in Woolwich, attempts were made in the UK to revive the controversial Communications Data Bill, also dubbed the snooper’s charter. The bill would give police and security services unfettered access to details (excluding content) of all digital communication in the UK without needing so much as a warrant.

    The powers afforded by the snooper’s charter would, the argument goes, enable police to prevent crimes such as the one witnessed in Woolwich. True or not, the proposal would, if implemented, also bring about infrastructure for snooping on anyone at any time for any purpose. Once available, the temptation may become strong to extend, little by little, the legal use of these abilities to cover ever more everyday activities, all in the name of crime prevention, of course.

    In the emotional aftermath of a gruesome act, anything with the promise of preventing it happening again may seem like a good idea. At times like these it is important, more than ever, to remain rational and carefully consider all the potential consequences of legislation, not only the intended ones.

    Hate speech

    Hand in hand with terrorism goes hate speech, preachings designed to inspire violence against people of some singled-out nation, race, or other group. Naturally, hate speech is often to be found on the Internet, where it can reach large audiences while the author remains relatively protected. Naturally, we would prefer for it not to exist.

    To fulfil the utopian desire of a clean Internet, some advocate mandatory filtering by Internet service providers and search engines to remove this unwanted content. Exactly how such censoring might be implemented is however rarely dwelt upon, much less the consequences inadvertent blocking of innocent material might have.

    Pornography

    Another common target of calls for filtering is pornography. While few object to the blocking of child pornography, at least in principle, the debate runs hotter when it comes to the legal variety. Pornography, it is claimed, promotes violence towards women and is immoral or generally offensive. As such it ought to be blocked in the name of the greater good.

    The conviction last week of paedophile Mark Bridger for the abduction and murder of five-year-old April Jones renewed the debate about filtering of pornography in the UK ; his laptop was found to contain child pornography. John Carr of the UK government’s Council on Child Internet Safety went so far as suggesting a default blocking of all pornography, access being granted to an Internet user only once he or she had registered with some unspecified entity. Registering people wishing only to access perfectly legal material is not something we do in a democracy.

    The reality is that Google and other major search engines already remove illegal images from search results and report them to the appropriate authorities. In the UK, the Internet Watch Foundation, a non-government organisation, maintains a blacklist of what it deems ‘potentially criminal’ content, and many Internet service providers block access based on this list.

    While well-intentioned, the IWF and its blacklist should raise some concerns. Firstly, a vigilante organisation operating in secret and with no government oversight acting as the nation’s morality police has serious implications for freedom of speech. Secondly, the blocks imposed are sometimes more far-reaching than intended. In one incident, an attempt to block the cover image of the Scorpions album Virgin Killer hosted by Wikipedia (in itself a dubious decision) rendered the entire related article inaccessible as well as interfered with editing.

    Net neutrality

    Content filtering, or more precisely the lack thereof, is central to the concept of net neutrality. Usually discussed in the context of Internet service providers, this is the principle that the user should have equal, unfiltered access to all content. As a consequence, ISPs should not be held responsible for the content they deliver. Compare this to how the postal system works.

    The current debate shows that the principle of net neutrality is important not only at the ISP level, but should also include providers of essential services on the Internet. This means search engines should not be responsible for or be required to filter results, email hosts should not be required to scan users’ messages, and so on. No mandatory censoring can be effective without infringing the essential liberties of freedom of speech and press.

    Social networks operate in a less well-defined space. They are clearly not part of the essential Internet infrastructure, and they require that users sign up and agree to their terms and conditions. Because of this, they can include restrictions that would be unacceptable for the Internet as a whole. At the same time, social networks are growing in importance as means of communication between people, and as such they have a moral obligation to act fairly and apply their rules in a transparent manner.

    Facebook was recently under fire, accused of not taking sufficient measures to curb ‘hate speech,’ particularly against women. Eventually they pledged to review their policies and methods, and reducing the proliferation of such content will surely make the web a better place. Nevertheless, one must ask how Facebook (or another social network) might react to similar pressure from, say, a religious group demanding removal of ‘blasphemous’ content. What about demands from a foreign government ? Only yesterday, the Turkish prime minister Erdogan branded Twitter ‘a plague’ in a TV interview.

    Rather than impose upon Internet companies the burden of law enforcement, we should provide them the latitude to set their own policies as well as the legal confidence to stand firm in the face of unreasonable demands. The usual market forces will promote those acting responsibly.

    Further reading

  • Open Media Developers Track at OVC 2011

    11 octobre 2011, par silvia

    The Open Video Conference that took place on 10-12 September was so overwhelming, I’ve still not been able to catch my breath ! It was a dense three days for me, even though I only focused on the technology sessions of the conference and utterly missed out on all the policy and content discussions.

    Roughly 60 people participated in the Open Media Software (OMS) developers track. This was an amazing group of people capable and willing to shape the future of video technology on the Web :

    • HTML5 video developers from Apple, Google, Opera, and Mozilla (though we missed the NZ folks),
    • codec developers from WebM, Xiph, and MPEG,
    • Web video developers from YouTube, JWPlayer, Kaltura, VideoJS, PopcornJS, etc.,
    • content publishers from Wikipedia, Internet Archive, YouTube, Netflix, etc.,
    • open source tool developers from FFmpeg, gstreamer, flumotion, VideoLAN, PiTiVi, etc,
    • and many more.

    To provide a summary of all the discussions would be impossible, so I just want to share the key take-aways that I had from the main sessions.

    WebRTC : Realtime Communications and HTML5

    Tim Terriberry (Mozilla), Serge Lachapelle (Google) and Ethan Hugg (CISCO) moderated this session together (slides). There are activities both at the W3C and at IETF – the ones at IETF are supposed to focus on protocols, while the W3C ones on HTML5 extensions.

    The current proposal of a PeerConnection API has been implemented in WebKit/Chrome as open source. It is expected that Firefox will have an add-on by Q1 next year. It enables video conferencing, including media capture, media encoding, signal processing (echo cancellation etc), secure transmission, and a data stream exchange.

    Current discussions are around the signalling protocol and whether SIP needs to be required by the standard. Further, the codec question is under discussion with a question whether to mandate VP8 and Opus, since transcoding gateways are not desirable. Another question is how to measure the quality of the connection and how to report errors so as to allow adaptation.

    What always amazes me around RTC is the sheer number of specialised protocols that seem to be required to implement this. WebRTC does not disappoint : in fact, the question was asked whether there could be a lighter alternative than to re-use dozens of years of protocol development – is it over-engineered ? Can desktop players connect to a WebRTC session ?

    We are already in a second or third revision of this part of the HTML5 specification and yet it seems the requirements are still being collected. I’m quietly confident that everything is done to make the lives of the Web developer easier, but it sure looks like a huge task.

    The Missing Link : Flash to HTML5

    Zohar Babin (Kaltura) and myself moderated this session and I must admit that this session was the biggest eye-opener for me amongst all the sessions. There was a large number of Flash developers present in the room and that was great, because sometimes we just don’t listen enough to lessons learnt in the past.

    This session gave me one of those aha-moments : it the form of the Flash appendBytes() API function.

    The appendBytes() function allows a Flash developer to take a byteArray out of a connected video resource and do something with it – such as feed it to a video for display. When I heard that Web developers want that functionality for JavaScript and the video element, too, I instinctively rejected the idea wondering why on earth would a Web developer want to touch encoded video bytes – why not leave that to the browser.

    But as it turns out, this is actually a really powerful enabler of functionality. For example, you can use it to :

    • display mid-roll video ads as part of the same video element,
    • sequence playlists of videos into the same video element,
    • implement DVR functionality (high-speed seeking),
    • do mash-ups,
    • do video editing,
    • adaptive streaming.

    This totally blew my mind and I am now completely supportive of having such a function in HTML5. Together with media fragment URIs you could even leave all the header download management for resources to the Web browser and just request time ranges from a video through an appendBytes() function. This would be easier on the Web developer than having to deal with byte ranges and making sure that appropriate decoding pipelines are set up.

    Standards for Video Accessibility

    Philip Jagenstedt (Opera) and myself moderated this session. We focused on the HTML5 track element and the WebVTT file format. Many issues were identified that will still require work.

    One particular topic was to find a standard means of rendering the UI for caption, subtitle, und description selection. For example, what icons should be used to indicate that subtitles or captions are available. While this is not part of the HTML5 specification, it’s still important to get this right across browsers since otherwise users will get confused with diverging interfaces.

    Chaptering was discussed and a particular need to allow URLs to directly point at chapters was expressed. I suggested the use of named Media Fragment URLs.

    The use of WebVTT for descriptions for the blind was also discussed. A suggestion was made to use the voice tag <v> to allow for “styling” (i.e. selection) of the screen reader voice.

    Finally, multitrack audio or video resources were also discussed and the @mediagroup attribute was explained. A question about how to identify the language used in different alternative dubs was asked. This is an issue because @srclang is not on audio or video, only on text, so it’s a missing feature for the multitrack API.

    Beyond this session, there was also a breakout session on WebVTT and the track element. As a consequence, a number of bugs were registered in the W3C bug tracker.

    WebM : Testing, Metrics and New features

    This session was moderated by John Luther and John Koleszar, both of the WebM Project. They started off with a presentation on current work on WebM, which includes quality testing and improvements, and encoder speed improvement. Then they moved on to questions about how to involve the community more.

    The community criticised that communication of what is happening around WebM is very scarce. More sharing of information was requested, including a move to using open Google+ hangouts instead of Google internal video conferences. More use of the public bug tracker can also help include the community better.

    Another pain point of the community was that code is introduced and removed without much feedback. It was requested to introduce a peer review process. Also it was requested that example code snippets are published when new features are announced so others can replicate the claims.

    This all indicates to me that the WebM project is increasingly more open, but that there is still a lot to learn.

    Standards for HTTP Adaptive Streaming

    This session was moderated by Frank Galligan and Aaron Colwell (Google), and Mark Watson (Netflix).

    Mark started off by giving us an introduction to MPEG DASH, the MPEG file format for HTTP adaptive streaming. MPEG has just finalized the format and he was able to show us some examples. DASH is XML-based and thus rather verbose. It is covering all eventualities of what parameters could be switched during transmissions, which makes it very broad. These include trick modes e.g. for fast forwarding, 3D, multi-view and multitrack content.

    MPEG have defined profiles – one for live streaming which requires chunking of the files on the server, and one for on-demand which requires keyframe alignment of the files. There are clear specifications for how to do these with MPEG. Such profiles would need to be created for WebM and Ogg Theora, too, to make DASH universally applicable.

    Further, the Web case needs a more restrictive adaptation approach, since the video element’s API is already accounting for some of the features that DASH provides for desktop applications. So, a Web-specific profile of DASH would be required.

    Then Aaron introduced us to the MediaSource API and in particular the webkitSourceAppend() extension that he has been experimenting with. It is essentially an implementation of the appendBytes() function of Flash, which the Web developers had been asking for just a few sessions earlier. This was likely the biggest announcement of OVC, alas a quiet and technically-focused one.

    Aaron explained that he had been trying to find a way to implement HTTP adaptive streaming into WebKit in a way in which it could be standardised. While doing so, he also came across other requirements around such chunked video handling, in particular around dynamic ad insertion, live streaming, DVR functionality (fast forward), constraint video editing, and mashups. While trying to sort out all these requirements, it became clear that it would be very difficult to implement strategies for stream switching, buffering and delivery of video chunks into the browser when so many different and likely contradictory requirements exist. Also, once an approach is implemented and specified for the browser, it becomes very difficult to innovate on it.

    Instead, the easiest way to solve it right now and learn about what would be necessary to implement into the browser would be to actually allow Web developers to queue up a chunk of encoded video into a video element for decoding and display. Thus, the webkitSourceAppend() function was born (specification).

    The proposed extension to the HTMLMediaElement is as follows :

    partial interface HTMLMediaElement 
      // URL passed to src attribute to enable the media source logic.
      readonly attribute [URL] DOMString webkitMediaSourceURL ;
    

    bool webkitSourceAppend(in Uint8Array data) ;

    // end of stream status codes.
    const unsigned short EOS_NO_ERROR = 0 ;
    const unsigned short EOS_NETWORK_ERR = 1 ;
    const unsigned short EOS_DECODE_ERR = 2 ;

    void webkitSourceEndOfStream(in unsigned short status) ;

    // states
    const unsigned short SOURCE_CLOSED = 0 ;
    const unsigned short SOURCE_OPEN = 1 ;
    const unsigned short SOURCE_ENDED = 2 ;

    readonly attribute unsigned short webkitSourceState ;
     ;

    The code is already checked into WebKit, but commented out behind a command-line compiler flag.

    Frank then stepped forward to show how webkitSourceAppend() can be used to implement HTTP adaptive streaming. His example uses WebM – there are no examples with MPEG or Ogg yet.

    The chunks that Frank’s demo used were 150 video frames long (6.25s) and 5s long audio. Stream switching only switched video, since audio data is much lower bandwidth and more important to retain at high quality. Switching was done on multiplexed files.

    Every chunk requires an XHR range request – this could be optimised if the connections were kept open per adaptation. Seeking works, too, but since decoding requires download of a whole chunk, seeking latency is determined by the time it takes to download and decode that chunk.

    Similar to DASH, when using this approach for live streaming, the server has to produce one file per chunk, since byte range requests are not possible on a continuously growing file.

    Frank did not use DASH as the manifest format for his HTTP adaptive streaming demo, but instead used a hacked-up custom XML format. It would be possible to use JSON or any other format, too.

    After this session, I was actually completely blown away by the possibilities that such a simple API extension allows. If I wasn’t sold on the idea of a appendBytes() function in the earlier session, this one completely changed my mind. While I still believe we need to standardise a HTTP adaptive streaming file format that all browsers will support for all codecs, and I still believe that a native implementation for support of such a file format is necessary, I also believe that this approach of webkitSourceAppend() is what HTML needs – and maybe it needs it faster than native HTTP adaptive streaming support.

    Standards for Browser Video Playback Metrics

    This session was moderated by Zachary Ozer and Pablo Schklowsky (JWPlayer). Their motivation for the topic was, in fact, also HTTP adaptive streaming. Once you leave the decisions about when to do stream switching to JavaScript (through a function such a wekitSourceAppend()), you have to expose stream metrics to the JS developer so they can make informed decisions. The other use cases is, of course, monitoring of the quality of video delivery for reporting to the provider, who may then decide to change their delivery environment.

    The discussion found that we really care about metrics on three different levels :

    • measuring the network performance (bandwidth)
    • measuring the decoding pipeline performance
    • measuring the display quality

    In the end, it seemed that work previously done by Steve Lacey on a proposal for video metrics was generally acceptable, except for the playbackJitter metric, which may be too aggregate to mean much.

    Device Inputs / A/V in the Browser

    I didn’t actually attend this session held by Anant Narayanan (Mozilla), but from what I heard, the discussion focused on how to manage permission of access to video camera, microphone and screen, e.g. when multiple applications (tabs) want access or when the same site wants access in a different session. This may apply to real-time communication with screen sharing, but also to photo sharing, video upload, or canvas access to devices e.g. for time lapse photography.

    Open Video Editors

    This was another session that I wasn’t able to attend, but I believe the creation of good open source video editing software and similar video creation software is really crucial to giving video a broader user appeal.

    Jeff Fortin (PiTiVi) moderated this session and I was fascinated to later see his analysis of the lifecycle of open source video editors. It is shocking to see how many people/projects have tried to create an open source video editor and how many have stopped their project. It is likely that the creation of a video editor is such a complex challenge that it requires a larger and more committed open source project – single people will just run out of steam too quickly. This may be comparable to the creation of a Web browser (see the size of the Mozilla project) or a text processing system (see the size of the OpenOffice project).

    Jeff also mentioned the need to create open video editor standards around playlist file formats etc. Possibly the Open Video Alliance could help. In any case, something has to be done in this space – maybe this would be a good topic to focus next year’s OVC on ?

    Monday’s Breakout Groups

    The conference ended officially on Sunday night, but we had a third day of discussions / hackday at the wonderful New York Lawschool venue. We had collected issues of interest during the two previous days and organised the breakout groups on the morning (Schedule).

    In the Content Protection/DRM session, Mark Watson from Netflix explained how their API works and that they believe that all we need in browsers is a secure way to exchange keys and an indicator of protection scheme is used – the actual protection scheme would not be implemented by the browser, but be provided by the underlying system (media framework/operating system). I think that until somebody actually implements something in a browser fork and shows how this can be done, we won’t have much progress. In my understanding, we may also need to disable part of the video API for encrypted content, because otherwise you can always e.g. grab frames from the video element into canvas and save them from there.

    In the Playlists and Gapless Playback session, there was massive brainstorming about what new cool things can be done with the video element in browsers if playback between snippets can be made seamless. Further discussions were about a standard playlist file formats (such as XSPF, MRSS or M3U), media fragment URIs in playlists for mashups, and the need to expose track metadata for HTML5 media elements.

    What more can I say ? It was an amazing three days and the complexity of problems that we’re dealing with is a tribute to how far HTML5 and open video has already come and exciting news for the kind of applications that will be possible (both professional and community) once we’ve solved the problems of today. It will be exciting to see what progress we will have made by next year’s conference.

    Thanks go to Google for sponsoring my trip to OVC.

    UPDATE : We actually have a mailing list for open media developers who are interested in these and similar topics – do join at http://lists.annodex.net/cgi-bin/mailman/listinfo/foms.