Diary Of An x264 Developer

http://x264dev.multimedia.cx/

Les articles publiés sur le site

  • Announcing the first free software Blu-ray encoder

    25 avril 2010, par Dark Shikariblu-ray, x264

    For many years it has been possible to make your own DVDs with free software tools.  Over the course of the past decade, DVD creation evolved from the exclusive domain of the media publishing companies to something basically anyone could do on their home computer.

    But Blu-ray has yet to get that treatment.  Despite the “format war” between Blu-ray and HD DVD ending over two years ago, free software has lagged behind.  “Professional” tools for Blu-ray video encoding can cost as much as $100,000 and are often utter garbage.  Here are two actual screenshots from real Blu-rays: I wish I was making this up.

    But today, things change.  Today we take the first step towards a free software Blu-ray creation toolkit.

    Thanks to tireless work by Kieran Kunyha, Alex Giladi, Lamont Alston, and the Doom9 crowd, x264 can now produce Blu-ray-compliant video.  Extra special thanks to The Criterion Collection for sponsoring the final compliance test to confirm x264′s Blu-ray compliance.

    With x264′s powerful compression, as demonstrated by the incredibly popular BD-Rebuilder Blu-ray backup software, it’s quite possible to author Blu-ray disks on DVD9s (dual-layer DVDs) or even DVD5s (single-layer DVDs) with a reasonable level of quality.  With a free software encoder and less need for an expensive Blu-ray burner, we are one step closer to putting HD optical media creation in the hands of the everyday user.

    To celebrate this achievement, we are making available for download a demo Blu-ray encoded with x264, containing entirely free content!

    On this Blu-ray are the Open Movie Project films Big Buck Bunny and Elephant’s Dream, available under a Creative Commons license.  Additionally, Microsoft has graciously provided about 6 minutes of lossless HD video and audio (from part of a documentary project) under a very liberal license.  This footage rounds out the Blu-ray by adding some difficult live-action content in addition to the relatively compressible CGI footage from the Open Movie Project.  Finally, we used this sound sample, available under a Creative Commons license.

    You may notice that the Blu-ray image is only just over 2GB.  This is intentional; we have encoded all the content on the disk at appropriate bitrates to be playable from an ordinary 4.7GB DVD.  This should make it far easier to burn a copy of the Blu-ray, since Blu-ray burners and writable media are still relatively rare.  Most Blu-ray players will treat a DVD containing Blu-ray data as a normal Blu-ray disc.  A few, such as the Playstation 3, will not, but you can still play it as a data disc.

    Finally, note that (in accordance with the Blu-ray spec) the disc image file uses the UDF 2.5 filesystem, which may be incompatible with some older virtual drive and DVD burning applications.  You’ll also need to play it on an actual Blu-ray player if you want to get the menus and such working correctly.  If you’re looking to play it on a PC, a free trial of Arcsoft TMT is available here.

    What are you waiting for?  Grab a copy today!

    UPDATE: Here is an AVCHD-compliant version of the above, which should work better when burned on a DVD-5 instead of a BD-R. (mirror)

    What’s left before we have a fully free software Blu-ray creation toolkit?  Audio is already dealt with; AC3 audio (aka Dolby Digital), the format used in DVD, is still supported by Blu-ray, and there are many free software AC3 encoders.  The primary missing application is a free software Blu-ray authoring tool, to combine the video and audio streams to create a Blu-ray file structure with the menus, chapters, and so forth that we have all come to expect.  But the hardest part is dealt with: we can now create compatible video and audio streams.

    In the meantime, x264 can be used to create streams to be authored using Blu-Print, Scenarist, Encore or other commercial authoring tools.

    More detailed documentation on the new Blu-ray support and how to use it can be found in the official commit message.  Do keep in mind that you have to export to raw H.264 (not MKV or MP4) or else the buffering information will be slightly incorrect.  Finally, also note that the encoding settings given as an example are not a good choice for general-purpose encoding: they are intentionally crippled by Blu-ray restrictions, which will significantly reduce compression for ordinary non-Blu-ray encoding.

    In addition to Blu-ray support, the aforementioned commit comes with a lot of fun extras:

    x264 now has native variable-framerate ratecontrol, which makes sure your encodes get a correct target bitrate and proper limiting of maximum bitrate even if the duration of every frame is different and the “framerate” is completely unknown.  This helps a lot when encoding from variable-framerate container formats such as FLV and WMV, along with variable-framerate content such as anime.

    x264 now supports pulldown (telecine) in much the same fashion as it is handled in MPEG-2.  The calling application can pass in flags representing how to display a frame, allowing easy transcoding from MPEG-2 sources with pulldown, such as broadcast television.  The x264 commandline app contains some examples of these (such as the common 3:2 pulldown pattern).

    x264 now also exports HRD timing information, which is critical for compliant transport stream muxing.  There is currently an active project to write a fully DVB-compatible free software TS muxer that will be able to interface with x264 for a seamless free software broadcast system.  It will likely also be possible to repurpose this muxer as part of a free software Blu-ray authoring package.

    All of this is now available in the latest x264.

  • Announcing the first free software Blu-ray encoder

    25 avril 2010, par Dark Shikariblu-ray, x264

    For many years it has been possible to make your own DVDs with free software tools.  Over the course of the past decade, DVD creation evolved from the exclusive domain of the media publishing companies to something basically anyone could do on their home computer.

    But Blu-ray has yet to get that treatment.  Despite the “format war” between Blu-ray and HD DVD ending over two years ago, free software has lagged behind.  “Professional” tools for Blu-ray video encoding can cost as much as $100,000 and are often utter garbage.  Here are two actual screenshots from real Blu-rays: I wish I was making this up.

    But today, things change.  Today we take the first step towards a free software Blu-ray creation toolkit.

    Thanks to tireless work by Kieran Kunyha, Alex Giladi, Lamont Alston, and the Doom9 crowd, x264 can now produce Blu-ray-compliant video.  Extra special thanks to The Criterion Collection for sponsoring the final compliance test to confirm x264′s Blu-ray compliance.

    With x264′s powerful compression, as demonstrated by the incredibly popular BD-Rebuilder Blu-ray backup software, it’s quite possible to author Blu-ray disks on DVD9s (dual-layer DVDs) or even DVD5s (single-layer DVDs) with a reasonable level of quality.  With a free software encoder and less need for an expensive Blu-ray burner, we are one step closer to putting HD optical media creation in the hands of the everyday user.

    To celebrate this achievement, we are making available for download a demo Blu-ray encoded with x264, containing entirely free content!

    On this Blu-ray are the Open Movie Project films Big Buck Bunny and Elephant’s Dream, available under a Creative Commons license.  Additionally, Microsoft has graciously provided about 6 minutes of lossless HD video and audio (from part of a documentary project) under a very liberal license.  This footage rounds out the Blu-ray by adding some difficult live-action content in addition to the relatively compressible CGI footage from the Open Movie Project.  Finally, we used this sound sample, available under a Creative Commons license.

    You may notice that the Blu-ray image is only just over 2GB.  This is intentional; we have encoded all the content on the disk at appropriate bitrates to be playable from an ordinary 4.7GB DVD.  This should make it far easier to burn a copy of the Blu-ray, since Blu-ray burners and writable media are still relatively rare.  Most Blu-ray players will treat a DVD containing Blu-ray data as a normal Blu-ray disc.  A few, such as the Playstation 3, will not, but you can still play it as a data disc.

    Finally, note that (in accordance with the Blu-ray spec) the disc image file uses the UDF 2.5 filesystem, which may be incompatible with some older virtual drive and DVD burning applications.  You’ll also need to play it on an actual Blu-ray player if you want to get the menus and such working correctly.  If you’re looking to play it on a PC, a free trial of Arcsoft TMT is available here.

    What are you waiting for?  Grab a copy today!

    UPDATE: Here is an AVCHD-compliant version of the above, which should work better when burned on a DVD-5 instead of a BD-R. (mirror)

    What’s left before we have a fully free software Blu-ray creation toolkit?  Audio is already dealt with; AC3 audio (aka Dolby Digital), the format used in DVD, is still supported by Blu-ray, and there are many free software AC3 encoders.  The primary missing application is a free software Blu-ray authoring tool, to combine the video and audio streams to create a Blu-ray file structure with the menus, chapters, and so forth that we have all come to expect.  But the hardest part is dealt with: we can now create compatible video and audio streams.

    In the meantime, x264 can be used to create streams to be authored using Blu-Print, Scenarist, Encore or other commercial authoring tools.

    More detailed documentation on the new Blu-ray support and how to use it can be found in the official commit message.  Do keep in mind that you have to export to raw H.264 (not MKV or MP4) or else the buffering information will be slightly incorrect.  Finally, also note that the encoding settings given as an example are not a good choice for general-purpose encoding: they are intentionally crippled by Blu-ray restrictions, which will significantly reduce compression for ordinary non-Blu-ray encoding.

    In addition to Blu-ray support, the aforementioned commit comes with a lot of fun extras:

    x264 now has native variable-framerate ratecontrol, which makes sure your encodes get a correct target bitrate and proper limiting of maximum bitrate even if the duration of every frame is different and the “framerate” is completely unknown.  This helps a lot when encoding from variable-framerate container formats such as FLV and WMV, along with variable-framerate content such as anime.

    x264 now supports pulldown (telecine) in much the same fashion as it is handled in MPEG-2.  The calling application can pass in flags representing how to display a frame, allowing easy transcoding from MPEG-2 sources with pulldown, such as broadcast television.  The x264 commandline app contains some examples of these (such as the common 3:2 pulldown pattern).

    x264 now also exports HRD timing information, which is critical for compliant transport stream muxing.  There is currently an active project to write a fully DVB-compatible free software TS muxer that will be able to interface with x264 for a seamless free software broadcast system.  It will likely also be possible to repurpose this muxer as part of a free software Blu-ray authoring package.

    All of this is now available in the latest x264.

  • Announcing x264 Summer of Code 2010 !

    19 mars 2010, par Dark ShikariGSOC, development, google, x264

    With the announcement of Google Summer of Code 2010 and the acceptance of our umbrella organization, Videolan, we are proud to announce the third x264 Summer of Code!  After two years of progressively increasing success, we expect this year to be better than ever.  Last year’s successes include ARM support and weighted P-frame prediction.  This year we have a wide variety of projects of varying difficulty, including some old ones and a host of new tasks.  The qualification tasks are tough, so if you want to get involved, the sooner the better!

    Interested in getting started?  Check out the wiki page, hop on #x264 on Freenode IRC, and say hi to the gang!  No prior experience or knowledge in video compression necessary: just dedication and the willingness to ask questions and experiment until you figure things out.

  • The problems with wavelets

    27 février 2010, par Dark ShikariDCT, Dirac, Snow, psychovisual optimizations, wavelets

    I have periodically noted in this blog and elsewhere various problems with wavelet compression, but many readers have requested that I write a more detailed post about it, so here it is.

    Wavelets have been researched for quite some time as a replacement for the standard discrete cosine transform used in most modern video compression.  Their methodology is basically opposite: each coefficient in a DCT represents a constant pattern applied to the whole block, while each coefficient in a wavelet transform represents a single, localized pattern applied to a section of the block.  Accordingly, wavelet transforms are usually very large with the intention of taking advantage of large-scale redundancy in an image.  DCTs are usually quite small and are intended to cover areas of roughly uniform patterns and complexity.

    Both are complete transforms, offering equally accurate frequency-domain representations of pixel data.  I won’t go into the mathematical details of each here; the real question is whether one offers better compression opportunities for real-world video.

    DCT transforms, though it isn’t mathematically required, are usually found as block transforms, handling a single sharp-edged block of data.  Accordingly, they usually need a deblocking filter to smooth the edges between DCT blocks.  Wavelet transforms typically overlap, avoiding such a need.  But because wavelets don’t cover a sharp-edged block of data, they don’t compress well when the predicted data is in the form of blocks.

    Thus motion compensation is usually performed as overlapped-block motion compensation (OBMC), in which every pixel is calculated by performing the motion compensation of a number of blocks and averaging the result based on the distance of those blocks from the current pixel.  Another option, which can be combined with OBMC, is “mesh MC“, where every pixel gets its own motion vector, which is a weighted average of the closest nearby motion vectors.  The end result of either is the elimination of sharp edges between blocks and better prediction, at the cost of greatly increased CPU requirements.  For an overlap factor of 2, it’s 4 times the amount of motion compensation, plus the averaging step.  With mesh MC, it’s even worse, with SIMD optimizations becoming nearly impossible.

    At this point, it would seem wavelets would have pretty big advantages: when used with OBMC, they have better inter prediction, eliminate the need for deblocking, and take advantage of larger-scale correlations.  Why then hasn’t everyone switched over to wavelets then?  Dirac and Snow offer modern implementations.  Yet despite decades of research, wavelets have consistently disappointed for image and video compression.  It turns out there are a lot of serious practical issues with wavelets, many of which are open problems.

    1.  No known method exists for efficient intra coding. H.264′s spatial intra prediction is extraordinarily powerful, but relies on knowing the exact decoded pixels to the top and left of the current block.  Since there is no such boundary in overlapped-wavelet coding, such prediction is impossible.  Newer intra prediction methods, such as markov-chain intra prediction, also seem to require an H.264-like situation with exactly-known neighboring pixels.  Intra coding in wavelets is in the same state that DCT intra coding was in 20 years ago: the best known method was to simply transform the block with no prediction at all besides DC.  NB: as described by Pengvado in the comments, the switching between inter and intra coding is potentially even more costly than the inefficient intra coding.

    2.  Mixing partition sizes has serious practical problems. Because the overlap between two motion partitions depends on the partitions’ size, mixing block sizes becomes quite difficult to define.  While in H.264 an smaller partition always gives equal or better compression than a larger one when one ignores the extra overhead, it is actually possible for a larger partition to win when using OBMC due to the larger overlap.  All of this makes both the problem of defining the result of mixed block sizes and making decisions about them very difficult.

    Both Snow and Dirac offer variable block size, but the overlap amount is constant; larger blocks serve only to save bits on motion vectors, not offer better overlap characteristics.

    3.  Lack of spatial adaptive quantization. As shown in x264 with VAQ, and correspondingly in HCEnc’s implementation and Theora’s recent implementation, spatial adaptive quantization has staggeringly impressive (before, after) effects on visual quality.  Only Dirac seems to have such a feature, and the encoder doesn’t even use it.  No other wavelet formats (Snow, JPEG2K, etc) seem to have such a feature.  This results in serious blurring problems in areas with subtle texture (as in the comparison below).

    4.  Wavelets don’t seem to code visual energy effectively. Remember that a single coefficient in a DCT represents a pattern which applies across an entire block: this makes it very easy to create apparent “detail” with a DCT.  Furthermore, the sharp edges of DCT blocks, despite being an apparent weakness, often result in a “fake sharpness” that can actually improve the visual appearance of videos, as was seen with Xvid.  Thus wavelet codecs have a tendency to look much blurrier than DCT-based codecs, but since PSNR likes blur, this is often seen as a benefit during video compression research.  Some of the consequences of these factors can be seen in this comparison; somewhat outdated and not general-case, but which very effectively shows the difference in how wavelets handle sharp edges and subtle textures.

    Another problem that periodically crops up is the visual aliasing that tends to be associated with wavelets at lower bitrates.  Standard wavelets effectively consist of a recursive function that upscales the coefficients coded by the previous level by a factor of 2 and then adds a new set of coefficients.  If the upscaling algorithm is naive — as it often is, for the sake of speed — the result can look quite ugly, as if parts of the image were coded at a lower resolution and then badly scaled up.  Of course, it looks like that because they were coded at a lower resolution and then badly scaled up.

    JPEG2000 is a classic example of wavelet failure: despite having more advanced entropy coding, being designed much later than JPEG, being much more computationally intensive, and having much better PSNR, comparisons have consistently shown it to be visually worse than JPEG at sane filesizes.  Here’s an example from Wikipedia.  By comparison, H.264′s intra coding, when used for still image compression, can beat JPEG by a factor of 2 or more (I’ll make a post on this later).  With the various advancements in DCT intra coding since H.264, I suspect that a state-of-the-art DCT compressor could win by an even larger factor.

    Despite the promised benefits of wavelets, a wavelet encoder even close to competitive with x264 has yet to be created.  With some tests even showing Dirac losing to Theora in visual comparisons, it’s clear that many problems remain to be solved before wavelets can eliminate the ugliness of block-based transforms once and for all.

  • Flash, Google, VP8, and the future of internet video

    23 février 2010, par Dark ShikariH.264, HTML5, Theora, VP8, google, x264
    This is going to be a much longer post than usual, as it’s going to cover a lot of ground. The internet has been filled for quite some time with an enormous number of blog posts complaining about how Flash sucks–so much that it’s sounding as if the entire internet is crying wolf.  But, of [...]