
Recherche avancée
Médias (1)
-
The pirate bay depuis la Belgique
1er avril 2013, par
Mis à jour : Avril 2013
Langue : français
Type : Image
Autres articles (36)
-
Des sites réalisés avec MediaSPIP
2 mai 2011, parCette page présente quelques-uns des sites fonctionnant sous MediaSPIP.
Vous pouvez bien entendu ajouter le votre grâce au formulaire en bas de page. -
Support audio et vidéo HTML5
10 avril 2011MediaSPIP utilise les balises HTML5 video et audio pour la lecture de documents multimedia en profitant des dernières innovations du W3C supportées par les navigateurs modernes.
Pour les navigateurs plus anciens, le lecteur flash Flowplayer est utilisé.
Le lecteur HTML5 utilisé a été spécifiquement créé pour MediaSPIP : il est complètement modifiable graphiquement pour correspondre à un thème choisi.
Ces technologies permettent de distribuer vidéo et son à la fois sur des ordinateurs conventionnels (...) -
HTML5 audio and video support
13 avril 2011, parMediaSPIP uses HTML5 video and audio tags to play multimedia files, taking advantage of the latest W3C innovations supported by modern browsers.
The MediaSPIP player used has been created specifically for MediaSPIP and can be easily adapted to fit in with a specific theme.
For older browsers the Flowplayer flash fallback is used.
MediaSPIP allows for media playback on major mobile platforms with the above (...)
Sur d’autres sites (2986)
-
Revision 32595 : avec le numero de version des plugins
1er novembre 2009, par fil@… — Logavec le numero de version des plugins
-
Inside WebM Technology : The VP8 Alternate Reference Frame
15 juin 2010, par noreply@blogger.com (John Luther) — inside webm, vp8Since the WebM project was open-sourced just a week ago, we’ve seen blog posts and articles about its capabilities. As an open project, we welcome technical scrutiny and contributions that improve the codec. We know from our extensive testing that VP8 can match or exceed other leading codecs, but to get the best results, it helps to understand more about how the codec works. In this first of a series of blog posts, I’ll explain some of the fundamental techniques in VP8, along with examples and metrics.
The alternative reference frame is one of the most exciting quality innovations in VP8. Let’s delve into how VP8 uses these frames to improve prediction and thereby overall video quality.
Alternate Reference Frames in VP8
VP8 uses three types of reference frames for inter prediction : the last frame, a "golden" frame (one frame worth of decompressed data from the arbitrarily distant past) and an alternate reference frame. Overall, this design has a much smaller memory footprint on both encoders and decoders than designs with many more reference frames. In video compression, it is very rare for more than three reference frames to provide significant quality benefit, but the undesirable increase in memory footprint from the extra frames is substantial.
Unlike other types of reference frames used in video compression, which are displayed to the user by the decoder, the VP8 alternate reference frame is decoded normally but is never shown to the user. It is used solely as a reference to improve inter prediction for other coded frames. Because alternate reference frames are not displayed, VP8 encoders can use them to transmit any data that are helpful to compression. For example, a VP8 encoder can construct one alternate reference frame from multiple source frames, or it can create an alternate reference frame using different macroblocks from hundreds of different video frames.
The current VP8 implementation enables two different types of usage for the alternate reference frame : noise-reduced prediction and past/future directional prediction.
Noise-Reduced Prediction
The alternate reference frame is transmitted and decoded similar to other frames, hence its usage does not add extra computation in decoding. The VP8 encoder however is free to use more sophisticated processing to create them in off-line encoding. One application of the alternate reference frame is for noise-reduced prediction. In this application, the VP8 encoder uses multiple input source frames to construct one reference frame through temporal or spatial noise filtering. This "noise-free" alternate reference frame is then used to improve prediction for encoding subsequent frames.
You can make use of this feature by setting ARNR parameters in VP8 encoding, where ARNR stands for "Alternate Reference Noise Reduction." A sample two-pass encoding setting with the parameters :
--arnr-maxframes=5 --arnr-strength=3
enables the encoder to use "5" consecutive input source frames to produce one alternate reference frame using a filtering strength of "3". Here is an example showing the quality benefit of using this experimental "ARNR" feature on the standard test clip "Hall Monitor." (Each line on the graph represents the quality of an encoded stream on a given clip at multiple datarates. The higher points on the Y axis (PSNR) indicates the stream with the better quality.)
The only difference between the two curves in the graph is that VP8_ARNR was produced by encodings with ARNR parameters and VP8_NO_ARNR was not. As we can see from the graph, noise reduced prediction is very helpful to compression quality when encoding noisy sources. We’ve just started to explore this idea but have already seen strong improvements on noisy input clips similar to this "Hall Monitor." We feel there’s a lot more we can do in this area.
Improving Prediction without B Frames
The lack of B frames in VP8 has sparked some discussion about its ability to achieve competitive compression efficiency. VP8 encoders, however, can make intelligent use of the golden reference and the alternate reference frames to compensate for this. The VP8 encoder can choose to transmit an alternate reference frame similar to a "future" frame, and encoding of subsequent frames can make use of information from the past (last frame and golden frame) and from the future (alternate reference frame). Effectively, this helps the encoder to achieve results similar to bidirectional (B frame) prediction without requiring frame reordering in the decoder. Running in two-pass encoding mode, compression can be improved in the VP8 encoder by using encoding parameters that enable lagged encoding and automatic placement of alternate reference frames :
--auto-alt-ref=1 --lag-in-frames=16
Used this way, the VP8 encoder can achieve improved prediction and compression efficiency without increasing the decoder’s complexity :
In the video compression community, "Mobile and calendar" is known as a clip that benefits significantly from the usage of B frames. The graph above illustrates that the use of alternate reference frame benefits VP8 significantly without using B frames.
Keep an eye on this blog for more posts about VP8 encoding. You can find more information on above encoding parameters or other detailed instructions to use with our VP8 encoders on our site, or join our discussion list.
Yaowu Xu, Ph.D. is a codec engineer at Google.
-
The problems with wavelets
I have periodically noted in this blog and elsewhere various problems with wavelet compression, but many readers have requested that I write a more detailed post about it, so here it is.
Wavelets have been researched for quite some time as a replacement for the standard discrete cosine transform used in most modern video compression. Their methodology is basically opposite : each coefficient in a DCT represents a constant pattern applied to the whole block, while each coefficient in a wavelet transform represents a single, localized pattern applied to a section of the block. Accordingly, wavelet transforms are usually very large with the intention of taking advantage of large-scale redundancy in an image. DCTs are usually quite small and are intended to cover areas of roughly uniform patterns and complexity.
Both are complete transforms, offering equally accurate frequency-domain representations of pixel data. I won’t go into the mathematical details of each here ; the real question is whether one offers better compression opportunities for real-world video.
DCT transforms, though it isn’t mathematically required, are usually found as block transforms, handling a single sharp-edged block of data. Accordingly, they usually need a deblocking filter to smooth the edges between DCT blocks. Wavelet transforms typically overlap, avoiding such a need. But because wavelets don’t cover a sharp-edged block of data, they don’t compress well when the predicted data is in the form of blocks.
Thus motion compensation is usually performed as overlapped-block motion compensation (OBMC), in which every pixel is calculated by performing the motion compensation of a number of blocks and averaging the result based on the distance of those blocks from the current pixel. Another option, which can be combined with OBMC, is “mesh MC“, where every pixel gets its own motion vector, which is a weighted average of the closest nearby motion vectors. The end result of either is the elimination of sharp edges between blocks and better prediction, at the cost of greatly increased CPU requirements. For an overlap factor of 2, it’s 4 times the amount of motion compensation, plus the averaging step. With mesh MC, it’s even worse, with SIMD optimizations becoming nearly impossible.
At this point, it would seem wavelets would have pretty big advantages : when used with OBMC, they have better inter prediction, eliminate the need for deblocking, and take advantage of larger-scale correlations. Why then hasn’t everyone switched over to wavelets then ? Dirac and Snow offer modern implementations. Yet despite decades of research, wavelets have consistently disappointed for image and video compression. It turns out there are a lot of serious practical issues with wavelets, many of which are open problems.
1. No known method exists for efficient intra coding. H.264′s spatial intra prediction is extraordinarily powerful, but relies on knowing the exact decoded pixels to the top and left of the current block. Since there is no such boundary in overlapped-wavelet coding, such prediction is impossible. Newer intra prediction methods, such as markov-chain intra prediction, also seem to require an H.264-like situation with exactly-known neighboring pixels. Intra coding in wavelets is in the same state that DCT intra coding was in 20 years ago : the best known method was to simply transform the block with no prediction at all besides DC. NB : as described by Pengvado in the comments, the switching between inter and intra coding is potentially even more costly than the inefficient intra coding.
2. Mixing partition sizes has serious practical problems. Because the overlap between two motion partitions depends on the partitions’ size, mixing block sizes becomes quite difficult to define. While in H.264 an smaller partition always gives equal or better compression than a larger one when one ignores the extra overhead, it is actually possible for a larger partition to win when using OBMC due to the larger overlap. All of this makes both the problem of defining the result of mixed block sizes and making decisions about them very difficult.
Both Snow and Dirac offer variable block size, but the overlap amount is constant ; larger blocks serve only to save bits on motion vectors, not offer better overlap characteristics.
3. Lack of spatial adaptive quantization. As shown in x264 with VAQ, and correspondingly in HCEnc’s implementation and Theora’s recent implementation, spatial adaptive quantization has staggeringly impressive (before, after) effects on visual quality. Only Dirac seems to have such a feature, and the encoder doesn’t even use it. No other wavelet formats (Snow, JPEG2K, etc) seem to have such a feature. This results in serious blurring problems in areas with subtle texture (as in the comparison below).
4. Wavelets don’t seem to code visual energy effectively. Remember that a single coefficient in a DCT represents a pattern which applies across an entire block : this makes it very easy to create apparent “detail” with a DCT. Furthermore, the sharp edges of DCT blocks, despite being an apparent weakness, often result in a “fake sharpness” that can actually improve the visual appearance of videos, as was seen with Xvid. Thus wavelet codecs have a tendency to look much blurrier than DCT-based codecs, but since PSNR likes blur, this is often seen as a benefit during video compression research. Some of the consequences of these factors can be seen in this comparison ; somewhat outdated and not general-case, but which very effectively shows the difference in how wavelets handle sharp edges and subtle textures.
Another problem that periodically crops up is the visual aliasing that tends to be associated with wavelets at lower bitrates. Standard wavelets effectively consist of a recursive function that upscales the coefficients coded by the previous level by a factor of 2 and then adds a new set of coefficients. If the upscaling algorithm is naive — as it often is, for the sake of speed — the result can look quite ugly, as if parts of the image were coded at a lower resolution and then badly scaled up. Of course, it looks like that because they were coded at a lower resolution and then badly scaled up.
JPEG2000 is a classic example of wavelet failure : despite having more advanced entropy coding, being designed much later than JPEG, being much more computationally intensive, and having much better PSNR, comparisons have consistently shown it to be visually worse than JPEG at sane filesizes. Here’s an example from Wikipedia. By comparison, H.264′s intra coding, when used for still image compression, can beat JPEG by a factor of 2 or more (I’ll make a post on this later). With the various advancements in DCT intra coding since H.264, I suspect that a state-of-the-art DCT compressor could win by an even larger factor.
Despite the promised benefits of wavelets, a wavelet encoder even close to competitive with x264 has yet to be created. With some tests even showing Dirac losing to Theora in visual comparisons, it’s clear that many problems remain to be solved before wavelets can eliminate the ugliness of block-based transforms once and for all.