Recherche avancée

Recherche
Choix de la période de publication
Date minimale :

Date maximale :

Type de date :
Choix de la langue
Choix du type de média
Choix de la rubrique
Choix de la licence de publication
Choix de l’auteur

Médias (1)

Mot : - Tags -/publishing

Autres articles (67)

(Dés)Activation de fonctionnalités (plugins)

18 février 2011, par kent1

Pour gérer l’ajout et la suppression de fonctionnalités supplémentaires (ou plugins), MediaSPIP utilise à partir de la version 0.2 SVP.
SVP permet l’activation facile de plugins depuis l’espace de configuration de MediaSPIP.
Pour y accéder, il suffit de se rendre dans l’espace de configuration puis de se rendre sur la page "Gestion des plugins".
MediaSPIP est fourni par défaut avec l’ensemble des plugins dits "compatibles", ils ont été testés et intégrés afin de fonctionner parfaitement avec chaque (...)
Activation de l’inscription des visiteurs

12 avril 2011, par kent1

Il est également possible d’activer l’inscription des visiteurs ce qui permettra à tout un chacun d’ouvrir soit même un compte sur le canal en question dans le cadre de projets ouverts par exemple.
Pour ce faire, il suffit d’aller dans l’espace de configuration du site en choisissant le sous menus "Gestion des utilisateurs". Le premier formulaire visible correspond à cette fonctionnalité.
Par défaut, MediaSPIP a créé lors de son initialisation un élément de menu dans le menu du haut de la page menant (...)
Ecrire une actualité

21 juin 2013, par etalarma

Présentez les changements dans votre MédiaSPIP ou les actualités de vos projets sur votre MédiaSPIP grâce à la rubrique actualités.
Dans le thème par défaut spipeo de MédiaSPIP, les actualités sont affichées en bas de la page principale sous les éditoriaux.
Vous pouvez personnaliser le formulaire de création d’une actualité.
Formulaire de création d’une actualité Dans le cas d’un document de type actualité, les champs proposés par défaut sont : Date de publication ( personnaliser la date de publication ) (...)

1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | ... | 23

Sur d’autres sites (9294)

The Big VP8 Debug

20 novembre 2010, par Multimedia Mike — VP8
I hope my previous walkthrough of the VP8 4x4 intra coding process was educational. Today, I’ll be walking through an example of what happens when my toy VP8 encoder encodes an intra 16x16 block. This may prove educational to those who have never been exposed to the deep details of this or related algorithms. Also, I wanted to illustrate where I think my VP8 encoder process is going bad and generating such grotesque results.

Before I start, let me give a shout-out to Google Docs’ Drawing tool which I used to generate these diagrams. It works quite well.

Results

(Always cut to the chase in a blog post ; results first.) I’m glad I composed this post. In the course of doing so, I found the problem, fixed it, and am now able to present this image that was decoded from the bitstream encoded by my ~~toy~~ working VP8 encoder :

Yeah, I know that image doesn’t look like anything you haven’t seen before. The difference is that it has made a successful trip through my VP8 encoder.

Follow along through the encoding process and learn of the mistake...

Original Block and Subblocks

Here is the 16x16 block to be encoded :

The block is broken down into 16 4x4 subblocks for further encoding :

Prediction

The first step is to pick a prediction mode, generate a prediction block, and subtract the predictors from the macroblock. In this case, we will use DC prediction which means the predictor will be the same for each element.

In 4x4 VP8 DC intra prediction, samples outside of the frame are assumed to be 128. It’s a little different in 16x16 DC intra prediction— samples above the top row are assumed to be 127 while samples left of the leftmost column are assumed to be 129. For the top left macroblock, this still works out to 128.

Subtract 128 from each of the samples :

Forward Transform

Run each of the 16 prediction-removed subblocks through the forward transform. This example uses the forward transform from libvpx 0.9.5 :

I have highlighted the DC coefficients in each subblock. That’s because those receive special consideration in 16x16 intra coding.

Quantization

The Y plane AC quantizer is 4 in this example, the minimum allowed. (The Y plane DC quantizer is also 4 but doesn’t come into play for intra 16x16 coding since the DC coefficients follow a different process.) Thus, quantize (integer divide) each AC element in each subblock (we’ll ignore the DC coefficient for this part) :

The Y2 Round Trip

Those highlighted DC coefficients from each of the 16 subblocks comprise the Y2 block. This block is transformed with a slightly different algorithm called the Walsh-Hadamard Transform (WHT). The results of this transform are then quantized (using 8 for both Y2 DC and AC in this example, as those are the smallest Y2 quantizers that VP8 allows), then zigzagged and entropy-coded along with the rest of the macroblock coefficients.

On the decoder side, the Y2 coefficients are decoded, de-zigzagged, dequantized and run through the inverse WHT.

And this is where I suspect that most of the error is creeping into my VP8 encoder. Observe the round-trip through the Y2 process :

As intimated, this part causes me consternation due to the wide discrepancy between the original and the reconstructed Y2 blocks. Observe the absolute difference between the 2 vectors :

That’s really significant and leads me to believe that this is where the big problem is.

What’s Wrong ?

My first suspicion is that the quantization is throwing off the process. I was disabused of this idea when I removed quantization from the equation and immediately reversed the transform :

So perhaps there is a problem with the forward WHT. Just like with the usual subblock transform, the VP8 spec doesn’t define how to perform the forward WHT, only the inverse WHT. Do I need to audition different forward WHTs from various versions of libvpx, similar to what I did with the other transform ? That doesn’t make a lot of sense— libvpx doesn’t seem to have so much trouble with basic encoding.

The Punchline

I reviewed the forward WHT code, the stuff that I plagiarized from libvpx 0.9.0. The function takes, among other parameters, a pitch value. There are 2 loops in the code. The first iterates through the rows of the input matrix— which I assumed was a 4x4 matrix. I was puzzled that during each iteration of the row loop, the input pointer was only being advanced by (pitch/2). I removed the division by 2 and the problem went away. I.e., the encoded image looks correct.

What’s up with the (pitch/2), anyway ? It seems that the encoder likes to pack 2 4x4 subblocks into an 8x4 block data structure. In fact, the forward DCTs in the libvpx encoder have the same artifact. Remember how I surveyed several variations of forward DCT from different versions of libvpx ? The one that proved most accurate in that test was the one I had already modified to advance the input pointer properly. Fixing the other 2 candidates yields similar results :
```
input :   92 91 89 86 91 90 88 86 89 89 89 88 89 87 88 93
short 0.9.0 : -311 6 2 0 0 11 -6 1 2 -3 3 0 0 0 -2 1
inverse : 92 91 89 86 91 90 88 87 90 89 89 88 89 87 88 93
fast  0.9.0 : -313 5 1 0 1 11 -6 1 3 -3 4 0 0 0 -2 1
inverse : 91 91 89 86 90 90 88 86 89 89 89 88 89 87 88 93
short 0.9.5 : -312 7 1 0 1 12 -5 2 2 -3 3 -1 1 0 -2 1
inverse : 92 91 89 86 91 90 88 86 89 89 89 88 89 87 88 93
```
Code cribber beware !

Corrected Y2 Round Trip

Let’s look at that Y2 round trip one more time :

And another look at the error between the original and the reconstruction :

Better.

Dequantization, Prediction, Inverse Transforms, and Reconstruction

To be honest, now that I solved the major problem, I’m getting a little tired of making these pictures. Long story short, all elements of the original 16 subblocks are dequantized and their DC coefficients are filled in with the appropriate item from the reconstructed Y2 block. A base predictor block is generated (all 128 values in this case). And each Y block is run through the inverse transform and added to the predictor block. The following is the reconstruction :

And if you compare that against the original luma macroblock (I don’t feel like doing it right now), you’ll find that it’s pretty close.

I can’t believe how close I was all this time, and how long that pitch bug held me up.
Inside WebM Technology : VP8 Intra and Inter Prediction

20 juillet 2010, par noreply@blogger.com (Lou Quillio)
Continuing our series on WebM technology, I will discuss the use of prediction methods in the VP8 video codec, with special attention to the TM_PRED and SPLITMV modes, which are unique to VP8.

First, some background. To encode a video frame, block-based codecs such as VP8 first divide the frame into smaller segments called macroblocks. Within each macroblock, the encoder can predict redundant motion and color information based on previously processed blocks. The redundant data can be subtracted from the block, resulting in more efficient compression.

Image by Fido Factor, licensed under Creative Commons Attribution License.
Based on a work at www.flickr.com

A VP8 encoder uses two classes of prediction :
- Intra prediction uses data within a single video frame
- Inter prediction uses data from previously encoded frames
The residual signal data is then encoded using other techniques, such as transform coding.

VP8 Intra Prediction Modes
VP8 intra prediction modes are used with three types of macroblocks :
- 4x4 luma
- 16x16 luma
- 8x8 chroma
Four common intra prediction modes are shared by these macroblocks :
- H_PRED (horizontal prediction). Fills each column of the block with a copy of the left column, L.
- V_PRED (vertical prediction). Fills each row of the block with a copy of the above row, A.
- DC_PRED (DC prediction). Fills the block with a single value using the average of the pixels in the row above A and the column to the left of L.
- TM_PRED (TrueMotion prediction). A mode that gets its name from a compression technique developed by On2 Technologies. In addition to the row A and column L, TM_PRED uses the pixel P above and to the left of the block. Horizontal differences between pixels in A (starting from P) are propagated using the pixels from L to start each row.
For 4x4 luma blocks, there are six additional intra modes similar to V_PRED and H_PRED, but correspond to predicting pixels in different directions. These modes are outside the scope of this post, but if you want to learn more see the VP8 Bitstream Guide.

As mentioned above, the TM_PRED mode is unique to VP8. The following figure uses an example 4x4 block of pixels to illustrate how the TM_PRED mode works :

Where C, As and Ls represent reconstructed pixel values from previously coded blocks, and X₀₀ through X₃₃ represent predicted values for the current block. TM_PRED uses the following equation to calculate X_ij :

X_ij = L_i + A_j - C (i, j=0, 1, 2, 3)

Although the above example uses a 4x4 block, the TM_PRED mode for 8x8 and 16x16 blocks works in the same fashion.
TM_PRED is one of the more frequently used intra prediction modes in VP8, and for common video sequences it is typically used by 20% to 45% of all blocks that are intra coded. Overall, together with other intra prediction modes, TM_PRED helps VP8 to achieve very good compression efficiency, especially for key frames, which can only use intra modes (key frames by their very nature cannot refer to previously encoded frames).

VP8 Inter Prediction Modes

In VP8, inter prediction modes are used only on inter frames (non-key frames). For any VP8 inter frame, there are typically three previously coded reference frames that can be used for prediction. A typical inter prediction block is constructed using a motion vector to copy a block from one of the three frames. The motion vector points to the location of a pixel block to be copied. In most video compression schemes, a good portion of the bits are spent on encoding motion vectors ; the portion can be especially large for video encoded at lower datarates.

Like previous VPx codecs, VP8 encodes motion vectors very efficiently by reusing vectors from neighboring macroblocks (a macroblock includes one 16x16 luma block and two 8x8 chroma blocks). VP8 uses a similar strategy in the overall design of inter prediction modes. For example, the prediction modes "NEAREST" and "NEAR" make use of last and second-to-last, non-zero motion vectors from neighboring macroblocks. These inter prediction modes can be used in combination with any of the three different reference frames.

In addition, VP8 has a very sophisticated, flexible inter prediction mode called SPLITMV. This mode was designed to enable flexible partitioning of a macroblock into sub-blocks to achieve better inter prediction. SPLITMV is very useful when objects within a macroblock have different motion characteristics. Within a macroblock coded using SPLITMV mode, each sub-block can have its own motion vector. Similar to the strategy of reusing motion vectors at the macroblock level, a sub-block can also use motion vectors from neighboring sub-blocks above or left to the current block. This strategy is very flexible and can effectively encode any shape of sub-macroblock partitioning, and does so efficiently. Here is an example of a macroblock with 16x16 luma pixels that is partitioned to 16 4x4 blocks :

where New represents a 4x4 bock coded with a new motion vector, and Left and Above represent a 4x4 block coded using the motion vector from the left and above, respectively. This example effectively partitions the 16x16 macroblock into 3 different segments with 3 different motion vectors (represented below by 1, 2 and 3) :

Through effective use of intra and inter prediction modes, WebM encoder implementations can achieve great compression quality on a wide range of source material. If you want to delve further into VP8 prediction modes, read the VP8 Bitstream Guide or examine the reconintra.c and rdopt.c files in the VP8 source tree.

Yaowu Xu, Ph.D. is a codec engineer at Google.
Inside WebM Technology : The VP8 Alternate Reference Frame

15 juin 2010, par noreply@blogger.com (John Luther) — inside webm, vp8

Since the WebM project was open-sourced just a week ago, we’ve seen blog posts and articles about its capabilities. As an open project, we welcome technical scrutiny and contributions that improve the codec. We know from our extensive testing that VP8 can match or exceed other leading codecs, but to get the best results, it helps to understand more about how the codec works. In this first of a series of blog posts, I’ll explain some of the fundamental techniques in VP8, along with examples and metrics.

The alternative reference frame is one of the most exciting quality innovations in VP8. Let’s delve into how VP8 uses these frames to improve prediction and thereby overall video quality.

Alternate Reference Frames in VP8

VP8 uses three types of reference frames for inter prediction : the last frame, a "golden" frame (one frame worth of decompressed data from the arbitrarily distant past) and an alternate reference frame. Overall, this design has a much smaller memory footprint on both encoders and decoders than designs with many more reference frames. In video compression, it is very rare for more than three reference frames to provide significant quality benefit, but the undesirable increase in memory footprint from the extra frames is substantial.

Unlike other types of reference frames used in video compression, which are displayed to the user by the decoder, the VP8 alternate reference frame is decoded normally but is never shown to the user. It is used solely as a reference to improve inter prediction for other coded frames. Because alternate reference frames are not displayed, VP8 encoders can use them to transmit any data that are helpful to compression. For example, a VP8 encoder can construct one alternate reference frame from multiple source frames, or it can create an alternate reference frame using different macroblocks from hundreds of different video frames.

The current VP8 implementation enables two different types of usage for the alternate reference frame : noise-reduced prediction and past/future directional prediction.

Noise-Reduced Prediction

The alternate reference frame is transmitted and decoded similar to other frames, hence its usage does not add extra computation in decoding. The VP8 encoder however is free to use more sophisticated processing to create them in off-line encoding. One application of the alternate reference frame is for noise-reduced prediction. In this application, the VP8 encoder uses multiple input source frames to construct one reference frame through temporal or spatial noise filtering. This "noise-free" alternate reference frame is then used to improve prediction for encoding subsequent frames.

You can make use of this feature by setting ARNR parameters in VP8 encoding, where ARNR stands for "Alternate Reference Noise Reduction." A sample two-pass encoding setting with the parameters :

--arnr-maxframes=5 --arnr-strength=3

enables the encoder to use "5" consecutive input source frames to produce one alternate reference frame using a filtering strength of "3". Here is an example showing the quality benefit of using this experimental "ARNR" feature on the standard test clip "Hall Monitor." (Each line on the graph represents the quality of an encoded stream on a given clip at multiple datarates. The higher points on the Y axis (PSNR) indicates the stream with the better quality.)

The only difference between the two curves in the graph is that VP8_ARNR was produced by encodings with ARNR parameters and VP8_NO_ARNR was not. As we can see from the graph, noise reduced prediction is very helpful to compression quality when encoding noisy sources. We’ve just started to explore this idea but have already seen strong improvements on noisy input clips similar to this "Hall Monitor." We feel there’s a lot more we can do in this area.

Improving Prediction without B Frames

The lack of B frames in VP8 has sparked some discussion about its ability to achieve competitive compression efficiency. VP8 encoders, however, can make intelligent use of the golden reference and the alternate reference frames to compensate for this. The VP8 encoder can choose to transmit an alternate reference frame similar to a "future" frame, and encoding of subsequent frames can make use of information from the past (last frame and golden frame) and from the future (alternate reference frame). Effectively, this helps the encoder to achieve results similar to bidirectional (B frame) prediction without requiring frame reordering in the decoder. Running in two-pass encoding mode, compression can be improved in the VP8 encoder by using encoding parameters that enable lagged encoding and automatic placement of alternate reference frames :

--auto-alt-ref=1 --lag-in-frames=16

Used this way, the VP8 encoder can achieve improved prediction and compression efficiency without increasing the decoder’s complexity :

In the video compression community, "Mobile and calendar" is known as a clip that benefits significantly from the usage of B frames. The graph above illustrates that the use of alternate reference frame benefits VP8 significantly without using B frames.

Keep an eye on this blog for more posts about VP8 encoding. You can find more information on above encoding parameters or other detailed instructions to use with our VP8 encoders on our site, or join our discussion list.

Yaowu Xu, Ph.D. is a codec engineer at Google.