Recherche avancée

Recherche
Choix de la période de publication
Date minimale :

Date maximale :

Type de date :
Choix de la langue
Choix du type de média
Choix de la rubrique
Choix de la licence de publication
Choix de l’auteur

Médias (0)

Mot : - Tags -/gis

Aucun média correspondant à vos critères n’est disponible sur le site.

Autres articles (91)

Le profil des utilisateurs

12 avril 2011, par kent1

Chaque utilisateur dispose d’une page de profil lui permettant de modifier ses informations personnelle. Dans le menu de haut de page par défaut, un élément de menu est automatiquement créé à l’initialisation de MediaSPIP, visible uniquement si le visiteur est identifié sur le site.
L’utilisateur a accès à la modification de profil depuis sa page auteur, un lien dans la navigation "Modifier votre profil" est (...)
Configurer la prise en compte des langues

15 novembre 2010, par kent1

Accéder à la configuration et ajouter des langues prises en compte
Afin de configurer la prise en compte de nouvelles langues, il est nécessaire de se rendre dans la partie "Administrer" du site.
De là, dans le menu de navigation, vous pouvez accéder à une partie "Gestion des langues" permettant d’activer la prise en compte de nouvelles langues.
Chaque nouvelle langue ajoutée reste désactivable tant qu’aucun objet n’est créé dans cette langue. Dans ce cas, elle devient grisée dans la configuration et (...)
XMP PHP

13 mai 2011, par kent1

Dixit Wikipedia, XMP signifie :
Extensible Metadata Platform ou XMP est un format de métadonnées basé sur XML utilisé dans les applications PDF, de photographie et de graphisme. Il a été lancé par Adobe Systems en avril 2001 en étant intégré à la version 5.0 d’Adobe Acrobat.
Étant basé sur XML, il gère un ensemble de tags dynamiques pour l’utilisation dans le cadre du Web sémantique.
XMP permet d’enregistrer sous forme d’un document XML des informations relatives à un fichier : titre, auteur, historique (...)

1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | ... | 31

Sur d’autres sites (6573)

Patent skullduggery : Tandberg rips off x264 algorithm

25 novembre 2010, par Dark Shikari — patents, ripoffs, x264

Update : Tandberg claims they came up with the algorithm independently : to be fair, I can actually believe this to some extent, as I think the algorithm is way too obvious to be patented. Of course, they also claim that the algorithm isn’t actually identical, since they don’t want to lose their patent application.

I still don’t trust them, but it’s possible it’s merely bad research (and thus being unaware of prior art) as opposed to anything malicious. Furthermore, word from within their office suggests they’re quite possibly being honest : supposedly the development team does not read x264 code at all. So this might just all be very bad luck.

Regardless, the patent is still complete tripe, and should never have been filed.

Most importantly, stop harassing the guy whose name is on the patent (Lars) : he’s just a programmer, not the management or lawyers responsible for filing the patent. This is stupid and unnecessary. I’ve removed the original post because of this ; it can be found here for those who want to read it.

Appendix : the details of the patent :

I figure I’ll go over the exact correspondence between the patent and my code here.

1. A method for calculating run and level representations of quantized transform coefficients representing pixel values included in a block of a video picture, the method comprising :

Translation : It’s a run-level coder.

packing, at a video processing apparatus, each quantized transform coefficients in a value interval [Max, Min] by setting all quantized transform coefficients greater than Max equal to Max, and all quantized transform coefficients less than Min equal to Min

The quantized coefficients are clipped to a certain valid range to allow them to be packed into bytes (they start as 16-bit values).

reordering, at the video processing apparatus, the quantized transform ID coefficients according to a predefined order depending on respective positions in the block resulting in an array C of reordered quantized transform coefficients

This is the zigzag pattern used in H.264 (and most formats) for reordering DCT coefficients. In x264, this is done before the run-level coder ste.

masking, at the video processing apparatus, C by generating an array M containing ones in positions corresponding to positions of C having non-zero values, and zeros in positions corresponding to positions of C having zero values

This is creating a bitmask based on the coefficient values, the pmovmskb step.

is generating, at the video processing apparatus, for each position containing a one in M, a run and a level representation by setting the level value equal to an occurring value in a corresponding position of C ; and setting, at the video processing apparatus, for each position containing a one in M₅ the run value equal to the number of proceeding positions relative to a current position in M since a previous occurrence of one in M.

This is the process of creating run/level values from the bitmask.

Now into the detailed claims :

2. The method according to Claim 1, wherein the masking further includes, creating an array C from C where positions corresponding to positions of nonzero values in C are filled with ones, and positions corresponding to positions of zero values in C are filled with zeros, and creating M from C by extracting the most significant bit from values in respective position of C and inserting the bits in corresponding positions in M.

They’re extracting the most significant bit of the values to create a bitmask. This is exactly what the pmovmskb in my algorithm does.

3. The method according to Claim 2, wherein the creating of the array C is executed by a C++ function PCMPGTB, and the creating of M from C is executed by a C++ function PMOVMSKB.

And here they use pcmpgtb (they call it a C++ function for some reason, but it’s a SSE instruction) to do the clipping of the input values. This is exactly the same method I used in decimate_score. They also use pmovmskb as mentioned.

4. The method according to Claim 1 , wherein the generating of the run and level representation further includes determining positions containing non-zero values in C by corresponding positions containing ones in M.

5. The method according to Claim 4, wherein the determining of positions containing non-zero values in C is executed by a C++ function BSF.

Here they iterate over the bitmask of transform coefficients using a “BSF” function to find runs, which is exactly what I did. Of course, BSF isn’t a function, it’s an x86 instruction.

6. The method according to Claim 1 , wherein Max is 256 and Min is 0.

This is almost surely a typo or mistake of some sort. They mean the Max should be 255, not 256 : 256 doesn’t fit in a uint8_t.

7. The method according to Claim 1 , wherein the predefined order follows a zigzag path of transform coefficient positions in the block starting in an upper left corner heading towards a lower right corner.

This is a description of the typical DCT zigzag pattern (like in H.264, MPEG-2, Theora, etc).

Everything after this part is just repeating itself with the phrase “an apparatus” added in order to make the USPTO listen to them.
VP8 Codec SDK "Aylesbury" Release

28 octobre 2010, par noreply@blogger.com (John Luther)
Today we’re making available "Aylesbury," our first named release of libvpx, the VP8 codec SDK. VP8 is the video codec used in WebM. Note that the VP8 specification has not changed, only the SDK.

What’s an Aylesbury ? It’s a breed of duck. We like ducks, so we plan to use duck-related names for each major libvpx release, in alphabetical order. Our goal is to have one named release of libvpx per calendar quarter, each with a theme.

You can download the Aylesbury libvpx release from our Downloads page or check it out of our Git repository and build it yourself. In the coming days Aylesbury will be integrated into all of the WebM project components (DirectShow filters, QuickTime plugins, etc.). We encourage anyone using our components to upgrade to the Aylesbury releases.

For Aylesbury the theme was faster decoder, better encoder. We used our May 19, 2010 launch release of libvpx as the benchmark. We’re very happy with the results (see graphs below) :
- 20-40% (average 28%) improvement in libvpx decoder speed
- Over 7% overall PSNR improvement (6.3% SSIM) in VP8 "best" quality encoding mode, and up to 60% improvement on very noisy, still or slow moving source video.
The main improvements to the decoder are :
- Single-core assembly "hot spot" optimizations, including improved vp8_sixtap_predict() and SSE2 loopfilter functions
- Threading improvements for more efficient use of multiple processor cores
- Improved memory handling and reduced footprint
- Combining IDCT and reconstruction steps
- SSSE3 usage in functions where appropriate
On the encoder front, we concentrated on clips in the 30-45 dB range and saw the biggest gains in higher-quality source clips (greater that 38 dB), low to medium-motion clips, and clips with noisy source material. Many code contributions made this possible, but a few of the highlights were :
- Adaptive width and strength alternate reference frame noise suppression filter with optional motion compensation.
- Transform improvements (improved accuracy and reduction in round trip error)
- Trellis-based quantized coefficient optimization
- Two-pass rate control and quantizer changes
- Rate distortion changes
- Zero bin and rounding changes
- Work on MB-level quality control and bit allocation
We’re targeting Q1 2011 for the next named libvpx release, which we’re calling Bali. The theme for that release will be faster encoder. We are constantly working on improvements to video quality in the encoder, so after Aylesbury we won’t tie that work to specific named releases.

WebM at Streaming Media West

Members of the WebM project will discuss Aylesbury during a session at the Streaming Media West conference on November 3rd (session C203 : WebM Open Video Project Update). For more information, visit www.streamingmedia.com/west.

John Luther is Product Manager of the WebM Project.
H.264 and VP8 for still image coding : WebP ?

1er octobre 2010, par Dark Shikari — google, H.264, psychovisual optimizations, VP8

Update : post now contains a Theora comparison as well ; see below.

JPEG is a very old lossy image format. By today’s standards, it’s awful compression-wise : practically every video format since the days of MPEG-2 has been able to tie or beat JPEG at its own game. The reasons people haven’t switched to something more modern practically always boil down to a simple one — it’s just not worth the hassle. Even if JPEG can be beaten by a factor of 2, convincing the entire world to change image formats after 20 years is nigh impossible. Furthermore, JPEG is fast, simple, and practically guaranteed to be free of any intellectual property worries. It’s been tried before : JPEG-2000 first, then Microsoft’s JPEG XR, both tried to unseat JPEG. Neither got much of anywhere.

Now Google is trying to dump yet another image format on us, “WebP”. But really, it’s just a VP8 intra frame. There are some obvious practical problems with this new image format in comparison to JPEG ; it doesn’t even support all of JPEG’s features, let alone many of the much-wanted features JPEG was missing (alpha channel support, lossless support). It only supports 4:2:0 chroma subsampling, while JPEG can handle 4:2:2 and 4:4:4. Google doesn’t seem interested in adding any of these features either.

But let’s get to the meat and see how these encoders stack up on compressing still images. As I explained in my original analysis, VP8 has the advantage of H.264′s intra prediction, which is one of the primary reasons why H.264 has such an advantage in intra compression. It only has i4x4 and i16x16 modes, not i8x8, so it’s not quite as fancy as H.264′s, but it comes close.

The test files are all around 155KB ; download them for the exact filesizes. For all three, I did a binary search of quality levels to get the file sizes close. For x264, I encoded with --tune stillimage --preset placebo. For libvpx, I encoded with --best. For JPEG, I encoded with ffmpeg, then applied jpgcrush, a lossless jpeg compressor. I suspect there are better JPEG encoders out there than ffmpeg ; if you have one, feel free to test it and post the results. The source image is the 200th frame of Parkjoy, from derf’s page (fun fact : this video was shot here ! More info on the video here.).

Files : (x264 [154KB], vp8 [155KB], jpg [156KB])

Results (decoded to PNG) : (x264, vp8, jpg)

This seems rather embarrassing for libvpx. Personally I think VP8 looks by far the worst of the bunch, despite JPEG’s blocking. What’s going on here ? VP8 certainly has better entropy coding than JPEG does (by far !). It has better intra prediction (JPEG has just DC prediction). How could VP8 look worse ? Let’s investigate.

VP8 uses a 4×4 transform, which tends to blur and lose more detail than JPEG’s 8×8 transform. But that alone certainly isn’t enough to create such a dramatic difference. Let’s investigate a hypothesis — that the problem is that libvpx is optimizing for PSNR and ignoring psychovisual considerations when encoding the image… I’ll encode with --tune psnr --preset placebo in x264, turning off all psy optimizations.

Files : (x264, optimized for PSNR [154KB]) [Note for the technical people : because adaptive quantization is off, to get the filesize on target I had to use a CQM here.]

Results (decoded to PNG) : (x264, optimized for PSNR)

What a blur ! Only somewhat better than VP8, and still worse than JPEG. And that’s using the same encoder and the same level of analysis — the only thing done differently is dropping the psy optimizations. Thus we come back to the conclusion I’ve made over and over on this blog — the encoder matters more than the video format, and good psy optimizations are more important than anything else for compression. libvpx, a much more powerful encoder than ffmpeg’s jpeg encoder, loses because it tries too hard to optimize for PSNR.

These results raise an obvious question — is Google nuts ? I could understand the push for “WebP” if it was better than JPEG. And sure, technically as a file format it is, and an encoder could be made for it that’s better than JPEG. But note the word “could”. Why announce it now when libvpx is still such an awful encoder ? You’d have to be nuts to try to replace JPEG with this blurry mess as-is. Now, I don’t expect libvpx to be able to compete with x264, the best encoder in the world — but surely it should be able to beat an image format released in 1992 ?

Earth to Google : make the encoder good first, then promote it as better than the alternatives. The reverse doesn’t work quite as well.

Addendum (added Oct. 2, 03:51) :

maikmerten gave me a Theora-encoded image to compare as well. Here’s the PNG and the source (155KB). And yes, that’s Theora 1.2 (Ptalarbvorm) beating VP8 handily. Now that is embarassing. Guess what the main new feature of Ptalarbvorm is ? Psy optimizations…

Addendum (added Apr. 20, 23:33) :

There’s a new webp encoder out, written from scratch by skal (available in libwebp). It’s significantly better than libvpx — not like that says much — but it should probably beat JPEG much more readily now. The encoder design is rather unique — it basically uses K-means for a large part of the encoding process. It still loses to x264, but that was expected.

[155KB]