
Recherche avancée
Autres articles (111)
-
Supporting all media types
13 avril 2011, parUnlike most software and media-sharing platforms, MediaSPIP aims to manage as many different media types as possible. The following are just a few examples from an ever-expanding list of supported formats : images : png, gif, jpg, bmp and more audio : MP3, Ogg, Wav and more video : AVI, MP4, OGV, mpg, mov, wmv and more text, code and other data : OpenOffice, Microsoft Office (Word, PowerPoint, Excel), web (html, CSS), LaTeX, Google Earth and (...)
-
MediaSPIP version 0.1 Beta
16 avril 2011, parMediaSPIP 0.1 beta est la première version de MediaSPIP décrétée comme "utilisable".
Le fichier zip ici présent contient uniquement les sources de MediaSPIP en version standalone.
Pour avoir une installation fonctionnelle, il est nécessaire d’installer manuellement l’ensemble des dépendances logicielles sur le serveur.
Si vous souhaitez utiliser cette archive pour une installation en mode ferme, il vous faudra également procéder à d’autres modifications (...) -
MediaSPIP v0.2
21 juin 2013, parMediaSPIP 0.2 est la première version de MediaSPIP stable.
Sa date de sortie officielle est le 21 juin 2013 et est annoncée ici.
Le fichier zip ici présent contient uniquement les sources de MediaSPIP en version standalone.
Comme pour la version précédente, il est nécessaire d’installer manuellement l’ensemble des dépendances logicielles sur le serveur.
Si vous souhaitez utiliser cette archive pour une installation en mode ferme, il vous faudra également procéder à d’autres modifications (...)
Sur d’autres sites (2758)
-
VP8 : a retrospective
I’ve been working the past few weeks to help finish up the ffmpeg VP8 decoder, the first community implementation of On2′s VP8 video format. Now that I’ve written a thousand or two lines of assembly code and optimized a good bit of the C code, I’d like to look back at VP8 and comment on a variety of things — both good and bad — that slipped the net the first time, along with things that have changed since the time of that blog post.
These are less-so issues related to compression — that issue has been beaten to death, particularly in MSU’s recent comparison, where x264 beat the crap out of VP8 and the VP8 developers pulled a Pinocchio in the developer comments. But that was expected and isn’t particularly interesting, so I won’t go into that. VP8 doesn’t have to be the best in the world in order to be useful.
When the ffmpeg VP8 decoder is complete (just a few more asm functions to go), we’ll hopefully be able to post some benchmarks comparing it to libvpx.
1. The spec, er, I mean, bitstream guide.
Google has reneged on their claim that a spec existed at all and renamed it a “bitstream guide”. This is probably after it was found that — not merely was it incomplete — but at least a dozen places in the spec differed wildly from what was actually in their own encoder and decoder software ! The deblocking filter, motion vector clamping, probability tables, and many more parts simply disagreed flat-out with the spec. Fortunately, Ronald Bultje, one of the main authors of the ffmpeg VP8 decoder, is rather skilled at reverse-engineering, so we were able to put together a matching implementation regardless.
Most of the differences aren’t particularly important — they don’t have a huge effect on compression or anything — but make it vastly more difficult to implement a “working” VP8 decoder, or for that matter, decide what “working” really is. For example, Google’s decoder will, if told to “swap the ALT and GOLDEN reference frames”, overwrite both with GOLDEN, because it first sets GOLDEN = ALT, and then sets ALT = GOLDEN. Is this a bug ? Or is this how it’s supposed to work ? It’s hard to tell — there isn’t a spec to say so. Google says that whatever libvpx does is right, but I doubt they intended this.
I expect a spec will eventually be written, but it was a bit obnoxious of Google — both to the community and to their own developers — to release so early that they didn’t even have their own documentation ready.
2. The TM intra prediction mode.
One thing I glossed over in the original piece was that On2 had added an extra intra prediction mode to the standard batch that H.264 came with — they replaced Planar with “TM pred”. For i4x4, which didn’t have a Planar mode, they just added it without replacing an old one, resulting in a total of 10 modes to H.264′s 9. After understanding and writing assembly code for TM pred, I have to say that it is quite a cool idea. Here’s how it works :
1. Let us take a block of size 4×4, 8×8, or 16×16.
2. Define the pixels bordering the top of this block (starting from the left) as T[0], T[1], T[2]…
3. Define the pixels bordering the left of this block (starting from the top) as L[0], L[1], L[2]…
4. Define the pixel above the top-left of the block as TL.
5. Predict every pixel <X,Y> in the block to be equal to clip3( T[X] + L[Y] – TL, 0, 255).
It’s effectively a generalization of gradient prediction to the block level — predict each pixel based on the gradient between its top and left pixels, and the topleft. According to the VP8 devs, it’s chosen by the encoder quite a lot of the time, which isn’t surprising ; it seems like a pretty good idea. As just one more intra pred mode, it’s not going to do magic for compression, but it’s a cool idea and elegantly simple.
3. Performance and the deblocking filter.
On2 advertised for quite some that VP8′s goal was to be significantly faster to decode than H.264. When I saw the spec, I waited for the punchline, but apparently they were serious. There’s nothing wrong with being of similar speed or a bit slower — but I was rather confused as to the fact that their design didn’t match their stated goal at all. What apparently happened is they had multiple profiles of VP8 — high and low complexity profiles. They marketed the performance of the low complexity ones while touting the quality of the high complexity ones, a tad dishonest. More importantly though, practically nobody is using the low complexity modes, so anyone writing a decoder has to be prepared to handle the high complexity ones, which are the default.
The primary time-eater here is the deblocking filter. VP8, being an H.264 derivative, has much the same problem as H.264 does in terms of deblocking — it spends an absurd amount of time there. As I write this post, we’re about to finish some of the deblocking filter asm code, but before it’s committed, up to 70% or more of total decoding time is spent in the deblocking filter ! Like H.264, it suffers from the 4×4 transform problem : a 4×4 transform requires a total of 8 length-16 and 8 length-8 loopfilter calls per macroblock, while Theora, with only an 8×8 transform, requires half that.
This problem is aggravated in VP8 by the fact that the deblocking filter isn’t strength-adaptive ; if even one 4×4 block in a macroblock contains coefficients, every single edge has to be deblocked. Furthermore, the deblocking filter itself is quite complicated ; the “inner edge” filter is a bit more complex than H.264′s and the “macroblock edge” filter is vastly more complicated, having two entirely different codepaths chosen on a per-pixel basis. Of course, in SIMD, this means you have to do both and mask them together at the end.
There’s nothing wrong with a good-but-slow deblocking filter. But given the amount of deblocking one needs to do in a 4×4-transform-based format, it might have been a better choice to make the filter simpler. It’s pretty difficult to beat H.264 on compression, but it’s certainly not hard to beat it on speed — and yet it seems VP8 missed a perfectly good chance to do so. Another option would have been to pick an 8×8 transform instead of 4×4, reducing the amount of deblocking by a factor of 2.
And yes, there’s a simple filter available in the low complexity profile, but it doesn’t help if nobody uses it.
4. Tree-based arithmetic coding.
Binary arithmetic coding has become the standard entropy coding method for a wide variety of compressed formats, ranging from LZMA to VP6, H.264 and VP8. It’s simple, relatively fast compared to other arithmetic coding schemes, and easy to make adaptive. The problem with this is that you have to come up with a method for converting non-binary symbols into a list of binary symbols, and then choosing what probabilities to use to code each one. Here’s an example from H.264, the sub-partition mode symbol, which is either 8×8, 8×4, 4×8, or 4×4. encode_decision( context, bit ) writes a binary decision (bit) into a numbered context (context).
8×8 : encode_decision( 21, 0 ) ;
8×4 : encode_decision( 21, 1 ) ; encode_decision( 22, 0 ) ;
4×8 : encode_decision( 21, 1 ) ; encode_decision( 22, 1 ) ; encode_decision( 23, 1 ) ;
4×4 : encode_decision( 21, 1 ) ; encode_decision( 22, 1 ) ; encode_decision( 23, 0 ) ;
As can be seen, this is clearly like a Huffman tree. Wouldn’t it be nice if we could represent this in the form of an actual tree data structure instead of code ? On2 thought so — they designed a simple system in VP8 that allowed all binarization schemes in the entire format to be represented as simple tree data structures. This greatly reduces the complexity — not speed-wise, but implementation-wise — of the entropy coder. Personally, I quite like it.
5. The inverse transform ordering.
I should at some point write a post about common mistakes made in video formats that everyone keeps making. These are not issues that are patent worries or huge issues for compression — just stupid mistakes that are repeatedly made in new video formats, probably because someone just never asked the guy next to him “does this look stupid ?” before sticking it in the spec.
One common mistake is the problem of transform ordering. Every sane 2D transform is “separable” — that is, it can be done by doing a 1D transform vertically and doing the 1D transform again horizontally (or vice versa). The original iDCT as used in JPEG, H.263, and MPEG-1/2/4 was an “idealized” iDCT — nobody had to use the exact same iDCT, theirs just had to give very close results to a reference implementation. This ended up resulting in a lot of practical problems. It was also slow ; the only way to get an accurate enough iDCT was to do all the intermediate math in 32-bit.
Practically every modern format, accordingly, has specified an exact iDCT. This includes H.264, VC-1, RV40, Theora, VP8, and many more. Of course, with an exact iDCT comes an exact ordering — while the “real” iDCT can be done in any order, an exact iDCT usually requires an exact order. That is, it specifies horizontal and then vertical, or vertical and then horizontal.
All of these transforms end up being implemented in SIMD. In SIMD, a vertical transform is generally the only option, so a transpose is added to the process instead of doing a horizontal transform. Accordingly, there are two ways to do it :
1. Transpose, vertical transform, transpose, vertical transform.
2. Vertical transform, transpose, vertical transform, transpose.
These may seem to be equally good, but there’s one catch — if the transpose is done first, it can be completely eliminated by merging it into the coefficient decoding process. On many modern CPUs, particularly x86, transposes are very expensive, so eliminating one of the two gives a pretty significant speed benefit.
H.264 did it way 1).
VC-1 did it way 1).
Theora (inherited from VP3) did it way 1).
But no. VP8 has to do it way 2), where you can’t eliminate the transpose. Bah. It’s not a huge deal ; probably only 1-2% overall at most speed-wise, but it’s just a needless waste. What really bugs me is that VP3 got it right — why in the world did they screw it up this time around if they got it right beforehand ?
RV40 is the other modern format I know that made this mistake.
(NB : You can do transforms without a transpose, but it’s generally not worth it unless the intermediate needs 32-bit math, as in the case of the “real” iDCT.)
6. Not supporting interlacing.
THANK YOU THANK YOU THANK YOU THANK YOU THANK YOU THANK YOU THANK YOU.
Interlacing was the scourge of H.264. It weaseled its way into every nook and cranny of the spec, making every decoder a thousand lines longer. H.264 even included a highly complicated — and effective — dedicated interlaced coding scheme, MBAFF. The mere existence of MBAFF, despite its usefulness for broadcasters and others still stuck in the analog age with their 1080i, 576i , and 480i content, was a blight upon the video format.
VP8 has once and for all avoided it.
And if anyone suggests adding interlaced support to the experimental VP8 branch, find a straightjacket and padded cell for them before they cause any real damage.
-
Translating Return To Ringworld
17 août 2016, par Multimedia Mike — Game HackingAs indicated in my previous post, the Translator has expressed interest in applying his hobby towards another DOS adventure game from the mid 1990s : Return to Ringworld (henceforth R2RW) by Tsunami Media. This represents significantly more work than the previous outing, Phantasmagoria.
Return to Ringworld Title Screen
I have been largely successful thus far in crafting translation tools. I have pushed the fruits of these labors to a Github repository named improved-spoon (named using Github’s random name generator because I wanted something more interesting than ‘game-hacking-tools’).
Further, I have recorded everything I have learned about the game’s resource format (named RLB) at the XentaxWiki.
New Challenges
The previous project mostly involved scribbling subtitle text on an endless series of video files by leveraging a separate software library which took care of rendering fonts. In contrast, R2RW has at least 30k words of English text contained in various blocks which require translation. Further, the game encodes its own fonts (9 of them) which stubbornly refuse to be useful for rendering text in nearly any other language.Thus, the immediate 2 challenges are :
- Translating volumes of text to Spanish
- Expanding the fonts to represent Spanish characters
Normally, “figuring out the file format data structures involved” is on the list as well. Thankfully, understanding the formats is not a huge challenge since the folks at the ScummVM project already did all the heavy lifting of reverse engineering the file formats.
The Pitch
Here was the plan :- Create a tool that can dump out the interesting data from the game’s master resource file.
- Create a tool that can perform the elaborate file copy described in the previous post. The new file should be bit for bit compatible with the original file.
- Modify the rewriting tool to repack some modified strings into the new resource file.
- Unpack the fonts and figure out a way to add new characters.
- Repack the new fonts into the resource file.
- Repack message strings with Spanish characters.
Showing The Work : Modifying Strings
First, I created the tool to unpack blocks of message string resources. I elected to dump the strings to disk as JSON data since it’s easy to write and read JSON using Python, and it’s quick to check if any mistakes have crept in.The next step is to find a string to focus on. So I started the game and looked for the first string I could trigger :
This shows up in the JSON string dump as :
"Spanish" : " !0205Your quarters on the Lance of Truth are spartan, in accord with your mercenary lifestyle.", "English" : " !0205Your quarters on the Lance of Truth are spartan, in accord with your mercenary lifestyle." ,
As you can see, many of the strings are encoded with an ID key as part of the string which should probably be left unmodified. I changed the Spanish string :
"Spanish" : " !0205Hey, is this thing on ?", "English" : " !0205Your quarters on the Lance of Truth are spartan, in accord with your mercenary lifestyle." ,
And then I wrote the repacking tool to substitute this message block for the original one. Look ! The engine liked it !
Little steps, little steps.
Showing The Work : Modifying Fonts
The next little step is to find a place to put the new characters. First, a problem definition : The immediate goal is to translate the game into Spanish. The current fonts encoded in the game resource only support 128 characters, corresponding to 7-bit ASCII. In order to properly express Spanish, 16 new characters are required : á, é, í, ó, ú, ü, ñ (each in upper and lower case for a total of 14 characters) as well as the inverted punctuation symbols : ¿, ¡.Again, ScummVM already documents (via code) the font coding format. So I quickly determined that each of the 9 fonts is comprised of 128 individual bitmaps with either 1 or 2 bits per pixel. I wrote a tool to unpack each character into an individual portable grey map (PGM) image. These can be edited with graphics editors or with text editors since they are just text files.
Where to put the 16 new Spanish characters ? ASCII characters 1-31 are non-printable, so my first theory was that these characters would be empty and could be repurposed. However, after dumping and inspecting, I learned that they represent the same set of characters as seen in DOS Code Page 437. So that’s a no-go (so I assumed ; I didn’t check if any existing strings leveraged those characters).
My next plan was hope that I could extend the font beyond index 127 and use positions 128-143. This worked superbly. This is the new example string :
"Spanish" : " !0205¿Ves esto ? ¡La puntuacion se hace girar !", "English" : " !0205Your quarters on the Lance of Truth are spartan, in accord with your mercenary lifestyle." ,
Fortunately, JSON understands UTF-8 and after mapping the 16 necessary characters down to the numeric range of 128-143, I repacked the new fonts and the new string :
Translation : “See this ? The punctuation is rotated !”
Another victory. Notice that there are no diacritics in this string. None are required for this translation (according to Google Translate). But adding the diacritics to the 14 characters isn’t my department. My tool does help by prepopulating [aeiounAEIOUN] into the right positions to make editing easier for the Translator. But the tool does make the effort to rotate the punctuation since that is easy to automate.
Next Steps and Residual Weirdness
There is another method for storing ASCII text inside the R2RW resource called strip resources. These store conversation scripts. There are plenty of fields in the data structures that I don’t fully understand. So, following the lessons I learned from my previous translation outing, I was determined to modify as little as possible. This means copying over most of the original data structures intact, but changing the field representing the relative offset that points to the corresponding string. This works well since the strings are invariably stored NULL-terminated in a concatenated manner.I wanted to document for the record that the format that R2RW uses has some weirdness in they way it handles residual bytes in a resource. The variant of the resource format that R2RW uses requires every block to be aligned on a 16-byte boundary. If there is space between the logical end of the resource and the start of the next resource, there are random bytes in that space. This leads me to believe that these bytes were originally recorded from stale/uninitialized memory. This frustrates me because when I write the initial file copy tool which unpacks and repacks each block, I want the new file to be identical to the original. However, these apparent nonsense bytes at the end thwart that effort.
But leaving those bytes as 0 produces an acceptable resource file.
Text On Static Images
There is one last resource type we are working on translating. There are various bits of text that are rendered as images. For example, from the intro :
It’s possible to locate and extract the exact image that is overlaid on this scene, though without the colors :
The palettes are stored in a separate resource type. So it seems the challenge is to figure out the palette in use for these frames and render a transparent image that uses the same palette, then repack the new text-image into the new resource file.
The post Translating Return To Ringworld first appeared on Breaking Eggs And Making Omelettes.
-
Matomo Celebrates 15 Years of Building an Open-Source & Transparent Web Analytics Solution