Recherche avancée

Médias (0)

Mot : - Tags -/objet éditorial

Aucun média correspondant à vos critères n’est disponible sur le site.

Autres articles (27)

  • La sauvegarde automatique de canaux SPIP

    1er avril 2010, par

    Dans le cadre de la mise en place d’une plateforme ouverte, il est important pour les hébergeurs de pouvoir disposer de sauvegardes assez régulières pour parer à tout problème éventuel.
    Pour réaliser cette tâche on se base sur deux plugins SPIP : Saveauto qui permet une sauvegarde régulière de la base de donnée sous la forme d’un dump mysql (utilisable dans phpmyadmin) mes_fichiers_2 qui permet de réaliser une archive au format zip des données importantes du site (les documents, les éléments (...)

  • Script d’installation automatique de MediaSPIP

    25 avril 2011, par

    Afin de palier aux difficultés d’installation dues principalement aux dépendances logicielles coté serveur, un script d’installation "tout en un" en bash a été créé afin de faciliter cette étape sur un serveur doté d’une distribution Linux compatible.
    Vous devez bénéficier d’un accès SSH à votre serveur et d’un compte "root" afin de l’utiliser, ce qui permettra d’installer les dépendances. Contactez votre hébergeur si vous ne disposez pas de cela.
    La documentation de l’utilisation du script d’installation (...)

  • Automated installation script of MediaSPIP

    25 avril 2011, par

    To overcome the difficulties mainly due to the installation of server side software dependencies, an "all-in-one" installation script written in bash was created to facilitate this step on a server with a compatible Linux distribution.
    You must have access to your server via SSH and a root account to use it, which will install the dependencies. Contact your provider if you do not have that.
    The documentation of the use of this installation script is available here.
    The code of this (...)

Sur d’autres sites (2950)

  • Things I Have Learned About Emscripten

    1er septembre 2015, par Multimedia Mike — Cirrus Retro

    3 years ago, I released my Game Music Appreciation project, a website with a ludicrously uninspired title which allowed users a relatively frictionless method to experience a range of specialized music files related to old video games. However, the site required use of a special Chrome plugin. Ever since that initial release, my #1 most requested feature has been for a pure JavaScript version of the music player.

    “Impossible !” I exclaimed. “There’s no way JS could ever run fast enough to run these CPU emulators and audio synthesizers in real time, and allow for the visualization that I demand !” Well, I’m pleased to report that I have proved me wrong. I recently quietly launched a new site with what I hope is a catchier title, meant to evoke a cloud-based retro-music-as-a-service product : Cirrus Retro. Right now, it’s basically the same as the old site, but without the wonky Chrome-specific technology.

    Along the way, I’ve learned a few things about using Emscripten that I thought might be useful to share with other people who wish to embark on a similar journey. This is geared more towards someone who has a stronger low-level background (such as C/C++) vs. high-level (like JavaScript).

    General Goals
    Do you want to cross-compile an entire desktop application, one that relies on an extensive GUI toolkit ? That might be difficult (though I believe there is a path for porting qt code directly with Emscripten). Your better wager might be to abstract out the core logic and processes of the program and then create a new web UI to access them.

    Do you want to compile a game that basically just paints stuff to a 2D canvas ? You’re in luck ! Emscripten has a porting path for SDL. Make a version of your C/C++ software that targets SDL (generally not a tall order) and then compile that with Emscripten.

    Do you just want to cross-compile some functionality that lives in a library ? That’s what I’ve done with the Cirrus Retro project. For this, plan to compile the library into a JS file that exports some public functions that other, higher-level, native JS (i.e., JS written by a human and not a computer) will invoke.

    Memory Levels
    When porting C/C++ software to JavaScript using Emscripten, you have to think on 2 different levels. Or perhaps you need to force JavaScript into a low level C lens, especially if you want to write native JS code that will interact with Emscripten-compiled code. This often means somehow allocating chunks of memory via JS and passing them to the Emscripten-compiled functions. And you wouldn’t believe the type of gymnastics you need to execute to get native JS and Emscripten-compiled JS to cooperate.

    “Emscripten : Pointers and Pointers” is the best (and, really, ONLY) explanation I could find for understanding the basic mechanics of this process, at least when I started this journey. However, there’s a mistake in the explanation that left me confused for a little while, and I’m at a loss to contact the author (doesn’t anyone post a simple email address anymore ?).

    Per the best of my understanding, Emscripten allocates a large JS array and calls that the memory space that the compiled C/C++ code is allowed to operate in. A pointer in C/C++ code will just be an index into that mighty array. Really, that’s not too far off from how a low-level program process is supposed to view memory– as a flat array.

    Eventually, I just learned to cargo-cult my way through the memory allocation process. Here’s the JS code for allocating an Emscripten-compatible byte buffer, taken from my test harness (more on that later) :

    var musicBuffer = fs.readFileSync(testSpec[’filename’]) ;
    var musicBufferBytes = new Uint8Array(musicBuffer) ;
    var bytesMalloc = player._malloc(musicBufferBytes.length) ;
    var bytes = new Uint8Array(player.HEAPU8.buffer, bytesMalloc, musicBufferBytes.length) ;
    bytes.set(new Uint8Array(musicBufferBytes.buffer)) ;
    

    So, read the array of bytes from some input source, create a Uint8Array from the bytes, use the Emscripten _malloc() function to allocate enough bytes from the Emscripten memory array for the input bytes, then create a new array… then copy the bytes…

    You know what ? It’s late and I can’t remember how it works exactly, but it does. It has been a few months since I touched that code (been fighting with front-end website tech since then). You write that memory allocation code enough times and it begins to make sense, and then you hope you don’t have to write it too many more times.

    Multithreading
    You can’t port multithreaded code to JS via Emscripten. JavaScript has no notion of threads ! If you don’t understand the computer science behind this limitation, a more thorough explanation is beyond the scope of this post. But trust me, I’ve thought about it a lot. In fact, the official Emscripten literature states that you should be able to port most any C/C++ code as long as 1) none of the code is proprietary (i.e., all the raw source is available) ; and 2) there are no threads.

    Yes, I read about the experimental pthreads support added to Emscripten recently. Don’t get too excited ; that won’t be ready and widespread for a long time to come as it relies on a new browser API. In the meantime, figure out how to make your multithreaded C/C++ code run in a single thread if you want it to run in a browser.

    Printing Facility
    Eventually, getting software to work boils down to debugging, and the most primitive tool in many a programmer’s toolbox is the humble print statement. A print statement allows you to inspect a piece of a program’s state at key junctures. Eventually, when you try to cross-compile C/C++ code to JS using Emscripten, something is not going to work correctly in the generated JS “object code” and you need to understand what. You’ll be pleading for a method of just inspecting one variable deep in the original C/C++ code.

    I came up with this simple printf-workalike called emprintf() :

    #ifndef EMPRINTF_H
    #define EMPRINTF_H
    

    #include <stdio .h>
    #include <stdarg .h>
    #include <emscripten .h>

    #define MAX_MSG_LEN 1000

    /* NOTE : Don’t pass format strings that contain single quote (’) or newline
    * characters. */
    static void emprintf(const char *format, ...)

    char msg[MAX_MSG_LEN] ;
    char consoleMsg[MAX_MSG_LEN + 16] ;
    va_list args ;

    /* create the string */
    va_start(args, format) ;
    vsnprintf(msg, MAX_MSG_LEN, format, args) ;
    va_end(args) ;

    /* wrap the string in a console.log(’’) statement */
    snprintf(consoleMsg, MAX_MSG_LEN + 16, "console.log(’%s’)", msg) ;

    /* send the final string to the JavaScript console */
    emscripten_run_script(consoleMsg) ;

    #endif /* EMPRINTF_H */

    Put it in a file called “emprint.h”. Include it into any C/C++ file where you need debugging visibility, use emprintf() as a replacement for printf() and the output will magically show up on the browser’s JavaScript debug console. Heed the comments and don’t put any single quotes or newlines in strings, and keep it under 1000 characters. I didn’t say it was perfect, but it has helped me a lot in my Emscripten adventures.

    Optimization Levels
    Remember to turn on optimization when compiling. I have empirically found that optimizing for size (-Os) leads to the best performance all around, in addition to having the smallest size. Just be sure to specify some optimization level. If you don’t, the default is -O0 which offers horrible performance when running in JS.

    Static Compression For HTTP Delivery
    JavaScript code compresses pretty efficiently, even after it has been optimized for size using -Os. I routinely see compression ratios between 3.5:1 and 5:1 using gzip.

    Web servers in this day and age are supposed to be smart enough to detect when a requesting web browser can accept gzip-compressed data and do the compression on the fly. They’re even supposed to be smart enough to cache compressed output so the same content is not recompressed for each request. I would have to set up a series of tests to establish whether either of the foregoing assertions are correct and I can’t be bothered. Instead, I took it into my own hands. The trick is to pre-compress the JS files and then instruct the webserver to serve these files with a ‘Content-Type’ of ‘application/javascript’ and a ‘Content-Encoding’ of ‘gzip’.

    1. Compress your large Emscripten-build JS files with ‘gzip’ : ‘gzip compiled-code.js’
    2. Rename them from extension .js.gz to .jsgz
    3. Tell the webserver to deliver .jsgz files with the correct Content-Type and Content-Encoding headers

    To do that last step with Apache, specify these lines :

    AddType application/javascript jsgz
    AddEncoding gzip jsgz
    

    They belong in either a directory’s .htaccess file or in the sitewide configuration (/etc/apache2/mods-available/mime.conf works on my setup).

    Build System and Build Time Optimization
    Oh goodie, build systems ! I had a very specific manner in which I wanted to build my JS modules using Emscripten. Can I possibly coerce any of the many popular build systems to do this ? It has been a few months since I worked on this problem specifically but I seem to recall that the build systems I tried to used would freak out at the prospect of compiling stuff to a final binary target of .js.

    I had high hopes for Bazel, which Google released while I was developing Cirrus Retro. Surely, this is software that has been battle-tested in the harshest conditions of one of the most prominent software-developing companies in the world, needing to take into account the most bizarre corner cases and still build efficiently and correctly every time. And I have little doubt that it fulfills the order. Similarly, I’m confident that Google also has a team of no fewer than 100 or so people dedicated to developing and supporting the project within the organization. When you only have, at best, 1-2 hours per night to work on projects like this, you prefer not to fight with such cutting edge technology and after losing 2 or 3 nights trying to make a go of Bazel, I eventually put it aside.

    I also tried to use Autotools. It failed horribly for me, mostly for my own carelessness and lack of early-project source control.

    After that, it was strictly vanilla makefiles with no real dependency management. But you know what helps in these cases ? ccache ! Or at least, it would if it didn’t fail with Emscripten.

    Quick tip : ccache has trouble with LLVM unless you set the CCACHE_CPP2 environment variable (e.g. : “export CCACHE_CPP2=1”). I don’t remember the specifics, but it magically fixes things. Then, the lazy build process becomes “make clean && make”.

    Testing
    If you have never used Node.js, testing Emscripten-compiled JS code might be a good opportunity to start. I was able to use Node.js to great effect for testing the individually-compiled music player modules, wiring up a series of invocations using Python for a broader test suite (wouldn’t want to go too deep down the JS rabbit hole, after all).

    Be advised that Node.js doesn’t enjoy the same kind of JIT optimizations that the browser engines leverage. Thus, in the case of time critical code like, say, an audio synthesis library, the code might not run in real time. But as long as it produces the correct bitwise waveform, that’s good enough for continuous integration.

    Also, if you have largely been a low-level programmer for your whole career and are generally unfamiliar with the world of single-threaded, event-driven, callback-oriented programming, you might be in for a bit of a shock. When I wanted to learn how to read the contents of a file in Node.js, this is the first tutorial I found on the matter. I thought the code presented was a parody of bad coding style :

    var fs = require("fs") ;
    var fileName = "foo.txt" ;
    

    fs.exists(fileName, function(exists)
    if (exists)
    fs.stat(fileName, function(error, stats)
    fs.open(fileName, "r", function(error, fd)
    var buffer = new Buffer(stats.size) ;

    fs.read(fd, buffer, 0, buffer.length, null, function(error, bytesRead, buffer)
    var data = buffer.toString("utf8", 0, buffer.length) ;

    console.log(data) ;
    fs.close(fd) ;
    ) ;
    ) ;
    ) ;
    ) ;

    Apparently, this kind of thing doesn’t raise an eyebrow in the JS world.

    Now, I understand and respect the JS programming model. But this was seriously frustrating when I first encountered it because a simple script like the one I was trying to write just has an ordered list of tasks to complete. When it asks for bytes from a file, it really has nothing better to do than to wait for the answer.

    Thankfully, it turns out that Node’s fs module includes synchronous versions of the various file access functions. So it’s all good.

    Conclusion
    I’m sure I missed or underexplained some things. But if other brave souls are interested in dipping their toes in the waters of Emscripten, I hope these tips will come in handy.

  • Things I Have Learned About Emscripten

    1er septembre 2015, par Multimedia Mike — Cirrus Retro

    3 years ago, I released my Game Music Appreciation project, a website with a ludicrously uninspired title which allowed users a relatively frictionless method to experience a range of specialized music files related to old video games. However, the site required use of a special Chrome plugin. Ever since that initial release, my #1 most requested feature has been for a pure JavaScript version of the music player.

    “Impossible !” I exclaimed. “There’s no way JS could ever run fast enough to run these CPU emulators and audio synthesizers in real time, and allow for the visualization that I demand !” Well, I’m pleased to report that I have proved me wrong. I recently quietly launched a new site with what I hope is a catchier title, meant to evoke a cloud-based retro-music-as-a-service product : Cirrus Retro. Right now, it’s basically the same as the old site, but without the wonky Chrome-specific technology.

    Along the way, I’ve learned a few things about using Emscripten that I thought might be useful to share with other people who wish to embark on a similar journey. This is geared more towards someone who has a stronger low-level background (such as C/C++) vs. high-level (like JavaScript).

    General Goals
    Do you want to cross-compile an entire desktop application, one that relies on an extensive GUI toolkit ? That might be difficult (though I believe there is a path for porting qt code directly with Emscripten). Your better wager might be to abstract out the core logic and processes of the program and then create a new web UI to access them.

    Do you want to compile a game that basically just paints stuff to a 2D canvas ? You’re in luck ! Emscripten has a porting path for SDL. Make a version of your C/C++ software that targets SDL (generally not a tall order) and then compile that with Emscripten.

    Do you just want to cross-compile some functionality that lives in a library ? That’s what I’ve done with the Cirrus Retro project. For this, plan to compile the library into a JS file that exports some public functions that other, higher-level, native JS (i.e., JS written by a human and not a computer) will invoke.

    Memory Levels
    When porting C/C++ software to JavaScript using Emscripten, you have to think on 2 different levels. Or perhaps you need to force JavaScript into a low level C lens, especially if you want to write native JS code that will interact with Emscripten-compiled code. This often means somehow allocating chunks of memory via JS and passing them to the Emscripten-compiled functions. And you wouldn’t believe the type of gymnastics you need to execute to get native JS and Emscripten-compiled JS to cooperate.

    “Emscripten : Pointers and Pointers” is the best (and, really, ONLY) explanation I could find for understanding the basic mechanics of this process, at least when I started this journey. However, there’s a mistake in the explanation that left me confused for a little while, and I’m at a loss to contact the author (doesn’t anyone post a simple email address anymore ?).

    Per the best of my understanding, Emscripten allocates a large JS array and calls that the memory space that the compiled C/C++ code is allowed to operate in. A pointer in C/C++ code will just be an index into that mighty array. Really, that’s not too far off from how a low-level program process is supposed to view memory– as a flat array.

    Eventually, I just learned to cargo-cult my way through the memory allocation process. Here’s the JS code for allocating an Emscripten-compatible byte buffer, taken from my test harness (more on that later) :

    var musicBuffer = fs.readFileSync(testSpec[’filename’]) ;
    var musicBufferBytes = new Uint8Array(musicBuffer) ;
    var bytesMalloc = player._malloc(musicBufferBytes.length) ;
    var bytes = new Uint8Array(player.HEAPU8.buffer, bytesMalloc, musicBufferBytes.length) ;
    bytes.set(new Uint8Array(musicBufferBytes.buffer)) ;
    

    So, read the array of bytes from some input source, create a Uint8Array from the bytes, use the Emscripten _malloc() function to allocate enough bytes from the Emscripten memory array for the input bytes, then create a new array… then copy the bytes…

    You know what ? It’s late and I can’t remember how it works exactly, but it does. It has been a few months since I touched that code (been fighting with front-end website tech since then). You write that memory allocation code enough times and it begins to make sense, and then you hope you don’t have to write it too many more times.

    Multithreading
    You can’t port multithreaded code to JS via Emscripten. JavaScript has no notion of threads ! If you don’t understand the computer science behind this limitation, a more thorough explanation is beyond the scope of this post. But trust me, I’ve thought about it a lot. In fact, the official Emscripten literature states that you should be able to port most any C/C++ code as long as 1) none of the code is proprietary (i.e., all the raw source is available) ; and 2) there are no threads.

    Yes, I read about the experimental pthreads support added to Emscripten recently. Don’t get too excited ; that won’t be ready and widespread for a long time to come as it relies on a new browser API. In the meantime, figure out how to make your multithreaded C/C++ code run in a single thread if you want it to run in a browser.

    Printing Facility
    Eventually, getting software to work boils down to debugging, and the most primitive tool in many a programmer’s toolbox is the humble print statement. A print statement allows you to inspect a piece of a program’s state at key junctures. Eventually, when you try to cross-compile C/C++ code to JS using Emscripten, something is not going to work correctly in the generated JS “object code” and you need to understand what. You’ll be pleading for a method of just inspecting one variable deep in the original C/C++ code.

    I came up with this simple printf-workalike called emprintf() :

    #ifndef EMPRINTF_H
    #define EMPRINTF_H
    

    #include <stdio .h>
    #include <stdarg .h>
    #include <emscripten .h>

    #define MAX_MSG_LEN 1000

    /* NOTE : Don’t pass format strings that contain single quote (’) or newline
    * characters. */
    static void emprintf(const char *format, ...)

    char msg[MAX_MSG_LEN] ;
    char consoleMsg[MAX_MSG_LEN + 16] ;
    va_list args ;

    /* create the string */
    va_start(args, format) ;
    vsnprintf(msg, MAX_MSG_LEN, format, args) ;
    va_end(args) ;

    /* wrap the string in a console.log(’’) statement */
    snprintf(consoleMsg, MAX_MSG_LEN + 16, "console.log(’%s’)", msg) ;

    /* send the final string to the JavaScript console */
    emscripten_run_script(consoleMsg) ;

    #endif /* EMPRINTF_H */

    Put it in a file called “emprint.h”. Include it into any C/C++ file where you need debugging visibility, use emprintf() as a replacement for printf() and the output will magically show up on the browser’s JavaScript debug console. Heed the comments and don’t put any single quotes or newlines in strings, and keep it under 1000 characters. I didn’t say it was perfect, but it has helped me a lot in my Emscripten adventures.

    Optimization Levels
    Remember to turn on optimization when compiling. I have empirically found that optimizing for size (-Os) leads to the best performance all around, in addition to having the smallest size. Just be sure to specify some optimization level. If you don’t, the default is -O0 which offers horrible performance when running in JS.

    Static Compression For HTTP Delivery
    JavaScript code compresses pretty efficiently, even after it has been optimized for size using -Os. I routinely see compression ratios between 3.5:1 and 5:1 using gzip.

    Web servers in this day and age are supposed to be smart enough to detect when a requesting web browser can accept gzip-compressed data and do the compression on the fly. They’re even supposed to be smart enough to cache compressed output so the same content is not recompressed for each request. I would have to set up a series of tests to establish whether either of the foregoing assertions are correct and I can’t be bothered. Instead, I took it into my own hands. The trick is to pre-compress the JS files and then instruct the webserver to serve these files with a ‘Content-Type’ of ‘application/javascript’ and a ‘Content-Encoding’ of ‘gzip’.

    1. Compress your large Emscripten-build JS files with ‘gzip’ : ‘gzip compiled-code.js’
    2. Rename them from extension .js.gz to .jsgz
    3. Tell the webserver to deliver .jsgz files with the correct Content-Type and Content-Encoding headers

    To do that last step with Apache, specify these lines :

    AddType application/javascript jsgz
    AddEncoding gzip jsgz
    

    They belong in either a directory’s .htaccess file or in the sitewide configuration (/etc/apache2/mods-available/mime.conf works on my setup).

    Build System and Build Time Optimization
    Oh goodie, build systems ! I had a very specific manner in which I wanted to build my JS modules using Emscripten. Can I possibly coerce any of the many popular build systems to do this ? It has been a few months since I worked on this problem specifically but I seem to recall that the build systems I tried to used would freak out at the prospect of compiling stuff to a final binary target of .js.

    I had high hopes for Bazel, which Google released while I was developing Cirrus Retro. Surely, this is software that has been battle-tested in the harshest conditions of one of the most prominent software-developing companies in the world, needing to take into account the most bizarre corner cases and still build efficiently and correctly every time. And I have little doubt that it fulfills the order. Similarly, I’m confident that Google also has a team of no fewer than 100 or so people dedicated to developing and supporting the project within the organization. When you only have, at best, 1-2 hours per night to work on projects like this, you prefer not to fight with such cutting edge technology and after losing 2 or 3 nights trying to make a go of Bazel, I eventually put it aside.

    I also tried to use Autotools. It failed horribly for me, mostly for my own carelessness and lack of early-project source control.

    After that, it was strictly vanilla makefiles with no real dependency management. But you know what helps in these cases ? ccache ! Or at least, it would if it didn’t fail with Emscripten.

    Quick tip : ccache has trouble with LLVM unless you set the CCACHE_CPP2 environment variable (e.g. : “export CCACHE_CPP2=1”). I don’t remember the specifics, but it magically fixes things. Then, the lazy build process becomes “make clean && make”.

    Testing
    If you have never used Node.js, testing Emscripten-compiled JS code might be a good opportunity to start. I was able to use Node.js to great effect for testing the individually-compiled music player modules, wiring up a series of invocations using Python for a broader test suite (wouldn’t want to go too deep down the JS rabbit hole, after all).

    Be advised that Node.js doesn’t enjoy the same kind of JIT optimizations that the browser engines leverage. Thus, in the case of time critical code like, say, an audio synthesis library, the code might not run in real time. But as long as it produces the correct bitwise waveform, that’s good enough for continuous integration.

    Also, if you have largely been a low-level programmer for your whole career and are generally unfamiliar with the world of single-threaded, event-driven, callback-oriented programming, you might be in for a bit of a shock. When I wanted to learn how to read the contents of a file in Node.js, this is the first tutorial I found on the matter. I thought the code presented was a parody of bad coding style :

    var fs = require("fs") ;
    var fileName = "foo.txt" ;
    

    fs.exists(fileName, function(exists)
    if (exists)
    fs.stat(fileName, function(error, stats)
    fs.open(fileName, "r", function(error, fd)
    var buffer = new Buffer(stats.size) ;

    fs.read(fd, buffer, 0, buffer.length, null, function(error, bytesRead, buffer)
    var data = buffer.toString("utf8", 0, buffer.length) ;

    console.log(data) ;
    fs.close(fd) ;
    ) ;
    ) ;
    ) ;
    ) ;

    Apparently, this kind of thing doesn’t raise an eyebrow in the JS world.

    Now, I understand and respect the JS programming model. But this was seriously frustrating when I first encountered it because a simple script like the one I was trying to write just has an ordered list of tasks to complete. When it asks for bytes from a file, it really has nothing better to do than to wait for the answer.

    Thankfully, it turns out that Node’s fs module includes synchronous versions of the various file access functions. So it’s all good.

    Conclusion
    I’m sure I missed or underexplained some things. But if other brave souls are interested in dipping their toes in the waters of Emscripten, I hope these tips will come in handy.

    The post Things I Have Learned About Emscripten first appeared on Breaking Eggs And Making Omelettes.

  • The Big VP8 Debug

    20 novembre 2010, par Multimedia Mike — VP8

    I hope my previous walkthrough of the VP8 4x4 intra coding process was educational. Today, I’ll be walking through an example of what happens when my toy VP8 encoder encodes an intra 16x16 block. This may prove educational to those who have never been exposed to the deep details of this or related algorithms. Also, I wanted to illustrate where I think my VP8 encoder process is going bad and generating such grotesque results.

    Before I start, let me give a shout-out to Google Docs’ Drawing tool which I used to generate these diagrams. It works quite well.

    Results

    (Always cut to the chase in a blog post ; results first.) I’m glad I composed this post. In the course of doing so, I found the problem, fixed it, and am now able to present this image that was decoded from the bitstream encoded by my toy working VP8 encoder :



    Yeah, I know that image doesn’t look like anything you haven’t seen before. The difference is that it has made a successful trip through my VP8 encoder.

    Follow along through the encoding process and learn of the mistake...

    Original Block and Subblocks

    Here is the 16x16 block to be encoded :



    The block is broken down into 16 4x4 subblocks for further encoding :



    Prediction

    The first step is to pick a prediction mode, generate a prediction block, and subtract the predictors from the macroblock. In this case, we will use DC prediction which means the predictor will be the same for each element.

    In 4x4 VP8 DC intra prediction, samples outside of the frame are assumed to be 128. It’s a little different in 16x16 DC intra prediction— samples above the top row are assumed to be 127 while samples left of the leftmost column are assumed to be 129. For the top left macroblock, this still works out to 128.

    Subtract 128 from each of the samples :



    Forward Transform

    Run each of the 16 prediction-removed subblocks through the forward transform. This example uses the forward transform from libvpx 0.9.5 :



    I have highlighted the DC coefficients in each subblock. That’s because those receive special consideration in 16x16 intra coding.

    Quantization

    The Y plane AC quantizer is 4 in this example, the minimum allowed. (The Y plane DC quantizer is also 4 but doesn’t come into play for intra 16x16 coding since the DC coefficients follow a different process.) Thus, quantize (integer divide) each AC element in each subblock (we’ll ignore the DC coefficient for this part) :



    The Y2 Round Trip

    Those highlighted DC coefficients from each of the 16 subblocks comprise the Y2 block. This block is transformed with a slightly different algorithm called the Walsh-Hadamard Transform (WHT). The results of this transform are then quantized (using 8 for both Y2 DC and AC in this example, as those are the smallest Y2 quantizers that VP8 allows), then zigzagged and entropy-coded along with the rest of the macroblock coefficients.

    On the decoder side, the Y2 coefficients are decoded, de-zigzagged, dequantized and run through the inverse WHT.

    And this is where I suspect that most of the error is creeping into my VP8 encoder. Observe the round-trip through the Y2 process :



    As intimated, this part causes me consternation due to the wide discrepancy between the original and the reconstructed Y2 blocks. Observe the absolute difference between the 2 vectors :



    That’s really significant and leads me to believe that this is where the big problem is.

    What’s Wrong ?

    My first suspicion is that the quantization is throwing off the process. I was disabused of this idea when I removed quantization from the equation and immediately reversed the transform :



    So perhaps there is a problem with the forward WHT. Just like with the usual subblock transform, the VP8 spec doesn’t define how to perform the forward WHT, only the inverse WHT. Do I need to audition different forward WHTs from various versions of libvpx, similar to what I did with the other transform ? That doesn’t make a lot of sense— libvpx doesn’t seem to have so much trouble with basic encoding.

    The Punchline

    I reviewed the forward WHT code, the stuff that I plagiarized from libvpx 0.9.0. The function takes, among other parameters, a pitch value. There are 2 loops in the code. The first iterates through the rows of the input matrix— which I assumed was a 4x4 matrix. I was puzzled that during each iteration of the row loop, the input pointer was only being advanced by (pitch/2). I removed the division by 2 and the problem went away. I.e., the encoded image looks correct.

    What’s up with the (pitch/2), anyway ? It seems that the encoder likes to pack 2 4x4 subblocks into an 8x4 block data structure. In fact, the forward DCTs in the libvpx encoder have the same artifact. Remember how I surveyed several variations of forward DCT from different versions of libvpx ? The one that proved most accurate in that test was the one I had already modified to advance the input pointer properly. Fixing the other 2 candidates yields similar results :

    input :   92 91 89 86 91 90 88 86 89 89 89 88 89 87 88 93
    short 0.9.0 : -311 6 2 0 0 11 -6 1 2 -3 3 0 0 0 -2 1
    inverse : 92 91 89 86 91 90 88 87 90 89 89 88 89 87 88 93
    fast  0.9.0 : -313 5 1 0 1 11 -6 1 3 -3 4 0 0 0 -2 1
    inverse : 91 91 89 86 90 90 88 86 89 89 89 88 89 87 88 93
    short 0.9.5 : -312 7 1 0 1 12 -5 2 2 -3 3 -1 1 0 -2 1
    inverse : 92 91 89 86 91 90 88 86 89 89 89 88 89 87 88 93
    

    Code cribber beware !

    Corrected Y2 Round Trip

    Let’s look at that Y2 round trip one more time :



    And another look at the error between the original and the reconstruction :



    Better.

    Dequantization, Prediction, Inverse Transforms, and Reconstruction

    To be honest, now that I solved the major problem, I’m getting a little tired of making these pictures. Long story short, all elements of the original 16 subblocks are dequantized and their DC coefficients are filled in with the appropriate item from the reconstructed Y2 block. A base predictor block is generated (all 128 values in this case). And each Y block is run through the inverse transform and added to the predictor block. The following is the reconstruction :



    And if you compare that against the original luma macroblock (I don’t feel like doing it right now), you’ll find that it’s pretty close.

    I can’t believe how close I was all this time, and how long that pitch bug held me up.