Recherche avancée

Médias (0)

Mot : - Tags -/organisation

Aucun média correspondant à vos critères n’est disponible sur le site.

Autres articles (91)

  • Problèmes fréquents

    10 mars 2010, par

    PHP et safe_mode activé
    Une des principales sources de problèmes relève de la configuration de PHP et notamment de l’activation du safe_mode
    La solution consiterait à soit désactiver le safe_mode soit placer le script dans un répertoire accessible par apache pour le site

  • Les autorisations surchargées par les plugins

    27 avril 2010, par

    Mediaspip core
    autoriser_auteur_modifier() afin que les visiteurs soient capables de modifier leurs informations sur la page d’auteurs

  • Publier sur MédiaSpip

    13 juin 2013

    Puis-je poster des contenus à partir d’une tablette Ipad ?
    Oui, si votre Médiaspip installé est à la version 0.2 ou supérieure. Contacter au besoin l’administrateur de votre MédiaSpip pour le savoir

Sur d’autres sites (16350)

  • Announcing the World’s Worst VP8 Encoder

    5 octobre 2010, par Multimedia Mike — Outlandish Brainstorms, VP8

    I wanted to see if I could write an extremely basic VP8 encoder. It turned out to be one of the hardest endeavors I have ever attempted (and arguably one of the least successful).

    Results
    I started with the Big Buck Bunny title image :



    And this is the best encoding that this experiment could yield :



    Squint hard enough and you can totally make out the logo. Pretty silly effort, I know. It should also be noted that the resultant .webm file holding that single 400×225 image was 191324 bytes. When FFmpeg decoded it to a PNG, it was only 187200 bytes.

    The Story
    Remember my post about a naive SVQ1 encoder ? Long story short, I set out to do the same thing with VP8. (I wanted to the same thing with VP3/Theora for years. But take a good look at what it would entail to create even the most basic bitstream. As involved as VP8 may be, its bitstream is absolutely trivial compared to VP3/Theora.)

    With the naive SVQ1 encoder, the goal was to create a minimally compliant SVQ1 encoded bitstream. For this exercise, I similarly hypothesized what it would take to create the most basic, syntactically correct VP8 bitstream with the least amount of effort. These are the overall steps I came up with :

    • Intra-only
    • Create a basic bitstream header that disables any extra features (no modification of default tables)
    • Use a static quantizer
    • Use intra 16×16 coding for each macroblock
    • Use vertical prediction for the 16×16 intra coding

    For coding each macroblock :

    • Subtract vertical predictor from each row
    • Perform forward transform on each 4×4 sub block
    • Perform forward WHT on luma plane DCT coefficients
    • Pack the coefficients into the bitstream via the Boolean encoder

    It all sounds so simple. But, like I said in the SVQ1 post, it’s all very much like carefully bootstrapping a program to run on a particular CPU, and the VP8 decoder serves as the CPU. I’m confident that I have the bitstream encoding correct because, at the very least, the decoder agrees precisely with the encoder about the numbers represented by those 0s and 1s.

    What’s Wrong ?
    Compromises were made for the sake of getting some vaguely recognizable image encoded in a minimally valid manner. One big stumbling block is that I couldn’t seem to encode an end of block (EOB) condition correctly. I then realized that it’s perfectly valid to just encode a lot of zero coefficients rather than signaling EOB. An encoding travesty, I know, and likely one reason that the resulting filesize is so huge.

    More drama occurred when I hit my first block that had all zeros. There were complications in that situation that I couldn’t seem to avoid. So I forced the first AC coefficient to be 1 in that case. Hey, the decoder liked it.

    As for the generally weird look of the decoded image, I’m thinking that could either be : A) an artifact of forcing 16×16 vertical prediction or ; or B) a mistake in the way that I transformed and predicted stuff before sending it to the decoder. The smart money is on a combination of both A and B.

    Then again, as the SVQ1 experiment demonstrated, I shouldn’t expect extraordinary visual quality when setting the bar this low (i.e., just getting some bag of bits that doesn’t make the decoder barf).

  • Minimal Understanding of VP8′s Forward Transform

    16 novembre 2010, par Multimedia Mike — VP8

    Regarding my toy VP8 encoder, Pengvado mentioned in the comments of my last post, “x264 looks perfect using only i16x16 DC mode. You must be doing something wrong in computing residual or fdct or quantization.” This makes a lot of sense. The encoder generates a series of elements which describe how to reconstruct the original image. Intra block reconstruction takes into consideration the following elements :



    I have already verified that both my encoder and FFmpeg’s VP8 decoder agree precisely on how to reconstruct blocks based on the predictors, coefficients, and quantizers. Thus, if the decoded image still looks crazy, the elements the encoder is generating to describe the image must be wrong.

    So I started studying the forward DCT, which I had cribbed wholesale from the original libvpx 0.9.0 source code. It should be noted that the formal VP8 spec only defines the inverse transform process, not the forward process. I was using a version designated as the “short” version, vs. the “fast” version. Then I looked at the 0.9.5 FDCT. Then I got the idea of comparing the results of each.

    input:   92 91 89 86 91 90 88 86 89 89 89 88 89 87 88 93

    • libvpx 0.9.0 “short” :
      forward : -314 5 1 5 4 5 -2 0 0 1 -1 -1 1 11 -3 -4
      inverse : 92 91 89 86 89 86 91 90 91 90 88 86 88 86 89 89
      
    • libvpx 0.9.0 “fast” :
      forward : -314 4 0 5 4 4 -2 0 0 1 0 -1 1 11 -2 -5
      inverse : 91 91 89 86 88 86 91 90 91 90 88 86 88 86 89 89
      
    • libvpx 0.9.5 “short” :
      forward : -312 7 1 0 1 12 -5 2 2 -3 3 -1 1 0 -2 1
      inverse : 92 91 89 86 91 90 88 86 89 89 89 88 89 87 88 93
      

    I was surprised when I noticed that input[] != idct(fdct(input[])) in some of the above cases. Then I remembered that the aforementioned property isn’t what is meant by a “bit-exact” transform– only that all implementations of the inverse transform are supposed to produce bit-exact output for a given vector of input coefficients.

    Anyway, I tried applying each of these forward transforms. I got slightly differing results, with the latest one I tried (the fdct from libvpx 0.9.5) producing the best results (to my eye). At least the trees look better in the Big Buck Bunny logo image :



    The dense trees of the Big Buck Bunny logo using one of the libvpx 0.9.0 forward transforms


    The same segment of the image using the libvpx 0.9.5 forward transform

    Then again, it could be that the different numbers generated by the newer forward transform triggered different prediction modes to be chosen. Overall, adapting the newer FDCT did not dramatically improve the encoding quality.

    Working on the intra 4×4 mode encoding is generating some rather more accurate blocks than my intra 16×16 encoder. Pengvado indicated that x264 generates perfectly legible results when forcing the encoder to only use intra 16×16 mode. To be honest, I’m having trouble understanding how that can possibly occur thanks to the Walsh-Hadamard transform (WHT). I think that’s where a lot of the error is creeping in with my intra 16×16 encoder. Then again, FFmpeg implements an inverse WHT function that bears ‘vp8′ in its name. This implies that it’s custom to the algorithm and not exactly shared with H.264.

  • RoQ on Dreamcast

    18 mars 2011, par Multimedia Mike — Sega Dreamcast

    I have been working on that challenge to play back video on the Sega Dreamcast. To review, I asserted that the RoQ format would be a good fit for the Sega Dreamcast hardware. The goal was to play 640x480 video at 30 frames/second. Short version : I have determined that it is possible to decode such video in real time. However, I ran into certain data rate caveats.

    First off : Have you ever wondered if the Dreamcast can read an 80mm optical disc ? It can ! I discovered this when I only had 60 MB of RoQ samples to burn on a disc and a spindle full of these 210MB-capacity 80mm CD-Rs that I never have occasion to use.



    New RoQ Library
    There are open source RoQ decoders out there but I decided to write a new one. A few reasons : 1) RoQ is so simple that I didn’t think it would take too long ; 2) it would be nice to have a RoQ library that is license-compatible (BSD-like) with the rest of the KallistiOS distribution ; 3) the idroq.tar.gz distribution, while license-compatible, has enough issues that I didn’t want to correct it.

    Thankfully, I was correct about the task not being too difficult : I put together a new RoQ decoder in short order. I’m a bit embarrassed to admit that the part I had the most trouble with was properly converting YUV -> RGB.

    About the approach I took : While the original idroq.tar.gz decoder maintains YUV 4:2:0 codebooks (which led to chroma bugs during motion compensation) and FFmpeg’s decoder maintains YUV 4:4:4 codebooks, this decoder is built to convert the YUV 4:2:0 vectors into RGB565 vectors during the vector unpacking phase. Thus, the entire frame is rendered in RGB565 — no lengthy YUV -> RGB conversion after decoding — and all pixels are shuffled around as 16-bit units (minor speedup vs. shuffling everything as bytes).

    I also entertained the idea of maintaining YUYV codebooks (since the DC supports that colorspace as a texture format). But I scrapped that idea when I remembered it would lead to the same chroma bleeding problem seen in the original idroq.tar.gz decoder.

    Onto The Dreamcast
    I developed the library on a Linux computer, allowing it to output a series of PNM files for visual verification and debugging. Dropping it into a basic DC/KOS-compatible program was trivial and the first order of business was profiling.

    At first, I profiled the entire decode operation : open file, then read and decode each chunk while tossing away the results. I was roundly disappointed to see that, e.g., an 8.5-second RoQ sample needed a little more than 20 seconds to complete. Not real time. I performed a series of optimizations on the decoding library that netted notable performance gains when profiling on Linux. When I brought these same optimizations over to the DC, decoding time didn’t improve at all. This was my first suspicion that perhaps my assumptions regarding the DC’s optical drive’s data rate were not correct.

    Dreamcast Data Rate Profiling
    Let’s start with some definitions : In terms of data rate, an ’X’, i.e., 1X is the minimum data rate needed to read CD quality audio from a disc. At that speed, a drive should be able to stream 75 sectors each second. When reading mode 1/form 1 CD-ROM data, each sector has 2048 bytes (2 kbytes), so a single-speed data rate should achieve 150 kbytes/sec.

    The Dreamcast is supposed to possess a 12X optical drive. This would imply a maximum data rate of 150 kbytes/sec * 12 = 1800 kbytes/sec.

    Rigging up a trivial experiment using the RoQ samples burned on a few different CD-R discs, the best data rate I can see is about 500-525 kbytes/sec, or around 3.5X.

    Where’s the discrepancy ? My first theory has to do with the fact that not all optical media is created equal. This is why optical drives often advertise a slew of numbers which refer to the best theoretical speed for reading a CD vs. writing a CD-R vs. writing a CD-RW, etc. Perhaps the DC drive can’t read CD-Rs very quickly. To test this theory, I tried streaming a large file from a conventionally mastered CD-ROM. This worked well for the closest CD-ROM I had on hand : I was able to stream data at a rate that works out to about 6.5X.

    I smell a science project for another evening : Profiling read speeds from a mastered CD-ROM, burned CD-R, and also a mastered GD-ROM, on each of the 3 Dreamcast consoles I possess (I’ve heard that there’s variance between optical drives depending on manufacturing run).

    The Good News
    I added a little finer-grained code to profile just the video decoding functions. The good news is that the decoder meets my real time goals : That 8.5-second RoQ sample encoded at 640x480x30fps makes its way through the video decoding functions on the DC in a little less than 5 seconds. If the optical drive can supply the data fast enough, the video decoder can take care of the rest.

    The RoQ encoder included with FFmpeg does not honor any bitrate parameters. Instead, I encoded the same file at 320x240. It reportedly decoded in real time and can be streamed in real time as well.

    I say "reportedly" because I’m simply working from textual output at this point ; the next phase is to hook the decoder up to the display hardware.