Recherche avancée

Médias (1)

Mot : - Tags -/book

Autres articles (47)

  • HTML5 audio and video support

    13 avril 2011, par

    MediaSPIP uses HTML5 video and audio tags to play multimedia files, taking advantage of the latest W3C innovations supported by modern browsers.
    The MediaSPIP player used has been created specifically for MediaSPIP and can be easily adapted to fit in with a specific theme.
    For older browsers the Flowplayer flash fallback is used.
    MediaSPIP allows for media playback on major mobile platforms with the above (...)

  • De l’upload à la vidéo finale [version standalone]

    31 janvier 2010, par

    Le chemin d’un document audio ou vidéo dans SPIPMotion est divisé en trois étapes distinctes.
    Upload et récupération d’informations de la vidéo source
    Dans un premier temps, il est nécessaire de créer un article SPIP et de lui joindre le document vidéo "source".
    Au moment où ce document est joint à l’article, deux actions supplémentaires au comportement normal sont exécutées : La récupération des informations techniques des flux audio et video du fichier ; La génération d’une vignette : extraction d’une (...)

  • Support audio et vidéo HTML5

    10 avril 2011

    MediaSPIP utilise les balises HTML5 video et audio pour la lecture de documents multimedia en profitant des dernières innovations du W3C supportées par les navigateurs modernes.
    Pour les navigateurs plus anciens, le lecteur flash Flowplayer est utilisé.
    Le lecteur HTML5 utilisé a été spécifiquement créé pour MediaSPIP : il est complètement modifiable graphiquement pour correspondre à un thème choisi.
    Ces technologies permettent de distribuer vidéo et son à la fois sur des ordinateurs conventionnels (...)

Sur d’autres sites (4817)

  • x264 encoder with JNA

    17 mai 2014, par baba

    I have been busy creating a JNA wrapper around x264.dll. I have the following class for my x264_param_t :

    http://pastebin.com/Mh4JkVpP

    However, when I try to initialize my x264_param_t like that

    x264_param_t param_t = new x264_param_t;

    I get the following error :

    Exception in thread "main" java.lang.IllegalArgumentException: Can't determine size of nested structure: Can't instantiate class anotherReversed.x264_param_t$Vui (java.lang.InstantiationException: anotherReversed.x264_param_t$Vui)
           at com.sun.jna.Structure.calculateSize(Structure.java:790)
           at com.sun.jna.Structure.allocateMemory(Structure.java:287)
           at com.sun.jna.Structure.<init>(Structure.java:177)
           at com.sun.jna.Structure.<init>(Structure.java:167)
           at com.sun.jna.Structure.<init>(Structure.java:163)
           at com.sun.jna.Structure.<init>(Structure.java:154)
           at anotherReversed.x264_param_t.<init>(x264_param_t.java:7)
    </init></init></init></init></init>

    If I comment out the Vui in it’s parent class constructor, the instantiation is ok. I wonder what is different with EXACTLY this nested structure, as there are 2 others (namely Rc and Analyse ) that are nested in the same way. Somehow, though, JNA isn’t able to find the required size for Vui. Any pointers ?

    Edit :
    It seems that all the other nested structs (analyse and rc ) were also not initialized. I wonder why ?

  • The 11th Hour RoQ Variation

    12 avril 2012, par Multimedia Mike — Game Hacking, dreamroq, Reverse Engineering, roq, Vector Quantization

    I have been looking at the RoQ file format almost as long as I have been doing practical multimedia hacking. However, I have never figured out how the RoQ format works on The 11th Hour, which was the game for which the RoQ format was initially developed. When I procured the game years ago, I remember finding what appeared to be RoQ files and shoving them through the open source decoders but not getting the right images out.

    I decided to dust off that old copy of The 11th Hour and have another go at it.



    Baseline
    The game consists of 4 CD-ROMs. Each disc has a media/ directory that has a series of files bearing the extension .gjd, likely the initials of one Graeme J. Devine. These are resource files which are merely headerless concatenations of other files. Thus, at first glance, one file might appear to be a single RoQ file. So that’s the source of some of the difficulty : Sending an apparent RoQ .gjd file through a RoQ player will often cause the program to complain when it encounters the header of another RoQ file.

    I have uploaded some samples to the usual place.

    However, even the frames that a player can decode (before encountering a file boundary within the resource file) look wrong.

    Investigating Codebooks Using dreamroq
    I wrote dreamroq last year– an independent RoQ playback library targeted towards embedded systems. I aimed it at a gjd file and quickly hit a codebook error.

    RoQ is a vector quantizer video codec that maintains a codebook of 256 2×2 pixel vectors. In the Quake III and later RoQ files, these are transported using a YUV 4:2:0 colorspace– 4 Y samples, a U sample, and a V sample to represent 4 pixels. This totals 6 bytes per vector. A RoQ codebook chunk contains a field that indicates the number of 2×2 vectors as well as the number of 4×4 vectors. The latter vectors are each comprised of 4 2×2 vectors.

    Thus, the total size of a codebook chunk ought to be (# of 2×2 vectors) * 6 + (# of 4×4 vectors) * 4.

    However, this is not the case with The 11th Hour RoQ files.

    Longer Codebooks And Mystery Colorspace
    Juggling the numbers for a few of the codebook chunks, I empirically determined that the 2×2 vectors are represented by 10 bytes instead of 6. Now I need to determine what exactly these 10 bytes represent.

    I should note that I suspect that everything else about these files lines up with successive generations of the format. For example if a file has 640×320 resolution, that amounts to 40×20 macroblocks. dreamroq iterates through 40×20 8×8 blocks and precisely exhausts the VQ bitstream. So that all looks valid. I’m just puzzled on the codebook format.

    Here is an example codebook dump :

    ID 0x1002, len = 0x0000014C, args = 0x1C0D
      0 : 00 00 00 00 00 00 00 00 80 80
      1 : 08 07 00 00 1F 5B 00 00 7E 81
      2 : 00 00 15 0F 00 00 40 3B 7F 84
      3 : 00 00 00 00 3A 5F 18 13 7E 84
      4 : 00 00 00 00 3B 63 1B 17 7E 85
      5 : 18 13 00 00 3C 63 00 00 7E 88
      6 : 00 00 00 00 00 00 59 3B 7F 81
      7 : 00 00 56 23 00 00 61 2B 80 80
      8 : 00 00 2F 13 00 00 79 63 81 83
      9 : 00 00 00 00 5E 3F AC 9B 7E 81
      10 : 1B 17 00 00 B6 EF 77 AB 7E 85
      11 : 2E 43 00 00 C1 F7 75 AF 7D 88
      12 : 6A AB 28 5F B6 B3 8C B3 80 8A
      13 : 86 BF 0A 03 D5 FF 3A 5F 7C 8C
      14 : 00 00 9E 6B AB 97 F5 EF 7F 80
      15 : 86 73 C8 CB B6 B7 B7 B7 85 8B
      16 : 31 17 84 6B E7 EF FF FF 7E 81
      17 : 79 AF 3B 5F FC FF E2 FF 7D 87
      18 : DC FF AE EF B3 B3 B8 B3 85 8B
      19 : EF FF F5 FF BA B7 B6 B7 88 8B
      20 : F8 FF F7 FF B3 B7 B7 B7 88 8B
      21 : FB FF FB FF B8 B3 B4 B3 85 88
      22 : F7 FF F7 FF B7 B7 B9 B7 87 8B
      23 : FD FF FE FF B9 B7 BB B7 85 8A
      24 : E4 FF B7 EF FF FF FF FF 7F 83
      25 : FF FF AC EB FF FF FC FF 7F 83
      26 : CC C7 F7 FF FF FF FF FF 7F 81
      27 : FF FF FE FF FF FF FF FF 80 80
    

    Note that 0x14C (the chunk size) = 332, 0x1C and 0x0D (the chunk arguments — count of 2×2 and 4×4 vectors, respectively) are 28 and 13. 28 * 10 + 13 * 4 = 332, so the numbers check out.

    Do you see any patterns in the codebook ? Here are some things I tried :

    • Treating the last 2 bytes as U & V and treating the first 4 as the 4 Y samples :


    • Treating the last 2 bytes as U & V and treating the first 8 as 4 16-bit little-endian Y samples :


    • Disregarding the final 2 bytes and treating the first 8 bytes as 4 RGB565 pixels (both little- and big-endian, respectively, shown here) :


    • Based on the type of data I’m seeing in these movies (which appears to be intended as overlays), I figured that some of these bits might indicate transparency ; here is 15-bit big-endian RGB which disregards the top bit of each pixel :


    These images are taken from the uploaded sample bdpuz.gjd, apparently a component of the puzzle represented in this screenshot.

    Unseen Types
    It has long been rumored that early RoQ files could contain JPEG images. I finally found one such specimen. One of the files bundled early in the uploaded fhpuz.gjd sample contains a JPEG frame. It’s a standard JFIF file and can easily be decoded after separating the bytes from the resource using ‘dd’. JPEGs serve as intraframes in the coding scheme, with successive RoQ frames moving objects on top.

    However, a new chunk type showed up as well, one identified by 0×1030. I have never encountered this type. Where could I possibly find data about this ? Fortunately, iD Games recently posted all of their open sourced games at Github. Reading through the code for their official RoQ decoder, I see that this is called a RoQ_PACKET. The name and the code behind it are both supremely unhelpful. The code is basically a no-op. The payloads of the various RoQ_PACKETs from one sample are observed to be either 8784, 14752, or 14760 bytes in length. It’s very likely that this serves the same purpose as the JPEG intraframes.

    Other Tidbits
    I read through the readme.txt on the first game disc and found this nugget :

            g)      Animations displayed normally or in SPOOKY MODE
    

    SPOOKY MODE is blue-tinted grayscale with color cursors, puzzle
    and game pieces. It is the preferred display setting of the
    developers at Trilobyte. Just for fun, try out the SPOOKY
    MODE.

    The MobyGames screenshot page has a number of screenshots labeled as being captured in spooky mode. Color tricks ?

    Meanwhile, another twist arose as I kept tweaking dreamroq to deal with more RoQ weirdness : After modifying my dreamroq code to handle these 10-byte vectors, it eventually chokes on another codebook. These codebooks happen to have 6-byte vectors again ! Fortunately, I was already working on a scheme to automatically detect which codebook is in play (plugging the numbers into a formula and seeing which vector size checks out).

  • Creating A Lossless SMC Encoder

    26 avril 2011, par Multimedia Mike — General

    Look, I can’t explain how or why I come up with this stuff. For some reason, I thought it would be interesting to write a new encoder for the Apple SMC video codec. I can’t even remember why. I just sat down the other day, started writing, and now I have a lossless SMC encoder that I’m not sure what to do with. Maybe this is to be my new thing— writing encoders for marginal multimedia formats.

    Introduction
    SMC is a vector quantizer (a lossy method) but I decided to attack it from the angle of lossless encoding. A.k.a. Apple Graphics Codec, SMC operates on 4x4 blocks in an 8-bit paletted colorspace. Each 4x4 block can be encoded with 1, 2, 4, 8, or 16 colors. Blocks can also be skipped (copied from previous frame) or copied from blocks rendered immediately prior within the same frame.

    Step 1 : Validating Infrastructure
    The goal of this step is to encode the most braindead SMC frame possible and see if FFmpeg/libav’s QuickTime muxer can create a valid file. I think the simplest frame would be one in which each vector is encoded with the single-color mode, starting with color 0 and incrementing through the palette.

    Status : Successful. The only ’trick’ was to set avctx->bits_per_coded_sample to 8. (For fun, this can also be set to 40 (8 | 0x20) to specify a grayscale palette.)



    Step 2 : Preprocessing
    The video frames will arrive at the encoder as 32-bit RGB. These will need to be converted to a paletted colorspace before encoding. I don’t want to use FFmpeg’s default dithering approach as this will result in a substantial loss of quality as described in this post. I would rather maintain a palette built from observed colors throughout successive frames. If the total number of unique observed colors ever exceeds 256, error out.

    That’s what I would like to do. However, I noticed that FFmpeg/libav’s QuickTime muxer has never taken into account the possibility of encoding palettes. The path of least resistance in this case is to dither the input to match QuickTime’s default 8-bit palette (if a paletted QuickTime file does not specify a palette, a default 1-, 2-, 4-, or 8-bit palette is selected).

    Status : Successful, if slow. I definitely need to optimize this step later.

    Step 3 : Most Naive Encoding
    The most basic encoding is to "encode" each block as a 16-color block. This will actually result in a slightly larger frame size than a raw encoding since each 4x4 block will be prepended by a byte opcode (0xE0 in this case) to indicate encoding mode. This should demonstrate that the encoder is functioning at the most basic level.

    Status : Successful. Try not to laugh too hard at the Big Buck Bunny dithered to an 8-bit palette :



    Step 4 : Better Representation
    It seems to me that encoding this format (losslessly) will entail performing vector operations on lots of 16-element (4x4-pixel) vectors. These could be done on the frame as-is, but it strikes me as more efficient and perhaps less error prone to rearrange the input images into a vector of vectors (or array of arrays if you prefer) :

      0  1  2  3  w ...
      4  5  6  7  x ...
      8  9  A  B  y ...
      C  D  E  F  z ...
    
      0 : [0 1 2 3 4 5 6 7 8 9 A B C D E F]
      1 : [...]
    

    Status : Successful.

    Step 5 : Add Interframe Skip Codes
    Time to add a bit of brainpower to the proceedings : On non-keyframes, compare the current vector to the vector at the same position from the previous frame.

    Test this by encoding a pair of identical frames. Ideally, all codes should be skip codes.

    Status : Successful, though my vector matching function could probably be improved.

    Step 6 : Analyze Blocks For Optimal Color Coding
    This is where things get potentially interesting, algorithmically. At least, I need to figure out (or look up) an algorithm to count the unique elements in a vector.

    Naive algorithm (i.e., first thing I can think of) :

    • initialize a count variable to 0
    • initialize an array of 256 flags to false
    • for each 8-bit element in vector :
      • if flag array[element] is 0, set array[element] to true and increment count

    Status : Successful. Here is the distribution for the 640x360 Big Buck Bunny title :

    1194 4636 4113 2140 1138 568 325 154 80 36 9 5 2 0 0 0

    Or, in pretty graph form, demonstrating that vectors with few distinct elements dominate :



    Step 7 : Encode Monochrome Blocks
    At this point, the structure is starting to come together pretty well. This phase involves encoding a 0x60 opcode and a palette index when the count_distinct() function returns 1.

    Status : Absolutely no problem.

    Step 8 : Encode 2-, 4-, and 8-color Modes
    This step is a little more involved. This is where SMC’s 2-, 4-, and 8-color circular palette caches come into play. E.g., when the first 2-color block is encoded, the pair of colors it uses will be inserted into entry 0 of the 2-color cache. During the next 2-color block encoding, if the block uses a pair of colors that already occurs in the cache, the encoding can reference that cache entry. Otherwise, it adds the pair to the next available cache entry, looping back around to 0 as necessary.

    I think I should modify the count_distinct() function to also return a 16-byte array that contains a sorted list of the palette indicies used in the vector. The color pair cache will contain 256 16-bit, 32-bit ints for the quads and 64-bit ints for the octets. This will allow a slightly faster linear cache search.

    Status : The 2-color encoding wasn’t too much trouble and I was able to adapt it to the 4-color mode pretty quickly afterward. I’m still having trouble with the insane 8-color coding mode, though. So that’s commented out for the time being.

    Step 9 : Run Encoding and Putting It All Together
    For each frame, convert the input pixels to a paletted format via one method or another (match to default QuickTime palette for first pass). Then, preprocess each vector to determine the minimum number of elements that can be used to represent it, storing the sorted list of distinct colors in a separate array. The number of elements can either be 0 (only for interframes and indicates a skip block), 1, 2, 4, 8, or 16. Also during this phase, for each vector after the first, test if the vector is the same as the previous vector. If it is, denote this fact in the preprocessed encoding (set the high bit of the element count number).

    Finally, pack it into the bytestream. Iterate through the element count array and search for the longest runs of elements that are encoded with the same mode (up to 256 for skip modes, up to 16 for other modes). If the high bit of an element count is set, that indicates that a copy mode can be encoded. Look for the longest run of element counts with the high bit set and encode a copy mode.

    Status : In-process. Will finish this as motivation strikes.