Breaking Eggs And Making Omelettes
A blog dealing with technical multimedia matters, binary reverse engineering, and the occasional video game hacking.
Les articles publiés sur le site
-
How To Write An Oscilloscope
I'm trying to figure out how to write a software oscilloscope audio visualization. It's made more frustrating by the knowledge that I am certain that I have accomplished this task before.
In this context, the oscilloscope is used to draw the time-domain samples of an audio wave form. I have written such a plugin as part of the xine project. However, for that project, I didn't have to write the full playback pipeline-- my plugin was just handed some PCM data and drew some graphical data in response. Now I'm trying to write the entire engine in a standalone program and I'm wondering how to get it just right.
This is an SDL-based oscilloscope visualizer and audio player for Game Music Emu library. My approach is to have an audio buffer that holds a second of audio (44100 stereo 16-bit samples). The player updates the visualization at 30 frames per second. The o-scope is 512 pixels wide. So, at every 1/30th second interval, the player dips into the audio buffer at position ((frame_number % 30) * 44100 / 30) and takes the first 512 stereo frames for plotting on the graph.
It seems to be working okay, I guess. The only problem is that the A/V sync seems to be slightly misaligned. I am just wondering if this is the correct approach. Perhaps the player should be performing some slightly more complicated calculation over those (44100/30) audio frames during each update in order to obtain a more accurate graph? I described my process to an electrical engineer friend of mine and he insisted that I needed to apply something called hysteresis to the output or I would never get accurate A/V sync in this scenario.
Further, I know that some schools of thought on these matters require that the dots in those graphs be connected, that the scattered points simply won't do. I guess it's a stylistic choice.
Still, I think I have a reasonable, workable approach here. I might just be starting the visualization 1/30th of a second too late.
-
Libav/FFmpeg and Google Summer of Code 2012
26 avril 2012, par Multimedia Mike — General, ffmpeg, gsoc, gsoc2012, hevc, libav, opus, rtmp, ut videoSo, the projects are participating in the Google Summer of Code for the 2012 season. (While Libav is the project officially accepted to particular, I still refer to the projects because FFmpeg will also benefit).
Here are the students, projects, and mentors for this summer:
- Andrew D’Addesio is working on an Opus Decoder, mentored by Justin Ruggles
- Guillaume Martres is working on an HEVC video decoder, mentored by Mashiat Sarker Shakkhar
- Jan Ekström is working on an LGPL Ut Video encoder, mentored by Kostya Shishkov
- Jordi Ortiz is working to rewrite avserver, mentored by Luca Barbato
- Samuel Pitoiset is working on an RTMP[E|S|T|TE] protocol implementation, mentored by Martin Storsjö
Wish them luck– these are some ambitious projects.
-
Solving The XVD Puzzle
I downloaded a multimedia file a long time ago (at least, I strongly suspected it was a multimedia file which is why I downloaded it). It went by the name of ‘lamborghini_850kbps.vg2′. I have had it in my collection for at least 7 years. I couldn’t remember where I found it. I downloaded it before it occurred to me to take notes about this sort of stuff.
I found myself staring at the file again today and Googled the filename. This led me to a few Japanese sites which also contained working URLs for a few more .vg2 samples. Some other clues led me to a Russian language forum where someone had linked to a site that had Win32 codec modules that could process the files. The site was defunct but the Internet Archive Wayback Machine kept a copy for me, as well as copies of several more .vg2 samples from a defunct Japanese site previously involved with this codec.
Sometimes this internet technology works really well. But I digress.
Anyway, through all this, I finally found a clue: XVD. and wouldn’t you know, there is already a basic page on the MultimediaWiki describing the technology. In fact, while VG2 is a custom container, the MultimediaWiki states that the video component has a FourCC of VGMV, and there is already a file named VGMV.avi in the root V-codecs/ samples directory, something I vow to correct (that’s a big pet peeve of mine– putting samples in the root V-codecs/ or A-codecs/ directories).
XVD… XVD… XVD… why does that sound so familiar? Oh, of course; there is a company named XVD and they have an office in the Bay Area which I have passed on numerous occasions, like this morning:
<
Someone originally connected with the multimedia technology in question operates a website which contains an unofficial history of the XVD tech. At first, I was wondering if the technology was completely defunct (and should therefore be open sourced). But if XVD’s solutions page (dated 2010) is to be believed, the technology is still in service, and purported to be better than H.264 and VC-1: “The current generation of XVD video compression technology provides better video quality at any given data rate than standards-based codecs (H.264 or VC-1) with four times lower encoding complexity (when compared with H.264 Main Profile).”
If they say so. For my part, I’m just happy that I have finally figured out what this lamborghini_850kbps.vg2 is so that I can properly catalog it on the samples site, which I have now done, along with other samples and various codecs modules.
This episode reminds me that there’s a branch office of Zygo Corporation close to my home (though the headquarters are far, far away). The companies you see in Silicon Valley. Anyway, long-time open source multimedia hackers will no doubt recognize Zygo from the ZyGo FourCC & video codec transported in QuickTime files that was almost decode-able using an H.263 decoder.
I may never learn what Zygo’s core competency actually is, but I will always remember their multimedia tech every time I run past their office.
-
The 11th Hour RoQ Variation
12 avril 2012, par Multimedia Mike — Game Hacking, dreamroq, Reverse Engineering, roq, Vector QuantizationI have been looking at the RoQ file format almost as long as I have been doing practical multimedia hacking. However, I have never figured out how the RoQ format works on The 11th Hour, which was the game for which the RoQ format was initially developed. When I procured the game years ago, I remember finding what appeared to be RoQ files and shoving them through the open source decoders but not getting the right images out.
I decided to dust off that old copy of The 11th Hour and have another go at it.
Baseline
The game consists of 4 CD-ROMs. Each disc has a media/ directory that has a series of files bearing the extension .gjd, likely the initials of one Graeme J. Devine. These are resource files which are merely headerless concatenations of other files. Thus, at first glance, one file might appear to be a single RoQ file. So that’s the source of some of the difficulty: Sending an apparent RoQ .gjd file through a RoQ player will often cause the program to complain when it encounters the header of another RoQ file.I have uploaded some samples to the usual place.
However, even the frames that a player can decode (before encountering a file boundary within the resource file) look wrong.
Investigating Codebooks Using dreamroq
I wrote dreamroq last year– an independent RoQ playback library targeted towards embedded systems. I aimed it at a gjd file and quickly hit a codebook error.RoQ is a vector quantizer video codec that maintains a codebook of 256 2×2 pixel vectors. In the Quake III and later RoQ files, these are transported using a YUV 4:2:0 colorspace– 4 Y samples, a U sample, and a V sample to represent 4 pixels. This totals 6 bytes per vector. A RoQ codebook chunk contains a field that indicates the number of 2×2 vectors as well as the number of 4×4 vectors. The latter vectors are each comprised of 4 2×2 vectors.
Thus, the total size of a codebook chunk ought to be (# of 2×2 vectors) * 6 + (# of 4×4 vectors) * 4.
However, this is not the case with The 11th Hour RoQ files.
Longer Codebooks And Mystery Colorspace
Juggling the numbers for a few of the codebook chunks, I empirically determined that the 2×2 vectors are represented by 10 bytes instead of 6. Now I need to determine what exactly these 10 bytes represent.I should note that I suspect that everything else about these files lines up with successive generations of the format. For example if a file has 640×320 resolution, that amounts to 40×20 macroblocks. dreamroq iterates through 40×20 8×8 blocks and precisely exhausts the VQ bitstream. So that all looks valid. I’m just puzzled on the codebook format.
Here is an example codebook dump:
ID 0x1002, len = 0x0000014C, args = 0x1C0D 0: 00 00 00 00 00 00 00 00 80 80 1: 08 07 00 00 1F 5B 00 00 7E 81 2: 00 00 15 0F 00 00 40 3B 7F 84 3: 00 00 00 00 3A 5F 18 13 7E 84 4: 00 00 00 00 3B 63 1B 17 7E 85 5: 18 13 00 00 3C 63 00 00 7E 88 6: 00 00 00 00 00 00 59 3B 7F 81 7: 00 00 56 23 00 00 61 2B 80 80 8: 00 00 2F 13 00 00 79 63 81 83 9: 00 00 00 00 5E 3F AC 9B 7E 81 10: 1B 17 00 00 B6 EF 77 AB 7E 85 11: 2E 43 00 00 C1 F7 75 AF 7D 88 12: 6A AB 28 5F B6 B3 8C B3 80 8A 13: 86 BF 0A 03 D5 FF 3A 5F 7C 8C 14: 00 00 9E 6B AB 97 F5 EF 7F 80 15: 86 73 C8 CB B6 B7 B7 B7 85 8B 16: 31 17 84 6B E7 EF FF FF 7E 81 17: 79 AF 3B 5F FC FF E2 FF 7D 87 18: DC FF AE EF B3 B3 B8 B3 85 8B 19: EF FF F5 FF BA B7 B6 B7 88 8B 20: F8 FF F7 FF B3 B7 B7 B7 88 8B 21: FB FF FB FF B8 B3 B4 B3 85 88 22: F7 FF F7 FF B7 B7 B9 B7 87 8B 23: FD FF FE FF B9 B7 BB B7 85 8A 24: E4 FF B7 EF FF FF FF FF 7F 83 25: FF FF AC EB FF FF FC FF 7F 83 26: CC C7 F7 FF FF FF FF FF 7F 81 27: FF FF FE FF FF FF FF FF 80 80
Note that 0x14C (the chunk size) = 332, 0x1C and 0x0D (the chunk arguments — count of 2×2 and 4×4 vectors, respectively) are 28 and 13. 28 * 10 + 13 * 4 = 332, so the numbers check out.
Do you see any patterns in the codebook? Here are some things I tried:
- Treating the last 2 bytes as U & V and treating the first 4 as the 4 Y samples:
- Treating the last 2 bytes as U & V and treating the first 8 as 4 16-bit little-endian Y samples:
- Disregarding the final 2 bytes and treating the first 8 bytes as 4 RGB565 pixels (both little- and big-endian, respectively, shown here):
- Based on the type of data I’m seeing in these movies (which appears to be intended as overlays), I figured that some of these bits might indicate transparency; here is 15-bit big-endian RGB which disregards the top bit of each pixel:
These images are taken from the uploaded sample bdpuz.gjd, apparently a component of the puzzle represented in this screenshot.
Unseen Types
It has long been rumored that early RoQ files could contain JPEG images. I finally found one such specimen. One of the files bundled early in the uploaded fhpuz.gjd sample contains a JPEG frame. It’s a standard JFIF file and can easily be decoded after separating the bytes from the resource using ‘dd’. JPEGs serve as intraframes in the coding scheme, with successive RoQ frames moving objects on top.However, a new chunk type showed up as well, one identified by 0×1030. I have never encountered this type. Where could I possibly find data about this? Fortunately, iD Games recently posted all of their open sourced games at Github. Reading through the code for their official RoQ decoder, I see that this is called a RoQ_PACKET. The name and the code behind it are both supremely unhelpful. The code is basically a no-op. The payloads of the various RoQ_PACKETs from one sample are observed to be either 8784, 14752, or 14760 bytes in length. It’s very likely that this serves the same purpose as the JPEG intraframes.
Other Tidbits
I read through the readme.txt on the first game disc and found this nugget:g) Animations displayed normally or in SPOOKY MODE SPOOKY MODE is blue-tinted grayscale with color cursors, puzzle and game pieces. It is the preferred display setting of the developers at Trilobyte. Just for fun, try out the SPOOKY MODE.
The MobyGames screenshot page has a number of screenshots labeled as being captured in spooky mode. Color tricks?
Meanwhile, another twist arose as I kept tweaking dreamroq to deal with more RoQ weirdness: After modifying my dreamroq code to handle these 10-byte vectors, it eventually chokes on another codebook. These codebooks happen to have 6-byte vectors again! Fortunately, I was already working on a scheme to automatically detect which codebook is in play (plugging the numbers into a formula and seeing which vector size checks out).
- Treating the last 2 bytes as U & V and treating the first 4 as the 4 Y samples:
-
G.I. Joe Custom Multimedia
30 mars 2012, par Multimedia Mike — GeneralI received this 3-disc set of G.I. Joe CD-ROMs today:
Copyright 2003, and labeled as PC ONLY. Each disc claims to have 2 episodes. So are these some sort of video discs? Any gaming elements? I dove in to investigate.
So, it turns out that there are some games on these discs, done in Flash Player (which tells me that these were probably available on the web at some point). Here’s a shooting gallery game from the first disc:
As promised by the CD-ROM copy, the menu does grant access to 2 classic G.I. Joe episodes. Selecting either one launches this:
Powered by C-ezy? Am I interpreting that correctly? Anyway, the video player goes fullscreen and looks fine (given the source material). I can’t capture screenshots and controls are limited to: space for pause, ESC to exit player, and up/down to control volume. No seeking and certainly no onscreen controls. Pretty awful player.
Studying the first disc, I find a 550 MB file with the name 5859Hasbro.egm. Coupled with ep58.cfg and ep59.cfg files in the same directory, I gather that the disc has G.I. Joe episodes 58 and 59 (though the exact episodes, “There’s No Place Like Springfield” parts 1 and 2, are listed on Wikipedia as being episodes 154 and 155; but who’s counting?). The cfg files contain this text:
ep58.cfg: EGM_GIJOE.exe 5859Hasbro.egm /noend /track:0 /singletrack ep59.cfg: EGM_GIJOE.exe 5859Hasbro.egm /noend /track:1 /singletrack
The big EGM file starts with the string “Egenie Player”. After that, I see absolutely no clues. The supporting EGM_GIJOE.exe file has some interesting strings: “Decore Bits Per Pixel” (I know I have seen “Decore” used to mean “decoding core” in some libraries), “Egenie Player – %s, Version:%s”, “4th June 2002″, a list of common FourCC tags seen in AVI files, “Brought to you by Martin, Patrick Bob and Bren” (do you suppose “Patrick Bob” is one person’s name?), a list of command line options…
Aha! A URL: http:\\www.e-genie.tv (yep, backslashes, not forward slashes). e-genie.tv seems to redirect to mygenie.tv, which… doesn’t appear to be strictly related to video technology these days.