Breaking Eggs And Making Omelettes
A blog dealing with technical multimedia matters, binary reverse engineering, and the occasional video game hacking.
Les articles publiés sur le site
-
Tele-Arena Lives On
25 février 2011, par Multimedia Mike — Game HackingReaders know I have a peculiar interest in taking apart video games and that I would rather study a game's inner workings than actually play it. I take an interest on others' efforts in this same area. It's still in my backlog to take a closer look at Clone2727's body of work. But I wanted to highlight my friend's work on re-implementing a game called Tele-Arena.
Back In The Day
As some of you are likely aware, there was a dark age of online communication that predated the era of widespread internet access. This was known as "The BBS Age". People dialed into these BBSes using modems that operated at abysmal transfer speeds and would communicate with other users, upload and download files, and play an occasional game.BBS software evolved and perhaps the ultimate (and final) evolution was Galacticomm's MajorBBS (MBBS). There were assorted games that plugged into the MBBS, all rendered in glorious color ANSI graphics. One of the most famous of these games was Tele-Arena (TA). TA was a multiplayer fantasy-themed text adventure game. Perhaps you could think of it as World of Warcraft, only rendered as interactive fiction instead of a rich 3D landscape. (Disclaimer: I might not be qualified to make that comparison since I have never experienced WoW firsthand, though I did play TA on and off about 17 years ago).
TA was often compared to multi-user dungeons -- or MUDs -- that were played by telneting into internet servers hosting games. Such comparisons were usually unfavorable as people who had experience with both TA and MUDs were sniffy elitists with internet access who thought they were sooooo much better than those filthy, BBS-dialing serfs.
Sorry, didn't mean to open old wounds.
Modern Retelling of A Classic Tale
Anyway, my friend Ron Kinney is perhaps the world's biggest fan of TA. So much so that he has re-implemented the engine in Java under the project name Ether. He's in a similar situation as the ScummVM project in that, while the independent, open source engine is fair game for redistribution, it would be questionable to redistribute the original data files. That's why he created an AreaBuilder application that generates independent game data files.Ironically, you can also telnet into a server on which Ron hosts an instance of Tele-Arena (ironic in the sense that the internet/BBS conflict gets a little blurry).
I hope that one day Ron will regale us with the strangest tales from the classic TA days. My personal favorite was "Wrath of a Sysop."
-
Notes on Linux for Dreamcast
I wanted to write down some notes about compiling Linux on Dreamcast (which I have yet to follow through to success). But before I do, allow me to follow up on my last post where I got Google's libvpx library decoding VP8 video on the DC. Remember when I said the graphics hardware could only process variations of RGB color formats? I was mistaken. Reading over some old documentation, I noticed that the DC's PowerVR hardware can also handle packed YUV textures (UYVY, specifically):
The video looks pretty sharp in the small photo. Up close, less so, due to the low resolution and high quantization of the test vector combined with the naive chroma upscaling. For the curious, the grey box surrounding the image highlights the 256-square texture that the video frame gets plotted on. Texture dimensions have to be powers of 2.
Notes on Linux for Dreamcast
I've occasionally dabbled with Linux on my Dreamcast. There's an ancient (circa 2001) distro based around a build of kernel 2.4.5 out there. But I wanted to try to get something more current compiled. Thus far, I have figured out how to cross compile kernels pretty handily but have been unsuccessful in making them run.Here are notes are the compilation portion:
- kernel.org provides a very useful set of cross compiling toolchains
- get the gcc 4.5.1 cross toolchain for SH-4 (the gcc 4.3.3 one won't work because the binutils is too old; it will fail to assemble certain instructions as described in this post)
- working off of Linux kernel 2.6.37, edit the top-level Makefile; find the ARCH and CROSS_COMPILE variables and set appropriately:
ARCH ?= sh CROSS_COMPILE ?= /path/to/gcc-4.5.1-nolibc/sh4-linux/bin/sh4-linux-
$ make dreamcast_defconfig
$ make menuconfig
... if any changes to the default configuration are desired- manually edit arch/sh/Makefile, changing:
cflags-$(CONFIG_CPU_SH4) := $(call cc-option,-m4,) \ $(call cc-option,-mno-implicit-fp,-m4-nofpu)
to:
cflags-$(CONFIG_CPU_SH4) := $(call cc-option,-m4,) \ $(call cc-option,-mno-implicit-fp)
I.e., remove the
'-m4-nofpu'
option. According to the gcc man page, this will "Generate code for the SH4 without a floating-point unit." Why this is a default is a mystery since the DC's SH-4 has an FPU and compilation fails when enabling this option. - On that note, I was always under the impression that the DC sported an SH-4 CPU with the model number SH7750. According to this LinuxSH wiki page as well as the Linux kernel help, it actually has an SH7091 variant. This photo of the physical DC hardware corroborates the model number.
$ make
... to build a Linux kernel for the Sega Dreamcast
Running
So I can compile the kernel but running the kernel (the resulting vmlinux ELF file) gives me trouble. The default kernel ELF file reports an entry point of 0x8c002000. Attempting to upload this through the serial uploading facility I have available to me triggers a system reset almost immediately, probably because that's the same place that the bootloader calls home. I have attempted to alter the starting address via 'make menuconfig' -> System type -> Memory management options -> Physical memory start address. This allows the upload to complete but it still does not run. It's worth noting that the 2.4.5 vmlinux file from the old distribution can be executed when uploaded through the serial loader, and it begins at 0x8c210000. -
Decoding VP8 On A Sega Dreamcast
I got Google's libvpx VP8 codec library to compile and run on the Sega Dreamcast with its Hitachi/Renesas SH-4 200 MHz CPU. So give Google/On2 their due credit for writing portable software. I'm not sure how best to illustrate this so please accept this still photo depicting my testbench Dreamcast console driving video to my monitor:
Why? Because I wanted to try my hand at porting some existing software to this console and because I tend to be most comfortable working with assorted multimedia software components. This seemed like it would be a good exercise.
You may have observed that the video is blue. Shortest, simplest answer: Pure laziness. Short, technical answer: Path of least resistance for getting through this exercise. Longer answer follows.
Update: I did eventually realize that the Dreamcast can work with YUV textures. Read more in my followup post.
Process and Pitfalls
libvpx comes with a number of little utilities includingdecode_to_md5.c
. The first order of business was porting over enough source files to make the VP8 decoder compile along with the MD5 testbench utility.Again, I used the KallistiOS (KOS) console RTOS (aside: I'm still working to get modern Linux kernels compiled for the Dreamcast). I started by configuring and compiling libvpx on a regular desktop Linux system. From there, I was able to modify a number of configuration options to make the build more amenable to the embedded RTOS.
I had to create a few shim header files that mapped various functions related to threading and synchronization to their KOS equivalents. For example, KOS has a threading library cleverly named kthreads which is mostly compatible with the more common pthread library functions. KOS apparently also predates stdint.h, so I had to contrive a file with those basic types.So I got everything compiled and then uploaded the binary along with a small VP8 IVF test vector. Imagine my surprise when an MD5 sum came out of the serial console. Further, visualize my utter speechlessness when I noticed that the MD5 sum matched what my desktop platform produced. It worked!
Almost. When I tried to decode all frames in a test vector, the program would invariably crash. The problem was that the file that manages motion compensation (reconinter.c) needs to define MUST_BE_ALIGNED which compiles byte-wise block copy functions. This is necessary for CPUs like the SH-4 which can't load unaligned data. Apparently, even ARM CPUs these days can handle unaligned memory accesses which is why this isn't a configure-time option.
Showing The Work
I completed the first testbench application which ran the MD5 test on all 17 official IVF test vectors. The SH-4/Dreamcast version aces the whole suite.However, this is a video game console, so I had better be able to show the decoded video. The Dreamcast is strictly RGB-- forget about displaying YUV data directly. I could take the performance hit to convert YUV -> RGB. Or, I could just display the intensity information (Y plane) rendered on a random color scale (I chose blue) on an RGB565 texture (the DC's graphics hardware can also do paletted textures but those need to be rearranged/twiddled/swizzled).
Results
So, can the Dreamcast decode VP8 video in realtime? Sure! Well, I really need to qualify. In the test depicted in the picture, it seems to be realtime (though I wasn't enforcing proper frame timings, just decoding and displaying as quickly as possible). Obviously, I wasn't bothering to properly convert YUV -> RGB. Plus, that Big Buck Bunny test vector clip is only 176x144. Obviously, no audio decoding either.So, realtime playback, with a little fine print.
On the plus side, it's trivial to get the Dreamcast video hardware to upscale that little blue image to fullscreen.
I was able to tally the total milliseconds' worth of wall clock time required to decode the 17 VP8 test vectors. As you can probably work out from this list, when I try to play a 320x240 video, things start to break down.
- Processed 29 176x144 frames in 987 milliseconds.
- Processed 49 176x144 frames in 1809 milliseconds.
- Processed 49 176x144 frames in 704 milliseconds.
- Processed 29 176x144 frames in 255 milliseconds.
- Processed 49 176x144 frames in 339 milliseconds.
- Processed 48 175x143 frames in 2446 milliseconds.
- Processed 29 176x144 frames in 432 milliseconds.
- Processed 2 1432x888 frames in 2060 milliseconds.
- Processed 49 176x144 frames in 1884 milliseconds.
- Processed 57 320x240 frames in 5792 milliseconds.
- Processed 29 176x144 frames in 989 milliseconds.
- Processed 29 176x144 frames in 740 milliseconds.
- Processed 29 176x144 frames in 839 milliseconds.
- Processed 49 175x143 frames in 2849 milliseconds.
- Processed 260 320x240 frames in 29719 milliseconds.
- Processed 29 176x144 frames in 962 milliseconds.
- Processed 29 176x144 frames in 933 milliseconds.
-
Of ctors and dtors
I haven't given up on the Sega Dreamcast programming. I was able to compile a bunch of homebrew code for the DC many years ago and I can't make it work anymore. Again, I was working with a purpose-built, open source RTOS named KallistiOS (or KOS). I can make the programs compile but not run. I had ELF files left over from years ago which still executed. But when I tried to build new ELF files, no luck-- the programs crashed before even reaching my main() function.
I found the problem: ELF files are comprised of a number of sections and 2 of these sections are named '.ctors' and '.dtors' which stand for constructors and destructors. The KOS RTOS performs a manual traversal of .ctors section during program initialization and this is where things go bad. The traversal code doesn't seem to account for a .ctors section that only contains a single entry. I commented out the function that does the traversal and programs started to work, at least until it was time to exit the program and return control to the program loader. That's when the counterpart .dtors section traversal code ran and demonstrated the same problem. I'll exhibit the problematic code at the end of this post.
So I'm finally tinkering with Sega Dreamcast programming once again and with a slightly better grasp of software engineering than the first time I did this.
Portable and Compatible C?
If nothing else, this low-level embedded stuff exposes you to some serious toolchain arcana, the likes of which you will likely never see working strictly in the desktop arena.Still, this exercise makes me wonder why C code from a decade ago doesn't compile reliably now. Part of it is because gcc has gotten stricter about the syntax it will accept. In the case of this specific crashing problem, I suspect it comes down to a difference in the way the linker generates the final ELF file. I've written a list of items I have had to modify in the KOS codebase in order to get it to compile on more recent gcc versions. I wonder if it would be worth publishing the specifics, or if anyone would ever find the information useful? Oh, who am I kidding? Of course I'll write it up, perhaps publish a new version of the code, if only because that's the best chance I have of finding my own work again some years down the road.
Problematic C Code
See if this code makes any sense to you. It somehow traverse a list of 32-bit function pointers (in different directions, depending on constructors or destructors), executing each in turn. However, it appears to fall over if the list of pointers consists of a single entry.
C:-
typedef void (*fptr)(void);
-
-
static fptr ctor_list[1] __attribute__((section(".ctors"))) = { (fptr) -1 };
-
static fptr dtor_list[1] __attribute__((section(".dtors"))) = { (fptr) -1 };
-
-
/* Call this to execute all ctors */
-
void arch_ctors() {
-
fptr *fpp;
-
-
/* Run up to the end of the list (defined by crtend) */
-
for (fpp=ctor_list + 1; *fpp != 0; ++fpp)
-
;
-
-
/* Now run the ctors backwards */
-
while (--fpp> ctor_list)
-
(**fpp)();
-
}
-
-
/* Call this to execute all dtors */
-
void arch_dtors() {
-
fptr *fpp;
-
-
/* Do the dtors forwards */
-
for (fpp=dtor_list + 1; *fpp != 0; ++fpp )
-
(**fpp)();
-
}
-
-
Google’s YouTube Uses FFmpeg
9 février 2011, par Multimedia Mike — GeneralControversy arose last week when Google accused Microsoft of stealing search engine results for their Bing search engine. It was a pretty novel sting operation and Google did a good job of visually illustrating their side of the story on their official blog.
This reminds me of the fact that Google's YouTube video hosting site uses FFmpeg for converting videos. Not that this is in the same league as the search engine shenanigans (it's perfectly legit to use FFmpeg in this capacity, but to my knowledge, Google/YouTube has never confirmed FFmpeg usage), but I thought I would revisit this item and illustrate it with screenshots. This is not new information-- I first empirically tested this fact 4 years ago. However, a lot of people wonder how exactly I can identify FFmpeg on the backend when I claim that I've written code that helps power YouTube.
Short Answer
How do I know YouTube uses FFmpeg to convert multimedia? Because:- FFmpeg can decode a number of impossibly obscure multimedia formats using code I wrote
- YouTube can transcode many of the same formats
- I screwed up when I wrote the code to support some of these weird formats
- My mistakes are still present when YouTube transcodes certain fringe formats
Longer Answer (With Pictures!)
Let's take a video format named RoQ, developed by noted game designer Graeme Devine. Originated for use in the FMV-heavy game The 11th Hour, the format eventually found its way into the Quake 3 engine as well as many games derived from the same technology.Dr. Tim Ferguson reverse engineered the format (though it would later be open sourced along with the rest of the Q3 engine). I wrote a RoQ playback system for FFmpeg, and I messed up in doing so. I believe my coding error helps demonstrate the case I'm trying to make here.
Observe what happened when I pushed the jk02.roq sample through YouTube in my original experiment 4 years ago:
Do you see how the canyon walls bleed into the sky? That's not supposed to happen. FFmpeg doesn't do that anymore but I was able to go back into the source code history to find when it did do that:
Academic Answer
FFmpeg fixed this bug in June of 2007 (thanks to Eric Lasota). The problem had to do with premature colorspace conversion in my original decoder.Leftovers
I tried uploading the video again to see if the problem persists in YouTube's transcoder. First bit of trivia: YouTube detects when you have uploaded the same video twice and rejects the subsequent attempts. So I created a double concatenation of the video and uploaded it. The problem is gone, illustrating that the backend is actually using a newer version of FFmpeg. This surprises me for somewhat esoteric reasons.Here's another interesting bit of trivia for those who don't do a lot of YouTube uploading-- YouTube reports format details when you upload a video:
So, yep, RoQ format. And you can wager that this will prompt me to go back through the litany of unusual formats that FFmpeg supports to see how YouTube responds.