Breaking Eggs And Making Omelettes

A blog dealing with technical multimedia matters, binary reverse engineering, and the occasional video game hacking.

http://multimedia.cx/eggs/

Les articles publiés sur le site

1 | ... | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | ... | 38

Diamond Rio Artifacts

30 août 2012, par Multimedia Mike — Multimedia History
Remember the Diamond Rio PMP300? It’s credited with being the very first portable MP3 player, released all the way back in 1998 (I say ‘credited’ because I visited an audio museum once which exhibited a Toshiba MP3 player from 1997). I recently rescued a pristine set of Rio artifacts from a recycle pile.

I wondered if I should scan the manual for posterity. However, a Google search indicates that a proper PDF (loaded with pleas to not illegally copy music) isn’t very difficult to come by. Here are the other items that came with the unit:

Click for larger image

Ah, more memories (of dialup internet): A tie-in with another Diamond product, this time a modem which claims to enable the user to download songs at up to 112 kilobits per second. I wonder if that was really possible. I remember that 56k modems were a stretch and 33.6k was the best that most users could hope for.

There is also a separate piece of paper that advises the buyer that the parallel port adapter might look a bit different than what is seen in the printed copy. Imagine the age of downloading to your MP3 player via parallel port while pulling down new songs via dialup internet.

The artifacts also included not one, but two CD-ROMs:

Click for larger image

One is a driver and software disc, so no big surprise there. The other has a selection of MP3 files for your shiny new MP3 player. I’m wondering if these should be proactively preserved. I was going to process the files’ metadata and publish it here, for the benefit of search engines. However, while metadata is present, the files don’t conform to any metadata format that FFmpeg/Libav recognize. The files mention Brava Software Inc. in their metadata sections. Still, individual filenames at the end of this post.

Leftovers:
A few other miscellaneous multimedia acquisitions:

I still want to study all of these old multimedia creation programs in depth some day. Theatrix Hollywood is a creative writing game, Wikipedia alleges (I’m a bit rigid with my exact definition of what constitutes a game). Here is an example movie output from this software. Meanwhile, the Mad Dog Multimedia CD-ROM apparently came packaged with a 56X CD-ROM drive (roughly the pinnacle of CD-ROM speeds). I found it has some version of Sonic Foundry’s ACID software, thus making good on the “applications” claim on the CD-ROM copy.

Diamond Rio MP3 Sampler
These are the names of the MP3 files found on the Diamond Rio MP3 sampler for the benefit of search engines.
```
13_days.mp3
albert_einstein_dreams.mp3
a_man_of_many_colours.mp3
anything_for_love.mp3
a_secret_place.mp3
bake_sale.mp3
bigger_than_the_both_of_us.mp3
boogie_beat.mp3
bring_it_on.mp3
buskersoundcheck_hippo.mp3
charm.mp3
chemical_disturbance.mp3
coastin.mp3
credit_is_due.mp3
dance_again.mp3
destiny.mp3
dig_a_little_deeper.mp3
diplomat6_bigmouthshut.mp3
dirty_littlemonster.mp3
dirty.mp3
drivin.mp3
Eric_Clapton_Last_Train.mp3
etude_in_c_sharp_minor_op_42_n.mp3
everybody_here.mp3
freedom_4_all.mp3
grandpas_advice.mp3
groove.mp3
heartland.mp3
he_loved_her_so.mp3
highway_to_hell.mp3
hit_the_ground_runnin.mp3
i_feel_fine_today.mp3
im_not_lost_im_exploring.mp3
into_the_void.mp3
its_alright.mp3
i_will_be_there.mp3
i_will_pass_this_way_again.mp3
juiceboxwilly_hepcat.mp3
just_an_illusion.mp3
keepin_time_by_the_river.mp3
king_of_the_brooklyn_delta.mp3
lovermilou_ringingbell.mp3
middle_aged_rock_and_rollers.mp3
midnight_high.mp3
mr_schwinn.mp3
my_brilliant_masterpiece.mp3
my_gallery.mp3
on_the_river_road.mp3
pouring_rain.mp3
prayer.mp3
rats_in_my_bedroom.mp3
razor_serpent_and_the_dub_mix.mp3
ruthbuzzy_pleasestophangin.mp3
secret_love.mp3
ships.mp3
silence_the_thunder.mp3
sleeping_beauty.mp3
slow_burn.mp3
standing_in_my_own_way.mp3
take_no_prisoners.mp3
takin_up_space.mp3
Taylor_Dayne_Unstoppable.mp3
the_laundromat_song.mp3
the_old_dun_cow.mp3
the_people_i_meet.mp3
trip_trigger_avenue.mp3
tru-luv.mp3
unfortunate_man.mp3
vertigo.mp3
when_she_runs.mp3
where_do_we_go_from_here.mp3
words_of_earnest.mp3
```
Adjusting The Timetable and SQL Shame

16 août 2012, par Multimedia Mike — General, Python, sql
My Game Music Appreciation website has a big problem that many visitors quickly notice and comment upon. The problem looks like this:

The problem is that all of these songs are 2m30s in length. During the initial import process, unless a chiptune file already had curated length metadata attached, my metadata utility emitted a default play length of 150 seconds. This is not good if you want to listen to all the songs in a soundtrack without interacting with the player page, but have various short songs (think “game over” or other quick jingles) that are over in a few seconds. Such songs still pad out 150 seconds of silence.

So I needed to correct this. Possible solutions:
1. Manually: At first, I figured I could ask the database which songs needed fixing and listen to them to determine the proper lengths. Then I realized that there were well over 1400 games affected by this problem. This just screams “automated solution”.
2. Automatically: Ask the database which songs need fixing and then somehow ask the computer to listen to the songs and decide their proper lengths. This sounds like a winner, provided that I can figure out how to programmatically determine if a song has “finished”.
SQL Shame
This play adjustment task has been on my plate for a long time. A key factor that has blocked me is that I couldn’t figure out a single SQL query to feed to the SQLite database underlying the site which would give me all the songs I needed. To be clear, it was very simple and obvious to me how to write a program that would query the database in phases to get all the information. However, I felt that it would be impure to proceed with the task unless I could figure out one giant query to get all the information.

This always seems to come up whenever I start interacting with a database in any serious way. I call it SQL shame. This task got some traction when I got over this nagging doubt and told myself that there’s nothing wrong with the multi-step query program if it solves the problem at hand.

Suddenly, I had a flash of inspiration about why the so-called NoSQL movement exists. Maybe there are a lot more people who don’t like trying to derive such long queries and are happy to allow other languages to pick up the slack.

Estimating Lengths
Anyway, my solution involved writing a Python script to iterate through all the games whose metadata was output by a certain engine (the one that makes the default play length 150 seconds). For each of those games, the script queries the song table and determines if each song is exactly 150 seconds. If it is, then go to work trying to estimate the true length.

The forgoing paragraph describes what I figured was possible with only a single (possibly large) SQL query.

For each song represented in the chiptune file, I ran it through a custom length estimator program. My brilliant (err, naïve) solution to the length estimation problem was to synthesize seconds of audio up to a maximum of 120 seconds (tightening up the default length just a bit) and counting how many of those seconds had all 0 samples. If the count reached 5 consecutive seconds of silence, then the estimator rewound the running length by 5 seconds and declared that to be the proper length. Update the database.

There were about 1430 chiptune files whose songs needed updates. Some files had 1 single song. Some files had over 100. When I let the script run, it took nearly 65 minutes to process all the files. That was a single-threaded solution, of course. Even though I already had the data I needed, I wanted to try to hand at parallelizing the script. So I went to work with Python’s multiprocessing module and quickly refactored it to use all 4 CPU threads on the machine where the files live. Results:
- Single-threaded solution: 64m42s to process corpus (22 games/minute)
- Multi-threaded solution: 18m48s with 4 CPU threads (75 games/minute)
More than a 3x speedup across 4 CPU threads, which is decent for a primarily CPU-bound operation.

Epilogue
I suspect that this task will require some refinement or manual intervention. Maybe there are songs which actually have more than 5 legitimate seconds of silence. Also, I entertained the possibility that some songs would generate very low amplitude noise rather than being perfectly silent. In that case, I could refine the script to stipulate that amplitudes below a certain threshold count as 0. Fortunately, I marked which games were modified by this method, so I can run a new script as necessary.

SQL Schema
Here is the schema of my SQlite3 database, for those who want to try their hand at a proper query. I am confident that it’s possible; I just didn’t have the patience to work it out. The task is to retrieve all the rows from the games table where all of the corresponding songs in the songs table is 150000 milliseconds.
sql
< view plain text >
1. CREATE TABLE games
2. (
3. id INTEGER PRIMARY KEY AUTOINCREMENT,
4. uncompressed_sha1 TEXT,
5. uncompressed_size INTEGER,
6. compressed_sha1 TEXT,
7. compressed_size INTEGER,
8. system TEXT,
9. game TEXT,
10. gme_system TEXT default NULL,
11. canonical_url TEXT default NULL,
12. extension TEXT default "gamemusicxz",
13. enabled INTEGER default 1,
14. redirect_to_id INT DEFAULT -1,
15. play_lengths_modified INT DEFAULT NULL);
16. CREATE TABLE songs
17. (
18. game_id INTEGER,
19. song_number INTEGER NOT NULL,
20. song TEXT,
21. author TEXT,
22. copyright TEXT,
23. dumper TEXT,
24. length INTEGER,
25. intro_length INTEGER,
26. loop_length INTEGER,
27. play_length INTEGER,
28. play_order INTEGER default -1);
29. CREATE TABLE tags
30. (
31. game_id INTEGER,
32. tag TEXT NOT NULL,
33. tag_type TEXT default "filename");
34. CREATE INDEX gameid_index_songs ON songs(game_id);
35. CREATE INDEX gameid_index_tag ON tags(game_id);
36. CREATE UNIQUE INDEX sha1_index ON games(uncompressed_sha1);
Adding A New System To The Game Music Website

1er août 2012, par Multimedia Mike — General
At first, I was planning to just make a little website where users could install a Chrome browser extension and play music from old 8-bit NES games. But, like many software projects, the goal sort of ballooned. I created a website where users can easily play old video game music. It doesn’t cover too many systems yet, but I have had individual requests to add just about every system you can think of.

The craziest part is that I know it’s possible to represent most of the systems. Eventually, it would be great to reach Chipamp parity (a combination plugin for Winamp that packages together plugins for many of these chiptunes). But there is a process to all of this. I have taken to defining a number of phases that are required to get a new system covered.

Phase 0 informally involves marveling at the obscurity of some of the console systems for which chiptune collections have evolved. WonderSwan? Sharp X68000? PC-88? I may be viewing this through a terribly Ameri-centric lens. I’ve at least heard of the ZX Spectrum and the Amstrad CPC even if I’ve never seen either.

No matter. The goal is to get all their chiptunes cataloged and playable.

Phase 1: Finding A Player
The first step is to find a bit of open source code that can play a particular format. If it’s a library that can handle many formats, like Game Music Emu or Audio Overload SDK, even better (probably). The specific open source license isn’t a big concern for me. I’m almost certain that some of the libraries that SaltyGME currently mixes are somehow incompatible, license-wise. I’ll worry about it when I encounter someone who A) cares, and B) is in a position to do something about it. Historical preservation comes first, and these software libraries aren’t getting any younger (I’m finding some that haven’t been touched in a decade).

Phase 2: Test Program
The next phase is to create a basic test bench program that sends a music file into the library, generates a buffer of audio, and shoves it out to the speakers via PulseAudio’s simple API (people like to rip on PulseAudio, but its simple API really lives up to its name and requires pages less boilerplate code to play a few samples than ALSA).

Phase 3: Plug Into Web Player
After successfully creating the test bench and understanding exactly which source files need to be built, the next phase is to hook it up to the main SaltyGME program via the ad-hoc plugin API I developed. This API requires that a player backend can, at the very least, initialize itself based on a buffer of bytes and generate audio samples into an array of 16-bit numbers. The API also provides functions for managing files with multiple tracks and toggling individual voices/channels if the library supports such a feature. Having the test bench application written beforehand usually smooths out this step.

But really, I’m just getting started.

Phase 4: Collecting A Song Corpus
Then there is the matter of staging a collection of songs for a given system. It seems like it would just be a matter of finding a large collection of songs for a given format, downloading them in bulk, and mirroring them. Honestly, that’s the easy part. People who are interested in this stuff have been lovingly curating massive collections of these songs for years (see SNESmusic.org for one of the best examples, and they also host a torrent of all their music for really quick and easy hoarding).

In my drive to make this game music website more useful for normal people, the goal is to extract as much metadata as possible to make searching better, and to package the data so that it’s as convenient as possible for users. Whenever I seek to add a new format to the collection, this is the phase where I invariably find that I have to fundamentally modify some of the assumptions I originally made in the player.

First, there were the NES Sound Format (NSF) files, the original format I wanted to play. These are files that have any number of songs packed into a single file. Playback libraries expose APIs to jump to individual tracks. So the player was designed around that. Game Boy GBS files also fall into this category but present a different challenge vis-à-vis metadata, addressed in the next phase.

Then, there were the SPC files. Each SPC file is its own song and multiple SPC files are commonly bundled as RAR files. Not wanting to deal with RAR, or any format where I interacted with a general compression API to pull a few files out, I created a custom resource format (inspired by so many I have studied and documented) and compressed it with a simpler compression API. I also had to modify some of the player’s assumptions to deal with this archive format. Genesis VGMs, bundled either in .zip or .7z, followed the same model as SPC in RAR.

Then it was suggested that I attempt to bring SaltyGME closer to feature parity with Chipamp, rather than just being a Chrome browser frontend for Game Music Emu. When I studied the Portable Sound Format (PSF), I realized it didn’t fit into the player model I already had. PSF uses a sort of shared library model for code execution and I developed another resource archive format to cope with it. So that covers quite a few formats.

One more architecture challenge arose when I started to study one of the prevailing metadata formats, explained in the next phase.

Phase 5: Metadata
Finally, for the collections to really be useful, I need to harvest that juicy metadata for search and presentation.

I have created a series of programs and scripts to scrape metadata out of these music files and store it all in a database that drives the website and search engine. I recognize that it’s no good to have a large corpus of songs with minimal metadata and while importing bulk quantities of music, the scripts harshly reject songs that have too little metadata.

Again, challenges abound. One of the biggest challenges I’m facing is the peculiar quasi-freeform metadata format that emerged as .m3u that takes a form similar to:
```
#################################################################
#
# GRADIUS2
# (c) KONAMI  by Furukawa Motoaki, IKACHAN
#
#################################################################

nemesis2.kss::KSS,62,[Nemesis2] (Opening),2:23,,0
nemesis2.kss::KSS,61,[Nemesis2] (Start),7,,0
nemesis2.kss::KSS,43,[Nemesis2] (Air Battle),34,0-
nemesis2.kss::KSS,44,[Nemesis2] (1st. BGM),51,0-
[...]
```
A lot of file formats (including Game Boy GBS mentioned earlier) store their metadata separately using this format. I have some ideas about tools I can use to help me process this data but I’m pretty sure each one will require some manual intervention.

As alluded to in phase 4, .m3u presents another architectural challenge: Notice the second field in the CSV .m3u data. That’s a track number. A player can’t expect every track in a bundled chiptune file to be valid, nor to be in any particular order. Thus, I needed to alter the architecture once more to take this into account. However, instead of modifying the SaltyGME player, I simply extended the metadata database to include a playback order which, by default, is the same as the track order but can also accommodate this new issue. This also has the bonus of providing a facility to exclude playback of certain tracks. This comes in handy for many PSF archives which tend to include files that only provide support for other files and aren’t meant to be played on their own.

Bright Side
The reward for all of this effort is that the data lands in a proper database in the end. None of it goes back into the chiptune files themselves. This makes further modification easier as all of the data that is indexed and presented on the site comes from the database. Somewhere down the road, I should probably create an API for accessing this metadata.
Re-solving My Search Engine Problem

28 juillet 2012, par Multimedia Mike — General, swish-e

14 years ago, I created a web database of 8-bit Nintendo Entertainment System games. To make it useful, I developed a very primitive search feature.

A few months ago, I decided to create a web database of video game music. To make it useful, I knew it would need to have a search feature. I realized I needed to solve the exact same problem again.

Requirements
The last time I solved this problem, I came up with an excruciatingly naïve idea. Hey, it worked. I really didn't want to deploy the same solution again because it felt so silly the first time. Surely there are many better ways to solve it now? Many different workable software solutions that do all the hard work for me?

The first time I attacked this, it was 1998 and hosting resources were scarce. On my primary web host I was able to put static HTML pages, perhaps with server side includes. The web host also offered dynamic scripting capabilities via something called htmlscript (a.k.a. MIVA Script). I had a secondary web host in my ISP which allowed me to host conventional CGI scripts on a Unix host, so that's where I hosted the search function (Perl CGI script accessing a key/value data store file).

Nowadays, sky's the limit. Any type of technology you want to deploy should be tractable. Still, a key requirement was that I didn't want to pay for additional hosting resources for this silly little side project. That leaves me with options that my current shared web hosting plan allows, which includes such advanced features as PHP, Perl and Python scripts. I can also access MySQL.

Candidates
There are a lot of mature software packages out there which can index and search data and be plugged into a website. But a lot of them would be unworkable on my web hosting plan due to language or library package limitations. Further, a lot of them feel like overkill. At the most basic level, all I really want to do is map a series of video game titles to URLs in a website.

Based on my research, Lucene seems to hold a fair amount of mindshare as an open source indexing and search solution. But I was unsure of my ability to run it on my hosting plan. I think MySQL does some kind of full text search, so I could have probably made a solution around that. Again, it just feels like way more power than I need for this project.

I used Swish-e once about 3 years ago for a little project. I wasn't confident of my ability to run that on my server either. It has a Perl API but it requires custom modules.

My quest for a search solution grew deep enough that I started perusing a textbook on information retrieval techniques in preparation for possibly writing my own solution from scratch. However, in doing so, I figured out how I might subvert an existing solution to do what I want.

Back to Swish-e
Again, all I wanted to do was pull data out of a database and map that data to a URL in a website. Reading the Swish-e documentation, I learned that the software supports a mode specifically tailored for this. Rather than asking Swish-e to index a series of document files living on disk, you can specify a script for Swish-e to run and the script will generate what appears to be a set of phantom documents for Swish-e to index.

When I 'add' a game music file to the game music website, I have a scripts that scrape the metadata (game title, system, song titles, composers, company, copyright, the original file name on disk, even the ripper/dumper who extracted the chiptune in the first place) and store it all in an SQLite database. When it's time to update the database, another script systematically generates a series of pseudo-documents that spell out the metadata for each game and prefix each document with a path name. Searching for a term in the index returns a lists of paths that contain the search term. Thus, it makes sense for that path to be a site URL.

But what about a web script which can search this Swish-e index? That's when I noticed Swish-e's C API and came up with a crazy idea: Write the CGI script directly in C. It feels like sheer madness (or at least the height of software insecurity) to write a CGI script directly in C in this day and age. But it works (with the help of cgic for input processing), just as long as I statically link the search script with libswish-e.a (and libz.a). The web host is an x86 machine, after all.

I'm not proud of what I did here-- I'm proud of how little I had to do here. The searching CGI script is all of about 30 lines of C code. The one annoyance I experienced while writing it is that I had to consult the Swish-e source code to learn how to get my search results (the "swishdocpath" key -- or any other key -- for SwishResultPropertyStr() is not documented). Also, the C program just does the simplest job possible, only querying the term in the index and returning the results in plaintext, in order of relevance, to the client-side JavaScript code which requested them. JavaScript gets the job of sorting and grouping the results for presentation.

Tuning the Search
Almost immediately, I noticed that the search engine could not find one of my favorite SNES games, U.N. Squadron. That's because all of its associated metadata names Area 88, the game's original title. Thus, I had to modify the metadata database to allow attaching somewhat free-form tags to games in order to compensate. In this case, an alias title would show up in the game's pseudo-document.

Roman numerals are still a thorn in my side, just as they were 14 years ago in my original iteration. I dealt with it back then by converting all numbers to Roman numerals during the index and searching processes. I'm not willing to do that for this case and I'm still looking for a good solution.

Another annoying problem deals with Mega Man, a popular franchise. The proper spelling is 2 words but it's common for people to mash it into one word, Megaman (see also: Spider-Man, Spiderman, Spider Man). The index doesn't gracefully deal with that and I have some hacks in place to cope for the time being.

Positive Results
I'm pleased with the results so far, and so are the users I have heard from. I know one user expressed amazement that a search for Castlevania turned up Akumajou Densetsu, the Japanese version of Castlevania III: Dracula's Curse. This didn't surprise me because I manually added a hint for that mapping. (BTW, if you are a fan of Castlevania III, definitely check out the Akumajou Densetsu soundtrack which has an upgraded version of the same soundtrack using special audio channels.)

I was a little more surprised when a user announced that searching for 'probotector' correctly turned up Contra: Hard Corps. I looked into why this was. It turns out that the original chiptune filename was extremely descriptive: "Contra - Hard Corps [Probotector] (1994-08-08)(Konami)". The filenames themselves often carry a bunch of useful metadata which is why it's important to index those as well.

And of course, many rippers, dumpers, and taggers have labored for over a decade to lovingly tag these songs with as much composer information as possible, which all gets indexed. The search engine gets a lot of compliments for its ability to find many songs written by favorite composers.
Zlib vs. XZ on 2SF

21 juillet 2012, par Multimedia Mike — General, psf, saltygme, xz, zlib

I recently released my Game Music Appreciation website. It allows users to play an enormous range of video game music directly in their browsers. To do this, the site has to host the music. And since I'm a compression bore, I have to know how small I can practically make these music files. I already published the results of my effort to see if XZ could beat RAR (RAR won, but only slightly, and I still went with XZ for the project) on the corpus of Super Nintendo chiptune sets. Next is the corpus of Nintendo DS chiptunes.

Repacking Nintendo DS 2SF
The prevailing chiptune format for storing Nintendo DS songs is the .2sf format. This is a subtype of the Portable Sound Format (PSF). The designers had the foresight to build compression directly into the format. Much of payload data in a PSF file is compressed with zlib. Since I already incorporated Embedded XZ into the player project, I decided to try repacking the PSF payload data from zlib -> xz.

In an effort to not corrupt standards too much, I changed the 'PSF' file signature (seen in the first 3 bytes of a file) to 'psf'.

Results
There are about 900 Nintendo DS games currently represented in my website's archive. Total size of the original PSF archive, payloads packed with zlib: 2.992 GB. Total size of the same archive with payloads packed as xz: 2.059 GB.

Using xz vs. zlib saved me nearly a gigabyte of storage. That extra storage doesn't really impact my hosting plan very much (I have 1/2 TB, which is why I'm so nonchalant about hosting the massive MPlayer Samples Archive). However, smaller individual files translates to a better user experience since the files are faster to download.

Here is a pretty picture to illustrate the space savings:

The blue occasionally appears to dip below the orange but the data indicates that xz is always more efficient than zlib. Here's the raw data (comes in vanilla CSV flavor too).

Interface Impact
So the good news for the end user is that the songs are faster to load up front. The downside is that there can be a noticeable delay when changing tracks. Even though all songs are packaged into one file for download, and the entire file is downloaded before playback begins, each song is individually compressed. Thus, changing tracks triggers another decompression operation. I'm toying the possibility of some sort of background process that decompresses song (n+1) while playing song (n) in order to help compensate for this.

I don't like the idea of decompressing everything up front because A) it would take even longer to start playing; and B) it would take a huge amount of memory.

Corner Case
There was at least one case in which I found zlib to be better than xz. It looks like zlib's minimum block size is smaller than xz's. I think I discovered xz to be unable to compress a few bytes to a block any smaller than about 60-64 bytes while zlib got it down into the teens. However, in those cases, it was more efficient to just leave the data uncompressed anyway.