
Recherche avancée
Autres articles (28)
-
La file d’attente de SPIPmotion
28 novembre 2010, parUne file d’attente stockée dans la base de donnée
Lors de son installation, SPIPmotion crée une nouvelle table dans la base de donnée intitulée spip_spipmotion_attentes.
Cette nouvelle table est constituée des champs suivants : id_spipmotion_attente, l’identifiant numérique unique de la tâche à traiter ; id_document, l’identifiant numérique du document original à encoder ; id_objet l’identifiant unique de l’objet auquel le document encodé devra être attaché automatiquement ; objet, le type d’objet auquel (...) -
MediaSPIP version 0.1 Beta
16 avril 2011, parMediaSPIP 0.1 beta est la première version de MediaSPIP décrétée comme "utilisable".
Le fichier zip ici présent contient uniquement les sources de MediaSPIP en version standalone.
Pour avoir une installation fonctionnelle, il est nécessaire d’installer manuellement l’ensemble des dépendances logicielles sur le serveur.
Si vous souhaitez utiliser cette archive pour une installation en mode ferme, il vous faudra également procéder à d’autres modifications (...) -
MediaSPIP Player : problèmes potentiels
22 février 2011, parLe lecteur ne fonctionne pas sur Internet Explorer
Sur Internet Explorer (8 et 7 au moins), le plugin utilise le lecteur Flash flowplayer pour lire vidéos et son. Si le lecteur ne semble pas fonctionner, cela peut venir de la configuration du mod_deflate d’Apache.
Si dans la configuration de ce module Apache vous avez une ligne qui ressemble à la suivante, essayez de la supprimer ou de la commenter pour voir si le lecteur fonctionne correctement : /** * GeSHi (C) 2004 - 2007 Nigel McNie, (...)
Sur d’autres sites (4714)
-
The Big VP8 Debug
20 novembre 2010, par Multimedia Mike — VP8I hope my previous walkthrough of the VP8 4x4 intra coding process was educational. Today, I’ll be walking through an example of what happens when my toy VP8 encoder encodes an intra 16x16 block. This may prove educational to those who have never been exposed to the deep details of this or related algorithms. Also, I wanted to illustrate where I think my VP8 encoder process is going bad and generating such grotesque results.
Before I start, let me give a shout-out to Google Docs’ Drawing tool which I used to generate these diagrams. It works quite well.
Results
(Always cut to the chase in a blog post ; results first.) I’m glad I composed this post. In the course of doing so, I found the problem, fixed it, and am now able to present this image that was decoded from the bitstream encoded by my
toyworking VP8 encoder :
Yeah, I know that image doesn’t look like anything you haven’t seen before. The difference is that it has made a successful trip through my VP8 encoder.
Follow along through the encoding process and learn of the mistake...
Original Block and Subblocks
Here is the 16x16 block to be encoded :
The block is broken down into 16 4x4 subblocks for further encoding :
Prediction
The first step is to pick a prediction mode, generate a prediction block, and subtract the predictors from the macroblock. In this case, we will use DC prediction which means the predictor will be the same for each element.In 4x4 VP8 DC intra prediction, samples outside of the frame are assumed to be 128. It’s a little different in 16x16 DC intra prediction— samples above the top row are assumed to be 127 while samples left of the leftmost column are assumed to be 129. For the top left macroblock, this still works out to 128.
Subtract 128 from each of the samples :
Forward Transform
Run each of the 16 prediction-removed subblocks through the forward transform. This example uses the forward transform from libvpx 0.9.5 :
I have highlighted the DC coefficients in each subblock. That’s because those receive special consideration in 16x16 intra coding.
Quantization
The Y plane AC quantizer is 4 in this example, the minimum allowed. (The Y plane DC quantizer is also 4 but doesn’t come into play for intra 16x16 coding since the DC coefficients follow a different process.) Thus, quantize (integer divide) each AC element in each subblock (we’ll ignore the DC coefficient for this part) :
The Y2 Round Trip
Those highlighted DC coefficients from each of the 16 subblocks comprise the Y2 block. This block is transformed with a slightly different algorithm called the Walsh-Hadamard Transform (WHT). The results of this transform are then quantized (using 8 for both Y2 DC and AC in this example, as those are the smallest Y2 quantizers that VP8 allows), then zigzagged and entropy-coded along with the rest of the macroblock coefficients.
On the decoder side, the Y2 coefficients are decoded, de-zigzagged, dequantized and run through the inverse WHT.
And this is where I suspect that most of the error is creeping into my VP8 encoder. Observe the round-trip through the Y2 process :
As intimated, this part causes me consternation due to the wide discrepancy between the original and the reconstructed Y2 blocks. Observe the absolute difference between the 2 vectors :
That’s really significant and leads me to believe that this is where the big problem is.
What’s Wrong ?
My first suspicion is that the quantization is throwing off the process. I was disabused of this idea when I removed quantization from the equation and immediately reversed the transform :
So perhaps there is a problem with the forward WHT. Just like with the usual subblock transform, the VP8 spec doesn’t define how to perform the forward WHT, only the inverse WHT. Do I need to audition different forward WHTs from various versions of libvpx, similar to what I did with the other transform ? That doesn’t make a lot of sense— libvpx doesn’t seem to have so much trouble with basic encoding.
The Punchline
I reviewed the forward WHT code, the stuff that I plagiarized from libvpx 0.9.0. The function takes, among other parameters, a pitch value. There are 2 loops in the code. The first iterates through the rows of the input matrix— which I assumed was a 4x4 matrix. I was puzzled that during each iteration of the row loop, the input pointer was only being advanced by
(pitch/2)
. I removed the division by 2 and the problem went away. I.e., the encoded image looks correct.What’s up with the
(pitch/2)
, anyway ? It seems that the encoder likes to pack 2 4x4 subblocks into an 8x4 block data structure. In fact, the forward DCTs in the libvpx encoder have the same artifact. Remember how I surveyed several variations of forward DCT from different versions of libvpx ? The one that proved most accurate in that test was the one I had already modified to advance the input pointer properly. Fixing the other 2 candidates yields similar results :input : 92 91 89 86 91 90 88 86 89 89 89 88 89 87 88 93 short 0.9.0 : -311 6 2 0 0 11 -6 1 2 -3 3 0 0 0 -2 1 inverse : 92 91 89 86 91 90 88 87 90 89 89 88 89 87 88 93 fast 0.9.0 : -313 5 1 0 1 11 -6 1 3 -3 4 0 0 0 -2 1 inverse : 91 91 89 86 90 90 88 86 89 89 89 88 89 87 88 93 short 0.9.5 : -312 7 1 0 1 12 -5 2 2 -3 3 -1 1 0 -2 1 inverse : 92 91 89 86 91 90 88 86 89 89 89 88 89 87 88 93
Code cribber beware !
Corrected Y2 Round Trip
Let’s look at that Y2 round trip one more time :
And another look at the error between the original and the reconstruction :
Better.
Dequantization, Prediction, Inverse Transforms, and Reconstruction
To be honest, now that I solved the major problem, I’m getting a little tired of making these pictures. Long story short, all elements of the original 16 subblocks are dequantized and their DC coefficients are filled in with the appropriate item from the reconstructed Y2 block. A base predictor block is generated (all 128 values in this case). And each Y block is run through the inverse transform and added to the predictor block. The following is the reconstruction :
And if you compare that against the original luma macroblock (I don’t feel like doing it right now), you’ll find that it’s pretty close.
I can’t believe how close I was all this time, and how long that pitch bug held me up.
-
How to cheat on video encoder comparisons
Over the past few years, practically everyone and their dog has published some sort of encoder comparison. Sometimes they’re actually intended to be something for the world to rely on, like the old Doom9 comparisons and the MSU comparisons. Other times, they’re just to scratch an itch — someone wants to decide for themselves what is better. And sometimes they’re just there to outright lie in favor of whatever encoder the author likes best. The latter is practically an expected feature on the websites of commercial encoder vendors.
One thing almost all these comparisons have in common — particularly (but not limited to !) the ones done without consulting experts — is that they are horribly done. They’re usually easy to spot : for example, two videos at totally different bitrates are being compared, or the author complains about one of the videos being “washed out” (i.e. he screwed up his colorspace conversion). Or the results are simply nonsensical. Many of these problems result from the person running the test not “sanity checking” the results to catch mistakes that he made in his test. Others are just outright intentional.
The result of all these mistakes, both intentional and accidental, is that the results of encoder comparisons tend to be all over the map, to the point of absurdity. For any pair of encoders, it’s practically a given that a comparison exists somewhere that will “prove” any result you want to claim, even if the result would be beyond impossible in any sane situation. This often results in the appearance of a “controversy” even if there isn’t any.
Keep in mind that every single mistake I mention in this article has actually been done, usually in more than one comparison. And before I offend anyone, keep in mind that when I say “cheating”, I don’t mean to imply that everyone that makes the mistake is doing it intentionally. Especially among amateur comparisons, most of the mistakes are probably honest.
So, without further ado, we will investigate a wide variety of ways, from the blatant to the subtle, with which you too can cheat on your encoder comparisons.
Blatant cheating
1. Screw up your colorspace conversions. A common misconception is that converting from YUV to RGB and back is a simple process where nothing can go wrong. This is quite untrue. There are two primary attributes of YUV : PC range (0-255) vs TV range (16-235) and BT.709 vs BT.601 conversion coefficients. That sums up to a total of 4 possible different types of YUV. When people compare encoders, they often use different frontends, some of which make incorrect assumptions about these attributes.
Incorrect assumptions are so common that it’s often a matter of luck whether the tool gets it right or not. It doesn’t help that most videos don’t even properly signal which they are to begin with ! Often even the tool that the person running the comparison is using to view the source material gets the conversion wrong.
Subsampling YUV (aka what everyone uses) adds yet another dimension to the problem : the locations which the chroma data represents (“chroma siting”) isn’t constant. For example, JPEG and MPEG-2 define different positions. This is even worse because almost nobody actually handles this correctly — the best approach is to simply make sure none of your software is doing any conversion. A mistake in chroma siting is what created that infamous PSNR graph showing Theora beating x264, which has been cited for ages since despite the developers themselves retracting it after realizing their mistake.
Keep in mind that the video encoder is not responsible for colorspace conversion — almost all video encoders operate in the YUV domain (usually subsampled 4:2:0 YUV, aka YV12). Thus any problem in colorspace conversion is usually the fault of the tools used, not the actual encoder.
How to spot it : “The color is a bit off” or “the contrast of the video is a bit duller”. There were a staggering number of “H.264 vs Theora” encoder comparisons which came out in favor of one or the other solely based on “how well the encoder kept the color” — making the results entirely bogus.
2. Don’t compare at the same (or nearly the same) bitrate. I saw a VP8 vs x264 comparison the other day that gave VP8 30% more bitrate and then proceeded to demonstrate that it got better PSNR. You would think this is blindingly obvious, but people still make this mistake ! The most common cause of this is assuming that encoders will successfully reach the target bitrate you ask of them — particularly with very broken encoders that don’t. Always check the output filesizes of your encodes.
How to spot it : The comparison lists perfectly round bitrates for every single test, as opposed to the actual bitrates achieved by the encoders, which will never be exactly matching in any real test.
3. Use unfair encoding settings. This is a bit of a wide topic : there are many ways to do this. We’ll cover the more blatant ones in this part. Here’s some common ones :
a. Simply cheat. Intentionally pick awful settings for the encoder you don’t like.
b. Don’t consider performance. Pick encoding settings without any regard for some particular performance goal. For example, it’s perfectly reasonable to say “use the best settings possible, regardless of speed”. It’s also reasonable to look for a particular encoding speed target. But what isn’t reasonable is to pick extremely fast settings for one encoder and extremely slow settings for another encoder.
c. Don’t attempt match compatibility options when it’s reasonable to do so. Keyframe interval is a classic one of these : shorter values reduce compression but improve seeking. An easy way to cheat is to simply not set them to the same value, biasing towards whatever encoder has the longer interval. This is most common as an accidental mistake with comparisons involving ffmpeg, where the default keyframe interval is an insanely low 12 frames.
How to spot it : The comparison doesn’t document its approach regarding choice of encoding settings.
4. Use ratecontrol methods unfairly. Constant bitrate is not the same as average bitrate — using one instead of the other is a great way to completely ruin a comparison. Another method is to use 1-pass bitrate mode for one encoder and 2-pass or constant quality for another. A good general approach is that, for any given encoder, one should use 2-pass if available and constant quality if not (it may take a few runs to get the bitrate you want, of course).
Of course, it’s also fine to run a comparison with a particular mode in mind — for example, a comparison targeted at streaming applications might want to test using 1-pass CBR. Of course, in such a case, if CBR is not available in an encoder, you can’t compare to that encoder.
How to spot it : It’s usually pretty obvious if the encoding settings are given.
5. Use incredibly old versions of encoders. As it happens, Debian stable is not the best source for the most recent encoding software. Equally, using recent versions known to be buggy.
6. Don’t distinguish between video formats and the software that encodes them. This is incredibly common : I’ve seen tests that claim to compare “H.264″ against something else while in fact actually comparing “Quicktime” against something else. It’s impossible to compare all H.264 encoders at once, so don’t even try — just call the comparison “Quicktime versus X” instead of “H.264 versus X”. Or better yet, use a good H.264 encoder, like x264 and don’t bother testing awful encoders to begin with.
Less-obvious cheating
1. Pick a bitrate that’s way too low. Low bitrate testing is very effective at making differences between encoders obvious, particularly if doing a visual comparison. But past a certain point, it becomes impossible for some encoders to keep up. This is usually an artifact of the video format itself — a scalability limitation. Practically all DCT-based formats have this kind of limitation (wavelets are mostly immune).
In reality, this is rarely a problem, because one could merely downscale the video to resolve the problem — lower resolutions need fewer bits. But people rarely do this in comparisons (it’s hard to do it fairly), so the best approach is to simply not use absurdly low bitrates. What is “absurdly low” ? That’s a hard question — it ends up being a matter of using one’s best judgement.
This tends to be less of a problem in larger-scale tests that use many different bitrates.
How to spot it : At least one of the encoders being compared falls apart completely and utterly in the screenshots.
Biases towards, a lot : Video formats with completely scalable coding methods (Dirac, Snow, JPEG-2000, SVC).
Biases towards, a little : Video formats with coding methods that improve scalability, such as arithmetic coding, B-frames, and run-length coding. For example, H.264 and Theora tend to be more scalable than MPEG-4.
2. Pick a bitrate that’s way too high. This is staggeringly common mistake : pick a bitrate so high that all of the resulting encodes look absolutely perfect. The claim is then made that “there’s no significant difference” between any of the encoders tested. This is surprisingly easy to do inadvertently on sources like Big Buck Bunny, which looks transparent at relatively low bitrates. An equally common but similar mistake is to test at a bitrate that isn’t so high that the videos look perfect, but high enough that they all look very good. The claim is then made that “the difference between these encoders is small”. Well, of course, if you give everything tons of bitrate, the difference between encoders is small.
How to spot it : You can’t tell which image is the source and which is the encode.
3. Making invalid comparisons using objective metrics. I explained this earlier in the linked blog post, but in short, if you’re going to measure PSNR, make sure all the encoders are optimized for PSNR. Equally, if you’re going to leave the encoder optimized for visual quality, don’t measure PSNR — post screenshots instead. Same with SSIM or any other objective metric. Furthermore, don’t blindly do metric comparisons — always at least look at the output as a sanity test. Finally, do not claim that PSNR is particularly representative of visual quality, because it isn’t.
How to spot it : Encoders with psy optimizations, such as x264 or Theora 1.2, do considerably worse than expected in PSNR tests, but look much better in visual comparisons.
4. Lying with graphs. Using misleading scales on graphs is a great way to make the differences between encoders seem larger or smaller than they actually are. A common mistake is to scale SSIM linearly : in fact, 0.99 is about twice as good as 0.98, not 1% better. One solution for this is to use db to compare SSIM values.
5. Using lossy screenshots. Posting screenshots as JPEG is a silly, pointless way to worsen an encoder comparison.
Subtle cheating
1. Unfairly pick screenshots for comparison. Comparing based on stills is not ideal, but it’s often vastly easier than comparing videos in motion. But it also opens up the door to unfairness. One of the most common mistakes is to pick a frame immediately after (or on) a keyframe for one encoder, but which isn’t for the other encoder. Particularly in the case of encoders that massively boost keyframe quality, this will unfairly bias in favor of the one with the recent keyframe.
How to spot it : It’s very difficult to tell, if not impossible, unless they provide the video files to inspect.
2. Cherry-pick source videos. Good source videos are incredibly hard to come by — almost everything is already compressed and what’s left is usually a very poor example of real content. Here’s some common ways to bias unfairly using cherry-picking :
a. Pick source videos that are already heavily compressed. Pre-compressed source isn’t much of an issue if your target quality level for testing is much lower than that of the source, since any compression artifacts in the source will be a lot smaller than those created by the encoders. But if the source is already very compressed, or you’re testing at a relatively high quality level, this becomes a significant issue.
Biases towards : Anything that uses a similar transform to the source content. For MPEG-2 source material, this biases towards formats that use the 8x8dct or a very close approximation : MPEG-1/2/4, H.263, and Theora. For H.264 source material, this biases towards formats that use a 4×4 transform : H.264 and VP8.
b. Pick standard test clips that were not intended for this purpose. There are a wide variety of uncompressed “standard test clips“. Some of these are not intended for general-purpose use, but rather exist to test specific encoder capabilities. For example, Mobile Calendar (“mobcal”) is extremely sharp and low motion, serving to test interpolation capabilities. It will bias incredibly heavily towards whatever encoder uses more B-frames and/or has higher-precision motion compensation. Other test clips are almost completely static, such as the classic “akiyo”. These are also not particularly representative of real content.
c. Pick very noisy content. Noise is — by definition — not particularly compressible. Both in terms of PSNR and visual quality, a very noisy test clip will tend to reduce the differences between encoders dramatically.
d. Pick a test clip to exercise a specific encoder feature. I’ve often used short clips from Touhou games to demonstrate the effectiveness of x264′s macroblock-tree algorithm. I’ve sometimes even used it to compare to other encoders as part of such a demonstration. I’ve also used the standard test clip “parkrun” as a demonstration of adaptive quantization. But claiming that either is representative of most real content — and thus can be used as a general determinant of how good encoders are — is of course insane.
e. Simply encode a bunch of videos and pick the one your favorite encoder does best on.
3. Preprocessing the source. A encoder test is a test of encoders, not preprocessing. Some encoding apps may add preprocessors to the source, such as noise reduction. This may make the video look better — possibly even better than the source — but it’s not a fair part of comparing the actual encoders.
4. Screw up decoding. People often forget that in addition to encoding, a test also involves decoding — a step which is equally possible to do wrong. One common error caused by this is in tests of Theora on content whose resolution isn’t divisible by 16. Decoding is often done with ffmpeg — which doesn’t crop the edges properly in some cases. This isn’t really a big deal visually, but in a PSNR comparison, misaligning the entire frame by 4 or 8 pixels is a great way of completely invalidating the results.
The greatest mistake of all
Above all, the biggest and most common mistake — and the one that leads to many of the problems mentioned here – is the mistaken belief that one, or even a few tests can really represent all usage fairly. Any comparison has to have some specific goal — to compare something in some particular case, whether it be “maximum offline compression ignoring encoding speed” or “real-time high-speed video streaming” or whatnot. And even then, no comparison can represent all use-cases in that category alone. An encoder comparison can only be honest if it’s aware of its limitations.
-
Dreamcast Anniversary Programming
10 septembre 2010, par Multimedia Mike — Game HackingThis day last year saw a lot of nostalgia posts on the internet regarding the Sega Dreamcast, launched 10 years prior to that day (on 9/9/99). Regrettably, none of the retrospectives that I read really seemed to mention the homebrew potential, which is the aspect that interested me. On the occasion of the DC’s 11th anniversary, I wanted to remind myself how to build something for the unit and do so using modern equipment and build tools.
Background
Like many other programmers, I initially gained interest in programming because I desired to program video games. Not content to just plunk out games on a PC, I always had a deep, abiding ambition to program actual video game hardware. That is, I wanted to program a purpose-built video game console. The Sega Dreamcast might be the most ideal candidate to ever emerge for that task. All that was required to run your own software on the unit was the console, a PC, some free software tools, and a special connectivity measure.The Equipment
Here is the hardware required (ideally) to build software for the DC :- The console itself (I happen to have 3 of them laying around, as pictured above)
- Some peripherals : Such as the basic DC controller, the DC keyboard (flagship title : Typing of the Dead), and the visual memory unit (VMU)
- VGA box : The DC supported 480p gaming via a device that allowed you to connect the console straight to a VGA monitor via 15-pin D-sub. Not required for development, but very useful. I happen to have 3 of them from different third parties :
- Finally, the connectivity measure for hooking the DC to the PC.
There are 2 options here. The first is rare, expensive and relatively fast : A DC broadband adapter. The second is slower but much less expensive and relatively easy to come by– the DC coder’s cable. This was a DB-9 adapter on one end and a DC serial adapter on the other, and a circuit in the middle to monkey with voltage levels or some such ; I’m no electrical engineer. I procured this model from the notorious Lik Sang, well before that outfit was sued out of business.
Dealing With Legacy
Take a look at that coder’s cable again. DB-9 ? When was the last time you owned a computer with one of those ? And then think farther back to the last time to had occasion to plug something into one of those ports (likely a serial mouse).
A few years ago, someone was about to toss out this Belkin USB to DB-9 serial converter when I intervened. I foresaw the day when I would dust off the coder’s cable. So now I can connect a USB serial cable to my Eee PC, which then connects via converter to a different serial cable, one which has its own conversion circuit that alters the connection to yet another type of serial cable.
Bits is bits is bits as far as I’m concerned.
Putting It All Together
Now to assemble all the pieces (plus a monitor) into one development desktop :
The monitor says “dcload 1.0.3, idle…”. That’s a custom boot CD-ROM that is patiently waiting to receive commands, code and data via the serial port.
Getting The Software
Back in the day, homebrew software development on the DC revolved around these components :- GNU binutils : for building base toolchains for the Hitachi SH-4 main CPU as well as the ARM7-based audio coprocessor
- GNU gcc/g++ : for building compilers on top of binutils for the 2 CPUs
- Newlib : a C library intended for embedded systems
- KallistiOS : an open source, real-time OS developed for the DC
The DC was my first exposure to building cross compilers. I developed some software for the DC in the earlier part of the decade. Now, I am trying to figure out how I did it, especially since I think I came up with a few interesting ideas at the time.
Struggling With the Software Legacy
The source for KallistiOS has gone untouched since about 2004 but is still around thanks to Sourceforge. The instructions for properly building the toolchain have been lost to time, or would be were it not for the Internet Archive’s copy of a site called Hangar Eleven. Also, KallistiOS makes reference to a program called ‘dc-tool’ which is needed on the client side for communicating with dcload. I was able to find this binary at the Boob ! site (well-known in DC circles).I was able to build the toolchain using binutils 2.20.1, gcc 4.5.1 and newlib 1.18.0. Building the toolchain is an odd process as it requires building the binutils, then building the C compiler, then newlib, and then building the C compiler again along with the C++ compiler because the C++ compiler depends on newlib.
With some effort, I got the toolchain to build KallistiOS and most of its example programs. I documented most of the tweaks I had to make, several of them exactly the same as this one that I recently discovered while resurrecting a 10-year-old C program (common construct in C programming of old ?).
Moment of Truth
So I had some example programs built as ELF files. I told dc-tool to upload and run them on the waiting console. Unfortunately, the tool would just sort of stall, though some communication had evidently taken place. It has been many years since I have seen this in action but I recall that something more ought to be happening.Plan B (Hardware)
This is the point that I remember that I have been holding onto one rather old little machine that still has a DB-9 serial port. It’s not especially ergonomic to set up. I have to run it on my floor because, to connect it to my network, I need to run a 25′ ethernet cable that just barely reaches from the other room. The machine doesn’t seem to like USB keyboards, which is a shame since I have long since ditched any PS/2 keyboards. Fortunately, the box still has an old Gentoo distro and is running sshd, a holdover from its former life as a headless box.
Now when I run dc-tool, both the PC and DC report the upload progress while pretty overscan bars oscillate on the DC’s monitor. Now I’m back in business, until…
Plan C (Software)
None of these KallistiOS example programs are working. Some are even reporting catastrophic failures (register dumps) via the serial console. That’s when I remember that gcc can be a bit fickle on CPU architectures that are not, shall we say, first-class citizens. Back in the day, gcc 2.95 was a certified no-go for SH-4 development. 3.0.3 or 3.0.4 was called upon at the time. As I’m hosting this toolchain on x86_64 right now, gcc 3.0.4 can’t even be built (predates the architecture).One last option : As I searched through my old DC project directories, I found that I still have a lot of the resulting binaries, the ones I built 7-8 years ago. I upload a few of those and I finally see homebrew programming at work again, including this old program (described in detail here).
Next Steps
If I ever feel like revisiting this again, I suppose I can try some of the older 4.x series to see if they build valid programs. Alternatively, try building an x86_32-hosted 3.0.4 toolchain which ought to be a known good. And if that fails, search a little bit more to find that there are still active Dreamcast communities out there on the internet which probably have development toolchain binaries ready for download.