
Recherche avancée
Autres articles (97)
-
MediaSPIP 0.1 Beta version
25 avril 2011, parMediaSPIP 0.1 beta is the first version of MediaSPIP proclaimed as "usable".
The zip file provided here only contains the sources of MediaSPIP in its standalone version.
To get a working installation, you must manually install all-software dependencies on the server.
If you want to use this archive for an installation in "farm mode", you will also need to proceed to other manual (...) -
HTML5 audio and video support
13 avril 2011, parMediaSPIP uses HTML5 video and audio tags to play multimedia files, taking advantage of the latest W3C innovations supported by modern browsers.
The MediaSPIP player used has been created specifically for MediaSPIP and can be easily adapted to fit in with a specific theme.
For older browsers the Flowplayer flash fallback is used.
MediaSPIP allows for media playback on major mobile platforms with the above (...) -
ANNEXE : Les plugins utilisés spécifiquement pour la ferme
5 mars 2010, parLe site central/maître de la ferme a besoin d’utiliser plusieurs plugins supplémentaires vis à vis des canaux pour son bon fonctionnement. le plugin Gestion de la mutualisation ; le plugin inscription3 pour gérer les inscriptions et les demandes de création d’instance de mutualisation dès l’inscription des utilisateurs ; le plugin verifier qui fournit une API de vérification des champs (utilisé par inscription3) ; le plugin champs extras v2 nécessité par inscription3 (...)
Sur d’autres sites (9752)
-
Anatomy of an optimization : H.264 deblocking
As mentioned in the previous post, H.264 has an adaptive deblocking filter. But what exactly does that mean — and more importantly, what does it mean for performance ? And how can we make it as fast as possible ? In this post I’ll try to answer these questions, particularly in relation to my recent deblocking optimizations in x264.
H.264′s deblocking filter has two steps : strength calculation and the actual filter. The first step calculates the parameters for the second step. The filter runs on all the edges in each macroblock. That’s 4 vertical edges of length 16 pixels and 4 horizontal edges of length 16 pixels. The vertical edges are filtered first, from left to right, then the horizontal edges, from top to bottom (order matters !). The leftmost edge is the one between the current macroblock and the left macroblock, while the topmost edge is the one between the current macroblock and the top macroblock.
Here’s the formula for the strength calculation in progressive mode. The highest strength that applies is always selected.
If we’re on the edge between an intra macroblock and any other macroblock : Strength 4
If we’re on an internal edge of an intra macroblock : Strength 3
If either side of a 4-pixel-long edge has residual data : Strength 2
If the motion vectors on opposite sides of a 4-pixel-long edge are at least a pixel apart (in either x or y direction) or the reference frames aren’t the same : Strength 1
Otherwise : Strength 0 (no deblocking)These values are then thrown into a lookup table depending on the quantizer : higher quantizers have stronger deblocking. Then the actual filter is run with the appropriate parameters. Note that Strength 4 is actually a special deblocking mode that performs a much stronger filter and affects more pixels.
One can see somewhat intuitively why these strengths are chosen. The deblocker exists to get rid of sharp edges caused by the block-based nature of H.264, and so the strength depends on what exists that might cause such sharp edges. The strength calculation is a way to use existing data from the video stream to make better decisions during the deblocking process, improving compression and quality.
Both the strength calculation and the actual filter (not described here) are very complex if naively implemented. The latter can be SIMD’d with not too much difficulty ; no H.264 decoder can get away with reasonable performance without such a thing. But what about optimizing the strength calculation ? A quick analysis shows that this can be beneficial as well.
Since we have to check both horizontal and vertical edges, we have to check up to 32 pairs of coefficient counts (for residual), 16 pairs of reference frame indices, and 128 motion vector values (counting x and y as separate values). This is a lot of calculation ; a naive implementation can take 500-1000 clock cycles on a modern CPU. Of course, there’s a lot of shortcuts we can take. Here’s some examples :
- If the macroblock uses the 8×8 transform, we only need to check 2 edges in each direction instead of 4, because we don’t deblock inside of the 8×8 blocks.
- If the macroblock is a P-skip, we only have to check the first edge in each direction, since there’s guaranteed to be no motion vector differences, reference frame differences, or residual inside of the macroblock.
- If the macroblock has no residual at all, we can skip that check.
- If we know the partition type of the macroblock, we can do motion vector checks only along the edges of the partitions.
- If the effective quantizer is so low that no deblocking would be performed no matter what, don’t bother calculating the strength.
But even all of this doesn’t save us from ourselves. We still have to iterate over a ton of edges, checking each one. Stuff like the partition-checking logic greatly complicates the code and adds overhead even as it reduces the number of checks. And in many cases decoupling the checks to add such logic will make it slower : if the checks are coupled, we can avoid doing a motion vector check if there’s residual, since Strength 2 overrides Strength 1.
But wait. What if we could do this in SIMD, just like the actual loopfilter itself ? Sure, it seems more of a problem for C code than assembly, but there aren’t any obvious things in the way. Many years ago, Loren Merritt (pengvado) wrote the first SIMD implementation that I know of (for ffmpeg’s decoder) ; it is quite fast, so I decided to work on porting the idea to x264 to see if we could eke out a bit more speed here as well.
Before I go over what I had to do to make this change, let me first describe how deblocking is implemented in x264. Since the filter is a loopfilter, it acts “in loop” and must be done in both the encoder and decoder — hence why x264 has it too, not just decoders. At the end of encoding one row of macroblocks, x264 goes back and deblocks the row, then performs half-pixel interpolation for use in encoding the next frame.
We do it per-row for reasons of cache coherency : deblocking accesses a lot of pixels and a lot of code that wouldn’t otherwise be used, so it’s more efficient to do it in a single pass as opposed to deblocking each macroblock immediately after encoding. Then half-pixel interpolation can immediately re-use the resulting data.
Now to the change. First, I modified deblocking to implement a subset of the macroblock_cache_load function : spend an extra bit of effort loading the necessary data into a data structure which is much simpler to address — as an assembly implementation would need (x264_macroblock_cache_load_deblock). Then I massively cleaned up deblocking to move all of the core strength-calculation logic into a single, small function that could be converted to assembly (deblock_strength_c). Finally, I wrote the assembly functions and worked with Loren to optimize them. Here’s the result.
And the timings for the resulting assembly function on my Core i7, in cycles :
deblock_strength_c : 309
deblock_strength_mmx : 79
deblock_strength_sse2 : 37
deblock_strength_ssse3 : 33Now that is a seriously nice improvement. 33 cycles on average to perform that many comparisons–that’s absurdly low, especially considering the SIMD takes no branchy shortcuts : it always checks every single edge ! I walked over to my performance chart and happily crossed off a box.
But I had a hunch that I could do better. Remember, as mentioned earlier, we’re reloading all that data back into our data structures in order to address it. This isn’t that slow, but takes enough time to significantly cut down on the gain of the assembly code. And worse, less than a row ago, all this data was in the correct place to be used (when we just finished encoding the macroblock) ! But if we did the deblocking right after encoding each macroblock, the cache issues would make it too slow to be worth it (yes, I tested this). So I went back to other things, a bit annoyed that I couldn’t get the full benefit of the changes.
Then, yesterday, I was talking with Pascal, a former Xvid dev and current video hacker over at Google, about various possible x264 optimizations. He had seen my deblocking changes and we discussed that a bit as well. Then two lines hit me like a pile of bricks :
<_skal_> tried computing the strength at least ?
<_skal_> while it’s freshWhy hadn’t I thought of that ? Do the strength calculation immediately after encoding each macroblock, save the result, and then go pick it up later for the main deblocking filter. Then we can use the data right there and then for strength calculation, but we don’t have to do the whole deblock process until later.
I went and implemented it and, after working my way through a horde of bugs, eventually got a working implementation. A big catch was that of slices : deblocking normally acts between slices even though normal encoding does not, so I had to perform extra munging to get that to work. By midday today I was able to go cross yet another box off on the performance chart. And now it’s committed.
Sometimes chatting for 10 minutes with another developer is enough to spot the idea that your brain somehow managed to miss for nearly a straight week.
NB : the performance chart is on a specific test clip at a specific set of settings (super fast settings) relevant to the company I work at, so it isn’t accurate nor complete for, say, default settings.
Update : Here’s a higher resolution version of the current chart, as requested in the comments.
-
Dreamcast Anniversary Programming
10 septembre 2010, par Multimedia Mike — Game HackingThis day last year saw a lot of nostalgia posts on the internet regarding the Sega Dreamcast, launched 10 years prior to that day (on 9/9/99). Regrettably, none of the retrospectives that I read really seemed to mention the homebrew potential, which is the aspect that interested me. On the occasion of the DC’s 11th anniversary, I wanted to remind myself how to build something for the unit and do so using modern equipment and build tools.
Background
Like many other programmers, I initially gained interest in programming because I desired to program video games. Not content to just plunk out games on a PC, I always had a deep, abiding ambition to program actual video game hardware. That is, I wanted to program a purpose-built video game console. The Sega Dreamcast might be the most ideal candidate to ever emerge for that task. All that was required to run your own software on the unit was the console, a PC, some free software tools, and a special connectivity measure.The Equipment
Here is the hardware required (ideally) to build software for the DC :- The console itself (I happen to have 3 of them laying around, as pictured above)
- Some peripherals : Such as the basic DC controller, the DC keyboard (flagship title : Typing of the Dead), and the visual memory unit (VMU)
- VGA box : The DC supported 480p gaming via a device that allowed you to connect the console straight to a VGA monitor via 15-pin D-sub. Not required for development, but very useful. I happen to have 3 of them from different third parties :
- Finally, the connectivity measure for hooking the DC to the PC.
There are 2 options here. The first is rare, expensive and relatively fast : A DC broadband adapter. The second is slower but much less expensive and relatively easy to come by– the DC coder’s cable. This was a DB-9 adapter on one end and a DC serial adapter on the other, and a circuit in the middle to monkey with voltage levels or some such ; I’m no electrical engineer. I procured this model from the notorious Lik Sang, well before that outfit was sued out of business.
Dealing With Legacy
Take a look at that coder’s cable again. DB-9 ? When was the last time you owned a computer with one of those ? And then think farther back to the last time to had occasion to plug something into one of those ports (likely a serial mouse).
A few years ago, someone was about to toss out this Belkin USB to DB-9 serial converter when I intervened. I foresaw the day when I would dust off the coder’s cable. So now I can connect a USB serial cable to my Eee PC, which then connects via converter to a different serial cable, one which has its own conversion circuit that alters the connection to yet another type of serial cable.
Bits is bits is bits as far as I’m concerned.
Putting It All Together
Now to assemble all the pieces (plus a monitor) into one development desktop :
The monitor says “dcload 1.0.3, idle…”. That’s a custom boot CD-ROM that is patiently waiting to receive commands, code and data via the serial port.
Getting The Software
Back in the day, homebrew software development on the DC revolved around these components :- GNU binutils : for building base toolchains for the Hitachi SH-4 main CPU as well as the ARM7-based audio coprocessor
- GNU gcc/g++ : for building compilers on top of binutils for the 2 CPUs
- Newlib : a C library intended for embedded systems
- KallistiOS : an open source, real-time OS developed for the DC
The DC was my first exposure to building cross compilers. I developed some software for the DC in the earlier part of the decade. Now, I am trying to figure out how I did it, especially since I think I came up with a few interesting ideas at the time.
Struggling With the Software Legacy
The source for KallistiOS has gone untouched since about 2004 but is still around thanks to Sourceforge. The instructions for properly building the toolchain have been lost to time, or would be were it not for the Internet Archive’s copy of a site called Hangar Eleven. Also, KallistiOS makes reference to a program called ‘dc-tool’ which is needed on the client side for communicating with dcload. I was able to find this binary at the Boob ! site (well-known in DC circles).I was able to build the toolchain using binutils 2.20.1, gcc 4.5.1 and newlib 1.18.0. Building the toolchain is an odd process as it requires building the binutils, then building the C compiler, then newlib, and then building the C compiler again along with the C++ compiler because the C++ compiler depends on newlib.
With some effort, I got the toolchain to build KallistiOS and most of its example programs. I documented most of the tweaks I had to make, several of them exactly the same as this one that I recently discovered while resurrecting a 10-year-old C program (common construct in C programming of old ?).
Moment of Truth
So I had some example programs built as ELF files. I told dc-tool to upload and run them on the waiting console. Unfortunately, the tool would just sort of stall, though some communication had evidently taken place. It has been many years since I have seen this in action but I recall that something more ought to be happening.Plan B (Hardware)
This is the point that I remember that I have been holding onto one rather old little machine that still has a DB-9 serial port. It’s not especially ergonomic to set up. I have to run it on my floor because, to connect it to my network, I need to run a 25′ ethernet cable that just barely reaches from the other room. The machine doesn’t seem to like USB keyboards, which is a shame since I have long since ditched any PS/2 keyboards. Fortunately, the box still has an old Gentoo distro and is running sshd, a holdover from its former life as a headless box.
Now when I run dc-tool, both the PC and DC report the upload progress while pretty overscan bars oscillate on the DC’s monitor. Now I’m back in business, until…
Plan C (Software)
None of these KallistiOS example programs are working. Some are even reporting catastrophic failures (register dumps) via the serial console. That’s when I remember that gcc can be a bit fickle on CPU architectures that are not, shall we say, first-class citizens. Back in the day, gcc 2.95 was a certified no-go for SH-4 development. 3.0.3 or 3.0.4 was called upon at the time. As I’m hosting this toolchain on x86_64 right now, gcc 3.0.4 can’t even be built (predates the architecture).One last option : As I searched through my old DC project directories, I found that I still have a lot of the resulting binaries, the ones I built 7-8 years ago. I upload a few of those and I finally see homebrew programming at work again, including this old program (described in detail here).
Next Steps
If I ever feel like revisiting this again, I suppose I can try some of the older 4.x series to see if they build valid programs. Alternatively, try building an x86_32-hosted 3.0.4 toolchain which ought to be a known good. And if that fails, search a little bit more to find that there are still active Dreamcast communities out there on the internet which probably have development toolchain binaries ready for download. -
Linux Media Player Survey Circa 2001
2 septembre 2010, par Multimedia Mike — GeneralHere’s a document I scavenged from my archives. It was dated September 1, 2001 and I now publish it 9 years later. It serves as sort of a time capsule for the state of media player programs at the time. Looking back on this list, I can’t understand why I couldn’t find MPlayer while I was conducting this survey, especially since MPlayer is the project I eventually started to work for a few months after writing this piece.
For a little context, I had been studying multimedia concepts and tech for a year and was itching to get my hands dirty with practical multimedia coding. But I wanted to tackle what I perceived as unsolved problems– like playback of proprietary codecs. I didn’t want to have to build a new media playback framework just to start working on my problems. So I surveyed the players available to see which ones I could plug into and use as a testbed for implementing new decoders.
Regarding Real Player, I wrote : “We’re trying to move away from the proprietary, closed-source “solutions”. Heh. Was I really an insufferable open source idealist back in the day ?
Anyway, here’s the text with some Where are they now ? commentary [in brackets] :
Towards an All-Inclusive Media Playing Solution for Linux
I don’t feel that the media playing solutions for Linux set their sights high enough, even though they do tend to be quite ambitious.
I want to create a media player for Linux that can open a file, figure out what type of file it is (AVI, MOV, etc.), determine the compression algorithms used to encode the audio and video chunks inside (MPEG, Cinepak, Sorenson, etc.) and replay the file using the best audio, video, and CPU facilities available on the computer.
Video and audio playback is a solved problem on Linux ; I don’t wish to solve that problem again. The problem that isn’t solved is reliance on proprietary multimedia solutions through some kind of WINE-like layer in order to decode compressed multimedia files.
Survey of Linux solutions for decoding proprietary multimedia
updated 2001-09-01AVI Player for XMMS
This is based on Avifile. All the same advantages and limitations apply.
[Top Google hit is a Freshmeat page that doesn’t indicate activity since 2001-2002.]Avifile
This player does a great job at taking apart AVI and ASF files and then feeding the compressed chunks of multimedia data through to the binary Win32 decoders.The program is written in C++ and I’m not very good at interpreting that kind of code. But I’m learning all over again. Examining the object hierarchy, it appears that the designers had the foresight to include native support for decoders that are compiled into the program from source code. However, closer examination reveals that there is support for ONE source decoder and that’s the “decoder” for uncompressed data. Still, I tried to manipulate this routine to accept and decode data from other codecs but no dice. It’s really confounding. The program always crashes when I feed non-uncompressed data through the source decoder.
[Lives at http://avifile.sourceforge.net/ ; not updated since 2006.]Real Player
There’s not much to do with this since it is closed source and proprietary. Even though there is a plugin architecture, that’s not satisfactory. We’re trying to move away from the proprietary, closed-source “solutions”.
[Still kickin’ with version 11.]XAnim
This is a well-established Unix media player. To his credit, the author does as well as he can with the resources he has. In other words, he supports the non-proprietary video codecs well, and even has support for some proprietary video codecs through binary-only decoders.The source code is extremely difficult to work with as the author chose to use the X coding format which I’ve never seen used anywhere else except for X header files. The infrastructure for extending the program and supporting other codecs and file formats is there, I suppose, but I would have to wrap my head around the coding style. Maybe I can learn to work past that. The other thing that bothers me about this program is the decoding approach : It seems that each video decoder includes routines to decompress the multimedia data into every conceivable RGB and YUV output format. This seems backwards to me ; it seems better to have one decoder function that decodes the data into its native format it was compressed from (e.g., YV12 for MPEG data) and then pass that data to another layer of the program that’s in charge of presenting the data and possibly converting it if necessary. This layer would encompass highly-optimized software conversion routines including special CPU-specific instructions (e.g., MMX and SSE) and eliminate the need to place those routines in lots of other routines. But I’m getting ahead of myself.
[This one was pretty much dead before I made this survey, the most recent update being in 1999. Still, we owe it much respect as the granddaddy of Unix multimedia playback programs.]Xine
This seems like a promising program. It was originally designed to play MPEGs from DVDs. It can also play MPEG files on a hard drive and utilizes the Xv extensions for hardware YUV playback. It’s also supposed to play AVI files using the same technique as Avifile but I have never, ever gotten it to work. If an AVI file has both video and sound, the binary video decoder can’t decode any frames. If the AVI file has video and no sound, the program gets confused and crashes, as far as I can tell.Still, it’s promising, and I’ve been trying to work around these crashes. It doesn’t yet have the type of modularization I’d like to see. Right now, it tailored to suit MPEG playback and AVI playback is an afterthought. Still, it appears to have a generalized interface for dropping in new file demultiplexers.
I tried to extend the program for supporting source decoders by rewriting w32codec.c from scratch. I’m not having a smooth time of it so far. I’m able to perform some manipulations on the output window. However, I can’t get the program to deal with an RGB image format. It has trouble allocating an RGB surface with XvShmCreateImage(). This isn’t suprising, per my limited knowledge of X which is that Xv applies to YUV images, but it could also apply to RGB images as well. Anyway, the program should be able to fall back on regular RGB pixmaps if that Xv call fails.
Right now, this program is looking the most promising. It will take some work to extend the underlying infrastructure, but it seems doable since I know C quite well and can understand the flow of this program, as opposed to Avifile and its C++. The C code also compiles about 10 times faster.
[My home project for many years after a brief flirtation with MPlayer. It is still alive ; its latest release was just a month ago.]XMovie
This library is a Quicktime movie player. I haven’t looked at it too extensively yet, but I do remember looking at it at one point and reading the documentation that said it doesn’t support key frames. Still, I should examine it again since they released a new version recently.
[Heroine Virtual still puts out some software but XMovie has not been updated since 2005.]XMPS
This program compiles for me, but doesn’t do much else. It can play an MP3 file. I have been able to get MPEG movies to play through it, but it refuses to show the full video frame, constricting it to a small window (obviously a bug).
[This project is hosted on SourceForge and is listed with a registration date of 2003, well after this survey was made. So the project obviously lived elsewhere in 2001. Meanwhile, it doesn’t look like any files ever made it to SF for hosting.]XTheater
I can’t even get this program to compile. It’s supposed to be an MPEG player based on SMPEG. As such, it probably doesn’t hold much promise for being easily extended into a general media player.
[Last updated in 2002.]GMerlin
I can’t get this to compile yet. I have a bug report in to the dev group.
[Updated consistently in the last 9 years. Last update was in February of this year. I can’t find any record of my bug report, though.]