Recherche avancée

Médias (91)

Autres articles (53)

  • Publier sur MédiaSpip

    13 juin 2013

    Puis-je poster des contenus à partir d’une tablette Ipad ?
    Oui, si votre Médiaspip installé est à la version 0.2 ou supérieure. Contacter au besoin l’administrateur de votre MédiaSpip pour le savoir

  • Submit bugs and patches

    13 avril 2011

    Unfortunately a software is never perfect.
    If you think you have found a bug, report it using our ticket system. Please to help us to fix it by providing the following information : the browser you are using, including the exact version as precise an explanation as possible of the problem if possible, the steps taken resulting in the problem a link to the site / page in question
    If you think you have solved the bug, fill in a ticket and attach to it a corrective patch.
    You may also (...)

  • MediaSPIP 0.1 Beta version

    25 avril 2011, par

    MediaSPIP 0.1 beta is the first version of MediaSPIP proclaimed as "usable".
    The zip file provided here only contains the sources of MediaSPIP in its standalone version.
    To get a working installation, you must manually install all-software dependencies on the server.
    If you want to use this archive for an installation in "farm mode", you will also need to proceed to other manual (...)

Sur d’autres sites (3747)

  • An introduction to reverse engineering

    22 janvier 2011

    (This blog is still in hibernation, but I needed somewhere to post this)

    Reverse engineering is one of those wonderful topics, covering everything from simple "guess how this program works" problem solving, to poking at silicon with scanning electron microscopes. I’m always hugely fascinated by articles that walk through the steps involved in these types of activities, so I thought I’d contribute one back to the world.

    In this case, I’m going to be looking at the export bundle format created by the Tandberg Content Server, a device for recording video conferences. The bundle is intended for moving recordings between Tandberg devices, but it’s also the easiest way to get all of the related assets for a recorded conference. Unfortunately, there’s no parser available to take the bundle files (.tcb) and output the component pieces. Well, that just won’t do.

    For this type of reverse engineering, I basically want to learn enough about the TCB format to be able to parse out the individual files within it. The only tools I’ll need in this process are a hex editor, a notepad, and a way to convert between hex and decimal (the OS X calculator will do fine if you’re not the type to do it in your head).

    Step 1 : Basic Research
    After Googling around to see if this was a solved issue, I decided to dive into the format. I brought a sample bundle into my trusty hex editor (in this case Hex Fiend).

    1-1.jpg

    A few things are immediately obvious. First, we see the first four bytes are the letters TCSB. Another quick visit to Google confirms this header type isn’t found elsewhere, and there’s essentially no discussion of it. Going a few bytes further, we see "contents.xml." And a few bytes after that, we see what looks like plaintext XML. This is a pretty good clue that the TCB file consists of a . Let’s scan a bit further and see if we can confirm that.
    1-2.jpg
    In this segment, we see the end of the XML, and something that could be another filename - "dbtransfer" - followed by what looks like gibberish. That doesn’t help too much. Let’s keep looking.
    1-3.jpg
    Great - a .jpg ! Looking a bit further, we see the letters "JFIF," which is recognizable as part of a JPEG header. If you weren’t already familiar with that, a quick google for "jpg hex header" would clear up any confusion. So, we’ve got the basics of the file format down, but we’ll need a little bit more information if we’re going to write a parser.

    Step 2 : Finding the pattern
    We can make an educated guess that a file like this has to provide a few hints to a decoder. We would either expect a table of contents, describing where in the bundle each individual file was located, some sort of stop bit marking the boundary between files, byte offsets describing the locations of files, or a listing of file lengths.

    There isn’t any sign of a table of contents. Let’s start looking for a stop bit, as that would make writing our parser really easy. Want I’m going to do is pull out all of the data between two prospective files, and I want two sets to compare.
    I’ve placed asterisks to flag the bytes corresponding to the filenames, since those are known.

    1E D1 70 4C 25 06 36 4D 42 E9 65 6A 9F 5D 88 38 0A 00 *64 62 74 72 61 6E 73 66 65 72* 42 06 ED 48 0B 50 0A C4 14 D6 63 42 F2 BF E3 9D 20 29 00 00 00 00 00 00 DE E5 FD

    01 0C 00 *63 6F 6E 74 65 6E 74 73 2E 78 6D 6C* 9E 0E FE D3 C9 3A 3A 85 F4 E4 22 FE D0 21 DC D7 53 03 00 00 00 00 00 00

    The first line corresponds to the "dbtransfer" entry, the second to the "contents.xml" entry. Let’s trim the first entry to match the second.

    38 0A 00 *64 62 74 72 61 6E 73 66 65 72* 42 06 ED 48 0B 50 0A C4 14 D6 63 42 F2 BF E3 9D 20 29 00 00 00 00 00 00

    01 0C 00 *63 6F 6E 74 65 6E 74 73 2E 78 6D 6C* 9E 0E FE D3 C9 3A 3A 85 F4 E4 22 FE D0 21 DC D7 53 03 00 00 00 00 00 00

    It looks like we’ve got three bytes before the filename, followed by 18 bytes, followed by six bytes of zero. Unfortunately, there’s no obvious pattern of bits which would correspond to a "break" between segments. However, looking at those first three bytes, we see a 0x0A, and a 0x0C, two small values in the same place. 10 and 12. Interesting - the 12 entry corresponds with "contents.xml" and the 10 entry corresponds with "dbtransfer". Could that byte describe the length of the filename ? Let’s look at our much longer JPG entry to be sure.

    70 4A 00 *77 77 77 5C 73 6C 69 64 65 73 5C 64 37 30 64 35 34 63 66 2D 32 39 35 62 2D 34 31 34 63 2D 61 38 64 66 2D 32 66 37 32 64 66 33 30 31 31 35 65 5C 74 68 75 6D 62 6E 61 69 6C 73 5C 74 68 75 6D 62 6E 61 69 6C 30 30 2E 6A 70 67*

    0x4A - 74, corresponding to a 74 character filename. Looks like we’re in business.

    At this point, it’s worth an aside to talk about endianness. I happen to know that the Tandberg Content Server runs Windows on Intel, so I went into this with the assumption that the format was little-endian. However, if you’re not sure, it’s always worth looking at words backwards and forwards, just in case.

    So we know how to find our filename. Now how do we find our file data ? Let’s go back to our JPEG. We know that JPEGs start with 0xFFD8FFE0, and a quick trip to Google also tells us that they end with 0xFFD9. We can use that to pull a sample jpeg out of our TCB, save it to disk, and confirm that we’re on the right track.
    2-2.jpg

    This is one of those great steps in reverse engineering - concrete proof that you’re on the right track. Everything seems to go quicker from this point on.

    So, we know we’ve got a JPEG file in a continuous 2177 byte segment. We know that the format used byte lengths to describe filenames - maybe it also uses byte lengths to describe file lengths. Let’s look for 2177, or 0x8108, near our JPEG.

    2-3.jpg

    Well, that’s a good sign. But, it could be coincidental, so at this point we’d want to check a few other files to be sure. In fact, looking further in some file, we find some larger .mp4 files which don’t quite match our guess. It turns out that file length is a 32bit value, not a 16bit value - with our two jpegs, the larger bytes just happened to be zeros.

    Step 3 : Writing a parser

    "Bbbbbut...", I hear you say ! "You have all these chunks of data you don’t understand !"

    True enough, but all I care about is getting the files out, with the proper names. I don’t care about creation dates, file permissions, or any of the other crud that this file format likely contains.

    3-1.jpg

    Let’s look at the first two files in this bundle. A little bit of byte counting shows us the pattern that we can follow. We’ll treat the first file as a special case. After that, we seek 16 bytes from the end of file data to find the filename length (2 bytes), then we’re at the filename, then we seek 16 bytes to find the file length (4 bytes) and seek another 4 bytes to find the start of the file data. Rinse, repeat.

    I wrote a quick parser in PHP, since the eventual use for this information is part of a larger PHP-based application, but any language with basic raw file handling would work just as well.

    tcsParser.txt
    This was about the simplest possible type of reverse engineering - we had known data in an unknown format, without any compression or encryption. It only gets harder from here...

  • VP8 Codec Optimization Update

    16 juin 2010, par noreply@blogger.com (John Luther) — inside webm

    Since WebM launched in May, the team has been working hard to make the VP8 video codec faster. Our community members have contributed improvements, but there’s more work to be done in some interesting areas related to performance (more on those below).


    Encoder


    The VP8 encoder is ripe for speed optimizations. Scott LaVarnway’s efforts in writing an x86 assembly version of the quantizer will help in this goal significantly as the quantizer is called many times while the encoder makes decisions about how much detail from the image will be transmitted.

    For those of you eager to get involved, one piece of low-hanging fruit is writing a SIMD version of the ARNR temporal filtering code. Also, much of the assembly code only makes use of the SSE2 instruction set, and there surely are newer extensions that could be made use of. There are also redundant code removal and other general cleanup to be done ; (Yaowu Xu has submitted some changes for these).

    At a higher level, someone can explore some alternative motion search strategies in the encoder. Eventually the motion search can be decoupled entirely to allow motion fields to be calculated elsewhere (for example, on a graphics processor).

    Decoder


    Decoder optimizations can bring higher resolutions and smoother playback to less powerful hardware.

    Jeff Muizelaar has submitted some changes which combine the IDCT and summation with the predicted block into a single function, helping us avoid storing the intermediate result, thus reducing memory transfers and avoiding cache pollution. This changes the assembly code in a fundamental way, so we will need to sync the other platforms up or switch them to a generic C implementation and accept the performance regression. Johann Koenig is working on implementing this change for ARM processors, and we’ll merge these changes into the mainline soon.

    In addition, Tim Terriberry is attacking a different method of bounds checking on the "bool decoder." The bool decoder is performance-critical, as it is called several times for each bit in the input stream. The current code handles this check with a simple clamp in the innermost loops and a less-frequent copy into a circular buffer. This can be expensive at higher data rates. Tim’s patch removes the circular buffer, but uses a more complex clamp in the innermost loops. These inner loops have historically been troublesome on embedded platforms.

    To contribute in these efforts, I’ve started working on rewriting higher-level parts of the decoder. I believe there is an opportunity to improve performance by paying better attention to data locality and cache layout, and reducing memory bus traffic in general. Another area I plan to explore is improving utilization in the multi-threaded decoder by separating the bitstream decoding from the rest of the image reconstruction, using work units larger than a single macroblock, and not tying functionality to a specific thread. To get involved in these areas, subscribe to the codec-devel mailing list and provide feedback on the code as it’s written.

    Embedded Processors


    We want to optimize multiple platforms, not just desktops. Fritz Koenig has already started looking at the performance of VP8 on the Intel Atom platform. This platform need some attention as we wrote our current x86 assembly code with an out-of-order processor in mind. Since Atom is an in-order processor (much like the original Pentium), the instruction scheduling of all of the x86 assembly code needs to be reexamined. One option we’re looking at is scheduling the code for the Atom processor and seeing if that impacts the performance on other x86 platforms such as the Via C3 and AMD Geode. This is shaping up to be a lot of work, but doing it would provide us with an opportunity to tighten up our assembly code.

    These issues, along with wanting to make better use of the larger register file on x86_64, may reignite every assembly programmer’s (least ?) favorite debate : whether or not to use intrinsics. Yunqing Wang has been experimenting with this a bit, but initial results aren’t promising. If you have experience in dealing with a lot of assembly code across several similar-but-kinda-different platforms, these maintainability issues might be familiar to you. I hope you’ll share your thoughts and experiences on the codec-devel mailing list.

    Optimizing codecs is an iterative (some would say never-ending) process, so stay tuned for more posts on the progress we’re making, and by all means, start hacking yourself.

    It’s exciting to see that we’re starting to get substantial code contributions from developers outside of Google, and I look forward to more as WebM grows into a strong community effort.

    John Koleszar is a software engineer at Google.

  • Revision 30713 : Grosses modifications ... nécessite de refaire la configuration

    8 août 2009, par kent1@… — Log

    Grosses modifications ... nécessite de refaire la configuration