Recherche avancée

Médias (39)

Mot : - Tags -/audio

Autres articles (96)

  • MediaSPIP 0.1 Beta version

    25 avril 2011, par

    MediaSPIP 0.1 beta is the first version of MediaSPIP proclaimed as "usable".
    The zip file provided here only contains the sources of MediaSPIP in its standalone version.
    To get a working installation, you must manually install all-software dependencies on the server.
    If you want to use this archive for an installation in "farm mode", you will also need to proceed to other manual (...)

  • Multilang : améliorer l’interface pour les blocs multilingues

    18 février 2011, par

    Multilang est un plugin supplémentaire qui n’est pas activé par défaut lors de l’initialisation de MediaSPIP.
    Après son activation, une préconfiguration est mise en place automatiquement par MediaSPIP init permettant à la nouvelle fonctionnalité d’être automatiquement opérationnelle. Il n’est donc pas obligatoire de passer par une étape de configuration pour cela.

  • Les vidéos

    21 avril 2011, par

    Comme les documents de type "audio", Mediaspip affiche dans la mesure du possible les vidéos grâce à la balise html5 .
    Un des inconvénients de cette balise est qu’elle n’est pas reconnue correctement par certains navigateurs (Internet Explorer pour ne pas le nommer) et que chaque navigateur ne gère en natif que certains formats de vidéos.
    Son avantage principal quant à lui est de bénéficier de la prise en charge native de vidéos dans les navigateur et donc de se passer de l’utilisation de Flash et (...)

Sur d’autres sites (6170)

  • Simply beyond ridiculous

    7 mai 2010, par Dark Shikari — H.265, speed

    For the past few years, various improvements on H.264 have been periodically proposed, ranging from larger transforms to better intra prediction. These finally came together in the JCT-VC meeting this past April, where over two dozen proposals were made for a next-generation video coding standard. Of course, all of these were in very rough-draft form ; it will likely take years to filter it down into a usable standard. In the process, they’ll pick the most useful features (hopefully) from each proposal and combine them into something a bit more sane. But, of course, it all has to start somewhere.

    A number of features were common : larger block sizes, larger transform sizes, fancier interpolation filters, improved intra prediction schemes, improved motion vector prediction, increased internal bit depth, new entropy coding schemes, and so forth. A lot of these are potentially quite promising and resolve a lot of complaints I’ve had about H.264, so I decided to try out the proposal that appeared the most interesting : the Samsung+BBC proposal (A124), which claims compression improvements of around 40%.

    The proposal combines a bouillabaisse of new features, ranging from a 12-tap interpolation filter to 12thpel motion compensation and transforms as large as 64×64. Overall, I would say it’s a good proposal and I don’t doubt their results given the sheer volume of useful features they’ve dumped into it. I was a bit worried about complexity, however, as 12-tap interpolation filters don’t exactly scream “fast”.

    I prepared myself for the slowness of an unoptimized encoder implementation, compiled their tool, and started a test encode with their recommended settings.

    I waited. The first frame, an I-frame, completed.

    I took a nap.

    I waited. The second frame, a P-frame, was done.

    I played a game of Settlers.

    I waited. The third frame, a B-frame, was done.

    I worked on a term paper.

    I waited. The fourth frame, a B-frame, was done.

    After a full 6 hours, 8 frames had encoded. Yes, at this rate, it would take a full two weeks to encode 10 seconds of HD video. On a Core i7. This is not merely slow ; this is over 1000 times slower than x264 on “placebo” mode. This is so slow that it is not merely impractical ; it is impossible to even test. This encoder is apparently designed for some sort of hypothetical future computer from space. And word from other developers is that the Intel proposal is even slower.

    This has led me to suspect that there is a great deal of cheating going on in the H.265 proposals. The goal of the proposals, of course, is to pick the best feature set for the next generation video compression standard. But there is an extra motivation : organizations whose features get accepted get patents on the resulting standard, and thus income. With such large sums of money in the picture, dishonesty becomes all the more profitable.

    There is a set of rules, of course, to limit how the proposals can optimize their encoders. If different encoders use different optimization techniques, the results will no longer be comparable — remember, they are trying to compare compression features, not methods of optimizing encoder-side decisions. Thus all encoders are required to use a constant quantizer, specified frame types, and so forth. But there are no limits on how slow an encoder can be or what algorithms it can use.

    It would be one thing if the proposed encoder was a mere 10 times slower than the current reference ; that would be reasonable, given the low level of optimization and higher complexity of the new standard. But this is beyond ridiculous. With the prize given to whoever can eke out the most PSNR at a given quantizer at the lowest bitrate (with no limits on speed), we’re just going to get an arms race of slow encoders, with every company trying to use the most ridiculous optimizations possible, even if they involve encoding the frame 100,000 times over to choose the optimal parameters. And the end result will be as I encountered here : encoders so slow that they are simply impossible to even test.

    Such an arms race certainly does little good in optimizing for reality where we don’t have 30 years to encode an HD movie : a feature that gives great compression improvements is useless if it’s impossible to optimize for in a reasonable amount of time. Certainly once the standard is finalized practical encoders will be written — but it makes no sense to optimize the standard for a use-case that doesn’t exist. And even attempting to “optimize” anything is difficult when encoding a few seconds of video takes weeks.

    Update : The people involved have contacted me and insist that there was in fact no cheating going on. This is probably correct ; the problem appears to be that the rules that were set out were simply not strict enough, making many changes that I would intuitively consider “cheating” to be perfectly allowed, and thus everyone can do it.

    I would like to apologize if I implied that the results weren’t valid ; they are — the Samsung-BBC proposal is definitely one of the best, which is why I picked it to test with. It’s just that I think any situation in which it’s impossible to test your own software is unreasonable, and thus the entire situation is an inherently broken one, given the lax rules, slow baseline encoder, and no restrictions on compute time.

  • Beware the builtins

    14 janvier 2010, par Mans — Compilers

    GCC includes a large number of builtin functions allegedly providing optimised code for common operations not easily expressed directly in C. Rather than taking such claims at face value (this is GCC after all), I decided to conduct a small investigation to see how well a few of these functions are actually implemented for various targets.

    For my test, I selected the following functions :

    • __builtin_bswap32 : Byte-swap a 32-bit word.
    • __builtin_bswap64 : Byte-swap a 64-bit word.
    • __builtin_clz : Count leading zeros in a word.
    • __builtin_ctz : Count trailing zeros in a word.
    • __builtin_prefetch : Prefetch data into cache.

    To test the quality of these builtins, I wrapped each in a normal function, then compiled the code for these targets :

    • ARMv7
    • AVR32
    • MIPS
    • MIPS64
    • PowerPC
    • PowerPC64
    • x86
    • x86_64

    In all cases I used compiler flags were -O3 -fomit-frame-pointer plus any flags required to select a modern CPU model.

    ARM

    Both __builtin_clz and __builtin_prefetch generate the expected CLZ and PLD instructions respectively. The code for __builtin_ctz is reasonable for ARMv6 and earlier :

    rsb     r3, r0, #0
    and     r0, r3, r0
    clz     r0, r0
    rsb     r0, r0, #31
    

    For ARMv7 (in fact v6T2), however, using the new bit-reversal instruction would have been better :

    rbit    r0, r0
    clz     r0, r0
    

    I suspect this is simply a matter of the function not yet having been updated for ARMv7, which is perhaps even excusable given the relatively rare use cases for it.

    The byte-reversal functions are where it gets shocking. Rather than use the REV instruction found from ARMv6 on, both of them generate external calls to __bswapsi2 and __bswapdi2 in libgcc, which is plain C code :

    SItype
    __bswapsi2 (SItype u)
    
      return ((((u) & 0xff000000) >> 24)
              | (((u) & 0x00ff0000) >>  8)
              | (((u) & 0x0000ff00) <<  8)
              | (((u) & 0x000000ff) << 24)) ;
    
    

    DItype
    __bswapdi2 (DItype u)

    return ((((u) & 0xff00000000000000ull) >> 56)
    | (((u) & 0x00ff000000000000ull) >> 40)
    | (((u) & 0x0000ff0000000000ull) >> 24)
    | (((u) & 0x000000ff00000000ull) >> 8)
    | (((u) & 0x00000000ff000000ull) << 8)
    | (((u) & 0x0000000000ff0000ull) << 24)
    | (((u) & 0x000000000000ff00ull) << 40)
    | (((u) & 0x00000000000000ffull) << 56)) ;

    While the 32-bit version compiles to a reasonable-looking shift/mask/or job, the 64-bit one is a real WTF. Brace yourselves :

    push    r4, r5, r6, r7, r8, r9, sl, fp
    mov     r5, #0
    mov     r6, #65280 ; 0xff00
    sub     sp, sp, #40 ; 0x28
    and     r7, r0, r5
    and     r8, r1, r6
    str     r7, [sp, #8]
    str     r8, [sp, #12]
    mov     r9, #0
    mov     r4, r1
    and     r5, r0, r9
    mov     sl, #255 ; 0xff
    ldr     r9, [sp, #8]
    and     r6, r4, sl
    mov     ip, #16711680 ; 0xff0000
    str     r5, [sp, #16]
    str     r6, [sp, #20]
    lsl     r2, r0, #24
    and     ip, ip, r1
    lsr     r7, r4, #24
    mov     r1, #0
    lsr     r5, r9, #24
    mov     sl, #0
    mov     r9, #-16777216 ; 0xff000000
    and     fp, r0, r9
    lsr     r6, ip, #8
    orr     r9, r7, r1
    and     ip, r4, sl
    orr     sl, r1, r2
    str     r6, [sp]
    str     r9, [sp, #32]
    str     sl, [sp, #36] ; 0x24
    add     r8, sp, #32
    ldm     r8, r7, r8
    str     r1, [sp, #4]
    ldm     sp, r9, sl
    orr     r7, r7, r9
    orr     r8, r8, sl
    str     r7, [sp, #32]
    str     r8, [sp, #36] ; 0x24
    mov     r3, r0
    mov     r7, #16711680 ; 0xff0000
    mov     r8, #0
    and     r9, r3, r7
    and     sl, r4, r8
    ldr     r0, [sp, #16]
    str     fp, [sp, #24]
    str     ip, [sp, #28]
    stm     sp, r9, sl
    ldr     r7, [sp, #20]
    ldr     sl, [sp, #12]
    ldr     fp, [sp, #12]
    ldr     r8, [sp, #28]
    lsr     r0, r0, #8
    orr     r7, r0, r7, lsl #24
    lsr     r6, sl, #24
    orr     r5, r5, fp, lsl #8
    lsl     sl, r8, #8
    mov     fp, r7
    add     r8, sp, #32
    ldm     r8, r7, r8
    orr     r6, r6, r8
    ldr     r8, [sp, #20]
    ldr     r0, [sp, #24]
    orr     r5, r5, r7
    lsr     r8, r8, #8
    orr     sl, sl, r0, lsr #24
    mov     ip, r8
    ldr     r0, [sp, #4]
    orr     fp, fp, r5
    ldr     r5, [sp, #24]
    orr     ip, ip, r6
    ldr     r6, [sp]
    lsl     r9, r5, #8
    lsl     r8, r0, #24
    orr     fp, fp, r9
    lsl     r3, r3, #8
    orr     r8, r8, r6, lsr #8
    orr     ip, ip, sl
    lsl     r7, r6, #24
    and     r5, r3, #16711680 ; 0xff0000
    orr     r7, r7, fp
    orr     r8, r8, ip
    orr     r4, r1, r7
    orr     r5, r5, r8
    mov     r9, r6
    mov     r1, r5
    mov     r0, r4
    add     sp, sp, #40 ; 0x28
    pop     r4, r5, r6, r7, r8, r9, sl, fp
    bx      lr
    

    That’s right, 91 instructions to move 8 bytes around a bit. GCC definitely has a problem with 64-bit numbers. It is perhaps worth noting that the bswap_64 macro in glibc splits the 64-bit value into 32-bit halves which are then reversed independently, thus side-stepping this weakness of gcc.

    As a side note, ARM RVCT (armcc) compiles those functions perfectly into one and two REV instructions, respectively.

    AVR32

    There is not much to report here. The latest gcc version available is 4.2.4, which doesn’t appear to have the bswap functions. The other three are handled nicely, even using a bit-reverse for __builtin_ctz.

    MIPS / MIPS64

    The situation MIPS is similar to ARM. Both bswap builtins result in external libgcc calls, the rest giving sensible code.

    PowerPC

    I scarcely believe my eyes, but this one is actually not bad. The PowerPC has no byte-reversal instructions, yet someone seems to have taken the time to teach gcc a good instruction sequence for this operation. The PowerPC does have some powerful rotate-and-mask instructions which come in handy here. First the 32-bit version :

    rotlwi  r0,r3,8
    rlwimi  r0,r3,24,0,7
    rlwimi  r0,r3,24,16,23
    mr      r3,r0
    blr
    

    The 64-bit byte-reversal simply applies the above code on each half of the value :

    rotlwi  r0,r3,8
    rlwimi  r0,r3,24,0,7
    rlwimi  r0,r3,24,16,23
    rotlwi  r3,r4,8
    rlwimi  r3,r4,24,0,7
    rlwimi  r3,r4,24,16,23
    mr      r4,r0
    blr
    

    Although I haven’t analysed that code carefully, it looks pretty good.

    PowerPC64

    Doing 64-bit operations is easier on a 64-bit CPU, right ? For you and me perhaps, but not for gcc. Here __builtin_bswap64 gives us the now familiar __bswapdi2 call, and while not as bad as the ARM version, it is not pretty :

    rldicr  r0,r3,8,55
    rldicr  r10,r3,56,7
    rldicr  r0,r0,56,15
    rldicl  r11,r3,8,56
    rldicr  r9,r3,16,47
    or      r11,r10,r11
    rldicr  r9,r9,48,23
    rldicl  r10,r0,24,40
    rldicr  r0,r3,24,39
    or      r11,r11,r10
    rldicl  r9,r9,40,24
    rldicr  r0,r0,40,31
    or      r9,r11,r9
    rlwinm  r10,r3,0,0,7
    rldicl  r0,r0,56,8
    or      r0,r9,r0
    rldicr  r10,r10,8,55
    rlwinm  r11,r3,0,8,15
    or      r0,r0,r10
    rldicr  r11,r11,24,39
    rlwinm  r3,r3,0,16,23
    or      r0,r0,r11
    rldicr  r3,r3,40,23
    or      r3,r0,r3
    blr
    

    That is 6 times longer than the (presumably) hand-written 32-bit version.

    x86 / x86_64

    As one might expect, results on x86 are good. All the tested functions use the available special instructions. One word of caution though : the bit-counting instructions are very slow on some implementations, specifically the Atom, AMD chips, and the notoriously slow Pentium4E.

    Conclusion

    In conclusion, I would say gcc builtins can be useful to avoid fragile inline assembler. Before using them, however, one should make sure they are not in fact harmful on the required targets. Not even those builtins mapping directly to CPU instructions can be trusted.

  • Révision 17171 : remplacer l’usine a gaz lignes_longes qui alterait le texte en lui inserant des ...

    12 février 2011, par cedric -

    par un simple conteneur portant l’attribut style=’word-wrap:break-word ;’ http://www.alsacreations.com/tuto/lire/1038-gerer-debordement-contenu-css.html lignes_longues devrait etre depreciee au profit d’un stylage direct des zones concernees, mais on assure ainsi la compatibilite avec les (...)