
Recherche avancée
Autres articles (32)
-
Menus personnalisés
14 novembre 2010, parMediaSPIP utilise le plugin Menus pour gérer plusieurs menus configurables pour la navigation.
Cela permet de laisser aux administrateurs de canaux la possibilité de configurer finement ces menus.
Menus créés à l’initialisation du site
Par défaut trois menus sont créés automatiquement à l’initialisation du site : Le menu principal ; Identifiant : barrenav ; Ce menu s’insère en général en haut de la page après le bloc d’entête, son identifiant le rend compatible avec les squelettes basés sur Zpip ; (...) -
Use, discuss, criticize
13 avril 2011, parTalk to people directly involved in MediaSPIP’s development, or to people around you who could use MediaSPIP to share, enhance or develop their creative projects.
The bigger the community, the more MediaSPIP’s potential will be explored and the faster the software will evolve.
A discussion list is available for all exchanges between users. -
Encoding and processing into web-friendly formats
13 avril 2011, parMediaSPIP automatically converts uploaded files to internet-compatible formats.
Video files are encoded in MP4, Ogv and WebM (supported by HTML5) and MP4 (supported by Flash).
Audio files are encoded in MP3 and Ogg (supported by HTML5) and MP3 (supported by Flash).
Where possible, text is analyzed in order to retrieve the data needed for search engine detection, and then exported as a series of image files.
All uploaded files are stored online in their original format, so you can (...)
Sur d’autres sites (5163)
-
Bit-field badness
30 janvier 2010, par Mans — Compilers, OptimisationConsider the following C code which is based on an real-world situation.
struct bf1_31 unsigned a:1 ; unsigned b:31 ; ;
void func(struct bf1_31 *p, int n, int a)
int i = 0 ;
do
if (p[i].a)
p[i].b += a ;
while (++i < n) ;
How would we best write this in ARM assembler ? This is how I would do it :
func : ldr r3, [r0], #4 tst r3, #1 add r3, r3, r2, lsl #1 strne r3, [r0, #-4] subs r1, r1, #1 bgt func bx lr
The
add
instruction is unconditional to avoid a dependency on the comparison. Unrolling the loop would mask the latency of theldr
instruction as well, but that is outside the scope of this experiment.Now compile this code with
gcc -march=armv5te -O3
and watch in horror :func : push r4 mov ip, #0 mov r4, r2 loop : ldrb r3, [r0] add ip, ip, #1 tst r3, #1 ldrne r3, [r0] andne r2, r3, #1 addne r3, r4, r3, lsr #1 orrne r2, r2, r3, lsl #1 strne r2, [r0] cmp ip, r1 add r0, r0, #4 blt loop pop r4 bx lr
This is nothing short of awful :
- The same value is loaded from memory twice.
- A complicated mask/shift/or operation is used where a simple shifted add would suffice.
- Write-back addressing is not used.
- The loop control counts up and compares instead of counting down.
- Useless
mov
in the prologue ; swapping the roles orr2
andr4
would avoid this. - Using
lr
in place ofr4
would allow the return to be done withpop {pc}
, saving one instruction (ignoring for the moment that no callee-saved registers are needed at all).
Even for this trivial function the gcc-generated code is more than twice the optimal size and slower by approximately the same factor.
The main issue I wanted to illustrate is the poor handling of bit-fields by gcc. When accessing bitfields from memory, gcc issues a separate load for each field even when they are contained in the same aligned memory word. Although each load after the first will most likely hit L1 cache, this is still bad for several reasons :
- Loads have typically two or three cycles result latency compared to one cycle for data processing instructions. Any bit-field can be extracted from a register with two shifts, and on ARM the second of these can generally be achieved using a shifted second operand to a following instruction. The ARMv6T2 instruction set also adds the
SBFX
andUBFX
instructions for extracting any signed or unsigned bit-field in one cycle. - Most CPUs have more data processing units than load/store units. It is thus more likely for an ALU instruction than a load/store to issue without delay on a superscalar processor.
- Redundant memory accesses can trigger early flushing of store buffers rendering these less efficient.
No gcc bashing is complete without a comparison with another compiler, so without further ado, here is the ARM RVCT output (
armcc --cpu 5te -O3
) :func : mov r3, #0 push r4, lr loop : ldr ip, [r0, r3, lsl #2] tst ip, #1 addne ip, ip, r2, lsl #1 strne ip, [r0, r3, lsl #2] add r3, r3, #1 cmp r3, r1 blt loop pop r4, pc
This is much better, the core loop using only one instruction more than my version. The loop control is counting up, but at least this register is reused as offset for the memory accesses. More remarkable is the push/pop of two registers that are never used. I had not expected to see this from RVCT.
Even the best compilers are still no match for a human.
-
avcodec/mpeg4videodec : set field durations to safe values when they are invalid
19 février 2014, par Michael Niedermayeravcodec/mpeg4videodec : set field durations to safe values when they are invalid
Fixes division by 0
Fixes : 3c980274cecaca6ca5d6e517c0c1983e-signal_sigabrt_7ffff7125425_410_cov_3621042467_2_audio_streams.avi
Found-by : Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by : Michael Niedermayer <michaelni@gmx.at> -
avcodec/g2meet : fix stride calculation, use correct format field
26 novembre 2013, par Michael Niedermayer