git.videolan.org Git - x264.git/summary

x264 git repository

http://git.videolan.org/?p=x264.git;a=summary

Les articles publiés sur le site

  • x86 : Faster pixel_ssd_nv12

    17 septembre 2016, par Henrik Gramner
    x86: Faster pixel_ssd_nv12
    
    Also drop the MMX2 version to simplify things.
    
    • [DH] common/pixel.c
    • [DH] common/x86/pixel-a.asm
    • [DH] common/x86/pixel.h
  • x86 : SSE zigzag_scan_4x4_field

    11 septembre 2016, par Henrik Gramner
    x86: SSE zigzag_scan_4x4_field
    
    Replaces the MMX2 version, one cycle faster.
    
    Also change the checkasm test to use the correct alignment macro.
    
    • [DH] common/dct.c
    • [DH] common/x86/dct-a.asm
    • [DH] common/x86/dct.h
    • [DH] tools/checkasm.c
  • x86 : AVX2 mbtree_propagate_list

    7 septembre 2016, par Henrik Gramner
    x86: AVX2 mbtree_propagate_list
    
    SIMD part is around 25% faster than AVX on Haswell, around 7%
    faster when including the runtime of the scalar C wrapper.
    
    • [DH] common/x86/const-a.asm
    • [DH] common/x86/mc-a2.asm
    • [DH] common/x86/mc-c.c
    • [DH] common/x86/trellis-64.asm
  • x86 : Move predict_16x16_dc_left calculations to asm

    7 septembre 2016, par Henrik Gramner
    x86: Move predict_16x16_dc_left calculations to asm
    
    1-2 cycles faster and avoids some code duplication to decrease code size.
    
    Also drop the MMX2 implementation in favor of SSE2 to simplify things.
    
    • [DH] common/pixel.c
    • [DH] common/x86/predict-a.asm
    • [DH] common/x86/predict-c.c
    • [DH] common/x86/predict.h
  • aarch64 : implement x264_plane_copy_swap_neon

    26 août 2016, par Janne Grunau
    aarch64: implement x264_plane_copy_swap_neon
    
    plane_copy_swap_c: 27054
    plane_copy_swap_neon: 4152
    
    • [DH] common/aarch64/mc-a.S
    • [DH] common/aarch64/mc-c.c