git.videolan.org Git - ffmpeg.git/rss log
FFmpeg git repo
Les articles publiés sur le site
-
tests/swscale : calculate theoretical expected SSIM
4 mars, par Niklas Haastests/swscale: calculate theoretical expected SSIM We can calculate with some confidence the theoretical expected SSIM from an "ideal" conversion, by computing the reference SSIM level for an image dithered with uniformly distributed quatization noise. This gives us an additional safety net to check for regressions even in the absence of a reference to compare against.
-
swscale : aarch64 : Simplify the assignment of lumToYV12
4 mars, par Martin Storsjöswscale: aarch64: Simplify the assignment of lumToYV12 We normally don't need else statements here; the common pattern is to assign lower level SIMD implementations first, then conditionally reassign higher level ones afterwards, if supported. Signed-off-by: Martin Storsjö <martin@martin.st>
-
avcodec/aarch64/vvc : Optimize NEON version of vvc_dmvr
3 mars, par Krzysztof Pyrkoszavcodec/aarch64/vvc: Optimize NEON version of vvc_dmvr This patch replaces blocks of instructions performing rounding and widening shifts with one-liners achieving the same result. Before and after on A78 dmvr_8_12x20_neon: 86.2 ( 6.90x) dmvr_8_20x12_neon: 94.8 ( 5.93x) dmvr_8_20x20_neon: 141.5 ( 6.50x) dmvr_12_12x20_neon: 158.0 ( 3.76x) dmvr_12_20x12_neon: 151.2 ( 3.73x) dmvr_12_20x20_neon: 247.2 ( 3.71x) dmvr_hv_8_12x20_neon: 423.2 ( 3.75x) dmvr_hv_8_20x12_neon: 434.0 ( 3.69x) dmvr_hv_8_20x20_neon: 706.0 ( 3.69x) dmvr_8_12x20_neon: 77.2 ( 7.70x) dmvr_8_20x12_neon: 66.5 ( 8.49x) dmvr_8_20x20_neon: 92.2 ( 9.90x) dmvr_12_12x20_neon: 80.2 ( 7.38x) dmvr_12_20x12_neon: 58.2 ( 9.59x) dmvr_12_20x20_neon: 90.0 (10.15x) dmvr_hv_8_12x20_neon: 369.0 ( 4.34x) dmvr_hv_8_20x12_neon: 355.8 ( 4.49x) dmvr_hv_8_20x20_neon: 574.2 ( 4.51x) Signed-off-by: Martin Storsjö <martin@martin.st>
-
swscale/aarch64 : dotprod implementation of rgba32_to_Y
3 mars, par Krzysztof Pyrkoszswscale/aarch64: dotprod implementation of rgba32_to_Y The idea is to split the 16 bit coefficients into lower and upper half, invoke udot for the lower half, shift by 8, and follow by udot for the upper half. Benchmark on A78: bgra_to_y_128_c: 682.0 ( 1.00x) bgra_to_y_128_neon: 181.2 ( 3.76x) bgra_to_y_128_dotprod: 117.8 ( 5.79x) bgra_to_y_1080_c: 5742.5 ( 1.00x) bgra_to_y_1080_neon: 1472.5 ( 3.90x) bgra_to_y_1080_dotprod: 906.5 ( 6.33x) bgra_to_y_1920_c: 10194.0 ( 1.00x) bgra_to_y_1920_neon: 2589.8 ( 3.94x) bgra_to_y_1920_dotprod: 1573.8 ( 6.48x) Signed-off-by: Martin Storsjö <martin@martin.st>
-
avcodec/mpeg12enc : Simplify writing bits
3 mars, par Andreas Rheinhardt