git.videolan.org Git - ffmpeg.git/rss log

FFmpeg git repo

http://git.videolan.org/?p=ffmpeg.git;a=summary

Les articles publiés sur le site

1 | ... | 219 | 220 | 221 | 222 | 223 | 224 | 225 | 226 | 227 | ... | 15689

avcodec/aarch64/vvc : Optimize NEON version of vvc_dmvr

3 mars, par Krzysztof Pyrkosz

avcodec/aarch64/vvc: Optimize NEON version of vvc_dmvr

This patch replaces blocks of instructions performing rounding and
widening shifts with one-liners achieving the same result.

Before and after on A78
dmvr_8_12x20_neon:                                      86.2 ( 6.90x)
dmvr_8_20x12_neon:                                      94.8 ( 5.93x)
dmvr_8_20x20_neon:                                     141.5 ( 6.50x)
dmvr_12_12x20_neon:                                    158.0 ( 3.76x)
dmvr_12_20x12_neon:                                    151.2 ( 3.73x)
dmvr_12_20x20_neon:                                    247.2 ( 3.71x)
dmvr_hv_8_12x20_neon:                                  423.2 ( 3.75x)
dmvr_hv_8_20x12_neon:                                  434.0 ( 3.69x)
dmvr_hv_8_20x20_neon:                                  706.0 ( 3.69x)

dmvr_8_12x20_neon:                                      77.2 ( 7.70x)
dmvr_8_20x12_neon:                                      66.5 ( 8.49x)
dmvr_8_20x20_neon:                                      92.2 ( 9.90x)
dmvr_12_12x20_neon:                                     80.2 ( 7.38x)
dmvr_12_20x12_neon:                                     58.2 ( 9.59x)
dmvr_12_20x20_neon:                                     90.0 (10.15x)
dmvr_hv_8_12x20_neon:                                  369.0 ( 4.34x)
dmvr_hv_8_20x12_neon:                                  355.8 ( 4.49x)
dmvr_hv_8_20x20_neon:                                  574.2 ( 4.51x)

Signed-off-by: Martin Storsjö <martin@martin.st>

[D H] libavcodec/aarch64/vvc/inter.S

swscale/aarch64 : dotprod implementation of rgba32_to_Y

3 mars, par Krzysztof Pyrkosz

swscale/aarch64: dotprod implementation of rgba32_to_Y

The idea is to split the 16 bit coefficients into lower and upper half,
invoke udot for the lower half, shift by 8, and follow by udot for the
upper half.

Benchmark on A78:
bgra_to_y_128_c:                                       682.0 ( 1.00x)
bgra_to_y_128_neon:                                    181.2 ( 3.76x)
bgra_to_y_128_dotprod:                                 117.8 ( 5.79x)
bgra_to_y_1080_c:                                     5742.5 ( 1.00x)
bgra_to_y_1080_neon:                                  1472.5 ( 3.90x)
bgra_to_y_1080_dotprod:                                906.5 ( 6.33x)
bgra_to_y_1920_c:                                    10194.0 ( 1.00x)
bgra_to_y_1920_neon:                                  2589.8 ( 3.94x)
bgra_to_y_1920_dotprod:                               1573.8 ( 6.48x)

Signed-off-by: Martin Storsjö <martin@martin.st>

[D H] libswscale/aarch64/input.S
[D H] libswscale/aarch64/swscale.c

avcodec/mpeg12enc : Simplify writing bits

3 mars, par Andreas Rheinhardt

avcodec/mpeg12enc: Simplify writing bits

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>

[D H] libavcodec/mpeg12enc.c

avcodec/mpegvideo : Mark ff_mpv_common_defaults() as av_cold

3 mars, par Andreas Rheinhardt

avcodec/mpegvideo: Mark ff_mpv_common_defaults() as av_cold

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>

[D H] libavcodec/mpegvideo.c

avcodec/speedhqenc : Inline ff_speedhq_mb_y_order_to_mb()

3 mars, par Andreas Rheinhardt

avcodec/speedhqenc: Inline ff_speedhq_mb_y_order_to_mb()

It is an extremely simple function that is only called once,
so it should be inlined.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>

[D H] libavcodec/speedhqenc.c
[D H] libavcodec/speedhqenc.h

1 | ... | 219 | 220 | 221 | 222 | 223 | 224 | 225 | 226 | 227 | ... | 15689

git.videolan.org Git - ffmpeg.git/rss log

Les articles publiés sur le site

avcodec/aarch64/vvc : Optimize NEON version of vvc_dmvr

swscale/aarch64 : dotprod implementation of rgba32_to_Y

avcodec/mpeg12enc : Simplify writing bits

avcodec/mpegvideo : Mark ff_mpv_common_defaults() as av_cold

avcodec/speedhqenc : Inline ff_speedhq_mb_y_order_to_mb()

Se connecter

Se connecter

Navigation

Sites Web

Boussole SPIP