18:32
lavc/vp8dsp: add R-V V vp7_idct_dc_add4y As with idct_dc_add, most of the code is shared with, and replaces, the previous VP8 function. To improve performance, we break down the 16x4 matrix into 4 rows, rather than 4 squares. Thus strided loads and stores are avoided, and the 4 DC calculations are vectored. Unfortunately this requires a vector gather to splat the DC values, but overall this is still a win for performance: T-Head C908: vp7_idct_dc_add4y_c: 7.2 vp7_idct_dc_add4y_rvv_i32: 2.2 vp8_idct_dc_add4y_c: 6.2 vp8_idct_dc_add4y_rvv_i32: 2.2 (before) vp8_idct_dc_add4y_rvv_i32: 1.7 (...)
15:55
lavc/vp8dsp: add R-V V vp7_idct_dc_add This just computes the direct coefficient and hands over to code shared with VP8. Accordingly the bulk of changes are just rewriting the VP8 code to share. Nothing to write home about: vp7_idct_dc_add_c: 1.7 vp7_idct_dc_add_rvv_i32: 1.2 [DH] libavcodec/riscv/vp7dsp_init.c [DH] libavcodec/riscv/vp8dsp_rvv.S
09:01
lavc/vc1dsp: fix R-V V avg_mspel_pixels The 8x8 pixel arrays are not necessarily aligned to 64 bits, so the current code leads to Bus error on real hardware. This reproducible with FATE's vc1_ilaced_twomv test case. The new "pessimist" code can trivially be shared for 16x16 pixel arrays so we also do that. FWIW, this also nominally reduces the hardware requirement from Zve64x to Zve32x. T-Head C908: vc1dsp.avg_vc1_mspel_pixels_tab[0][0]_c: 14.7 vc1dsp.avg_vc1_mspel_pixels_tab[0][0]_rvv_i32: 3.5 vc1dsp.avg_vc1_mspel_pixels_tab[1][0]_c: 3.7 (...)
13:13
avcodec/hevc_ps: Fix UB 1 << 31 Reviewed-by: Tomas Härdin <gitⓐhaerdin.se> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardtⓐoutlook.com> [DH] libavcodec/hevc_ps.c
14:15
avutil/float_dsp.h: fix doxy for scalarproduct_double Signed-off-by: James Almer <jamrialⓐgmail.com> [DH] libavutil/float_dsp.h
14:14
avutil/float_dsp: revert accidental doxy removal done by accident in 6a7c4d60a1498929c2a366f2ef4ccc35621a4358. Signed-off-by: James Almer <jamrialⓐgmail.com> [DH] libavutil/float_dsp.h
13:56
avcodec/videotoolbox: use the correct HEVCSPS field name Fixes compilation that was broken in 6fed1841a1f5dd3cdcf343f77925af0781ebe83a. Signed-off-by: James Almer <jamrialⓐgmail.com> [DH] libavcodec/videotoolbox.c