Recherche avancée

Médias (21)

Mot : - Tags -/Nine Inch Nails

Autres articles (63)

  • MediaSPIP 0.1 Beta version

    25 avril 2011, par

    MediaSPIP 0.1 beta is the first version of MediaSPIP proclaimed as "usable".
    The zip file provided here only contains the sources of MediaSPIP in its standalone version.
    To get a working installation, you must manually install all-software dependencies on the server.
    If you want to use this archive for an installation in "farm mode", you will also need to proceed to other manual (...)

  • Personnaliser en ajoutant son logo, sa bannière ou son image de fond

    5 septembre 2013, par

    Certains thèmes prennent en compte trois éléments de personnalisation : l’ajout d’un logo ; l’ajout d’une bannière l’ajout d’une image de fond ;

  • Amélioration de la version de base

    13 septembre 2013

    Jolie sélection multiple
    Le plugin Chosen permet d’améliorer l’ergonomie des champs de sélection multiple. Voir les deux images suivantes pour comparer.
    Il suffit pour cela d’activer le plugin Chosen (Configuration générale du site > Gestion des plugins), puis de configurer le plugin (Les squelettes > Chosen) en activant l’utilisation de Chosen dans le site public et en spécifiant les éléments de formulaires à améliorer, par exemple select[multiple] pour les listes à sélection multiple (...)

Sur d’autres sites (9297)

  • Error initializing filter 'amovie' when Join multiple files and add background music, watermark

    25 août 2020, par Nguyễn Trọng

    I am doing concatenation of multiple videos with adding background music and watermark at the same time (see below)

    


    [-y, -i, 012.mp4, -i, 011.mp4, -i, 010.mp4, -i, 009.mp4, -i, 008.mp4, -i, 007.mp4, -i, 006.mp4, -i, 005.mp4, -i, 004.mp4, -i, 003.mp4, -i, 002.mp4, -i, 001.mp4, -i, 000.mp4, -i, /storage/emulated/0/FXMotion/.cache/.watermark/logo_watermark.png, -filter_complex, [0:v][0:a][1:v][1:a][2:v][2:a][3:v][3:a][4:v][4:a][5:v][5:a][6:v][6:a][7:v][7:a][8:v][8:a][9:v][9:a][10:v][10:a][11:v][11:a][12:v][12:a]concat=n=13:v=1:a=1[video][audio];[13:v]scale=320:-1[watermark];[video][watermark]overlay=main_w-overlay_w-10:main_h-overlay_h-10[vw];amovie=/storage/emulated/0/Download/Afro B - Drogba (Joanna) Prod by Team Salut [Official Music Video].mp3:loop=0,asetpts=N/SR/TB,aformat=sample_fmts=fltp:sample_rates=44100:channel_layouts=stereo[bgmusic];[audio]aformat=sample_fmts=fltp:sample_rates=44100:channel_layouts=stereo,volume=0[fmaudio];[fmaudio][bgmusic]amerge=2,pan=stereo|c0code>

    


    When I run the above command, it has an error :

    


    [Parsed_amovie_3 @ 0x816c5c00] Failed to avformat_open_input '/storage/emulated/0/Download/Afro B - Drogba (Joanna) Prod by Team Salut'
[AVFilterGraph @ 0xa4c104c0] Error initializing filter 'amovie'[AVFilterGraph @ 0xa4c104c0]  with args '/storage/emulated/0/Download/Afro B - Drogba (Joanna) Prod by Team Salut'[AVFilterGraph @ 0xa4c104c0]
Error initializing complex filters.
No such file or directory
Conversion failed!


    


    I don't know why it happened, is it a bug of amovie filter ?

    


    how to solve it ?, thank advance.

    


    ----------------Full Log------------------

    


    Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '012.mp4':
Metadata:
major_brand     : isom
minor_version   : 512
compatible_brands: isomiso2avc1mp41
encoder         : Lavf58.35.101
location-eng    : +18.0104-077.0263/
location        : +18.0104-077.0263/
Duration: 00:00:04.20, start: 0.000000, bitrate: 5349 kb/s
Stream #0:0(eng): Video: h264 (avc1 / 0x31637661), yuv420p, 720x1280 [SAR 1:1 DAR 9:16], 5340 kb/s, 16.67 fps, 16.67 tbr, 12800 tbn, 33.33 tbc (default)
Metadata:
handler_name    : VideoHandle
Stream #0:1(eng): Audio: aac (mp4a / 0x6134706D), 44100 Hz, mono, fltp, 1 kb/s (default)
Metadata:
handler_name    : SoundHandle
Input #1, mov,mp4,m4a,3gp,3g2,mj2, from '011.mp4':
Metadata:
major_brand     : isom
minor_version   : 512
compatible_brands: isomiso2avc1mp41
encoder         : Lavf58.35.101
location-eng    : +18.0104-077.0263/
location        : +18.0104-077.0263/
Duration: 00:00:06.20, start: 0.000000, bitrate: 5642 kb/s
Stream #1:0(eng): Video: h264 (avc1 / 0x31637661), yuv420p, 720x1280 [SAR 1:1 DAR 9:16], 5681 kb/s, 20 fps, 20 tbr, 10240 tbn, 40 tbc (default)
Metadata:
handler_name    : VideoHandle
Stream #1:1(eng): Audio: aac (mp4a / 0x6134706D), 44100 Hz, mono, fltp, 1 kb/s (default)
Metadata:
handler_name    : SoundHandle
Input #2, mov,mp4,m4a,3gp,3g2,mj2, from '010.mp4':
Metadata:
major_brand     : isom
minor_version   : 512
compatible_brands: isomiso2avc1mp41
encoder         : Lavf58.35.101
location-eng    : +18.0104-077.0263/
location        : +18.0104-077.0263/
Duration: 00:00:06.01, start: 0.000000, bitrate: 5689 kb/s
Stream #2:0(eng): Video: h264 (avc1 / 0x31637661), yuv420p, 720x1280 [SAR 1:1 DAR 9:16], 5695 kb/s, 20 fps, 20 tbr, 10240 tbn, 40 tbc (default)
Metadata:
handler_name    : VideoHandle
Stream #2:1(eng): Audio: aac (mp4a / 0x6134706D), 44100 Hz, mono, fltp, 1 kb/s (default)
Metadata:
handler_name    : SoundHandle
Input #3, mov,mp4,m4a,3gp,3g2,mj2, from '009.mp4':
Metadata:
major_brand     : isom
minor_version   : 512
compatible_brands: isomiso2avc1mp41
encoder         : Lavf58.35.101
location-eng    : +18.0104-077.0263/
location        : +18.0104-077.0263/
Duration: 00:00:06.01, start: 0.000000, bitrate: 5624 kb/s
Stream #3:0(eng): Video: h264 (avc1 / 0x31637661), yuv420p, 720x1280 [SAR 1:1 DAR 9:16], 5630 kb/s, 20 fps, 20 tbr, 10240 tbn, 40 tbc (default)
Metadata:
handler_name    : VideoHandle
Stream #3:1(eng): Audio: aac (mp4a / 0x6134706D), 44100 Hz, mono, fltp, 1 kb/s (default)
Metadata:
handler_name    : SoundHandle
Input #4, mov,mp4,m4a,3gp,3g2,mj2, from '008.mp4':
Metadata:
major_brand     : isom
minor_version   : 512
compatible_brands: isomiso2avc1mp41
encoder         : Lavf58.35.101
location-eng    : +18.0104-077.0263/
location        : +18.0104-077.0263/
Duration: 00:00:05.40, start: 0.000000, bitrate: 5226 kb/s
Stream #4:0(eng): Video: h264 (avc1 / 0x31637661), yuv420p, 720x1280 [SAR 1:1 DAR 9:16], 5218 kb/s, 16.67 fps, 16.67 tbr, 12800 tbn, 33.33 tbc (default)
Metadata:
handler_name    : VideoHandle
Stream #4:1(eng): Audio: aac (mp4a / 0x6134706D), 44100 Hz, mono, fltp, 1 kb/s (default)
Metadata:
handler_name    : SoundHandle
Input #5, mov,mp4,m4a,3gp,3g2,mj2, from '007.mp4':
Metadata:
major_brand     : isom
minor_version   : 512
compatible_brands: isomiso2avc1mp41
encoder         : Lavf58.35.101
location-eng    : +18.0104-077.0263/
location        : +18.0104-077.0263/
Duration: 00:00:06.60, start: 0.000000, bitrate: 5631 kb/s
Stream #5:0(eng): Video: h264 (avc1 / 0x31637661), yuv420p, 720x1280 [SAR 1:1 DAR 9:16], 5663 kb/s, 20 fps, 20 tbr, 10240 tbn, 40 tbc (default)
Metadata:
handler_name    : VideoHandle
Stream #5:1(eng): Audio: aac (mp4a / 0x6134706D), 44100 Hz, mono, fltp, 1 kb/s (default)
Metadata:
handler_name    : SoundHandle
Input #6, mov,mp4,m4a,3gp,3g2,mj2, from '006.mp4':
Metadata:
major_brand     : isom
minor_version   : 512
compatible_brands: isomiso2avc1mp41
encoder         : Lavf58.35.101
location-eng    : +18.0104-077.0263/
location        : +18.0104-077.0263/
Duration: 00:00:06.48, start: 0.000000, bitrate: 5455 kb/s
Stream #6:0(eng): Video: h264 (avc1 / 0x31637661), yuv420p, 720x1280 [SAR 1:1 DAR 9:16], 5472 kb/s, 20 fps, 20 tbr, 10240 tbn, 40 tbc (default)
Metadata:
handler_name    : VideoHandle
Stream #6:1(eng): Audio: aac (mp4a / 0x6134706D), 44100 Hz, mono, fltp, 1 kb/s (default)
Metadata:
handler_name    : SoundHandle
Input #7, mov,mp4,m4a,3gp,3g2,mj2, from '005.mp4':
Metadata:
major_brand     : isom
minor_version   : 512
compatible_brands: isomiso2avc1mp41
encoder         : Lavf58.35.101
location-eng    : +18.0104-077.0263/
location        : +18.0104-077.0263/
Duration: 00:00:06.11, start: 0.000000, bitrate: 5220 kb/s
Stream #7:0(eng): Video: h264 (avc1 / 0x31637661), yuv420p, 720x1280 [SAR 1:1 DAR 9:16], 5213 kb/s, 19.81 fps, 19.81 tbr, 720k tbn, 39.61 tbc (default)
Metadata:
handler_name    : VideoHandle
Stream #7:1(eng): Audio: aac (mp4a / 0x6134706D), 44100 Hz, mono, fltp, 1 kb/s (default)
Metadata:
handler_name    : SoundHandle
Input #8, mov,mp4,m4a,3gp,3g2,mj2, from '004.mp4':
Metadata:
major_brand     : isom
minor_version   : 512
compatible_brands: isomiso2avc1mp41
encoder         : Lavf58.35.101
location-eng    : +18.0104-077.0263/
location        : +18.0104-077.0263/
Duration: 00:00:05.43, start: 0.000000, bitrate: 5515 kb/s
Stream #8:0(eng): Video: h264 (avc1 / 0x31637661), yuv420p, 720x1280 [SAR 1:1 DAR 9:16], 5542 kb/s, 16.67 fps, 16.67 tbr, 12800 tbn, 33.33 tbc (default)
Metadata:
handler_name    : VideoHandle
Stream #8:1(eng): Audio: aac (mp4a / 0x6134706D), 44100 Hz, mono, fltp, 1 kb/s (default)
Metadata:
handler_name    : SoundHandle
Input #9, mov,mp4,m4a,3gp,3g2,mj2, from '003.mp4':
Metadata:
major_brand     : isom
minor_version   : 512
compatible_brands: isomiso2avc1mp41
encoder         : Lavf58.35.101
location-eng    : +18.0104-077.0263/
location        : +18.0104-077.0263/
Duration: 00:00:06.43, start: 0.000000, bitrate: 4450 kb/s
Stream #9:0(eng): Video: h264 (avc1 / 0x31637661), yuv420p, 720x1280 [SAR 1:1 DAR 9:16], 4445 kb/s, 15.25 fps, 15.25 tbr, 15616 tbn, 30.50 tbc (default)
Metadata:
handler_name    : VideoHandle
Stream #9:1(eng): Audio: aac (mp4a / 0x6134706D), 44100 Hz, mono, fltp, 1 kb/s (default)
Metadata:
handler_name    : SoundHandle
Input #10, mov,mp4,m4a,3gp,3g2,mj2, from '002.mp4':
Metadata:
major_brand     : isom
minor_version   : 512
compatible_brands: isomiso2avc1mp41
encoder         : Lavf58.35.101
location-eng    : +18.0104-077.0263/
location        : +18.0104-077.0263/
Duration: 00:00:05.92, start: 0.000000, bitrate: 4323 kb/s
Stream #10:0(eng): Video: h264 (avc1 / 0x31637661), yuv420p, 720x1280 [SAR 1:1 DAR 9:16], 4321 kb/s, 20.29 fps, 20.29 tbr, 1800k tbn, 40.58 tbc (default)
Metadata:
handler_name    : VideoHandle
Stream #10:1(eng): Audio: aac (mp4a / 0x6134706D), 44100 Hz, mono, fltp, 1 kb/s (default)
Metadata:
handler_name    : SoundHandle
Input #11, mov,mp4,m4a,3gp,3g2,mj2, from '001.mp4':
Metadata:
major_brand     : isom
minor_version   : 512
compatible_brands: isomiso2avc1mp41
encoder         : Lavf58.35.101
location-eng    : +18.0104-077.0263/
location        : +18.0104-077.0263/
Duration: 00:00:05.60, start: 0.000000, bitrate: 3759 kb/s
Stream #11:0(eng): Video: h264 (avc1 / 0x31637661), yuv420p, 720x1280 [SAR 1:1 DAR 9:16], 3776 kb/s, 16.19 fps, 16.19 tbr, 1350k tbn, 32.38 tbc (default)
Metadata:
handler_name    : VideoHandle
Stream #11:1(eng): Audio: aac (mp4a / 0x6134706D), 44100 Hz, mono, fltp, 1 kb/s (default)
Metadata:
handler_name    : SoundHandle
Input #12, mov,mp4,m4a,3gp,3g2,mj2, from '000.mp4':
Metadata:
major_brand     : isom
minor_version   : 512
compatible_brands: isomiso2avc1mp41
encoder         : Lavf58.35.101
location-eng    : +18.0104-077.0263/
location        : +18.0104-077.0263/
Duration: 00:00:07.38, start: 0.000000, bitrate: 5448 kb/s
Stream #12:0(eng): Video: h264 (avc1 / 0x31637661), yuv420p, 720x1280 [SAR 1:1 DAR 9:16], 5441 kb/s, 16.53 fps, 16.53 tbr, 10800k tbn, 33.06 tbc (default)
Metadata:
handler_name    : VideoHandle
Stream #12:1(eng): Audio: aac (mp4a / 0x6134706D), 44100 Hz, mono, fltp, 1 kb/s (default)
Metadata:
handler_name    : SoundHandle
Input #13, png_pipe, from '/storage/emulated/0/FXMotion/.cache/.watermark/logo_watermark.png':
Duration: N/A, bitrate: N/A
Stream #13:0: Video: png, rgba(pc), 335x51, 25 tbr, 25 tbn, 25 tbc
[Parsed_amovie_3 @ 0x816c5c00] Failed to avformat_open_input '/storage/emulated/0/Download/Afro B - Drogba (Joanna) Prod by Team Salut'
[AVFilterGraph @ 0xa4c104c0] Error initializing filter 'amovie'[AVFilterGraph @ 0xa4c104c0]  with args '/storage/emulated/0/Download/Afro B - Drogba (Joanna) Prod by Team Salut'[AVFilterGraph @ 0xa4c104c0]
Error initializing complex filters.
No such file or directory
Conversion failed!


    


  • A pragmatic strategy to merge multiple video files

    19 juin 2021, par saurav

    I currently am working on recording a multiparty video conference which supports up to 6 participants. I am recording the conference using a media server and storing audio/video streams individually for every participant.

    


    Next, I need to merge those individual recordings into a single video file and upload it to a cloud storage like aws s3. For this I am considering 2 options, either Gstreamer or FFMPEG. I am leaning towards FFMPEG as I have used FFMPEG previously. I currently am playing with FFMPEG things like the hstack and vstack filters etc.

    


    Here is the FFMPEG command I recently used to join 2 webm videos of 2 mins 40sec and 1min 40sec to create a mp4 video file for upload. Both the videos are 1280x720 in this case but I have included the scale part because in real life scenario different participants joining with different cameras produces video files of different resolution which is a problem for the hstack/vstack filter. Therefore, to make the video resolutions of all participant consistent, I have included the scale property.

    


    ffmpeg -i 1.webm -i 2.webm -filter_complex "[0:v]scale=1280:720,setsar=1[l];[1:v]scale=1280:720,setsar=1[r];[l][r]hstack;[0][1]amix" output-1280x720.mp4


    


    Currently I am facing 2 issues with this command.

    


      

    1. The output mp4 file is very big, in this case 140Mb (approx) for a less than 3 minutes video.

      


    2. 


    3. How do I add delay to any video before starting to merge ?
      
Currently the videos are going out of sync if all the participants don't join at the same time which is highly unlikely to happen in a real world scenario.

      


    4. 


    


    Any pointer in the right direction will be highly appreciated.

    


    Here is a log sample from FFmpeg (or see the full log link) :

    


    ffmpeg version 4.2.4-1ubuntu0.1 Copyright (c) 2000-2020 the FFmpeg developers
  built with gcc 9 (Ubuntu 9.3.0-10ubuntu2)
  configuration: --prefix=/usr --extra-version=1ubuntu0.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-nvenc --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
  libavutil      56. 31.100 / 56. 31.100
  libavcodec     58. 54.100 / 58. 54.100
  libavformat    58. 29.100 / 58. 29.100
  libavdevice    58.  8.100 / 58.  8.100
  libavfilter     7. 57.100 /  7. 57.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  5.100 /  5.  5.100
  libswresample   3.  5.100 /  3.  5.100
  libpostproc    55.  5.100 / 55.  5.100
Input #0, matroska,webm, from '3.webm':
  Metadata:
    title           : FFmpeg
    ENCODER         : Lavf58.29.100
  Duration: 00:01:39.63, start: 0.000000, bitrate: 707 kb/s
    Stream #0:0: Video: vp8, yuv420p(tv, bt470bg/unknown/unknown, progressive), 1280x720, SAR 1:1 DAR 16:9, 1k tbr, 1k tbn, 1k tbc (default)
    Metadata:
      DURATION        : 00:01:39.618000000
    Stream #0:1: Audio: opus, 48000 Hz, stereo, fltp (default)
    Metadata:
      DURATION        : 00:01:39.629000000
Input #1, matroska,webm, from '4.webm':
  Metadata:
    title           : FFmpeg
    ENCODER         : Lavf58.29.100
  Duration: 00:02:39.07, start: 0.000000, bitrate: 708 kb/s
    Stream #1:0: Video: vp8, yuv420p(tv, bt470bg/unknown/unknown, progressive), 1280x720, SAR 1:1 DAR 16:9, 1k tbr, 1k tbn, 1k tbc (default)
    Metadata:
      DURATION        : 00:02:39.050000000
    Stream #1:1: Audio: opus, 48000 Hz, stereo, fltp (default)
    Metadata:
      DURATION        : 00:02:39.068000000
Stream mapping:
  Stream #0:0 (vp8) -> scale
  Stream #0:1 (opus) -> amix:input0
  Stream #1:0 (vp8) -> scale
  Stream #1:1 (opus) -> amix:input1
  hstack -> Stream #0:0 (libx264)
  amix -> Stream #0:1 (aac)
Press [q] to stop, [?] for help
[libx264 @ 0x562b4842a500] using SAR=1/1
[libx264 @ 0x562b4842a500] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 0x562b4842a500] profile High, level 6.1
[libx264 @ 0x562b4842a500] 264 - core 155 r2917 0a84d98 - H.264/MPEG-4 AVC codec - Copyleft 2003-2018 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=18 lookahead_threads=3 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to 'output-new.mp4':
  Metadata:
    title           : FFmpeg
    encoder         : Lavf58.29.100
    Stream #0:0: Video: h264 (libx264) (avc1 / 0x31637661), yuv420p(progressive), 2560x720 [SAR 1:1 DAR 32:9], q=-1--1, 1k fps, 16k tbn, 1k tbc (default)
    Metadata:
      encoder         : Lavc58.54.100 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: -1
    Stream #0:1: Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 128 kb/s (default)
    Metadata:
      encoder         : Lavc58.54.100 aac

frame=  129 fps=0.0 q=33.0 size=       0kB time=00:00:00.23 bitrate=   1.6kbits/s dup=123 drop=0 speed=0.44x    
frame=  257 fps=228 q=33.0 size=       0kB time=00:00:00.51 bitrate=   0.8kbits/s dup=243 drop=0 speed=0.455x    
frame=  379 fps=224 q=33.0 size=     256kB time=00:00:00.73 bitrate=2855.1kbits/s dup=358 drop=0 speed=0.434x    
frame=  497 fps=222 q=33.0 size=     256kB time=00:00:00.86 bitrate=2431.5kbits/s dup=469 drop=0 speed=0.386x    
 
...
More than 1000 frames duplicated
...
  
frame=158751 fps=196 q=33.0 size=  134656kB time=00:02:39.00 bitrate=6937.4kbits/s dup=151385 drop=0 speed=0.196x    
frame=158851 fps=196 q=33.0 size=  134912kB time=00:02:39.00 bitrate=6950.6kbits/s dup=151482 drop=0 speed=0.196x    
frame=158983 fps=196 q=33.0 size=  134912kB time=00:02:39.00 bitrate=6950.6kbits/s dup=151610 drop=0 speed=0.196x    
frame=159081 fps=196 q=-1.0 Lsize=  137197kB time=00:02:39.07 bitrate=7065.2kbits/s dup=151706 drop=0 speed=0.196x    

video:132693kB audio:2494kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 1.486001%

[libx264 @ 0x562b4842a500] frame I:637   Avg QP:17.73  size:123895
[libx264 @ 0x562b4842a500] frame P:40088 Avg QP:19.73  size:  1134
[libx264 @ 0x562b4842a500] frame B:118356 Avg QP:27.54  size:    97
[libx264 @ 0x562b4842a500] consecutive B-frames:  0.8%  0.0%  0.0% 99.2%
[libx264 @ 0x562b4842a500] mb I  I16..4: 11.1% 67.3% 21.6%
[libx264 @ 0x562b4842a500] mb P  I16..4:  0.1%  0.1%  0.0%  P16..4:  2.6%  0.4%  0.3%  0.0%  0.0%    skip:96.5%
[libx264 @ 0x562b4842a500] mb B  I16..4:  0.0%  0.0%  0.0%  B16..8:  0.7%  0.0%  0.0%  direct: 0.0%  skip:99.3%  L0:38.7% L1:61.3% BI: 0.0%
[libx264 @ 0x562b4842a500] 8x8 transform intra:66.8% inter:71.4%
[libx264 @ 0x562b4842a500] coded y,uvDC,uvAC intra: 81.8% 89.5% 72.3% inter: 0.2% 0.4% 0.0%
[libx264 @ 0x562b4842a500] i16 v,h,dc,p: 25% 21% 17% 37%
[libx264 @ 0x562b4842a500] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 33% 22% 12%  4%  5%  6%  6%  6%  6%
[libx264 @ 0x562b4842a500] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 42% 24%  6%  4%  5%  5%  6%  4%  4%
[libx264 @ 0x562b4842a500] i8c dc,h,v,p: 42% 24% 26%  9%
[libx264 @ 0x562b4842a500] Weighted P-Frames: Y:0.0% UV:0.0%
[libx264 @ 0x562b4842a500] ref P L0: 82.4% 11.5%  5.3%  0.8%
[libx264 @ 0x562b4842a500] ref B L0: 83.0% 16.9%  0.1%
[libx264 @ 0x562b4842a500] ref B L1: 94.9%  5.1%
[libx264 @ 0x562b4842a500] kb/s:6833.11
[aac @ 0x562b4842b540] Qavg: 239.393


    


  • The problems with wavelets

    27 février 2010, par Dark Shikari — DCT, Dirac, Snow, psychovisual optimizations, wavelets

    I have periodically noted in this blog and elsewhere various problems with wavelet compression, but many readers have requested that I write a more detailed post about it, so here it is.

    Wavelets have been researched for quite some time as a replacement for the standard discrete cosine transform used in most modern video compression. Their methodology is basically opposite : each coefficient in a DCT represents a constant pattern applied to the whole block, while each coefficient in a wavelet transform represents a single, localized pattern applied to a section of the block. Accordingly, wavelet transforms are usually very large with the intention of taking advantage of large-scale redundancy in an image. DCTs are usually quite small and are intended to cover areas of roughly uniform patterns and complexity.

    Both are complete transforms, offering equally accurate frequency-domain representations of pixel data. I won’t go into the mathematical details of each here ; the real question is whether one offers better compression opportunities for real-world video.

    DCT transforms, though it isn’t mathematically required, are usually found as block transforms, handling a single sharp-edged block of data. Accordingly, they usually need a deblocking filter to smooth the edges between DCT blocks. Wavelet transforms typically overlap, avoiding such a need. But because wavelets don’t cover a sharp-edged block of data, they don’t compress well when the predicted data is in the form of blocks.

    Thus motion compensation is usually performed as overlapped-block motion compensation (OBMC), in which every pixel is calculated by performing the motion compensation of a number of blocks and averaging the result based on the distance of those blocks from the current pixel. Another option, which can be combined with OBMC, is “mesh MC“, where every pixel gets its own motion vector, which is a weighted average of the closest nearby motion vectors. The end result of either is the elimination of sharp edges between blocks and better prediction, at the cost of greatly increased CPU requirements. For an overlap factor of 2, it’s 4 times the amount of motion compensation, plus the averaging step. With mesh MC, it’s even worse, with SIMD optimizations becoming nearly impossible.

    At this point, it would seem wavelets would have pretty big advantages : when used with OBMC, they have better inter prediction, eliminate the need for deblocking, and take advantage of larger-scale correlations. Why then hasn’t everyone switched over to wavelets then ? Dirac and Snow offer modern implementations. Yet despite decades of research, wavelets have consistently disappointed for image and video compression. It turns out there are a lot of serious practical issues with wavelets, many of which are open problems.

    1. No known method exists for efficient intra coding. H.264′s spatial intra prediction is extraordinarily powerful, but relies on knowing the exact decoded pixels to the top and left of the current block. Since there is no such boundary in overlapped-wavelet coding, such prediction is impossible. Newer intra prediction methods, such as markov-chain intra prediction, also seem to require an H.264-like situation with exactly-known neighboring pixels. Intra coding in wavelets is in the same state that DCT intra coding was in 20 years ago : the best known method was to simply transform the block with no prediction at all besides DC. NB : as described by Pengvado in the comments, the switching between inter and intra coding is potentially even more costly than the inefficient intra coding.

    2. Mixing partition sizes has serious practical problems. Because the overlap between two motion partitions depends on the partitions’ size, mixing block sizes becomes quite difficult to define. While in H.264 an smaller partition always gives equal or better compression than a larger one when one ignores the extra overhead, it is actually possible for a larger partition to win when using OBMC due to the larger overlap. All of this makes both the problem of defining the result of mixed block sizes and making decisions about them very difficult.

    Both Snow and Dirac offer variable block size, but the overlap amount is constant ; larger blocks serve only to save bits on motion vectors, not offer better overlap characteristics.

    3. Lack of spatial adaptive quantization. As shown in x264 with VAQ, and correspondingly in HCEnc’s implementation and Theora’s recent implementation, spatial adaptive quantization has staggeringly impressive (before, after) effects on visual quality. Only Dirac seems to have such a feature, and the encoder doesn’t even use it. No other wavelet formats (Snow, JPEG2K, etc) seem to have such a feature. This results in serious blurring problems in areas with subtle texture (as in the comparison below).

    4. Wavelets don’t seem to code visual energy effectively. Remember that a single coefficient in a DCT represents a pattern which applies across an entire block : this makes it very easy to create apparent “detail” with a DCT. Furthermore, the sharp edges of DCT blocks, despite being an apparent weakness, often result in a “fake sharpness” that can actually improve the visual appearance of videos, as was seen with Xvid. Thus wavelet codecs have a tendency to look much blurrier than DCT-based codecs, but since PSNR likes blur, this is often seen as a benefit during video compression research. Some of the consequences of these factors can be seen in this comparison ; somewhat outdated and not general-case, but which very effectively shows the difference in how wavelets handle sharp edges and subtle textures.

    Another problem that periodically crops up is the visual aliasing that tends to be associated with wavelets at lower bitrates. Standard wavelets effectively consist of a recursive function that upscales the coefficients coded by the previous level by a factor of 2 and then adds a new set of coefficients. If the upscaling algorithm is naive — as it often is, for the sake of speed — the result can look quite ugly, as if parts of the image were coded at a lower resolution and then badly scaled up. Of course, it looks like that because they were coded at a lower resolution and then badly scaled up.

    JPEG2000 is a classic example of wavelet failure : despite having more advanced entropy coding, being designed much later than JPEG, being much more computationally intensive, and having much better PSNR, comparisons have consistently shown it to be visually worse than JPEG at sane filesizes. Here’s an example from Wikipedia. By comparison, H.264′s intra coding, when used for still image compression, can beat JPEG by a factor of 2 or more (I’ll make a post on this later). With the various advancements in DCT intra coding since H.264, I suspect that a state-of-the-art DCT compressor could win by an even larger factor.

    Despite the promised benefits of wavelets, a wavelet encoder even close to competitive with x264 has yet to be created. With some tests even showing Dirac losing to Theora in visual comparisons, it’s clear that many problems remain to be solved before wavelets can eliminate the ugliness of block-based transforms once and for all.