
Recherche avancée
Médias (1)
-
Rennes Emotion Map 2010-11
19 octobre 2011, par
Mis à jour : Juillet 2013
Langue : français
Type : Texte
Autres articles (67)
-
Les formats acceptés
28 janvier 2010, parLes commandes suivantes permettent d’avoir des informations sur les formats et codecs gérés par l’installation local de ffmpeg :
ffmpeg -codecs ffmpeg -formats
Les format videos acceptés en entrée
Cette liste est non exhaustive, elle met en exergue les principaux formats utilisés : h264 : H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10 m4v : raw MPEG-4 video format flv : Flash Video (FLV) / Sorenson Spark / Sorenson H.263 Theora wmv :
Les formats vidéos de sortie possibles
Dans un premier temps on (...) -
Ajouter notes et légendes aux images
7 février 2011, parPour pouvoir ajouter notes et légendes aux images, la première étape est d’installer le plugin "Légendes".
Une fois le plugin activé, vous pouvez le configurer dans l’espace de configuration afin de modifier les droits de création / modification et de suppression des notes. Par défaut seuls les administrateurs du site peuvent ajouter des notes aux images.
Modification lors de l’ajout d’un média
Lors de l’ajout d’un média de type "image" un nouveau bouton apparait au dessus de la prévisualisation (...) -
Le profil des utilisateurs
12 avril 2011, parChaque utilisateur dispose d’une page de profil lui permettant de modifier ses informations personnelle. Dans le menu de haut de page par défaut, un élément de menu est automatiquement créé à l’initialisation de MediaSPIP, visible uniquement si le visiteur est identifié sur le site.
L’utilisateur a accès à la modification de profil depuis sa page auteur, un lien dans la navigation "Modifier votre profil" est (...)
Sur d’autres sites (7102)
-
Your 6-step guide to increasing acquisition
2 juillet 2019, par Matomo Core Team — Analytics Tips -
FFMPEG overlay by time not working in my case
18 août 2021, par Patel MilanImage Overlay on image and enable by time


ffmpeg -y -loop 1 -i .\1080.png -i .\021.jpg -i .\022.jpg -i .\023.jpg -filter_complex " [1:v]scale=534:810[a]; [2:v]scale=534:810[b]; [3:v]scale=534:810[c]; [0:v][a] overlay=10:8:enable='between(t,0,8)'[o1]; [o1][b] overlay=264:778:enable='between(t,1,8)'[o2]; [o2][c] overlay=534:1524:enable='between(t,2,8)'[o3]" -map "[o3]" -t 8 outImageOverlay.mp4



Log


ffmpeg version 4.3.1-2021-01-01-essentials_build-www.gyan.dev Copyright (c) 2000-2021 the FFmpeg developers
 built with gcc 10.2.0 (Rev5, Built by MSYS2 project)
 configuration: --enable-gpl --enable-version3 --enable-static --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-gmp --enable-lzma --enable-zlib --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-sdl2 --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxvid --enable-libaom --enable-libopenjpeg --enable-libvpx --enable-libass --enable-libfreetype --enable-libfribidi --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf --enable-cuda-llvm --enable-cuvid --enable-ffnvcodec --enable-nvdec --enable-nvenc --enable-d3d11va --enable-dxva2 --enable-libmfx --enable-libgme --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libtheora --enable-libvo-amrwbenc --enable-libgsm --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enable-librubberband
 libavutil 56. 51.100 / 56. 51.100
 libavcodec 58. 91.100 / 58. 91.100
 libavformat 58. 45.100 / 58. 45.100
 libavdevice 58. 10.100 / 58. 10.100
 libavfilter 7. 85.100 / 7. 85.100
 libswscale 5. 7.100 / 5. 7.100
 libswresample 3. 7.100 / 3. 7.100
 libpostproc 55. 7.100 / 55. 7.100
Input #0, png_pipe, from '.\1080.png':
 Duration: N/A, bitrate: N/A
 Stream #0:0: Video: png, rgba(pc), 1080x2340, 25 fps, 25 tbr, 25 tbn, 25 tbc
Input #1, image2, from '.\021.jpg':
 Duration: 00:00:00.04, start: 0.000000, bitrate: 286665 kb/s
 Stream #1:0: Video: mjpeg (Baseline), yuvj444p(pc, bt470bg/unknown/unknown), 1080x2340, 25 tbr, 25 tbn, 25 tbc
Input #2, image2, from '.\022.jpg':
 Duration: 00:00:00.04, start: 0.000000, bitrate: 337493 kb/s
 Stream #2:0: Video: mjpeg (Baseline), yuvj444p(pc, bt470bg/unknown/unknown), 1080x2340, 25 tbr, 25 tbn, 25 tbc
Input #3, image2, from '.\023.jpg':
 Duration: 00:00:00.04, start: 0.000000, bitrate: 298403 kb/s
 Stream #3:0: Video: mjpeg (Baseline), yuvj444p(pc, bt470bg/unknown/unknown), 1080x2340, 25 tbr, 25 tbn, 25 tbc
Stream mapping:
 Stream #0:0 (png) -> overlay:main
 Stream #1:0 (mjpeg) -> scale
 Stream #2:0 (mjpeg) -> scale
 Stream #3:0 (mjpeg) -> scale
 overlay -> Stream #0:0 (libx264)
Press [q] to stop, [?] for help
[swscaler @ 000001cd0b24d000] deprecated pixel format used, make sure you did set range correctly
[swscaler @ 000001cd0b286080] deprecated pixel format used, make sure you did set range correctly
[swscaler @ 000001cd0b2c9c40] deprecated pixel format used, make sure you did set range correctly
[libx264 @ 000001cd0a848500] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 000001cd0a848500] profile High, level 5.0, 4:2:0, 8-bit
[libx264 @ 000001cd0a848500] 264 - core 161 r3027 4121277 - H.264/MPEG-4 AVC codec - Copyleft 2003-2020 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=6 lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to 'outImageOverlay.mp4':
 Metadata:
 encoder : Lavf58.45.100
 Stream #0:0: Video: h264 (libx264) (avc1 / 0x31637661), yuv420p, 1080x2340, q=-1--1, 25 fps, 12800 tbn, 25 tbc (default)
 Metadata:
 encoder : Lavc58.91.100 libx264
 Side data:
 cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: N/A
frame= 200 fps= 43 q=-1.0 Lsize= 208kB time=00:00:07.88 bitrate= 216.6kbits/s speed= 1.7x
video:205kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 1.538915%
[libx264 @ 000001cd0a848500] frame I:1 Avg QP:10.61 size: 68492
[libx264 @ 000001cd0a848500] frame P:50 Avg QP:16.79 size: 2508
[libx264 @ 000001cd0a848500] frame B:149 Avg QP:29.02 size: 104
[libx264 @ 000001cd0a848500] consecutive B-frames: 0.5% 0.0% 1.5% 98.0%
[libx264 @ 000001cd0a848500] mb I I16..4: 82.0% 10.1% 7.9%
[libx264 @ 000001cd0a848500] mb P I16..4: 0.0% 0.5% 0.1% P16..4: 1.0% 0.0% 0.1% 0.0% 0.0% skip:98.2%
[libx264 @ 000001cd0a848500] mb B I16..4: 0.0% 0.0% 0.0% B16..8: 0.3% 0.0% 0.0% direct: 0.0% skip:99.7% L0:39.1% L1:60.9% BI: 0.0%
[libx264 @ 000001cd0a848500] 8x8 transform intra:28.0% inter:98.0%
[libx264 @ 000001cd0a848500] coded y,uvDC,uvAC intra: 35.8% 31.1% 23.2% inter: 0.0% 0.1% 0.0%
[libx264 @ 000001cd0a848500] i16 v,h,dc,p: 98% 1% 0% 1%
[libx264 @ 000001cd0a848500] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 19% 17% 15% 7% 7% 9% 8% 8% 11%
[libx264 @ 000001cd0a848500] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 18% 21% 9% 7% 10% 9% 11% 6% 9%
[libx264 @ 000001cd0a848500] i8c dc,h,v,p: 76% 10% 10% 4%
[libx264 @ 000001cd0a848500] Weighted P-Frames: Y:0.0% UV:0.0%
[libx264 @ 000001cd0a848500] ref P L0: 90.9% 0.7% 7.0% 1.3%
[libx264 @ 000001cd0a848500] ref B L0: 57.1% 42.0% 0.9%
[libx264 @ 000001cd0a848500] ref B L1: 93.2% 6.8%
[libx264 @ 000001cd0a848500] kb/s:209.46



Overlay Video on Image Command


ffmpeg -y -i love.mp4 -i .\1080.png -i .\021.jpg -i .\022.jpg -i .\023.jpg -loop 1 -i .\020.jpg -filter_complex " [2:v]scale=534:810[a]; [3:v]scale=534:810[b]; [4:v]scale=534:810[c]; [5:v]scale=8000:4000,zoompan=z='min(zoom+0.0020,1.5)':d=417:s=1080x2340,setsar=1[d]; [0:v]scale=1080x2340,setdar=1080:2340,colorkey=0x1CD51A:0.3:0.2[ckout]; [1:v][a] overlay=10:8:enable='between(t,0,8)'[o1]; [o1][b] overlay=264:778:enable='between(t,1,8)'[o2]; [o2][c]overlay=534:1524:enable='between(t,2,8)'[o3]; [d][o3]overlay[o4]; [o4][ckout]overlay[o5]" -map "[o5]" -pix_fmt yuvj422p -t 8 outvideoOverlayInImage.mp4



Log


ffmpeg version 4.3.1-2021-01-01-essentials_build-www.gyan.dev Copyright (c) 2000-2021 the FFmpeg developers
 built with gcc 10.2.0 (Rev5, Built by MSYS2 project)
 configuration: --enable-gpl --enable-version3 --enable-static --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-gmp --enable-lzma --enable-zlib --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-sdl2 --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxvid --enable-libaom --enable-libopenjpeg --enable-libvpx --enable-libass --enable-libfreetype --enable-libfribidi --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf --enable-cuda-llvm --enable-cuvid --enable-ffnvcodec --enable-nvdec --enable-nvenc --enable-d3d11va --enable-dxva2 --enable-libmfx --enable-libgme --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libtheora --enable-libvo-amrwbenc --enable-libgsm --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enable-librubberband
 libavutil 56. 51.100 / 56. 51.100
 libavcodec 58. 91.100 / 58. 91.100
 libavformat 58. 45.100 / 58. 45.100
 libavdevice 58. 10.100 / 58. 10.100
 libavfilter 7. 85.100 / 7. 85.100
 libswscale 5. 7.100 / 5. 7.100
 libswresample 3. 7.100 / 3. 7.100
 libpostproc 55. 7.100 / 55. 7.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'love.mp4':
 Metadata:
 major_brand : mp42
 minor_version : 0
 compatible_brands: isommp42
 creation_time : 2021-08-17T05:35:07.000000Z
 com.android.version: 11
 Duration: 00:00:06.93, start: 0.000000, bitrate: 538 kb/s
 Stream #0:0(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 196 kb/s (default)
 Metadata:
 creation_time : 2021-08-17T05:35:07.000000Z
 handler_name : SoundHandle
 Stream #0:1(eng): Video: h264 (Baseline) (avc1 / 0x31637661), yuv420p(tv, smpte170m/bt470bg/smpte170m), 1080x2340, 334 kb/s, SAR 1:1 DAR 6:13, 25 fps, 25 tbr, 90k tbn, 180k tbc (default)
 Metadata:
 creation_time : 2021-08-17T05:35:07.000000Z
 handler_name : VideoHandle
Input #1, png_pipe, from '.\1080.png':
 Duration: N/A, bitrate: N/A
 Stream #1:0: Video: png, rgba(pc), 1080x2340, 25 tbr, 25 tbn, 25 tbc
Input #2, image2, from '.\021.jpg':
 Duration: 00:00:00.04, start: 0.000000, bitrate: 286665 kb/s
 Stream #2:0: Video: mjpeg (Baseline), yuvj444p(pc, bt470bg/unknown/unknown), 1080x2340, 25 tbr, 25 tbn, 25 tbc
Input #3, image2, from '.\022.jpg':
 Duration: 00:00:00.04, start: 0.000000, bitrate: 337493 kb/s
 Stream #3:0: Video: mjpeg (Baseline), yuvj444p(pc, bt470bg/unknown/unknown), 1080x2340, 25 tbr, 25 tbn, 25 tbc
Input #4, image2, from '.\023.jpg':
 Duration: 00:00:00.04, start: 0.000000, bitrate: 298403 kb/s
 Stream #4:0: Video: mjpeg (Baseline), yuvj444p(pc, bt470bg/unknown/unknown), 1080x2340, 25 tbr, 25 tbn, 25 tbc
Input #5, image2, from '.\020.jpg':
 Duration: 00:00:00.04, start: 0.000000, bitrate: 184663 kb/s
 Stream #5:0: Video: mjpeg (Baseline), yuvj444p(pc, bt470bg/unknown/unknown), 1080x2340, 25 fps, 25 tbr, 25 tbn, 25 tbc
Stream mapping:
 Stream #0:1 (h264) -> scale
 Stream #1:0 (png) -> overlay:main
 Stream #2:0 (mjpeg) -> scale
 Stream #3:0 (mjpeg) -> scale
 Stream #4:0 (mjpeg) -> scale
 Stream #5:0 (mjpeg) -> scale
 overlay -> Stream #0:0 (libx264)
Press [q] to stop, [?] for help
[swscaler @ 00000230595cff40] deprecated pixel format used, make sure you did set range correctly
[swscaler @ 0000023059727e80] deprecated pixel format used, make sure you did set range correctly
[swscaler @ 00000230597768c0] deprecated pixel format used, make sure you did set range correctly
[swscaler @ 00000230597c3c80] deprecated pixel format used, make sure you did set range correctly
[swscaler @ 00000230597faec0] No accelerated colorspace conversion found from yuv420p to argb.
[swscaler @ 0000023059884cc0] deprecated pixel format used, make sure you did set range correctly
[libx264 @ 00000230536e2900] using SAR=1/1
[libx264 @ 00000230536e2900] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 00000230536e2900] profile High 4:2:2, level 5.0, 4:2:2, 8-bit
[libx264 @ 00000230536e2900] 264 - core 161 r3027 4121277 - H.264/MPEG-4 AVC codec - Copyleft 2003-2020 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=6 lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to 'outvideoOverlayInImage.mp4':
 Metadata:
 major_brand : mp42
 minor_version : 0
 compatible_brands: isommp42
 com.android.version: 11
 encoder : Lavf58.45.100
 Stream #0:0: Video: h264 (libx264) (avc1 / 0x31637661), yuvj422p(pc), 1080x2340 [SAR 1:1 DAR 6:13], q=-1--1, 25 fps, 12800 tbn, 25 tbc (default)
 Metadata:
 encoder : Lavc58.91.100 libx264
 Side data:
 cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: N/A
frame= 200 fps= 11 q=-1.0 Lsize= 1411kB time=00:00:07.88 bitrate=1467.1kbits/s speed=0.435x
video:1408kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.226583%
[libx264 @ 00000230536e2900] frame I:1 Avg QP:20.68 size:121139
[libx264 @ 00000230536e2900] frame P:50 Avg QP:20.09 size: 15622
[libx264 @ 00000230536e2900] frame B:149 Avg QP:24.04 size: 3617
[libx264 @ 00000230536e2900] consecutive B-frames: 0.5% 0.0% 1.5% 98.0%
[libx264 @ 00000230536e2900] mb I I16..4: 8.9% 81.4% 9.7%
[libx264 @ 00000230536e2900] mb P I16..4: 1.5% 1.9% 0.3% P16..4: 21.8% 8.1% 4.7% 0.0% 0.0% skip:61.6%
[libx264 @ 00000230536e2900] mb B I16..4: 0.1% 0.1% 0.0% B16..8: 26.8% 0.7% 0.1% direct: 0.3% skip:72.1% L0:46.3% L1:53.3% BI: 0.3%
[libx264 @ 00000230536e2900] 8x8 transform intra:59.8% inter:84.9%
[libx264 @ 00000230536e2900] coded y,uvDC,uvAC intra: 30.0% 30.8% 13.5% inter: 3.0% 2.2% 0.1%
[libx264 @ 00000230536e2900] i16 v,h,dc,p: 66% 30% 2% 2%
[libx264 @ 00000230536e2900] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 29% 12% 43% 3% 3% 3% 3% 2% 3%
[libx264 @ 00000230536e2900] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 39% 19% 11% 5% 6% 5% 6% 4% 5%
[libx264 @ 00000230536e2900] i8c dc,h,v,p: 57% 19% 21% 3%
[libx264 @ 00000230536e2900] Weighted P-Frames: Y:0.0% UV:0.0%
[libx264 @ 00000230536e2900] ref P L0: 73.3% 11.4% 8.9% 6.4%
[libx264 @ 00000230536e2900] ref B L0: 96.2% 3.1% 0.7%
[libx264 @ 00000230536e2900] ref B L1: 95.5% 4.5%
[libx264 @ 00000230536e2900] kb/s:1441.17



Input images and output videos


- 

- 020.jpg
- 021.jpg
- 022.jpg
- 023.jpg
- 1080.png
- outImageOverlay
- outvideoOverlayInImage.mp4
- love.mp4


















image on image overlay is working by enable given time but when video overlay into image is not working correctly


-
IJG swings again, and misses
1er février 2010, par Mans — MultimediaEarlier this month the IJG unleashed version 8 of its ubiquitous libjpeg library on the world. Eager to try out the “major breakthrough in image coding technology” promised in the README file accompanying v7, I downloaded the release. A glance at the README file suggests something major indeed is afoot :
Version 8.0 is the first release of a new generation JPEG standard to overcome the limitations of the original JPEG specification.
The text also hints at the existence of a document detailing these marvellous new features, and a Google search later a copy has found its way onto my monitor. As I read, however, my state of mind shifts from an initial excited curiosity, through bewilderment and disbelief, finally arriving at pure merriment.
Already on the first page it becomes clear no new JPEG standard in fact exists. All we have is an unsolicited proposal sent to the ITU-T by members of the IJG. Realising that even the most brilliant of inventions must start off as mere proposals, I carry on reading. The summary informs me that I am about to witness the introduction of three extensions to the T.81 JPEG format :
- An alternative coefficient scan sequence for DCT coefficient serialization
- A SmartScale extension in the Start-Of-Scan (SOS) marker segment
- A Frame Offset definition in or in addition to the Start-Of-Frame (SOF) marker segment
Together these three extensions will, it is promised, “bring DCT based JPEG back to the forefront of state-of-the-art image coding technologies.”
Alternative scan
The first of the proposed extensions introduces an alternative DCT coefficient scan sequence to be used in place of the zigzag scan employed in most block transform based codecs.
Alternative scan sequence
The advantage of this scan would be that combined with the existing progressive mode, it simplifies decoding of an initial low-resolution image which is enhanced through subsequent passes. The author of the document calls this scheme “image-pyramid/hierarchical multi-resolution coding.” It is not immediately obvious to me how this constitutes even a small advance in image coding technology.
At this point I am beginning to suspect that our friend from the IJG has been trapped in a half-world between interlaced GIF images transmitted down noisy phone lines and today’s inferno of SVC, MVC, and other buzzwords.
(Not so) SmartScale
Disguised behind this camel-cased moniker we encounter a method which, we are told, will provide better image quality at high compression ratios. The author has combined two well-known (to us) properties in a (to him) clever way.
The first property concerns the perceived impact of different types of distortion in an image. When encoding with JPEG, as the quantiser is increased, the decoded image becomes ever more blocky. At a certain point, a better subjective visual quality can be achieved by down-sampling the image before encoding it, thus allowing a lower quantiser to be used. If the decoded image is scaled back up to the original size, the unpleasant, blocky appearance is replaced with a smooth blur.
The second property belongs to the DCT where, as we all know, the top-left (DC) coefficient is the average of the entire block, its neighbours represent the lowest frequency components etc. A top-left-aligned subset of the coefficient block thus represents a low-resolution version of the full block in the spatial domain.
In his flash of genius, our hero came up with the idea of using the DCT for down-scaling the image. Unfortunately, he appears to possess precious little knowledge of sampling theory and human visual perception. Any block-based resampling will inevitably produce sharp artefacts along the block edges. The human visual system is particularly sensitive to sharp edges, so this is one of the most unwanted types of distortion in an encoded image.
Despite the obvious flaws in this approach, I decided to give it a try. After all, the software is already written, allowing downscaling by factors of 8/8..16.
Using a 1280×720 test image, I encoded it with each of the nine scaling options, from unity to half size, each time adjusting the quality parameter for a final encoded file size of no more than 200000 bytes. The following table presents the encoded file size, the libjpeg quality parameter used, and the SSIM metric for each of the images.
Scale Size Quality SSIM 8/8 198462 59 0.940 8/9 196337 70 0.936 8/10 196133 79 0.934 8/11 197179 84 0.927 8/12 193872 89 0.915 8/13 197153 92 0.914 8/14 188334 94 0.899 8/15 198911 96 0.886 8/16 197190 97 0.869 Although the smaller images allowed a higher quality setting to be used, the SSIM value drops significantly. Numbers may of course be misleading, but the images below speak for themselves. These are cut-outs from the full image, the original on the left, unscaled JPEG-compressed in the middle, and JPEG with 8/16 scaling to the right.
Looking at these images, I do not need to hesitate before picking the JPEG variant I prefer.
Frame offset
The third and final extension proposed is quite simple and also quite pointless : a top-left cropping to be applied to the decoded image. The alleged utility of this feature would be to enable lossless cropping of a JPEG image. In a typical image workflow, however, JPEG is only used for the final published version, so the need for this feature appears quite far-fetched.
The grand finale
Throughout the text, the author makes references to “the fundamental DCT property for image representation.” In his own words :
This property was found by the author during implementation of the new DCT scaling features and is after his belief one of the most important discoveries in digital image coding after releasing the JPEG standard in 1992.
The secret is to be revealed in an annex to the main text. This annex quotes in full a post by the author to the comp.dsp Usenet group in a thread with the subject why DCT. Reading the entire thread proves quite amusing. A few excerpts follow.
The actual reason is much simpler, and therefore apparently very difficult to recognize by complicated-thinking people.
Here is the explanation :
What are people doing when they have a bunch of images and want a quick preview ? They use thumbnails ! What are thumbnails ? Thumbnails are small downscaled versions of the original image ! If you want more details of the image, you can zoom in stepwise by enlarging (upscaling) the image.
So with proper understanding of the fundamental DCT property, the MPEG folks could make their videos more scalable, but, as in the case of JPEG, they are unable to recognize this simple but basic property, unfortunately, and pursue rather inferior approaches in actual developments.
These are just phrases, and they don’t explain anything. But this is typical for the current state in this field : The relevant people ignore and deny the true reasons, and thus they turn in a circle and no progress is being made.
However, there are dark forces in action today which ignore and deny any fruitful advances in this field. That is the reason that we didn’t see any progress in JPEG for more than a decade, and as long as those forces dominate, we will see more confusion and less enlightenment. The truth is always simple, and the DCT *is* simple, but this fact is suppressed by established people who don’t want to lose their dubious position.
I believe a trip to the Total Perspective Vortex may be in order. Perhaps his tin-foil hat will save him.