
Recherche avancée
Médias (1)
-
The Great Big Beautiful Tomorrow
28 octobre 2011, par
Mis à jour : Octobre 2011
Langue : English
Type : Texte
Autres articles (54)
-
Websites made with MediaSPIP
2 mai 2011, parThis page lists some websites based on MediaSPIP.
-
Automated installation script of MediaSPIP
25 avril 2011, parTo overcome the difficulties mainly due to the installation of server side software dependencies, an "all-in-one" installation script written in bash was created to facilitate this step on a server with a compatible Linux distribution.
You must have access to your server via SSH and a root account to use it, which will install the dependencies. Contact your provider if you do not have that.
The documentation of the use of this installation script is available here.
The code of this (...) -
Demande de création d’un canal
12 mars 2010, parEn fonction de la configuration de la plateforme, l’utilisateur peu avoir à sa disposition deux méthodes différentes de demande de création de canal. La première est au moment de son inscription, la seconde, après son inscription en remplissant un formulaire de demande.
Les deux manières demandent les mêmes choses fonctionnent à peu près de la même manière, le futur utilisateur doit remplir une série de champ de formulaire permettant tout d’abord aux administrateurs d’avoir des informations quant à (...)
Sur d’autres sites (5846)
-
Greed is Good ; Greed Works
25 novembre 2010, par Multimedia Mike — VP8Greed, for lack of a better word, is good ; Greed works. Well, most of the time. Maybe.
Picking Prediction Modes
VP8 uses one of 4 prediction modes to predict a 16x16 luma block or 8x8 chroma block before processing it (for luma, a block can also be broken into 16 4x4 blocks for individual prediction using even more modes).So, how to pick the best predictor mode ? I had no idea when I started writing my VP8 encoder. I did not read any literature on the matter ; I just sat down and thought of a brute-force approach. According to the comments in my code :
// naive, greedy algorithm : // residual = source - predictor // mean = mean(residual) // residual -= mean // find the max diff between the mean and the residual // the thinking is that, post-prediction, the best block will // be comprised of similar samples
After removing the predictor from the macroblock, individual 4x4 subblocks are put through a forward DCT and quantized. Optimal compression in this scenario results when all samples are the same since only the DC coefficient will be non-zero. Failing that, when the input samples are at least similar to each other, few of the AC coefficients will be non-zero, which helps compression. When the samples are all over the scale, there aren’t a whole lot of non-zero coefficients unless you crank up the quantizer, which results in poor quality in the reconstructed subblocks.
Thus, my goal was to pick a prediction mode that, when applied to the input block, resulted in a residual in which each element would feature the least deviation from the mean of the residual (relative to other prediction choices).
Greedy Approach
I realized that this algorithm falls into the broad general category of "greedy" algorithms— one that makes locally optimal decisions at each stage. There are most likely smarter algorithms. But this one was good enough for making an encoder that just barely works.Compression Results
I checked the total file compression size on my usual 640x360 Big Buck Bunny logo image while forcing prediction modes vs. using my greedy prediction picking algorithm. In this very simple test, DC-only actually resulted in slightly better compression than the greedy algorithm (which says nothing about overall quality).prediction mode quantizer index = 0 (minimum) quantizer index = 10 greedy 286260 98028 DC 280593 95378 vertical 297206 105316 horizontal 295357 104185 TrueMotion 311660 113480 As another data point, in both quantizer cases, my greedy algorithm selected a healthy mix of prediction modes :
- quantizer index 0 : DC = 521, VERT = 151, HORIZ = 183, TM = 65
- quantizer index 10 : DC = 486, VERT = 167, HORIZ = 190, TM = 77
Size vs. Quality
Again, note that this ad-hoc test only measures one property (a highly objective one)— compression size. It did not account for quality which is a far more controversial topic that I have yet to wade into. -
Basic Video Palette Conversion
How do you take a 24-bit RGB image and convert it to an 8-bit paletted image for the purpose of compression using a codec that requires 8-bit input images ? Seems simple enough and that’s what I’m tackling in this post.
Ask FFmpeg/Libav To Do It
Ideally, FFmpeg / Libav should be able to handle this automatically. Indeed, FFmpeg used to be able to, at least at the time I wrote this post about ZMBV and was unhappy with FFmpeg’s default results. Somewhere along the line, FFmpeg and Libav lost the ability to do this. I suspect it got removed during some swscale refactoring.Still, there’s no telling if the old system would have computed palettes correctly for QuickTime files.
Distance Approach
When I started writing my SMC video encoder, I needed to convert RGB (from PNG files) to PAL8 colorspace. The path of least resistance was to match the pixels in the input image to the default 256-color palette that QuickTime assumes (and is hardcoded into FFmpeg/Libav).How to perform the matching ? Find the palette entry that is closest to a given input pixel, where "closest" is the minimum distance as computed by the usual distance formula (square root of the sum of the squares of the diffs of all the components).
That means for each pixel in an image, check the pixel against 256 palette entries (early termination is possible if an acceptable threshold is met). As you might imagine, this can be a bit time-consuming. I wondered about a faster approach...
Lookup Table
I think this is the approach that FFmpeg used to use, but I went and derived it for myself after studying the default QuickTime palette table. There’s a pattern there— all of the RGB entries are comprised of combinations of 6 values — 0x00, 0x33, 0x66, 0x99, 0xCC, and 0xFF. If you mix and match these for red, green, and blue values, you come up with6 * 6 * 6 = 216
different colors. This happens to be identical to the web-safe color palette.The first (0th) entry in the table is (FF, FF, FF), followed by (FF, FF, CC), (FF, FF, 99), and on down to (FF, FF, 00) when the green component gets knocked down and step and the next color is (FF, CC, FF). The first 36 palette entries in the table all have a red component of 0xFF. Thus, if an input RGB pixel has a red color closest to 0xFF, it must map to one of those first 36 entries.
I created a table which maps indices 0..215 to values from 5..0. Each of the R, G, and B components of an input pixel are used to index into this table and derive 3 indices ri, gi, and bi. Finally, the index into the palette table is given by :
index = ri * 36 + gi * 6 + bi
For example, the pixel (0xFE, 0xFE, 0x01) would yield ri, gi, and bi values of 0, 0, and 5. Therefore :
index = 0 * 36 + 0 * 6 + 5
The palette index is 5, which maps to color (0xFF, 0xFF, 0x00).
Validation
So I was pretty pleased with myself for coming up with that. Now, ideally, swapping out one algorithm for another in my SMC encoder should yield identical results. That wasn’t the case, initially.One problem is that the regulation QuickTime palette actually has 40 more entries above and beyond the typical 216-entry color cube (rounding out the grand total of 256 colors). Thus, using the distance approach with the full default table provides for a little more accuracy.
However, there still seems to be a problem. Let’s check our old standby, the Big Buck Bunny logo image :
Distance approach using the full 256-color QuickTime default palette
Distance approach using the 216-color palette
Table lookup approach using the 216-color palette
I can’t quite account for that big red splotch there. That’s the most notable difference between images 1 and 2 and the only visible difference between images 2 and 3.
To prove to myself that the distance approach is equivalent to the table approach, I wrote a Python script to iterate through all possible RGB combinations and verify the equivalence. If you’re not up on your base 2 math, that’s 224 or 16,777,216 colors to run through. I used Python’s multiprocessing module to great effect and really maximized a Core i7 CPU with 8 hardware threads.
So I’m confident that the palette conversion techniques are sound. The red spot is probably attributable to a bug in my WIP SMC encoder.
Source Code
Update August 23, 2011 : Here’s the Python code I used for proving equivalence between the 2 approaches. In terms of leveraging multiple CPUs, it’s possibly the best program I have written to date.PYTHON :-
# !/usr/bin/python
-
-
from multiprocessing import Pool
-
-
palette = []
-
pal8_table = []
-
-
def process_r(r) :
-
counts = []
-
-
for i in xrange(216) :
-
counts.append(0)
-
-
print "r = %d" % (r)
-
for g in xrange(256) :
-
for b in xrange(256) :
-
min_dsqrd = 0xFFFFFFFF
-
best_index = 0
-
for i in xrange(len(palette)) :
-
dr = palette[i][0] - r
-
dg = palette[i][1] - g
-
db = palette[i][2] - b
-
dsqrd = dr * dr + dg * dg + db * db
-
if dsqrd <min_dsqrd :
-
min_dsqrd = dsqrd
-
best_index = i
-
counts[best_index] += 1
-
-
# check if the distance approach deviates from the table-based approach
-
i = best_index
-
r = palette[i][0]
-
g = palette[i][1]
-
b = palette[i][2]
-
ri = pal8_table[r]
-
gi = pal8_table[g]
-
bi = pal8_table[b]
-
table_index = ri * 36 + gi * 6 + bi ;
-
if table_index != best_index :
-
print "(0x%02X 0x%02X 0x%02X) : distance index = %d, table index = %d" % (r, g, b, best_index, table_index)
-
-
return counts
-
-
if __name__ == ’__main__’ :
-
counts = []
-
for i in xrange(216) :
-
counts.append(0)
-
-
# initialize reference palette
-
color_steps = [ 0xFF, 0xCC, 0x99, 0x66, 0x33, 0x00 ]
-
for r in color_steps :
-
for g in color_steps :
-
for b in color_steps :
-
palette.append([r, g, b])
-
-
# initialize palette conversion table
-
for i in range(0, 26) :
-
pal8_table.append(5)
-
for i in range(26, 77) :
-
pal8_table.append(4)
-
for i in range(77, 128) :
-
pal8_table.append(3)
-
for i in range(128, 179) :
-
pal8_table.append(2)
-
for i in range(179, 230) :
-
pal8_table.append(1)
-
for i in range(230, 256) :
-
pal8_table.append(0)
-
-
# create a pool of worker threads and break up the overall job
-
pool = Pool()
-
it = pool.imap_unordered(process_r, range(256))
-
try :
-
while 1 :
-
partial_counts = it.next()
-
for i in xrange(216) :
-
counts[i] += partial_counts[i]
-
except StopIteration :
-
pass
-
-
print "index, count, red, green, blue"
-
for i in xrange(len(counts)) :
-
print "%d, %d, %d, %d, %d" % (i, counts[i], palette[i][0], palette[i][1], palette[i][2])
-
-
FFMPEG - Drawtext or drawbox or overlay on single frame
2 novembre 2011, par waxicalI'm using the avfilters on FFMPEG to drawtext and drawbox. Two of the most poorly documented functions known to man.
I'm struggling to work out how and if I can use this on a single frame. I.e. appear drawtext on frame 22.
Current command :-
ffmpeg -i /home/vtest/test.wmv -y -b 800000 -f flv -vcodec libx264 -vpre default -s 768x432 -g 250 -vf drawtext="fontfile=/home/Cyberbit.ttf:fontsize=24:text=testical:fontcolor=green:x=100:y=200" -qscale 8 -acodec libfaac -sn -vstats /home/testout.flv
Two elements mention here in the documentation are n and t - however I only seem to be able to use them in x and y. Not in text or even as other parameters.
Any help or ffmpeg guidance would be gratefully received.