
Recherche avancée
Autres articles (35)
-
Support audio et vidéo HTML5
10 avril 2011MediaSPIP utilise les balises HTML5 video et audio pour la lecture de documents multimedia en profitant des dernières innovations du W3C supportées par les navigateurs modernes.
Pour les navigateurs plus anciens, le lecteur flash Flowplayer est utilisé.
Le lecteur HTML5 utilisé a été spécifiquement créé pour MediaSPIP : il est complètement modifiable graphiquement pour correspondre à un thème choisi.
Ces technologies permettent de distribuer vidéo et son à la fois sur des ordinateurs conventionnels (...) -
Taille des images et des logos définissables
9 février 2011, parDans beaucoup d’endroits du site, logos et images sont redimensionnées pour correspondre aux emplacements définis par les thèmes. L’ensemble des ces tailles pouvant changer d’un thème à un autre peuvent être définies directement dans le thème et éviter ainsi à l’utilisateur de devoir les configurer manuellement après avoir changé l’apparence de son site.
Ces tailles d’images sont également disponibles dans la configuration spécifique de MediaSPIP Core. La taille maximale du logo du site en pixels, on permet (...) -
Gestion de la ferme
2 mars 2010, parLa ferme est gérée dans son ensemble par des "super admins".
Certains réglages peuvent être fais afin de réguler les besoins des différents canaux.
Dans un premier temps il utilise le plugin "Gestion de mutualisation"
Sur d’autres sites (4804)
-
FFmpeg delay and mix audio streams while keeping overall volume constant
5 octobre 2020, par unstuckI have about 100 audio streams, all with the same intro music/sound, and in some of them the intro is delayed by a few seconds. I want to align and mix all the audio streams such that all the intros play at the same time and the output remains pretty much the same volume throughout. I know in advance how much each stream needs to be delayed by.


Like this in Audacity. Each audio stream is aligned to the intro, and the duration before the intro is arbitrary. (This doesn't solve the volume problem though.)


What I have so far uses
adelay
andamix
. It looks something like this but with more audio streams.

ffmpeg -i 00.oga \
 -i 01.oga \
 -i 02.oga \
 -i 03.oga -filter_complex \
"[0]adelay=delays= 123S:all=1[a0]; \
 [1]adelay=delays= 2718S:all=1[a1]; \
 [2]adelay=delays= 6283185S:all=1[a2]; \
 [3]adelay=delays=11235813S:all=1[a3]; \
 [a0][a1][a2][a3]amix=inputs=4" output.oga



In this example the first stream is delayed by 123 samples, the second by 2 718, the third by 6 283 185, and the by fourth 11 235 813.


This works, except at the beginning of the output it's very quiet. When fed
n
streams,amix
makes each stream 1/n
th its original volume, which is a good thing in principle. In this case it's not an entirely good thing, because at the beginning of the output 3 of the 4 audio streams are silent (adelay
fills delayed streams with silence), meaning the only audible stream is 1/4 = 25% of its original volume. When the second stream becomes audible, the overall volume is 2/4, with three audible streams 3/4, and with all four streams audible it's 4/4 = 100%.

Instead, I want the the first stream to be at 100% volume when it's the only audible one, 50% volume each when there are two audible streams, etc.


Is there a way to make it so when there are
n
audio streams butm
non-silent audio streams, the volume for each of the audio streams is 1/m
not 1/n
?amix
does this when streams end ; if one stream ends it changes the volume of the others from 1/n
to 1/n-1
over a period of time (dropout_transition
: https://ffmpeg.org/ffmpeg-filters.html#amix).

I found a similar question where someone wanted to do something like this but only with 2 audio streams. The answer was to split, trim, and change the volume manually. This would be incredibly complicated with 100 audio streams or more, like in my situation.


Is there any easy way to achieve this, even without FFmpeg ?


-
ARM inline asm secrets
Although I generally recommend against using GCC inline assembly, preferring instead pure assembly code in separate files, there are occasions where inline is the appropriate solution. Should one, at a time like this, turn to the GCC documentation for guidance, one must be prepared for a degree of disappointment. As it happens, much of the inline asm syntax is left entirely undocumented. This article attempts to fill in some of the blanks for the ARM target.
Constraints
Each operand of an inline asm block is described by a constraint string encoding the valid representations of the operand in the generated assembly. For example the “r” code denotes a general-purpose register. In addition to the standard constraints, ARM allows a number of special codes, only some of which are documented. The full list, including a brief description, is available in the constraints.md file in the GCC source tree. The following table is an extract from this file consisting of the codes which are meaningful in an inline asm block (a few are only useful in the machine description itself).
f Legacy FPA registers f0-f7. t The VFP registers s0-s31. v The Cirrus Maverick co-processor registers. w The VFP registers d0-d15, or d0-d31 for VFPv3. x The VFP registers d0-d7. y The Intel iWMMX co-processor registers. z The Intel iWMMX GR registers. l In Thumb state the core registers r0-r7. h In Thumb state the core registers r8-r15. j A constant suitable for a MOVW instruction. (ARM/Thumb-2) b Thumb only. The union of the low registers and the stack register. I In ARM/Thumb-2 state a constant that can be used as an immediate value in a Data Processing instruction. In Thumb-1 state a constant in the range 0 to 255. J In ARM/Thumb-2 state a constant in the range -4095 to 4095. In Thumb-1 state a constant in the range -255 to -1. K In ARM/Thumb-2 state a constant that satisfies the I constraint if inverted. In Thumb-1 state a constant that satisfies the I constraint multiplied by any power of 2. L In ARM/Thumb-2 state a constant that satisfies the I constraint if negated. In Thumb-1 state a constant in the range -7 to 7. M In Thumb-1 state a constant that is a multiple of 4 in the range 0 to 1020. N Thumb-1 state a constant in the range 0 to 31. O In Thumb-1 state a constant that is a multiple of 4 in the range -508 to 508. Pa In Thumb-1 state a constant in the range -510 to +510 Pb In Thumb-1 state a constant in the range -262 to +262 Ps In Thumb-2 state a constant in the range -255 to +255 Pt In Thumb-2 state a constant in the range -7 to +7 G In ARM/Thumb-2 state a valid FPA immediate constant. H In ARM/Thumb-2 state a valid FPA immediate constant when negated. Da In ARM/Thumb-2 state a const_int, const_double or const_vector that can be generated with two Data Processing insns. Db In ARM/Thumb-2 state a const_int, const_double or const_vector that can be generated with three Data Processing insns. Dc In ARM/Thumb-2 state a const_int, const_double or const_vector that can be generated with four Data Processing insns. This pattern is disabled if optimizing for space or when we have load-delay slots to fill. Dn In ARM/Thumb-2 state a const_vector which can be loaded with a Neon vmov immediate instruction. Dl In ARM/Thumb-2 state a const_vector which can be used with a Neon vorr or vbic instruction. DL In ARM/Thumb-2 state a const_vector which can be used with a Neon vorn or vand instruction. Dv In ARM/Thumb-2 state a const_double which can be used with a VFP fconsts instruction. Dy In ARM/Thumb-2 state a const_double which can be used with a VFP fconstd instruction. Ut In ARM/Thumb-2 state an address valid for loading/storing opaque structure types wider than TImode. Uv In ARM/Thumb-2 state a valid VFP load/store address. Uy In ARM/Thumb-2 state a valid iWMMX load/store address. Un In ARM/Thumb-2 state a valid address for Neon doubleword vector load/store instructions. Um In ARM/Thumb-2 state a valid address for Neon element and structure load/store instructions. Us In ARM/Thumb-2 state a valid address for non-offset loads/stores of quad-word values in four ARM registers. Uq In ARM state an address valid in ldrsb instructions. Q In ARM/Thumb-2 state an address that is a single base register. Operand codes
Within the text of an inline asm block, operands are referenced as %0, %1 etc. Register operands are printed as rN, memory operands as [rN, #offset], and so forth. In some situations, for example with operands occupying multiple registers, more detailed control of the output may be required, and once again, an undocumented feature comes to our rescue.
Special code letters inserted between the % and the operand number alter the output from the default for each type of operand. The table below lists the more useful ones.
c An integer or symbol address without a preceding # sign B Bitwise inverse of integer or symbol without a preceding # L The low 16 bits of an immediate constant m The base register of a memory operand M A register range suitable for LDM/STM H The highest-numbered register of a pair Q The least significant register of a pair R The most significant register of a pair P A double-precision VFP register p The high single-precision register of a VFP double-precision register q A NEON quad register e The low doubleword register of a NEON quad register f The high doubleword register of a NEON quad register h A range of VFP/NEON registers suitable for VLD1/VST1 A A memory operand for a VLD1/VST1 instruction y S register as indexed D register, e.g. s5 becomes d2[1] -
Patent skullduggery : Tandberg rips off x264 algorithm
Update : Tandberg claims they came up with the algorithm independently : to be fair, I can actually believe this to some extent, as I think the algorithm is way too obvious to be patented. Of course, they also claim that the algorithm isn’t actually identical, since they don’t want to lose their patent application.
I still don’t trust them, but it’s possible it’s merely bad research (and thus being unaware of prior art) as opposed to anything malicious. Furthermore, word from within their office suggests they’re quite possibly being honest : supposedly the development team does not read x264 code at all. So this might just all be very bad luck.
Regardless, the patent is still complete tripe, and should never have been filed.
Most importantly, stop harassing the guy whose name is on the patent (Lars) : he’s just a programmer, not the management or lawyers responsible for filing the patent. This is stupid and unnecessary. I’ve removed the original post because of this ; it can be found here for those who want to read it.
Appendix : the details of the patent :
I figure I’ll go over the exact correspondence between the patent and my code here.
1. A method for calculating run and level representations of quantized transform coefficients representing pixel values included in a block of a video picture, the method comprising :
Translation : It’s a run-level coder.
packing, at a video processing apparatus, each quantized transform coefficients in a value interval [Max, Min] by setting all quantized transform coefficients greater than Max equal to Max, and all quantized transform coefficients less than Min equal to Min
The quantized coefficients are clipped to a certain valid range to allow them to be packed into bytes (they start as 16-bit values).
reordering, at the video processing apparatus, the quantized transform ID coefficients according to a predefined order depending on respective positions in the block resulting in an array C of reordered quantized transform coefficients
This is the zigzag pattern used in H.264 (and most formats) for reordering DCT coefficients. In x264, this is done before the run-level coder ste.
masking, at the video processing apparatus, C by generating an array M containing ones in positions corresponding to positions of C having non-zero values, and zeros in positions corresponding to positions of C having zero values
This is creating a bitmask based on the coefficient values, the pmovmskb step.
is generating, at the video processing apparatus, for each position containing a one in M, a run and a level representation by setting the level value equal to an occurring value in a corresponding position of C ; and setting, at the video processing apparatus, for each position containing a one in M5 the run value equal to the number of proceeding positions relative to a current position in M since a previous occurrence of one in M.
This is the process of creating run/level values from the bitmask.
Now into the detailed claims :
2. The method according to Claim 1, wherein the masking further includes, creating an array C from C where positions corresponding to positions of nonzero values in C are filled with ones, and positions corresponding to positions of zero values in C are filled with zeros, and creating M from C by extracting the most significant bit from values in respective position of C and inserting the bits in corresponding positions in M.
They’re extracting the most significant bit of the values to create a bitmask. This is exactly what the pmovmskb in my algorithm does.
3. The method according to Claim 2, wherein the creating of the array C is executed by a C++ function PCMPGTB, and the creating of M from C is executed by a C++ function PMOVMSKB.
And here they use pcmpgtb (they call it a C++ function for some reason, but it’s a SSE instruction) to do the clipping of the input values. This is exactly the same method I used in decimate_score. They also use pmovmskb as mentioned.
4. The method according to Claim 1 , wherein the generating of the run and level representation further includes determining positions containing non-zero values in C by corresponding positions containing ones in M.
5. The method according to Claim 4, wherein the determining of positions containing non-zero values in C is executed by a C++ function BSF.
Here they iterate over the bitmask of transform coefficients using a “BSF” function to find runs, which is exactly what I did. Of course, BSF isn’t a function, it’s an x86 instruction.
6. The method according to Claim 1 , wherein Max is 256 and Min is 0.
This is almost surely a typo or mistake of some sort. They mean the Max should be 255, not 256 : 256 doesn’t fit in a uint8_t.
7. The method according to Claim 1 , wherein the predefined order follows a zigzag path of transform coefficient positions in the block starting in an upper left corner heading towards a lower right corner.
This is a description of the typical DCT zigzag pattern (like in H.264, MPEG-2, Theora, etc).
Everything after this part is just repeating itself with the phrase “an apparatus” added in order to make the USPTO listen to them.