Recherche avancée

Recherche
Choix de la période de publication
Date minimale :

Date maximale :

Type de date :
Choix de la langue
Choix du type de média
Choix de la rubrique
Choix de la licence de publication
Choix de l’auteur

Médias (1)

Mot : - Tags -/embed

Autres articles (87)

HTML5 audio and video support

13 avril 2011, par kent1

MediaSPIP uses HTML5 video and audio tags to play multimedia files, taking advantage of the latest W3C innovations supported by modern browsers.
The MediaSPIP player used has been created specifically for MediaSPIP and can be easily adapted to fit in with a specific theme.
For older browsers the Flowplayer flash fallback is used.
MediaSPIP allows for media playback on major mobile platforms with the above (...)
Support audio et vidéo HTML5

10 avril 2011

MediaSPIP utilise les balises HTML5 video et audio pour la lecture de documents multimedia en profitant des dernières innovations du W3C supportées par les navigateurs modernes.
Pour les navigateurs plus anciens, le lecteur flash Flowplayer est utilisé.
Le lecteur HTML5 utilisé a été spécifiquement créé pour MediaSPIP : il est complètement modifiable graphiquement pour correspondre à un thème choisi.
Ces technologies permettent de distribuer vidéo et son à la fois sur des ordinateurs conventionnels (...)
De l’upload à la vidéo finale [version standalone]

31 janvier 2010, par kent1

Le chemin d’un document audio ou vidéo dans SPIPMotion est divisé en trois étapes distinctes.
Upload et récupération d’informations de la vidéo source
Dans un premier temps, il est nécessaire de créer un article SPIP et de lui joindre le document vidéo "source".
Au moment où ce document est joint à l’article, deux actions supplémentaires au comportement normal sont exécutées : La récupération des informations techniques des flux audio et video du fichier ; La génération d’une vignette : extraction d’une (...)

1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | ... | 29

Sur d’autres sites (7648)

Summary Video Accessibility Talk

23 avril 2013, par silvia
I’ve just got off a call to the UK Digital TV Group, for which I gave a talk on HTML5 video accessibility (slides best viewed in Google Chrome).

The slide provide a high-level summary of the accessibility features that we’ve developed in the W3C for HTML5, including :
- Subtitles & Captions with WebVTT and the track element
- Video Descriptions with WebVTT, the track element and speech synthesis
- Chapters with WebVTT for semantic navigation
- Audio Descriptions through synchronising an audio track with a video
- Sign Language video synchronized with a main video
I received some excellent questions.

The obvious one was about why WebVTT and not TTML. While for anyone who has tried to implement TTML support, the advantages of WebVTT should be clear, for some the decision of the browsers to go with WebVTT still seems to be bothersome. The advantages of CSS over XSL-FO in a browser-context are obvious, but not as much outside browsers. So, the simplicity of WebVTT and the clear integration with HTML have to speak for themselves. Conversion between TTML and WebVTT was a feature that was being asked for.

I received a question about how to support ducking (reduce the volume of the main audio track) when using video descriptions. My reply was to either use video descriptions with WebVTT and do ducking during the times that a cue is active, or when using audio descriptions (i.e. actual audio tracks) to add an additional WebVTT file of kind=metadata to mark the intervals in which to do ducking. In both cases some JavaScript will be necessary.

I received another question about how to do clean audio, which I had almost forgotten was a requirement from our earlier media accessibility document. “Clean audio” consists of isolating the audio channel containing the spoken dialog and important non-speech information that can then be amplified or otherwise modified, while other channels containing music or ambient sounds are attenuated. I suggested using the mediagroup attribute to provide a main video element (without an audio track) and then the other channels as parallel audio tracks that can be turned on and off and attenuated individually. There is some JavaScript coding involved on top of the APIs that we have defined in HTML, but it can be implemented in browsers that support the mediagroup attribute.

Another question was about the possibilities to extend the list of @kind attribute values. I explained that right now we have a proposal for a new text track kind=”forced” so as to provide forced subtitles for sections of video with foreign language. These would be on when no other subtitle or caption tracks are activated. I also explained that if there is a need for application-specific text tracks, the kind=”metadata” would be the correct choice.

I received some further questions, in particular about how to apply styling to captions (e.g. color changes to text) and about how closely the browser are able to keep synchronization across multiple media elements. The earlier was easily answered with the ::cue pseudo-element, but the latter is a quality of implementation feature, so I had to defer to individual browsers.

Overall it was a good exercise to summarize the current state of HTML5 video accessibility and I was excited to show off support in Chrome for all the features that we designed into the standard.
The Big VP8 Debug

20 novembre 2010, par Multimedia Mike — VP8
I hope my previous walkthrough of the VP8 4x4 intra coding process was educational. Today, I’ll be walking through an example of what happens when my toy VP8 encoder encodes an intra 16x16 block. This may prove educational to those who have never been exposed to the deep details of this or related algorithms. Also, I wanted to illustrate where I think my VP8 encoder process is going bad and generating such grotesque results.

Before I start, let me give a shout-out to Google Docs’ Drawing tool which I used to generate these diagrams. It works quite well.

Results

(Always cut to the chase in a blog post ; results first.) I’m glad I composed this post. In the course of doing so, I found the problem, fixed it, and am now able to present this image that was decoded from the bitstream encoded by my ~~toy~~ working VP8 encoder :

Yeah, I know that image doesn’t look like anything you haven’t seen before. The difference is that it has made a successful trip through my VP8 encoder.

Follow along through the encoding process and learn of the mistake...

Original Block and Subblocks

Here is the 16x16 block to be encoded :

The block is broken down into 16 4x4 subblocks for further encoding :

Prediction

The first step is to pick a prediction mode, generate a prediction block, and subtract the predictors from the macroblock. In this case, we will use DC prediction which means the predictor will be the same for each element.

In 4x4 VP8 DC intra prediction, samples outside of the frame are assumed to be 128. It’s a little different in 16x16 DC intra prediction— samples above the top row are assumed to be 127 while samples left of the leftmost column are assumed to be 129. For the top left macroblock, this still works out to 128.

Subtract 128 from each of the samples :

Forward Transform

Run each of the 16 prediction-removed subblocks through the forward transform. This example uses the forward transform from libvpx 0.9.5 :

I have highlighted the DC coefficients in each subblock. That’s because those receive special consideration in 16x16 intra coding.

Quantization

The Y plane AC quantizer is 4 in this example, the minimum allowed. (The Y plane DC quantizer is also 4 but doesn’t come into play for intra 16x16 coding since the DC coefficients follow a different process.) Thus, quantize (integer divide) each AC element in each subblock (we’ll ignore the DC coefficient for this part) :

The Y2 Round Trip

Those highlighted DC coefficients from each of the 16 subblocks comprise the Y2 block. This block is transformed with a slightly different algorithm called the Walsh-Hadamard Transform (WHT). The results of this transform are then quantized (using 8 for both Y2 DC and AC in this example, as those are the smallest Y2 quantizers that VP8 allows), then zigzagged and entropy-coded along with the rest of the macroblock coefficients.

On the decoder side, the Y2 coefficients are decoded, de-zigzagged, dequantized and run through the inverse WHT.

And this is where I suspect that most of the error is creeping into my VP8 encoder. Observe the round-trip through the Y2 process :

As intimated, this part causes me consternation due to the wide discrepancy between the original and the reconstructed Y2 blocks. Observe the absolute difference between the 2 vectors :

That’s really significant and leads me to believe that this is where the big problem is.

What’s Wrong ?

My first suspicion is that the quantization is throwing off the process. I was disabused of this idea when I removed quantization from the equation and immediately reversed the transform :

So perhaps there is a problem with the forward WHT. Just like with the usual subblock transform, the VP8 spec doesn’t define how to perform the forward WHT, only the inverse WHT. Do I need to audition different forward WHTs from various versions of libvpx, similar to what I did with the other transform ? That doesn’t make a lot of sense— libvpx doesn’t seem to have so much trouble with basic encoding.

The Punchline

I reviewed the forward WHT code, the stuff that I plagiarized from libvpx 0.9.0. The function takes, among other parameters, a pitch value. There are 2 loops in the code. The first iterates through the rows of the input matrix— which I assumed was a 4x4 matrix. I was puzzled that during each iteration of the row loop, the input pointer was only being advanced by (pitch/2). I removed the division by 2 and the problem went away. I.e., the encoded image looks correct.

What’s up with the (pitch/2), anyway ? It seems that the encoder likes to pack 2 4x4 subblocks into an 8x4 block data structure. In fact, the forward DCTs in the libvpx encoder have the same artifact. Remember how I surveyed several variations of forward DCT from different versions of libvpx ? The one that proved most accurate in that test was the one I had already modified to advance the input pointer properly. Fixing the other 2 candidates yields similar results :
```
input :   92 91 89 86 91 90 88 86 89 89 89 88 89 87 88 93
short 0.9.0 : -311 6 2 0 0 11 -6 1 2 -3 3 0 0 0 -2 1
inverse : 92 91 89 86 91 90 88 87 90 89 89 88 89 87 88 93
fast  0.9.0 : -313 5 1 0 1 11 -6 1 3 -3 4 0 0 0 -2 1
inverse : 91 91 89 86 90 90 88 86 89 89 89 88 89 87 88 93
short 0.9.5 : -312 7 1 0 1 12 -5 2 2 -3 3 -1 1 0 -2 1
inverse : 92 91 89 86 91 90 88 86 89 89 89 88 89 87 88 93
```
Code cribber beware !

Corrected Y2 Round Trip

Let’s look at that Y2 round trip one more time :

And another look at the error between the original and the reconstruction :

Better.

Dequantization, Prediction, Inverse Transforms, and Reconstruction

To be honest, now that I solved the major problem, I’m getting a little tired of making these pictures. Long story short, all elements of the original 16 subblocks are dequantized and their DC coefficients are filled in with the appropriate item from the reconstructed Y2 block. A base predictor block is generated (all 128 values in this case). And each Y block is run through the inverse transform and added to the predictor block. The following is the reconstruction :

And if you compare that against the original luma macroblock (I don’t feel like doing it right now), you’ll find that it’s pretty close.

I can’t believe how close I was all this time, and how long that pitch bug held me up.
Revision 642dba8de0ea9d1a1d493a04add30327f389571a : Reports : Report de r16556. Le find_in_path pour la DTD locale introduit ...

8 décembre 2010, par Cerdic — Log

Reports : Report de r16556. Le find_in_path pour la DTD locale introduit par r15963 n’etait pas assez general car contraint en amont par le prefixe ’prive’ Report de r16558 Accepter dans une DTD un element a contenu totalement vide (meme pas EMPTY explicite). Report r16560. Creer le repertoire (...)