
Recherche avancée
Médias (2)
-
Exemple de boutons d’action pour une collection collaborative
27 février 2013, par
Mis à jour : Mars 2013
Langue : français
Type : Image
-
Exemple de boutons d’action pour une collection personnelle
27 février 2013, par
Mis à jour : Février 2013
Langue : English
Type : Image
Autres articles (44)
-
Des sites réalisés avec MediaSPIP
2 mai 2011, parCette page présente quelques-uns des sites fonctionnant sous MediaSPIP.
Vous pouvez bien entendu ajouter le votre grâce au formulaire en bas de page. -
HTML5 audio and video support
13 avril 2011, parMediaSPIP uses HTML5 video and audio tags to play multimedia files, taking advantage of the latest W3C innovations supported by modern browsers.
The MediaSPIP player used has been created specifically for MediaSPIP and can be easily adapted to fit in with a specific theme.
For older browsers the Flowplayer flash fallback is used.
MediaSPIP allows for media playback on major mobile platforms with the above (...) -
Support audio et vidéo HTML5
10 avril 2011MediaSPIP utilise les balises HTML5 video et audio pour la lecture de documents multimedia en profitant des dernières innovations du W3C supportées par les navigateurs modernes.
Pour les navigateurs plus anciens, le lecteur flash Flowplayer est utilisé.
Le lecteur HTML5 utilisé a été spécifiquement créé pour MediaSPIP : il est complètement modifiable graphiquement pour correspondre à un thème choisi.
Ces technologies permettent de distribuer vidéo et son à la fois sur des ordinateurs conventionnels (...)
Sur d’autres sites (5823)
-
Latency and DAF in RTP transmissions
24 février 2023, par jfernandzI'm trying to perform some tests for audio RTP transmissions to know their technical limitations. The idea is to prevent DAF effect in this kind of transmissions, I'm assuming a latency lower than 50ms will prevent it. But there is another handicap in my analysis, the RTP transmission must be over WiFi.


For this tests I'm trying to transmit raw audio (not sure if skipping the encoding stage will improve latency) through
ffmpeg
between two different laptops, so I'm runningffmpeg
in the first laptop (172.20.1.2
) as :

$ ffmpeg -f pulse -i 56 -c copy -f rtp rtp://172.20.1.5:10000


which produces the following output :


ffmpeg version n5.1.2 Copyright (c) 2000-2022 the FFmpeg developers
 built with gcc 12.2.0 (GCC)
 configuration: --prefix=/usr --disable-debug --disable-static --disable-stripping --enable-amf --enable-avisynth --enable-cuda-llvm --enable-lto --enable-fontconfig --enable-gmp --enable-gnutls --enable-gpl --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libdav1d --enable-libdrm --enable-libfreetype --enable-libfribidi --enable-libgsm --enable-libiec61883 --enable-libjack --enable-libmfx --enable-libmodplug --enable-libmp3lame --enable-libopencore_amrnb --enable-libopencore_amrwb --enable-libopenjpeg --enable-libopus --enable-libpulse --enable-librav1e --enable-librsvg --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libsvtav1 --enable-libtheora --enable-libv4l2 --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxcb --enable-libxml2 --enable-libxvid --enable-libzimg --enable-nvdec --enable-nvenc --enable-opencl --enable-opengl --enable-shared --enable-version3 --enable-vulkan
 libavutil 57. 28.100 / 57. 28.100
 libavcodec 59. 37.100 / 59. 37.100
 libavformat 59. 27.100 / 59. 27.100
 libavdevice 59. 7.100 / 59. 7.100
 libavfilter 8. 44.100 / 8. 44.100
 libswscale 6. 7.100 / 6. 7.100
 libswresample 4. 7.100 / 4. 7.100
 libpostproc 56. 6.100 / 56. 6.100
Guessed Channel Layout for Input Stream #0.0 : stereo
Input #0, pulse, from '56':
 Duration: N/A, start: 1677234050.938677, bitrate: 1536 kb/s
 Stream #0:0: Audio: pcm_s16le, 48000 Hz, stereo, s16, 1536 kb/s
Output #0, rtp, to 'rtp://172.20.1.5:10000':
 Metadata:
 encoder : Lavf59.27.100
 Stream #0:0: Audio: pcm_s16le, 48000 Hz, stereo, s16, 1536 kb/s
SDP:
v=0
o=- 0 0 IN IP4 127.0.0.1
s=No Name
c=IN IP4 172.20.1.5
t=0 0
a=tool:libavformat LIBAVFORMAT_VERSION
m=audio 10000 RTP/AVP 97
b=AS:1536

Stream mapping:
 Stream #0:0 -> #0:0 (copy)
Press [q] to stop, [?] for help
size= 322kB time=00:00:01.67 bitrate=1573.6kbits/s speed=1.06x



I'm assuming the shown SDP is a valid one :


v=0
o=- 0 0 IN IP4 127.0.0.1
s=No Name
c=IN IP4 172.20.1.5
t=0 0
a=tool:libavformat LIBAVFORMAT_VERSION
m=audio 10000 RTP/AVP 97
b=AS:1536



So I saved it in a file called
ccopy.sdp
on the second laptop (172.20.1.5
). However, when I runffplay
in this other laptop as :

$ ffplay -protocol_whitelist file,rtp,udp -i ccopy.sdp


I can see there is problems with this SDP :


ffplay version n5.1.2 Copyright (c) 2003-2022 the FFmpeg developers
 built with gcc 12.2.0 (GCC)
 configuration: --prefix=/usr --disable-debug --disable-static --disable-stripping --enable-amf --enable-avisynth --enable-cuda-llvm --enable-lto --enable-fontconfig --enable-gmp --enable-gnutls --enable-gpl --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libdav1d --enable-libdrm --enable-libfreetype --enable-libfribidi --enable-libgsm --enable-libiec61883 --enable-libjack --enable-libmfx --enable-libmodplug --enable-libmp3lame --enable-libopencore_amrnb --enable-libopencore_amrwb --enable-libopenjpeg --enable-libopus --enable-libpulse --enable-librav1e --enable-librsvg --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libsvtav1 --enable-libtheora --enable-libv4l2 --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxcb --enable-libxml2 --enable-libxvid --enable-libzimg --enable-nvdec --enable-nvenc --enable-opencl --enable-opengl --enable-shared --enable-version3 --enable-vulkan
 libavutil 57. 28.100 / 57. 28.100
 libavcodec 59. 37.100 / 59. 37.100
 libavformat 59. 27.100 / 59. 27.100
 libavdevice 59. 7.100 / 59. 7.100
 libavfilter 8. 44.100 / 8. 44.100
 libswscale 6. 7.100 / 6. 7.100
 libswresample 4. 7.100 / 4. 7.100
 libpostproc 56. 6.100 / 56. 6.100
[sdp @ 0x7f8eec000c80] Could not find codec parameters for stream 0 (Audio: none, 0 channels): unknown codec
Consider increasing the value for the 'analyzeduration' (0) and 'probesize' (5000000) options
Input #0, sdp, from 'ccopy.sdp':
 Metadata:
 title : No Name
 Duration: N/A, bitrate: N/A
 Stream #0:0: Audio: none, 0 channels
Failed to open file 'ccopy.sdp' or configure filtergraph
 nan : 0.000 fd= 0 aq= 0KB vq= 0KB sq= 0B f=0/0 



Not sure if I'm doing something wrong or this is because of I cannot actually use
pcm_s16le
for an RTP transmission. Moreover ... Is there some argument forffmpeg
that I can use to improve this RTP transmission and reduce latency under 50ms.

Thank you all :-)


PS : When I don't use
-c copy
argument forffmpeg
and therefore I have this SDP

v=0
o=- 0 0 IN IP4 127.0.0.1
s=No Name
c=IN IP4 172.20.1.5
t=0 0
a=tool:libavformat LIBAVFORMAT_VERSION
m=audio 10000 RTP/AVP 97
b=AS:768
a=rtpmap:97 PCMU/48000/2



The RTP transmission works as I expect, but with a significant DAF.


-
Who Invented FLIC ?
26 mai 2011, par Multimedia Mike — Multimedia HistoryI have been reading through “All Your Base Are Belong To Us : How 50 Years of Video Games Conquered Pop Culture” by Harold Goldberg. Despite the title, Zero Wing has yet to be mentioned (I’m about halfway done).
I just made it through the chapter describing early breakthrough CD-ROM games, including Myst, The 7th Guest, and The 11th Hour. Some interesting tidbits :
The 7th Guest
Of course, Graeme Devine created a new FMV format (called VDX, documented here) for The 7th Guest. The player was apparently called PLAY and the book claims that Autodesk was so impressed by the technology that it licensed the player for use in its own products. When I think of an Autodesk multimedia format, I think of FLIC. The VDX coding format doesn’t look too much like FLIC, per my reading.Here’s the relevant passage (pp 118-119) :
Devine began working on creating software within the CD-ROM disk that would play full-motion video. Within days he had a robust but small ninety-kilobyte player called PLAY that was so good, it was licensed by Autodesk, the makers of the best 3-D animation program at the time. Then Devine figured out a way to compress the huge video files so that they would easily fit on two CD-ROMs.
Googling for “autodesk trilobyte play program” (Trilobyte was the company behind 7th Guest) led me to this readme file for a program called PLAY73 (hosted at Jason Scott’s massive CD-ROM archive, and it’s on a disc that, incidentally, I donated to the archive ; so, let’s here it for Jason’s tireless archival efforts ! And for Google’s remarkable indexing prowess). The file — dated September 10, 1991 — mentions that it’s a FLICK player, copyright Trilobyte software.
However, it also mentions being a Groovie Player. Based on ScummVM’s reimplementation of the VDX format, Groovie might refer to the engine behind The 7th Guest.
So now I’m really interested : Did Graeme Devine create the FLIC file format ? Multimedia nerds want to know !
I guess not. Thanks to Jim Leonard for digging up this item : “I developed the flic file format for the Autodesk Animator.” Jim Kent, Dr. Dobbs Magazine, March 1993.
The PLAY73 changelog reveals something from the bad old days of DOS/PC programming : The necessity of writing graphics drivers for 1/2 dozen different video adapters. The PLAY73 readme file also has some vintage contact address for Graeme Devine ; remember when addresses looked like these ?
If you have any comments, please send them to : Compuserve : 72330,3276 Genie : G.DEVINE Internet : 72330,3276@compuserve.com
The 11th Hour
The book didn’t really add anything I didn’t already know regarding the compression format (RoQ) used in 11th Hour. I already knew how hard Devine worked at it. This book took pains to emphasize the emotional toll taken on the format’s creator.I wonder if he would be comforted to know that, more than 15 years later, people are still finding ways to use the format.
-
IJG swings again, and misses
1er février 2010, par Mans — MultimediaEarlier this month the IJG unleashed version 8 of its ubiquitous libjpeg library on the world. Eager to try out the “major breakthrough in image coding technology” promised in the README file accompanying v7, I downloaded the release. A glance at the README file suggests something major indeed is afoot :
Version 8.0 is the first release of a new generation JPEG standard to overcome the limitations of the original JPEG specification.
The text also hints at the existence of a document detailing these marvellous new features, and a Google search later a copy has found its way onto my monitor. As I read, however, my state of mind shifts from an initial excited curiosity, through bewilderment and disbelief, finally arriving at pure merriment.
Already on the first page it becomes clear no new JPEG standard in fact exists. All we have is an unsolicited proposal sent to the ITU-T by members of the IJG. Realising that even the most brilliant of inventions must start off as mere proposals, I carry on reading. The summary informs me that I am about to witness the introduction of three extensions to the T.81 JPEG format :
- An alternative coefficient scan sequence for DCT coefficient serialization
- A SmartScale extension in the Start-Of-Scan (SOS) marker segment
- A Frame Offset definition in or in addition to the Start-Of-Frame (SOF) marker segment
Together these three extensions will, it is promised, “bring DCT based JPEG back to the forefront of state-of-the-art image coding technologies.”
Alternative scan
The first of the proposed extensions introduces an alternative DCT coefficient scan sequence to be used in place of the zigzag scan employed in most block transform based codecs.
Alternative scan sequence
The advantage of this scan would be that combined with the existing progressive mode, it simplifies decoding of an initial low-resolution image which is enhanced through subsequent passes. The author of the document calls this scheme “image-pyramid/hierarchical multi-resolution coding.” It is not immediately obvious to me how this constitutes even a small advance in image coding technology.
At this point I am beginning to suspect that our friend from the IJG has been trapped in a half-world between interlaced GIF images transmitted down noisy phone lines and today’s inferno of SVC, MVC, and other buzzwords.
(Not so) SmartScale
Disguised behind this camel-cased moniker we encounter a method which, we are told, will provide better image quality at high compression ratios. The author has combined two well-known (to us) properties in a (to him) clever way.
The first property concerns the perceived impact of different types of distortion in an image. When encoding with JPEG, as the quantiser is increased, the decoded image becomes ever more blocky. At a certain point, a better subjective visual quality can be achieved by down-sampling the image before encoding it, thus allowing a lower quantiser to be used. If the decoded image is scaled back up to the original size, the unpleasant, blocky appearance is replaced with a smooth blur.
The second property belongs to the DCT where, as we all know, the top-left (DC) coefficient is the average of the entire block, its neighbours represent the lowest frequency components etc. A top-left-aligned subset of the coefficient block thus represents a low-resolution version of the full block in the spatial domain.
In his flash of genius, our hero came up with the idea of using the DCT for down-scaling the image. Unfortunately, he appears to possess precious little knowledge of sampling theory and human visual perception. Any block-based resampling will inevitably produce sharp artefacts along the block edges. The human visual system is particularly sensitive to sharp edges, so this is one of the most unwanted types of distortion in an encoded image.
Despite the obvious flaws in this approach, I decided to give it a try. After all, the software is already written, allowing downscaling by factors of 8/8..16.
Using a 1280×720 test image, I encoded it with each of the nine scaling options, from unity to half size, each time adjusting the quality parameter for a final encoded file size of no more than 200000 bytes. The following table presents the encoded file size, the libjpeg quality parameter used, and the SSIM metric for each of the images.
Scale Size Quality SSIM 8/8 198462 59 0.940 8/9 196337 70 0.936 8/10 196133 79 0.934 8/11 197179 84 0.927 8/12 193872 89 0.915 8/13 197153 92 0.914 8/14 188334 94 0.899 8/15 198911 96 0.886 8/16 197190 97 0.869 Although the smaller images allowed a higher quality setting to be used, the SSIM value drops significantly. Numbers may of course be misleading, but the images below speak for themselves. These are cut-outs from the full image, the original on the left, unscaled JPEG-compressed in the middle, and JPEG with 8/16 scaling to the right.
Looking at these images, I do not need to hesitate before picking the JPEG variant I prefer.
Frame offset
The third and final extension proposed is quite simple and also quite pointless : a top-left cropping to be applied to the decoded image. The alleged utility of this feature would be to enable lossless cropping of a JPEG image. In a typical image workflow, however, JPEG is only used for the final published version, so the need for this feature appears quite far-fetched.
The grand finale
Throughout the text, the author makes references to “the fundamental DCT property for image representation.” In his own words :
This property was found by the author during implementation of the new DCT scaling features and is after his belief one of the most important discoveries in digital image coding after releasing the JPEG standard in 1992.
The secret is to be revealed in an annex to the main text. This annex quotes in full a post by the author to the comp.dsp Usenet group in a thread with the subject why DCT. Reading the entire thread proves quite amusing. A few excerpts follow.
The actual reason is much simpler, and therefore apparently very difficult to recognize by complicated-thinking people.
Here is the explanation :
What are people doing when they have a bunch of images and want a quick preview ? They use thumbnails ! What are thumbnails ? Thumbnails are small downscaled versions of the original image ! If you want more details of the image, you can zoom in stepwise by enlarging (upscaling) the image.
So with proper understanding of the fundamental DCT property, the MPEG folks could make their videos more scalable, but, as in the case of JPEG, they are unable to recognize this simple but basic property, unfortunately, and pursue rather inferior approaches in actual developments.
These are just phrases, and they don’t explain anything. But this is typical for the current state in this field : The relevant people ignore and deny the true reasons, and thus they turn in a circle and no progress is being made.
However, there are dark forces in action today which ignore and deny any fruitful advances in this field. That is the reason that we didn’t see any progress in JPEG for more than a decade, and as long as those forces dominate, we will see more confusion and less enlightenment. The truth is always simple, and the DCT *is* simple, but this fact is suppressed by established people who don’t want to lose their dubious position.
I believe a trip to the Total Perspective Vortex may be in order. Perhaps his tin-foil hat will save him.