Recherche avancée

Médias (0)

Mot : - Tags -/page unique

Aucun média correspondant à vos critères n’est disponible sur le site.

Autres articles (51)

  • Keeping control of your media in your hands

    13 avril 2011, par

    The vocabulary used on this site and around MediaSPIP in general, aims to avoid reference to Web 2.0 and the companies that profit from media-sharing.
    While using MediaSPIP, you are invited to avoid using words like "Brand", "Cloud" and "Market".
    MediaSPIP is designed to facilitate the sharing of creative media online, while allowing authors to retain complete control of their work.
    MediaSPIP aims to be accessible to as many people as possible and development is based on expanding the (...)

  • Submit bugs and patches

    13 avril 2011

    Unfortunately a software is never perfect.
    If you think you have found a bug, report it using our ticket system. Please to help us to fix it by providing the following information : the browser you are using, including the exact version as precise an explanation as possible of the problem if possible, the steps taken resulting in the problem a link to the site / page in question
    If you think you have solved the bug, fill in a ticket and attach to it a corrective patch.
    You may also (...)

  • Creating farms of unique websites

    13 avril 2011, par

    MediaSPIP platforms can be installed as a farm, with a single "core" hosted on a dedicated server and used by multiple websites.
    This allows (among other things) : implementation costs to be shared between several different projects / individuals rapid deployment of multiple unique sites creation of groups of like-minded sites, making it possible to browse media in a more controlled and selective environment than the major "open" (...)

Sur d’autres sites (6408)

  • What is “interoperable TTML” ?

    19 septembre 2012, par silvia

    I’ve just tried to come to terms with the latest state of TTML, the Timed Text Markup Language.

    TTML has been specified by the W3C Timed Text Working Group and released as a RECommendation v1.0 in November 2010. Since then, several organisations have tried to adopt it as their caption file format. This includes the SMPTE, the EBU (European Broadcasting Union), and Microsoft.

    Both, Microsoft and the EBU actually looked at TTML in detail and decided that in order to make it usable for their use cases, a restriction of its functionalities is needed.

    EBU-TT

    The EBU released EBU-TT, which restricts the set of valid attributes and feature. “The EBU-TT format is intended to constrain the features provided by TTML, especially to make EBU-TT more suitable for the use with broadcast video and web video applications.” (see EBU-TT).

    In addition, EBU-specific namespaces were introduce to extend TTML with EBU-specific data types, e.g. ebuttdt:frameRateMultiplierType or ebuttdt:smpteTimingType. Similarly, a bunch of metadata elements were introduced, e.g. ebuttm:documentMetadata, ebuttm:documentEbuttVersion, or ebuttm:documentIdentifier.

    The use of namespaces as an extensibility mechanism will ascertain that EBU-TT files continue to be valid TTML files. However, any vanilla TTML parser will not know what to do with these custom extensions and will drop them on the floor.

    Simple Delivery Profile

    With the intention to make TTML ready for “internet delivery of Captions originated in the United States”, Microsoft proposed a “Simple Delivery Profile for Closed Captions (US)” (see Simple Profile). The Simple Profile is also a restriction of TTML.

    Unfortunately, the Microsoft profile is not the same as the EBU-TT profile : for example, it contains the “set” element, which is not conformant in EBU-TT. Similarly, the supported style features are different, e.g. Simple Profile supports “display-region”, while EBU-TT does not. On the other hand, EBU-TT supports monospace, sans-serif and serif fonts, while the Simple profile does not.

    Thus files created for the Simple Delivery Profile will not work on players that expect EBU-TT and the reverse.

    Fortunately, the Simple Delivery Profile does not introduce any new namespaces and new features, so at least it is an explicit subpart of TTML and not both a restriction and extension like EBU-TT.

    SMPTE-TT

    SMPTE also created a version of the TTML standard called SMPTE-TT. SMPTE did not decide on a subset of TTML for their purposes – it was simply adopted as a complete set. “This Standard provides a framework for timed text to be supported for content delivered via broadband means,…” (see SMPTE-TT).

    However, SMPTE extended TTML in SMPTE-TT with an ability to store a binary blob with captions in another format. This allows using SMPTE-TT as a transport format for any caption format and is deemed to help with “backwards compatibility”.

    Now, instead of specifying a profile, SMPTE decided to define how to convert CEA-608 captions to SMPTE-TT. Even if it’s not called a “profile”, that’s actually what it is. It even has its own namespace : “m608 :”.

    Conclusion

    With all these different versions of TTML, I ask myself what a video player that claims support for TTML will do to get something working. The only chance it has is to implement all the extensions defined in all the different profiles. I pity the player that has to deal with a SMPTE-TT file that has a binary blob in it and is expected to be able to decode this.

    Now, what is a caption author supposed to do when creating TTML ? They obviously cannot expect all players to be able to play back all TTML versions. Should they create different files depending on what platform they are targeting, i.e. a EBU-TT version, a SMPTE-TT version, a vanilla TTML version, and a Simple Delivery Profile version ? Should they by throwing all the features of all the versions into one TTML file and hope that the players will pick out the right things that they require and drop the rest on the floor ?

    Maybe the best way to progress would be to make a list of the “safe” features : those features that every TTML profile supports. That may be the best way to get an “interoperable TTML” file. Here’s me hoping that this minimal set of features doesn’t just end up being the usual (starttime, endtime, text) triple.

    UPDATE :

    I just found out that UltraViolet have their own profile of SMPTE-TT called CFF-TT (see UltraViolet FAQ and spec). They are making some SMPTE-TT fields optional, but introduce a new @forcedDisplayMode attribute under their own namespace “cff :”.

  • lavu/x86 : add FFT assembly

    10 avril 2021, par Lynne
    lavu/x86 : add FFT assembly
    

    This commit adds a pure x86 assembly SIMD version of the FFT in libavutil/tx.
    The design of this pure assembly FFT is pretty unconventional.

    On the lowest level, instead of splitting the complex numbers into
    real and imaginary parts, we keep complex numbers together but split
    them in terms of parity. This saves a number of shuffles in each transform,
    but more importantly, it splits each transform into two independent
    paths, which we process using separate registers in parallel.
    This allows us to keep all units saturated and lets us use all available
    registers to avoid dependencies.
    Moreover, it allows us to double the granularity of our per-load permutation,
    skipping many expensive lookups and allowing us to use just 4 loads per register,
    rather than 8, or in case FMA3 (and by extension, AVX2), use the vgatherdpd
    instruction, which is at least as fast as 4 separate loads on old hardware,
    and quite a bit faster on modern CPUs).

    Higher up, we go for a bottom-up construction of large transforms, foregoing
    the traditional per-transform call-return recursion chains. Instead, we always
    start at the bottom-most basis transform (in this case, a 32-point transform),
    and continue constructing larger and larger transforms until we return to the
    top-most transform.
    This way, we only touch the stack 3 times per a complete target transform :
    once for the 1/2 length transform and two times for the 1/4 length transform.

    The combination algorithm we use is a standard Split-Radix algorithm,
    as used in our C code. Although a version with less operations exists
    (Steven G. Johnson and Matteo Frigo's "A modified split-radix FFT with fewer
    arithmetic operations", IEEE Trans. Signal Process. 55 (1), 111–119 (2007),
    which is the one FFTW uses), it only has 2% less operations and requires at least 4x
    the binary code (due to it needing 4 different paths to do a single transform).
    That version also has other issues which prevent it from being implemented
    with SIMD code as efficiently, which makes it lose the marginal gains it offered,
    and cannot be performed bottom-up, requiring many recursive call-return chains,
    whose overhead adds up.

    We go through a lot of effort to minimize load/stores by keeping as much in
    registers in between construcring transforms. This saves us around 32 cycles,
    on paper, but in reality a lot more due to load/store aliasing (a load from a
    memory location cannot be issued while there's a store pending, and there are
    only so many (2 for Zen 3) load/store units in a CPU).
    Also, we interleave coefficients during the last stage to save on a store+load
    per register.

    Each of the smallest, basis transforms (4, 8 and 16-point in our case)
    has been extremely optimized. Our 8-point transform is barely 20 instructions
    in total, beating our old implementation 8-point transform by 1 instruction.
    Our 2x8-point transform is 23 instructions, beating our old implementation by
    6 instruction and needing 50% less cycles. Our 16-point transform's combination
    code takes slightly more instructions than our old implementation, but makes up
    for it by requiring a lot less arithmetic operations.

    Overall, the transform was optimized for the timings of Zen 3, which at the
    time of writing has the most IPC from all documented CPUs. Shuffles were
    preferred over arithmetic operations due to their 1/0.5 latency/throughput.

    On average, this code is 30% faster than our old libavcodec implementation.
    It's able to trade blows with the previously-untouchable FFTW on small transforms,
    and due to its tiny size and better prediction, outdoes FFTW on larger transforms
    by 11% on the largest currently supported size.

    • [DH] libavutil/tx.c
    • [DH] libavutil/tx_priv.h
    • [DH] libavutil/x86/Makefile
    • [DH] libavutil/x86/tx_float.asm
    • [DH] libavutil/x86/tx_float_init.c
  • Combine Audio and Images in Stream

    19 décembre 2017, par SenorContento

    I would like to be able to create images on the fly and also create audio on the fly too and be able to combine them together into an rtmp stream (for Twitch or YouTube). The goal is to accomplish this in Python 3 as that is the language my bot is written in. Bonus points for not having to save to disk.

    So far, I have figured out how to stream to rtmp servers using ffmpeg by loading a PNG image and playing it on loop as well as loading a mp3 and then combining them together in the stream. The problem is I have to load at least one of them from file.

    I know I can use Moviepy to create videos, but I cannot figure out whether or not I can stream the video from Moviepy to ffmpeg or directly to rtmp. I think that I have to generate a lot of really short clips and send them, but I want to know if there’s an existing solution.

    There’s also OpenCV which I hear can stream to rtmp, but cannot handle audio.

    A redacted version of an ffmpeg command I have successfully tested with is

    ffmpeg -loop 1 -framerate 15 -i ScreenRover.png -i "Song-Stereo.mp3" -c:v libx264 -preset fast -pix_fmt yuv420p -threads 0 -f flv rtmp://SITE-SUCH-AS-TWITCH/.../STREAM-KEY

    or

    cat Song-Stereo.mp3 | ffmpeg -loop 1 -framerate 15 -i ScreenRover.png -i - -c:v libx264 -preset fast -pix_fmt yuv420p -threads 0 -f flv rtmp://SITE-SUCH-AS-TWITCH/.../STREAM-KEY

    I know these commands are not set up properly for smooth streaming, the result manages to screw up both Twitch’s and Youtube’s player and I will have to figure out how to fix that.

    The problem with this is I don’t think I can stream both the image and the audio at once when creating them on the spot. I have to load one of them from the hard drive. This becomes a problem when trying to react to a command or user chat or anything else that requires live reactions. I also do not want to destroy my hard drive by constantly saving to it.

    As for the python code, what I have tried so far in order to create a video is the following code. This still saves to the HD and is not responsive in realtime, so this is not very useful to me. The video itself is okay, with the one exception that as time passes on, the clock the qr code says versus the video’s clock start to spread apart farther and farther as the video gets closer to the end. I can work around that limitation if it shows up while live streaming.

    def make_frame(t):
     img = qrcode.make("Hello! The second is %s!" % t)
     return numpy.array(img.convert("RGB"))

    clip = mpy.VideoClip(make_frame, duration=120)
    clip.write_gif("test.gif",fps=15)

    gifclip = mpy.VideoFileClip("test.gif")
    gifclip.set_duration(120).write_videofile("test.mp4",fps=15)

    My goal is to be able to produce something along the psuedo-code of

    original_video = qrcode_generator("I don't know, a clock, pyotp, today's news sources, just anything that can be generated on the fly!")
    original_video.overlay_text(0,0,"This is some sample text, the left two are coordinates, the right three are font, size, and color", Times_New_Roman, 12, Blue)
    original_video.add_audio(sine_wave_generator(0,180,2)) # frequency min-max, seconds

    # NOTICE - I did not add any time measurements to the actual video itself. The whole point is this is a live stream and not a video clip, so the time frame would be now. The 2 seconds list above is for our psuedo sine wave generator to know how long the audio clip should be, not for the actual streaming library.

    stream.send_to_rtmp_server(original_video) # Doesn't matter if ffmpeg or some native library

    The above example is what I am looking for in terms of video creation in Python and then streaming. I am not trying to create a clip and then stream it later, I am trying to have the program be able to respond to outside events and then update it’s stream to do whatever it wants. It is sort of like a chat bot, but with video instead of text.

    def track_movement(...):
     ...
     return ...

    original_video = user_submitted_clip(chat.lastVideoMessage)
    original_video.overlay_text(0,0,"The robot watches the user's movements and puts a blue square around it.", Times_New_Roman, 12, Blue)
    original_video.add_audio(sine_wave_generator(0,180,2)) # frequency min-max, seconds

    # It would be awesome if I could also figure out how to perform advance actions such as tracking movements or pulling a face out of a clip and then applying effects to it on the fly. I know OpenCV can track movements and I hear that it can work with streams, but I cannot figure out how that works. Any help would be appreciated! Thanks!

    Because I forgot to add the imports, here are some useful imports I have in my file !

    import pyotp
    import qrcode
    from io import BytesIO
    from moviepy import editor as mpy

    The library, pyotp, is for generating one time pad authenticator codes, qrcode is for the qr codes, BytesIO is used for virtual files, and moviepy is what I used to generate the GIF and MP4. I believe BytesIO might be useful for piping data to the streaming service, but how that happens, depends entirely on how data is sent to the service, whether it be ffmpeg over command line (from subprocess import Popen, PIPE) or it be a native library.