Recherche avancée

Médias (91)

Autres articles (48)

  • Librairies et binaires spécifiques au traitement vidéo et sonore

    31 janvier 2010, par

    Les logiciels et librairies suivantes sont utilisées par SPIPmotion d’une manière ou d’une autre.
    Binaires obligatoires FFMpeg : encodeur principal, permet de transcoder presque tous les types de fichiers vidéo et sonores dans les formats lisibles sur Internet. CF ce tutoriel pour son installation ; Oggz-tools : outils d’inspection de fichiers ogg ; Mediainfo : récupération d’informations depuis la plupart des formats vidéos et sonores ;
    Binaires complémentaires et facultatifs flvtool2 : (...)

  • Support audio et vidéo HTML5

    10 avril 2011

    MediaSPIP utilise les balises HTML5 video et audio pour la lecture de documents multimedia en profitant des dernières innovations du W3C supportées par les navigateurs modernes.
    Pour les navigateurs plus anciens, le lecteur flash Flowplayer est utilisé.
    Le lecteur HTML5 utilisé a été spécifiquement créé pour MediaSPIP : il est complètement modifiable graphiquement pour correspondre à un thème choisi.
    Ces technologies permettent de distribuer vidéo et son à la fois sur des ordinateurs conventionnels (...)

  • De l’upload à la vidéo finale [version standalone]

    31 janvier 2010, par

    Le chemin d’un document audio ou vidéo dans SPIPMotion est divisé en trois étapes distinctes.
    Upload et récupération d’informations de la vidéo source
    Dans un premier temps, il est nécessaire de créer un article SPIP et de lui joindre le document vidéo "source".
    Au moment où ce document est joint à l’article, deux actions supplémentaires au comportement normal sont exécutées : La récupération des informations techniques des flux audio et video du fichier ; La génération d’une vignette : extraction d’une (...)

Sur d’autres sites (7452)

  • Google Analytics 4 (GA4) vs Matomo

    7 avril 2022, par Erin

    Google announced that Universal Analytics’ days are numbered. Universal Analytics will be replaced by Google Analytics 4 (or GA4) on the 1st of July 2023. 

    If Google Analytics users want to compare year-on-year data, they have until July 2022 to get set up and start collecting data before the sun sets on Universal Analytics (or UA).

    But is upgrading to Google Analytics 4 the right move ? There’s a lot to consider, and many organisations are looking for an alternative to Google Analytics. So in this blog, we’ll compare GA4 to Matomo – the leading Google Analytics alternative. 

    In this blog, we’ll look at :

    What is Matomo ?

    Matomo is a powerful privacy-first web analytics platform that gives you 100% data ownership. First launched in 2007, Matomo is now the world’s leading open-source web analytics platform and is used by more than 1 million websites. 

    Matomo’s core values are based on ethical data collection and processing. Consistently more businesses and organisations from around the globe are adopting data-privacy-compliant web analytics solutions like Matomo. 

    Matomo offers both Cloud and On-Premise solutions (and a five-star rated WordPress plugin), making for an adaptable and flexible solution. 

    What is Google Analytics 4 ?

    Google Analytics 4 is the latest version of Google Analytics and represents a completely new approach to data-modelling than its predecessor, Universal Analytics. For an in-depth look at how GA4 and UA compare, check out this Google Analytics 4 vs Universal Analytics comparison

    Google Analytics 4 will soon be the only available version of analytics software from Google. So what’s the issue ? Surely, in 2022, Google makes it easy to migrate to their newest (and only) analytics platform ? Not quite.

    Google Analytics 4 vs Matomo

    Whilst the core purpose of GA4 and Matomo is similar (providing web analytics that help to optimise your website and grow your business), there are several key differences that organisations should consider before making the switch.

    Importing Historical Data from Universal Analytics

    Google Analytics 4

    Users assuming that historical data from Universal Analytics could be imported into Google Analytics 4 were faced with swift disappointment. Unfortunately, Google Analytics 4 does not have an option to import data from its predecessor, Universal Analytics. This means that businesses won’t be able to import and compare data from previous years.

    Matomo

    If you don’t want to start from scratch with your web analytics data, then Matomo is an ideal solution for data continuity. Matomo offers users the ability to import their historical Universal Analytics data. So you can keep all that valuable historical data you’ve collected over the years.

    Google Analytics 4 Migration
    Tino Didriksen via Twitter

    User Interface

    Google Analytics 4

    GA4’s new user interface has been met with mixed reviews. Many claim that it’s overly complex and difficult to navigate. Some have even suggested that the tool has been designed specifically for enterprises with specialised analytics teams. 

    Kevin Levesquea via Twitter

    Matomo

    Matomo, on the other hand, is recognised for an easy to use interface, with a rating of 4.5 out of 5 stars for ease of use on Capterra. Matomo perfectly balances powerful features with a user-friendly interface so valuable insights are only a click away. There’s a reason why over 1 million websites are using Matomo. 

    Matomo Features

    Advanced Behavioural Analytics Features 

    Google Analytics 4

    While Google Analytics is undoubtedly robust in some areas (machine learning, for instance), what it really lacks is advanced behavioural analytics. Heatmaps, session recordings and other advanced tools can give you valuable insights into how users are engaging with your site. Well beyond pageviews and other metrics.

    Unfortunately, with this new generation of GA, Google still hasn’t introduced these features. So users have to manage subscriptions and tracking in third-party behavioural analytics tools like Hotjar or Lucky Orange, for example. This is inefficient, costly and time-consuming to manage. 

    Matomo Heatmaps Feature

    Matomo 

    Meanwhile, Matomo is a one-stop shop for all of your web analytics needs. Not only do you get access to the metrics you’ve grown accustomed to with Universal Analytics, but you also get built-in behavioural analytics features like Heatmaps, Scroll Depth, Session Recordings and more. 

    Want to know if visitors are reaching your call to action at the bottom of the page ? Scroll Depth will answer that.

    Want to know why visitors aren’t clicking through to the next page ? Heatmaps will give you the insights you need.

    You get the picture – the full picture, that is. 

    All-in-one web analytics

    Data Accuracy

    Google Analytics 4

    GA4 aims to make web and app analytics more privacy-centric by reducing the reliance on cookies to record certain events across platforms and devices. 

    However, when site and application visitors opt-out of cookie tracking, GA4 instead relies on machine learning to fill in the gaps. Data sampling could mean that your business is making business decisions based on inaccurate reports. 

    Matomo

    Data is the backbone of web analytics, so why make critical business decisions on sampled data ? With Matomo, you’re guaranteed 100% unsampled accurate data. So you can rest assured that any decisions you make are based on actual facts. 

    Compliance with Privacy Laws (GDPR, CCPA, etc.) 

    Google Analytics 4

    Google is making changes in an attempt to become compliant with privacy laws. However, even with GA4, users are still transferring data to the US. For this reason, both Austrian and French governments have ruled Google Analytics illegal under GDPR.

    The only possible workaround is “Privacy Shield 2.0”, but GDPR experts are still sceptical of this one. 

    Matomo

    If compliance with global privacy laws is a concern (and it should be), then Matomo is the clear winner here. 

    As an EU hosted web analytics tool, your data is stored in Europe, and no data is transferred to the US. On the other hand, if you choose to self-host, the data is stored in your country of choice.

    In addition, with cookieless tracking enabled, you can say goodbye to those pesky cookie consent screens. 

    Also, remember that under GDPR, and many other data privacy laws like CCPA and LGPD, end users have a legal right to access, amend and/or erase the personal data collected about them. 

    With Matomo you get 100% ownership of your web analytics data. This means that we don’t on-sell to third parties ; can’t claim ownership of the data ; and you can export your data at any time.

    Matomo vs GA4
    @tersmantoll via Twitter

    Wrap up

    At the end of the day, the worst thing an organisation can do is nothing. Waiting until July 2023 to migrate to GA4 or another web analytics platform would be very disruptive and costly. Organisations need to consider their options now and start migrating in the next few months. 

    With all that said, moving to Google Analytics 4 could prove to be a costly and time-consuming operation. The global trend towards increased data privacy is a threat to platforms like Google Analytics which uses data for advertising and transfers data across borders.

    With Matomo, you get an easy to use all-in-one web analytics platform and keep your historical Universal Analytics data. Plus, you can future-proof your business by being compliant with global privacy laws and get access to advanced behavioural analytics features. 

    There’s a lot to weigh up here but fortunately, getting started with Matomo is easy. Try it free for 21-days (no credit card required) and see for yourself why over 1 million websites choose Matomo. 

    While this is the end of the road for Universal Analytics, it’s also an opportune time for organisations to find a better fit web analytics tool. 

  • A Comprehensive Guide to Robust Digital Marketing Analytics

    30 octobre 2023, par Erin

    First impressions are everything. This is not only true for dating and job interviews but also for your digital marketing strategy. Like a poorly planned resume getting tossed in the “no thank you” pile, 38% of visitors to your website will stop engaging with your content if they find the layout unpleasant. Thankfully, digital marketers can access data that can be harnessed to optimise websites and turn those “no thank you’s” into “absolutely’s.”

    So, how can we transform raw data into valuable insights that pay off ? The key is web analytics tools that can help you make sense of it all while collecting data ethically. In this article, we’ll equip you with ways to take your digital marketing strategy to the next level with the power of web analytics.

    What are the different types of digital marketing analytics ?

    Digital marketing analytics are like a cipher into the complex behaviour of your buyers. Digital marketing analytics help collect, analyse and interpret data from any touchpoint you interact with your buyers online. Whether you’re trying to gauge the effectiveness of a new email marketing campaign or improve your mobile app layout, there’s a way for you to make use of the insights you gain. 

    As we go through the eight commonly known types of digital marketing analytics, please note we’ll primarily focus on what falls under the umbrella of web analytics. 

    1. Web analytics help you better understand how users interact with your website. Good web analytics tools will help you understand user behaviour while securely handling user data. 
    2. Learn more about the effectiveness of your organisation’s social media platforms with social media analytics. Social media analytics include user engagement, post reach and audience demographics. 
    3. Email marketing analytics help you see how email campaigns are being engaged with.
    4. Search engine optimisation (SEO) analytics help you understand your website’s visibility in search engine results pages (SERPs). 
    5. Pay-per-click (PPC) analytics measure the performance of paid advertising campaigns.
    6. Content marketing analytics focus on how your content is performing with your audience. 
    7. Customer analytics helps organisations identify and examine buyer behaviour to retain the biggest spenders. 
    8. Mobile app analytics track user interactions within mobile applications. 

    Choosing which digital marketing analytics tools are the best fit for your organisation is not an easy task. When making these decisions, it’s critical to remember the ethical implications of data collection. Although data insights can be invaluable to your organisation, they won’t be of much use if you lose the trust of your users. 

    Tips and best practices for developing robust digital marketing analytics 

    So, what separates top-notch, robust digital marketing analytics from the rest ? We’ve already touched on it, but a big part involves respecting user privacy and ethically handling data. Data security should be on your list of priorities, alongside conversion rate optimisation when developing a digital marketing strategy. In this section, we will examine best practices for using digital marketing analytics while retaining user trust.

    Lightbulb with a target in the center being struck by arrows

    Clear objectives

    Before comparing digital marketing analytics tools, you should define clear and measurable goals. Try asking yourself what you need your digital marketing analytics strategy to accomplish. Do you want to improve conversion rates while remaining data compliant ? Maybe you’ve noticed users are not engaging with your platform and want to fix that. Save yourself time and energy by focusing on the most relevant pain points and areas of improvement.

    Choose the right tools for the job

    Don’t just base your decision on what other people tell you. Take the tool for a test drive — free trials allow you to test features and user interfaces and learn more about the platform before committing. When choosing digital marketing analytics tools, look for ones that ensure compliance with privacy laws like GDPR.

    Don’t overlook data compliance

    GDPR ensures organisations prioritise data protection and privacy. You could be fined up to €20 million, or 4% of the previous year’s revenue for violations. Without data compliance practices, you can say goodbye to the time and money spent on digital marketing strategies. 

    Don’t sacrifice data quality and accuracy

    Inaccurate and low-quality data can taint your analysis, making it hard to glean valuable insights from your digital marketing analytics efforts. Regularly audit and clean your data to remove inaccuracies and inconsistencies. Address data discrepancies promptly to maintain the integrity of your analytics. Data validation measures also help to filter out inaccurate data.

    Communicate your findings

    Having insights is one thing ; effectively communicating complex data findings is just as important. Customise dashboards to display key metrics aligned with your objectives. Make sure to automate reports, allowing stakeholders to stay updated without manual intervention. 

    Understand the user journey

    To optimise your conversion rates, you need to understand the user journey. Start by analysing visitors interactions with your website — this will help you identify conversion bottlenecks in your sales or lead generation processes. Implement A/B testing for landing page optimisation, refining elements like call-to-action buttons or copy, and leverage Form Analytics to make informed, data-driven improvements to your forms.

    Continuous improvement

    Learn from the data insights you gain, and iterate your marketing strategies based on the findings. Stay updated with evolving web analytics trends and technologies to leverage new growth opportunities.

    Why you need web analytics to support your digital marketing analytics toolbox

    You wouldn’t set out on a roadtrip without a map, right ? Digital marketing analytics without insights into how users interact with your website are just as useless. Used ethically, web analytics tools can be an invaluable addition to your digital marketing analytics toolbox. 

    The data collected via web analytics reveals user interactions with your website. These could include anything from how long visitors stay on your page to their actions while browsing your website. Web analytics tools help you gather and understand this data so you can better understand buyer preferences. It’s like a domino effect : the more you understand your buyers and user behaviour, the better you can assess the effectiveness of your digital content and campaigns. 

    Web analytics reveal user behaviour, highlighting navigation patterns and drop-off points. Understanding these patterns helps you refine website layout and content, improving engagement and conversions for a seamless user experience.

    Magnifying glass examining various screens that contain data

    Concrete CMS harnessed the power of web analytics, specifically Form Analytics, to uncover a crucial insight within their user onboarding process. Their data revealed a significant issue : the “address” input field was causing visitors to drop off and not complete the form, severely impacting the overall onboarding experience and conversion rate.

    Armed with these insights, Concrete CMS made targeted optimisations to the form, resulting in a substantial transformation. By addressing the specific issue identified through Form Analytics, they achieved an impressive outcome – a threefold increase in lead generation.

    This case is a great example of how web analytics can uncover customer needs and preferences and positively impact conversion rates. 

    Ethical implications of digital marketing analytics

    As we’ve touched on, digital marketing analytics are a powerful tool to help better understand online user behaviour. With great power comes great responsibility, however, and it’s a legal and ethical obligation for organisations to protect individual privacy rights. Let’s get into the benefits of practising ethical digital marketing analytics and the potential risks of not respecting user privacy : 

    • If someone uses your digital platform and then opens their email one day to find it filled with random targeted ad campaigns, they won’t be happy. Avoid losing user trust — and facing a potential lawsuit — by informing users what their data will be used for. Give them the option to consent to opt-in or opt-out of letting you use their personal information. If users are also assured you’ll safeguard personal information against unauthorised access, they’ll be more likely to trust you to handle their data securely.
    • Protecting data against breaches means investing in technology that will let you end-to-end encrypt and securely store data. Other important data-security best practices include access control, backing up data regularly and network and physical security of assets.
    • A fine line separates digital marketing analytics and misusing user data — many companies have gotten into big trouble for crossing it. (By big trouble, we mean millions of dollars in fines.) When it comes to digital marketing analytics, you should never cut corners when it comes to user privacy and data security. This balance involves understanding what data can be collected and what should be collected and respecting user boundaries and preferences.

    Learn more 

    We discussed a lot of facets of digital marketing analytics, namely how to develop a robust digital marketing strategy while prioritising data compliance. With Matomo, you can protect user data and respect user privacy while gaining invaluable insights into user behaviour. Save your organisation time and money by investing in a web analytics solution that gives you the best of both worlds. 

    If you’re ready to begin using ethical and robust digital marketing analytics on your website, try Matomo. Start your 21-day free trial now — no credit card required.

  • Ogg objections

    3 mars 2010, par Mans — Multimedia

    The Ogg container format is being promoted by the Xiph Foundation for use with its Vorbis and Theora codecs. Unfortunately, a number of technical shortcomings in the format render it ill-suited to most, if not all, use cases. This article examines the most severe of these flaws.

    Overview of Ogg

    The basic unit in an Ogg stream is the page consisting of a header followed by one or more packets from a single elementary stream. A page can contain up to 255 packets, and a packet can span any number of pages. The following table describes the page header.

    Field Size (bits) Description
    capture_pattern 32 magic number “OggS”
    version 8 always zero
    flags 8
    granule_position 64 abstract timestamp
    bitstream_serial_number 32 elementary stream number
    page_sequence_number 32 incremented by 1 each page
    checksum 32 CRC of entire page
    page_segments 8 length of segment_table
    segment_table variable list of packet sizes

    Elementary stream types are identified by looking at the payload of the first few pages, which contain any setup data required by the decoders. For full details, see the official format specification.

    Generality

    Ogg, legend tells, was designed to be a general-purpose container format. To most multimedia developers, a general-purpose format is one in which encoded data of any type can be encapsulated with a minimum of effort.

    The Ogg format defined by the specification does not fit this description. For every format one wishes to use with Ogg, a complex mapping must first be defined. This mapping defines how to identify a codec, how to extract setup data, and even how timestamps are to be interpreted. All this is done differently for every codec. To correctly parse an Ogg stream, every such mapping ever defined must be known.

    Under this premise, a centralised repository of codec mappings would seem like a sensible idea, but alas, no such thing exists. It is simply impossible to obtain a exhaustive list of defined mappings, which makes the task of creating a complete implementation somewhat daunting.

    One brave soul, Tobias Waldvogel, created a mapping, OGM, capable of storing any Microsoft AVI compatible codec data in Ogg files. This format saw some use in the wild, but was frowned upon by Xiph, and it was eventually displaced by other formats.

    True generality is evidently not to be found with the Ogg format.

    A good example of a general-purpose format is Matroska. This container can trivially accommodate any codec, all it requires is a unique string to identify the codec. For codecs requiring setup data, a standard location for this is provided in the container. Furthermore, an official list of codec identifiers is maintained, meaning all information required to fully support Matroska files is available from one place.

    Matroska also has probably the greatest advantage of all : it is in active, wide-spread use. Historically, standards derived from existing practice have proven more successful than those created by a design committee.

    Overhead

    When designing a container format, one important consideration is that of overhead, i.e. the extra space required in addition to the elementary stream data being combined. For any given container, the overhead can be divided into a fixed part, independent of the total file size, and a variable part growing with increasing file size. The fixed overhead is not of much concern, its relative contribution being negligible for typical file sizes.

    The variable overhead in the Ogg format comes from the page headers, mostly from the segment_table field. This field uses a most peculiar encoding, somewhat reminiscent of Roman numerals. In Roman times, numbers were written as a sequence of symbols, each representing a value, the combined value being the sum of the constituent values.

    The segment_table field lists the sizes of all packets in the page. Each value in the list is coded as a number of bytes equal to 255 followed by a final byte with a smaller value. The packet size is simply the sum of all these bytes. Any strictly additive encoding, such as this, has the distinct drawback of coded length being linearly proportional to the encoded value. A value of 5000, a reasonable packet size for video of moderate bitrate, requires no less than 20 bytes to encode.

    On top of this we have the 27-byte page header which, although paling in comparison to the packet size encoding, is still much larger than necessary. Starting at the top of the list :

    • The version field could be disposed of, a single-bit marker being adequate to separate this first version from hypothetical future versions. One of the unused positions in the flags field could be used for this purpose
    • A 64-bit granule_position is completely overkill. 32 bits would be more than enough for the vast majority of use cases. In extreme cases, a one-bit flag could be used to signal an extended timestamp field.
    • 32-bit elementary stream number ? Are they anticipating files with four billion elementary streams ? An eight-bit field, if not smaller, would seem more appropriate here.
    • The 32-bit page_sequence_number is inexplicable. The intent is to allow detection of page loss due to transmission errors. ISO MPEG-TS uses a 4-bit counter per 188-byte packet for this purpose, and that format is used where packet loss actually happens, unlike any use of Ogg to date.
    • A mandatory 32-bit checksum is nothing but a waste of space when using a reliable storage/transmission medium. Again, a flag could be used to signal the presence of an optional checksum field.

    With the changes suggested above, the page header would shrink from 27 bytes to 12 bytes in size.

    We thus see that in an Ogg file, the packet size fields alone contribute an overhead of 1/255 or approximately 0.4%. This is a hard lower bound on the overhead, not attainable even in theory. In reality the overhead tends to be closer to 1%.

    Contrast this with the ISO MP4 file format, which can easily achieve an overhead of less than 0.05% with a 1 Mbps elementary stream.

    Latency

    In many applications end-to-end latency is an important factor. Examples include video conferencing, telephony, live sports events, interactive gaming, etc. With the codec layer contributing as little as 10 milliseconds of latency, the amount imposed by the container becomes an important factor.

    Latency in an Ogg-based system is introduced at both the sender and the receiver. Since the page header depends on the entire contents of the page (packet sizes and checksum), a full page of packets must be buffered by the sender before a single bit can be transmitted. This sets a lower bound for the sending latency at the duration of a page.

    On the receiving side, playback cannot commence until packets from all elementary streams are available. Hence, with two streams (audio and video) interleaved at the page level, playback is delayed by at least one page duration (two if checksums are verified).

    Taking both send and receive latencies into account, the minimum end-to-end latency for Ogg is thus twice the duration of a page, triple if strict checksum verification is required. If page durations are variable, the maximum value must be used in order to avoid buffer underflows.

    Minimum latency is clearly achieved by minimising the page duration, which in turn implies sending only one packet per page. This is where the size of the page header becomes important. The header for a single-packet page is 27 + packet_size/255 bytes in size. For a 1 Mbps video stream at 25 fps this gives an overhead of approximately 1%. With a typical audio packet size of 400 bytes, the overhead becomes a staggering 7%. The average overhead for a multiplex of these two streams is 1.4%.

    As it stands, the Ogg format is clearly not a good choice for a low-latency application. The key to low latency is small packets and fine-grained interleaving of streams, and although Ogg can provide both of these, by sending a single packet per page, the price in overhead is simply too high.

    ISO MPEG-PS has an overhead of 9 bytes on most packets (a 5-byte timestamp is added a few times per second), and Microsoft’s ASF has a 12-byte packet header. My suggestions for compacting the Ogg page header would bring it in line with these formats.

    Random access

    Any general-purpose container format needs to allow random access for direct seeking to any given position in the file. Despite this goal being explicitly mentioned in the Ogg specification, the format only allows the most crude of random access methods.

    While many container formats include an index allowing a time to be directly translated into an offset into the file, Ogg has nothing of this kind, the stated rationale for the omission being that this would require a two-pass multiplexing, the second pass creating the index. This is obviously not true ; the index could simply be written at the end of the file. Those objecting that this index would be unavailable in a streaming scenario are forgetting that seeking is impossible there regardless.

    The method for seeking suggested by the Ogg documentation is to perform a binary search on the file, after each file-level seek operation scanning for a page header, extracting the timestamp, and comparing it to the desired position. When the elementary stream encoding allows only certain packets as random access points (video key frames), a second search will have to be performed to locate the entry point closest to the desired time. In a large file (sizes upwards of 10 GB are common), 50 seeks might be required to find the correct position.

    A typical hard drive has an average seek time of roughly 10 ms, giving a total time for the seek operation of around 500 ms, an annoyingly long time. On a slow medium, such as an optical disc or files served over a network, the times are orders of magnitude longer.

    A factor further complicating the seeking process is the possibility of header emulation within the elementary stream data. To safeguard against this, one has to read the entire page and verify the checksum. If the storage medium cannot provide data much faster than during normal playback, this provides yet another substantial delay towards finishing the seeking operation. This too applies to both network delivery and optical discs.

    Although optical disc usage is perhaps in decline today, one should bear in mind that the Ogg format was designed at a time when CDs and DVDs were rapidly gaining ground, and network-based storage is most certainly on the rise.

    The final nail in the coffin of seeking is the codec-dependent timestamp format. At each step in the seeking process, the timestamp parsing specified by the codec mapping corresponding the current page must be invoked. If the mapping is not known, the best one can do is skip pages until one with a known mapping is found. This delays the seeking and complicates the implementation, both bad things.

    Timestamps

    A problem old as multimedia itself is that of synchronising multiple elementary streams (e.g. audio and video) during playback ; badly synchronised A/V is highly unpleasant to view. By the time Ogg was invented, solutions to this problem were long since explored and well-understood. The key to proper synchronisation lies in tagging elementary stream packets with timestamps, packets carrying the same timestamp intended for simultaneous presentation. The concept is as simple as it seems, so it is astonishing to see the amount of complexity with which the Ogg designers managed to imbue it. So bizarre is it, that I have devoted an entire article to the topic, and will not cover it further here.

    Complexity

    Video and audio decoding are time-consuming tasks, so containers should be designed to minimise extra processing required. With the data volumes involved, even an act as simple as copying a packet of compressed data can have a significant impact. Once again, however, Ogg lets us down. Despite the brevity of the specification, the format is remarkably complicated to parse properly.

    The unusual and inefficient encoding of the packet sizes limits the page size to somewhat less than 64 kB. To still allow individual packets larger than this limit, it was decided to allow packets spanning multiple pages, a decision with unfortunate implications. A page-spanning packet as it arrives in the Ogg stream will be discontiguous in memory, a situation most decoders are unable to handle, and reassembly, i.e. copying, is required.

    The knowledgeable reader may at this point remark that the MPEG-TS format also splits packets into pieces requiring reassembly before decoding. There is, however, a significant difference there. MPEG-TS was designed for hardware demultiplexing feeding directly into hardware decoders. In such an implementation the fragmentation is not a problem. Rather, the fine-grained interleaving is a feature allowing smaller on-chip buffers.

    Buffering is also an area in which Ogg suffers. To keep the overhead down, pages must be made as large as practically possible, and page size translates directly into demultiplexer buffer size. Playback of a file with two elementary streams thus requires 128 kB of buffer space. On a modern PC this is perhaps nothing to be concerned about, but in a small embedded system, e.g. a portable media player, it can be relevant.

    In addition to the above, a number of other issues, some of them minor, others more severe, make Ogg processing a painful experience. A selection follows :

    • 32-bit random elementary stream identifiers mean a simple table-lookup cannot be used. Instead the list of streams must be searched for a match. While trivial to do in software, it is still annoying, and a hardware demultiplexer would be significantly more complicated than with a smaller identifier.
    • Semantically ambiguous streams are possible. For example, the continuation flag (bit 1) may conflict with continuation (or lack thereof) implied by the segment table on the preceding page. Such invalid files have been spotted in the wild.
    • Concatenating independent Ogg streams forms a valid stream. While finding a use case for this strange feature is difficult, an implementation must of course be prepared to encounter such streams. Detecting and dealing with these adds pointless complexity.
    • Unusual terminology : inventing new terms for well-known concepts is confusing for the developer trying to understand the format in relation to others. A few examples :
      Ogg name Usual name
      logical bitstream elementary stream
      grouping multiplexing
      lacing value packet size (approximately)
      segment imaginary element serving no real purpose
      granule position timestamp

    Final words

    We have found the Ogg format to be a dubious choice in just about every situation. Why then do certain organisations and individuals persist in promoting it with such ferocity ?

    When challenged, three types of reaction are characteristic of the Ogg campaigners.

    On occasion, these people will assume an apologetic tone, explaining how Ogg was only ever designed for simple audio-only streams (ignoring it is as bad for these as for anything), and this is no doubt true. Why then, I ask again, do they continue to tout Ogg as the one-size-fits-all solution they already admitted it is not ?

    More commonly, the Ogg proponents will respond with hand-waving arguments best summarised as Ogg isn’t bad, it’s just different. My reply to this assertion is twofold :

    • Being too different is bad. We live in a world where multimedia files come in many varieties, and a decent media player will need to handle the majority of them. Fortunately, most multimedia file formats share some basic traits, and they can easily be processed in the same general framework, the specifics being taken care of at the input stage. A format deviating too far from the standard model becomes problematic.
    • Ogg is bad. When every angle of examination reveals serious flaws, bad is the only fitting description.

    The third reaction bypasses all technical analysis : Ogg is patent-free, a claim I am not qualified to directly discuss. Assuming it is true, it still does not alter the fact that Ogg is a bad format. Being free from patents does not magically make Ogg a good choice as file format. If all the standard formats are indeed covered by patents, the only proper solution is to design a new, good format which is not, this time hopefully avoiding the old mistakes.