Recherche avancée

Médias (91)

Autres articles (28)

  • MediaSPIP v0.2

    21 juin 2013, par

    MediaSPIP 0.2 is the first MediaSPIP stable release.
    Its official release date is June 21, 2013 and is announced here.
    The zip file provided here only contains the sources of MediaSPIP in its standalone version.
    To get a working installation, you must manually install all-software dependencies on the server.
    If you want to use this archive for an installation in "farm mode", you will also need to proceed to other manual (...)

  • HTML5 audio and video support

    13 avril 2011, par

    MediaSPIP uses HTML5 video and audio tags to play multimedia files, taking advantage of the latest W3C innovations supported by modern browsers.
    The MediaSPIP player used has been created specifically for MediaSPIP and can be easily adapted to fit in with a specific theme.
    For older browsers the Flowplayer flash fallback is used.
    MediaSPIP allows for media playback on major mobile platforms with the above (...)

  • De l’upload à la vidéo finale [version standalone]

    31 janvier 2010, par

    Le chemin d’un document audio ou vidéo dans SPIPMotion est divisé en trois étapes distinctes.
    Upload et récupération d’informations de la vidéo source
    Dans un premier temps, il est nécessaire de créer un article SPIP et de lui joindre le document vidéo "source".
    Au moment où ce document est joint à l’article, deux actions supplémentaires au comportement normal sont exécutées : La récupération des informations techniques des flux audio et video du fichier ; La génération d’une vignette : extraction d’une (...)

Sur d’autres sites (4072)

  • FFmpeg overlay positioning issue : Converting frontend center coordinates to FFmpeg top-left coordinates

    25 janvier, par tarun

    I'm building a web-based video editor where users can :

    


    Add multiple videos
Add images
Add text overlays with background color

    


    Frontend sends coordinates where each element's (x,y) represents its center position.
on click of the export button i want all data to be exported as one final video
on click i send the data to the backend like -

    


     const exportAllVideos = async () => {
    try {
      const formData = new FormData();
        
      
      const normalizedVideos = videos.map(video => ({
          ...video,
          startTime: parseFloat(video.startTime),
          endTime: parseFloat(video.endTime),
          duration: parseFloat(video.duration)
      })).sort((a, b) => a.startTime - b.startTime);

      
      for (const video of normalizedVideos) {
          const response = await fetch(video.src);
          const blobData = await response.blob();
          const file = new File([blobData], `${video.id}.mp4`, { type: "video/mp4" });
          formData.append("videos", file);
      }

      
      const normalizedImages = images.map(image => ({
          ...image,
          startTime: parseFloat(image.startTime),
          endTime: parseFloat(image.endTime),
          x: parseInt(image.x),
          y: parseInt(image.y),
          width: parseInt(image.width),
          height: parseInt(image.height),
          opacity: parseInt(image.opacity)
      }));

      
      for (const image of normalizedImages) {
          const response = await fetch(image.src);
          const blobData = await response.blob();
          const file = new File([blobData], `${image.id}.png`, { type: "image/png" });
          formData.append("images", file);
      }

      
      const normalizedTexts = texts.map(text => ({
          ...text,
          startTime: parseFloat(text.startTime),
          endTime: parseFloat(text.endTime),
          x: parseInt(text.x),
          y: parseInt(text.y),
          fontSize: parseInt(text.fontSize),
          opacity: parseInt(text.opacity)
      }));

      
      formData.append("metadata", JSON.stringify({
          videos: normalizedVideos,
          images: normalizedImages,
          texts: normalizedTexts
      }));

      const response = await fetch("my_flask_endpoint", {
          method: "POST",
          body: formData
      });

      if (!response.ok) {
        
          console.log('wtf', response);
          
      }

      const finalVideo = await response.blob();
      const url = URL.createObjectURL(finalVideo);
      const a = document.createElement("a");
      a.href = url;
      a.download = "final_video.mp4";
      a.click();
      URL.revokeObjectURL(url);

    } catch (e) {
      console.log(e, "err");
    }
  };


    


    the frontend data for each object that is text image and video we are storing it as an array of objects below is the Data strcutre for each object -

    


    // the frontend data for each
  const newVideo = {
      id: uuidv4(),
      src: URL.createObjectURL(videoData.videoBlob),
      originalDuration: videoData.duration,
      duration: videoData.duration,
      startTime: 0,
      playbackOffset: 0,
      endTime: videoData.endTime || videoData.duration,
      isPlaying: false,
      isDragging: false,
      speed: 1,
      volume: 100,
      x: window.innerHeight / 2,
      y: window.innerHeight / 2,
      width: videoData.width,
      height: videoData.height,
    };
    const newTextObject = {
      id: uuidv4(),
      description: text,
      opacity: 100,
      x: containerWidth.width / 2,
      y: containerWidth.height / 2,
      fontSize: 18,
      duration: 20,
      endTime: 20,
      startTime: 0,
      color: "#ffffff",
      backgroundColor: hasBG,
      padding: 8,
      fontWeight: "normal",
      width: 200,
      height: 40,
    };

    const newImage = {
      id: uuidv4(),
      src: URL.createObjectURL(imageData),
      x: containerWidth.width / 2,
      y: containerWidth.height / 2,
      width: 200,
      height: 200,
      borderRadius: 0,
      startTime: 0,
      endTime: 20,
      duration: 20,
      opacity: 100,
    };



    


    BACKEND CODE -

    


    import os
import shutil
import subprocess
from flask import Flask, request, send_file
import ffmpeg
import json
from werkzeug.utils import secure_filename
import uuid
from flask_cors import CORS


app = Flask(__name__)
CORS(app, resources={r"/*": {"origins": "*"}})



UPLOAD_FOLDER = 'temp_uploads'
if not os.path.exists(UPLOAD_FOLDER):
    os.makedirs(UPLOAD_FOLDER)


@app.route('/')
def home():
    return 'Hello World'


OUTPUT_WIDTH = 1920
OUTPUT_HEIGHT = 1080



@app.route('/process', methods=['POST'])
def process_video():
    work_dir = None
    try:
        work_dir = os.path.abspath(os.path.join(UPLOAD_FOLDER, str(uuid.uuid4())))
        os.makedirs(work_dir)
        print(f"Created working directory: {work_dir}")

        metadata = json.loads(request.form['metadata'])
        print("Received metadata:", json.dumps(metadata, indent=2))
        
        video_paths = []
        videos = request.files.getlist('videos')
        for idx, video in enumerate(videos):
            filename = f"video_{idx}.mp4"
            filepath = os.path.join(work_dir, filename)
            video.save(filepath)
            if os.path.exists(filepath) and os.path.getsize(filepath) > 0:
                video_paths.append(filepath)
                print(f"Saved video to: {filepath} Size: {os.path.getsize(filepath)}")
            else:
                raise Exception(f"Failed to save video {idx}")

        image_paths = []
        images = request.files.getlist('images')
        for idx, image in enumerate(images):
            filename = f"image_{idx}.png"
            filepath = os.path.join(work_dir, filename)
            image.save(filepath)
            if os.path.exists(filepath):
                image_paths.append(filepath)
                print(f"Saved image to: {filepath}")

        output_path = os.path.join(work_dir, 'output.mp4')

        filter_parts = []

        base_duration = metadata["videos"][0]["duration"] if metadata["videos"] else 10
        filter_parts.append(f'color=c=black:s={OUTPUT_WIDTH}x{OUTPUT_HEIGHT}:d={base_duration}[canvas];')

        for idx, (path, meta) in enumerate(zip(video_paths, metadata['videos'])):
            x_pos = int(meta.get("x", 0) - (meta.get("width", 0) / 2))
            y_pos = int(meta.get("y", 0) - (meta.get("height", 0) / 2))
            
            filter_parts.extend([
                f'[{idx}:v]setpts=PTS-STARTPTS,scale={meta.get("width", -1)}:{meta.get("height", -1)}[v{idx}];',
                f'[{idx}:a]asetpts=PTS-STARTPTS[a{idx}];'
            ])

            if idx == 0:
                filter_parts.append(
                    f'[canvas][v{idx}]overlay=x={x_pos}:y={y_pos}:eval=init[temp{idx}];'
                )
            else:
                filter_parts.append(
                    f'[temp{idx-1}][v{idx}]overlay=x={x_pos}:y={y_pos}:'
                    f'enable=\'between(t,{meta["startTime"]},{meta["endTime"]})\':eval=init'
                    f'[temp{idx}];'
                )

        last_video_temp = f'temp{len(video_paths)-1}'

        if video_paths:
            audio_mix_parts = []
            for idx in range(len(video_paths)):
                audio_mix_parts.append(f'[a{idx}]')
            filter_parts.append(f'{"".join(audio_mix_parts)}amix=inputs={len(video_paths)}[aout];')

        
        if image_paths:
            for idx, (img_path, img_meta) in enumerate(zip(image_paths, metadata['images'])):
                input_idx = len(video_paths) + idx
                
                
                x_pos = int(img_meta["x"] - (img_meta["width"] / 2))
                y_pos = int(img_meta["y"] - (img_meta["height"] / 2))
                
                filter_parts.extend([
                    f'[{input_idx}:v]scale={img_meta["width"]}:{img_meta["height"]}[img{idx}];',
                    f'[{last_video_temp}][img{idx}]overlay=x={x_pos}:y={y_pos}:'
                    f'enable=\'between(t,{img_meta["startTime"]},{img_meta["endTime"]})\':'
                    f'alpha={img_meta["opacity"]/100}[imgout{idx}];'
                ])
                last_video_temp = f'imgout{idx}'

        if metadata.get('texts'):
            for idx, text in enumerate(metadata['texts']):
                next_output = f'text{idx}' if idx < len(metadata['texts']) - 1 else 'vout'
                
                escaped_text = text["description"].replace("'", "\\'")
                
                x_pos = int(text["x"] - (text["width"] / 2))
                y_pos = int(text["y"] - (text["height"] / 2))
                
                text_filter = (
                    f'[{last_video_temp}]drawtext=text=\'{escaped_text}\':'
                    f'x={x_pos}:y={y_pos}:'
                    f'fontsize={text["fontSize"]}:'
                    f'fontcolor={text["color"]}'
                )
                
                if text.get('backgroundColor'):
                    text_filter += f':box=1:boxcolor={text["backgroundColor"]}:boxborderw=5'
                
                if text.get('fontWeight') == 'bold':
                    text_filter += ':font=Arial-Bold'
                
                text_filter += (
                    f':enable=\'between(t,{text["startTime"]},{text["endTime"]})\''
                    f'[{next_output}];'
                )
                
                filter_parts.append(text_filter)
                last_video_temp = next_output
        else:
            filter_parts.append(f'[{last_video_temp}]null[vout];')

        
        filter_complex = ''.join(filter_parts)

        
        cmd = [
            'ffmpeg',
            *sum([['-i', path] for path in video_paths], []),
            *sum([['-i', path] for path in image_paths], []),
            '-filter_complex', filter_complex,
            '-map', '[vout]'
        ]
        
        
        if video_paths:
            cmd.extend(['-map', '[aout]'])
        
        cmd.extend(['-y', output_path])

        print(f"Running ffmpeg command: {' '.join(cmd)}")
        result = subprocess.run(cmd, capture_output=True, text=True)
        
        if result.returncode != 0:
            print(f"FFmpeg error output: {result.stderr}")
            raise Exception(f"FFmpeg processing failed: {result.stderr}")

        return send_file(
            output_path,
            mimetype='video/mp4',
            as_attachment=True,
            download_name='final_video.mp4'
        )

    except Exception as e:
        print(f"Error in video processing: {str(e)}")
        return {'error': str(e)}, 500
    
    finally:
        if work_dir and os.path.exists(work_dir):
            try:
                print(f"Directory contents before cleanup: {os.listdir(work_dir)}")
                if not os.environ.get('FLASK_DEBUG'):
                    shutil.rmtree(work_dir)
                else:
                    print(f"Keeping directory for debugging: {work_dir}")
            except Exception as e:
                print(f"Cleanup error: {str(e)}")

                
if __name__ == '__main__':
    app.run(debug=True, port=8000)



    


    I'm also attaching what the final thing looks like on the frontend web vs in the downloaded video
and as u can see the downloaded video has all coords and positions messed up be it of the texts, images as well as videosdownloaded videos view
frontend web view

    


    can somebody please help me figure this out :)

    


  • Data Privacy in Business : A Risk Leading to Major Opportunities

    9 août 2022, par Erin — Privacy

    Data privacy in business is a contentious issue. 

    Claims that “big data is the new oil of the digital economy” and strong links between “data-driven personalisation and customer experience” encourage leaders to set up massive data collection programmes.

    However, many of these conversations downplay the magnitude of security, compliance and ethical risks companies face when betting too much on customer data collection. 

    In this post, we discuss the double-edged nature of privacy issues in business — the risk-ridden and the opportunity-driven. ​​

    3 Major Risks of Ignoring Data Privacy in Business

    As the old adage goes : Just because everyone else is doing it doesn’t make it right.

    Easy data accessibility and ubiquity of analytics tools make data consumer collection and processing sound like a “given”. But the decision to do so opens your business to a spectrum of risks. 

    1. Compliance and Legal Risks 

    Data collection and customer privacy are protected by a host of international laws including GDPR, CCPA, and regional regulations. Only 15% of countries (mostly developing ones) don’t have dedicated laws for protecting consumer privacy. 

    State of global data protection legislature via The UN

    Global legislature includes provisions on : 

    • Collectible data types
    • Allowed uses of obtained data 
    • Consent to data collection and online tracking 
    • Rights to request data removal 

    Personally identifiable information (PII) processing is prohibited or strictly regulated in most jurisdictions. Yet businesses repeatedly circumnavigate existing rules and break them on occasion.

    In Australia, for example, only 2% of brands use logos, icons or messages to transparently call out online tracking, data sharing or other specific uses of data at the sign-up stage. In Europe, around half of small businesses are still not fully GDPR-compliant — and Big Tech companies like Google, Amazon and Facebook can’t get a grip on their data collection practices even when pressed with horrendous fines. 

    Although the media mostly reports on compliance fines for “big names”, smaller businesses are increasingly receiving more scrutiny. 

    As Max Schrems, an Austrian privacy activist and founder of noyb NGO, explained in a Matomo webinar :

    “In Austria, my home country, there are a lot of €5,000 fines going out there as well [to smaller businesses]. Most of the time, they are just not reported. They just happen below the surface. [GDPR fines] are already a reality.”​

    In April 2022, the EU Court of Justice ruled that consumer groups can autonomously sue businesses for breaches of data protection — and nonprofit organisations like noyb enable more people to do so. 

    Finally, new data privacy legislation is underway across the globe. In the US, Colorado, Connecticut, Virginia and Utah have data protection acts at different stages of approval. South African authorities are working on the Protection of Personal Information Act (POPI) act and Brazil is working on a local General Data Protection Law (LGPD).

    Re-thinking your stance on user privacy and data protection now can significantly reduce the compliance burden in the future. 

    2. Security Risks 

    Data collection also mandates data protection for businesses. Yet, many organisations focus on the former and forget about the latter. 

    Lenient attitudes to consumer data protection resulted in a major spike in data breaches.

    Check Point research found that cyberattacks increased 50% year-over-year, with each organisation facing 925 cyberattacks per week globally.

    Many of these attacks end up being successful due to poor data security in place. As a result, billions of stolen consumer records become publicly available or get sold on dark web marketplaces.

    What’s even more troublesome is that stolen consumer records are often purchased by marketing firms or companies, specialising in spam campaigns. Buyers can also use stolen emails to distribute malware, stage phishing and other social engineering attacks – and harvest even more data for sale. 

    One business’s negligence creates a snowball effect of negative changes down the line with customers carrying the brunt of it all. 

    In 2020, hackers successfully targeted a Finnish psychotherapy practice. They managed to steal hundreds of patient records — and then demanded a ransom both from the firm and its patients for not exposing information about their mental health issues. Many patients refused to pay hackers and some 300 records ended up being posted online as Associated Press reported.

    Not only did the practice have to deal with the cyber-breach aftermath, but it also faced vocal regulatory and patient criticisms for failing to properly protect such sensitive information.

    Security negligence can carry both direct (heavy data breach fines) and indirect losses in the form of reputational damages. An overwhelming 90% of consumers say they wouldn’t buy from a business if it doesn’t adequately protect their data. This brings us to the last point. 

    3. Reputational Risks 

    Trust is the new currency. Data negligence and consumer privacy violations are the two fastest ways to lose it. 

    Globally, consumers are concerned about how businesses collect, use, and protect their data. 

    Consumer data sharing attitudes
    • According to Forrester, 47% of UK adults actively limit the amount of data they share with websites and apps. 49% of Italians express willingness to ask companies to delete their personal data. 36% of Germans use privacy and security tools to minimise online tracking of their activities. 
    • A GDMA survey also notes that globally, 82% of consumers want more control over their personal information, shared with companies. 77% also expect brands to be transparent about how their data is collected and used. 

    When businesses fail to hold their end of the bargain — collect just the right amount of data and use it with integrity — consumers are fast to cut ties. 

    Once the information about privacy violations becomes public, companies lose : 

    • Brand equity 
    • Market share 
    • Competitive positioning 

    An AON report estimates that post-data breach companies can lose as much as 25% of their initial value. In some cases, the losses can be even higher. 

    In 2015, British telecom TalkTalk suffered from a major data breach. Over 150,000 customer records were stolen by hackers. To contain the issue, TalkTalk had to throw between $60-$70 million into containment efforts. Still, they lost over 100,000 customers in a matter of months and one-third of their company value, equivalent to $1.4 billion, by the end of the year. 

    Fresher data from Infosys gives the following maximum cost estimates of brand damage, companies could experience after a data breach (accidental or malicious).

    Estimated cost of brand damage due to a data breach

    3 Major Advantages of Privacy in Business 

    Despite all the industry mishaps, a reassuring 77% of CEOs now recognise that their companies must fundamentally change their approaches to customer engagement, in particular when it comes to ensuring data privacy. 

    Many organisations take proactive steps to cultivate a privacy-centred culture and implement transparent data collection policies. 

    Here’s why gaining the “privacy advantage” pays off.

    1. Market Competitiveness 

    There’s a reason why privacy-focused companies are booming. 

    Consumers’ mounting concerns and frustrations over the lack of online privacy, prompt many to look for alternative privacy-centred products and services

    The following B2C and B2B products are moving from the industry margins to the mainstream : 

    Across the board, consumers express greater trust towards companies, protective of their privacy : 

    And as we well know : trust translates to higher engagement, loyalty, and – ultimately revenue. 

    By embedding privacy into the core of your product, you give users more reasons to select, stay and support your business. 

    2. Higher Operational Efficiency

    Customer data protection isn’t just a policy – it’s a culture of collecting “just enough” data, protecting it and using it responsibly. 

    Sadly, that’s the area where most organisations trail behind. At present, some 90% of businesses admit to having amassed massive data silos. 

    Siloed data is expensive to maintain and operationalise. Moreover, when left unattended, it can evolve into a pressing compliance issue. 

    A recently leaked document from Facebook says the company has no idea where all of its first-party, third-party and sensitive categories data goes or how it is processed. Because of this, Facebook struggles to achieve GDPR compliance and remains under regulatory pressure. 

    Similarly, Google Analytics is riddled with privacy issues. Other company products were found to be collecting and operationalising consumer data without users’ knowledge or consent. Again, this creates valid grounds for regulatory investigations. 

    Smaller companies have a better chance of making things right at the onset. 

    By curbing customer data collection, you can : 

    • Reduce data hosting and Cloud computation costs (aka trim your Cloud bill) 
    • Improve data security practices (since you would have fewer assets to protect) 
    • Make your staff more productive by consolidating essential data and making it easy and safe to access

    Privacy-mindful companies also have an easier time when it comes to compliance and can meet new data regulations faster. 

    3. Better Marketing Campaigns 

    The biggest counter-argument to reducing customer data collection is marketing. 

    How can we effectively sell our products if we know nothing about our customers ? – your team might be asking. 

    This might sound counterintuitive, but minimising data collection and usage can lead to better marketing outcomes. 

    Limiting the types of data that can be used encourages your people to become more creative and productive by focusing on fewer metrics that are more important.

    Think of it this way : Every other business uses the same targeting parameters on Facebook or Google for running paid ad campaigns on Facebook. As a result, we see ads everywhere — and people grow unresponsive to them or choose to limit exposure by using ad blocking software, private browsers and VPNs. Your ad budgets get wasted on chasing mirage metrics instead of actual prospects. 

    Case in point : In 2017 Marc Pritchard of Procter & Gamble decided to first cut the company’s digital advertising budget by 6% (or $200 million). Unilever made an even bolder move and reduced its ad budget by 30% in 2018. 

    Guess what happened ?

    P&G saw a 7.5% increase in organic sales and Unilever had a 3.8% gain as HBR reports. So how come both companies became more successful by spending less on advertising ? 

    They found that overexposure to online ads led to diminishing returns and annoyances among loyal customers. By minimising ad exposure and adopting alternative marketing strategies, the two companies managed to market better to new and existing customers. 

    The takeaway : There are more ways to engage consumers aside from pestering them with repetitive retargeting messages or creepy personalisation. 

    You can collect first-party data with consent to incrementally improve your product — and educate them on the benefits of your solution in transparent terms.

    Final Thoughts 

    The definitive advantage of privacy is consumers’ trust. 

    You can’t buy it, you can’t fake it, you can only cultivate it by aligning your external appearances with internal practices. 

    Because when you fail to address privacy internally, your mishaps will quickly become apparent either as social media call-outs or worse — as a security incident, a data breach or a legal investigation. 

    By choosing to treat consumer data with respect, you build an extra layer of protection around your business, plus draw in some banging benefits too. 

    Get one step closer to becoming a privacy-centred company by choosing Matomo as your web analytics solution. We offer robust privacy controls for ensuring ethical, compliant, privacy-friendly and secure website tracking. 

  • The problems with wavelets

    27 février 2010, par Dark Shikari — DCT, Dirac, Snow, psychovisual optimizations, wavelets

    I have periodically noted in this blog and elsewhere various problems with wavelet compression, but many readers have requested that I write a more detailed post about it, so here it is.

    Wavelets have been researched for quite some time as a replacement for the standard discrete cosine transform used in most modern video compression. Their methodology is basically opposite : each coefficient in a DCT represents a constant pattern applied to the whole block, while each coefficient in a wavelet transform represents a single, localized pattern applied to a section of the block. Accordingly, wavelet transforms are usually very large with the intention of taking advantage of large-scale redundancy in an image. DCTs are usually quite small and are intended to cover areas of roughly uniform patterns and complexity.

    Both are complete transforms, offering equally accurate frequency-domain representations of pixel data. I won’t go into the mathematical details of each here ; the real question is whether one offers better compression opportunities for real-world video.

    DCT transforms, though it isn’t mathematically required, are usually found as block transforms, handling a single sharp-edged block of data. Accordingly, they usually need a deblocking filter to smooth the edges between DCT blocks. Wavelet transforms typically overlap, avoiding such a need. But because wavelets don’t cover a sharp-edged block of data, they don’t compress well when the predicted data is in the form of blocks.

    Thus motion compensation is usually performed as overlapped-block motion compensation (OBMC), in which every pixel is calculated by performing the motion compensation of a number of blocks and averaging the result based on the distance of those blocks from the current pixel. Another option, which can be combined with OBMC, is “mesh MC“, where every pixel gets its own motion vector, which is a weighted average of the closest nearby motion vectors. The end result of either is the elimination of sharp edges between blocks and better prediction, at the cost of greatly increased CPU requirements. For an overlap factor of 2, it’s 4 times the amount of motion compensation, plus the averaging step. With mesh MC, it’s even worse, with SIMD optimizations becoming nearly impossible.

    At this point, it would seem wavelets would have pretty big advantages : when used with OBMC, they have better inter prediction, eliminate the need for deblocking, and take advantage of larger-scale correlations. Why then hasn’t everyone switched over to wavelets then ? Dirac and Snow offer modern implementations. Yet despite decades of research, wavelets have consistently disappointed for image and video compression. It turns out there are a lot of serious practical issues with wavelets, many of which are open problems.

    1. No known method exists for efficient intra coding. H.264′s spatial intra prediction is extraordinarily powerful, but relies on knowing the exact decoded pixels to the top and left of the current block. Since there is no such boundary in overlapped-wavelet coding, such prediction is impossible. Newer intra prediction methods, such as markov-chain intra prediction, also seem to require an H.264-like situation with exactly-known neighboring pixels. Intra coding in wavelets is in the same state that DCT intra coding was in 20 years ago : the best known method was to simply transform the block with no prediction at all besides DC. NB : as described by Pengvado in the comments, the switching between inter and intra coding is potentially even more costly than the inefficient intra coding.

    2. Mixing partition sizes has serious practical problems. Because the overlap between two motion partitions depends on the partitions’ size, mixing block sizes becomes quite difficult to define. While in H.264 an smaller partition always gives equal or better compression than a larger one when one ignores the extra overhead, it is actually possible for a larger partition to win when using OBMC due to the larger overlap. All of this makes both the problem of defining the result of mixed block sizes and making decisions about them very difficult.

    Both Snow and Dirac offer variable block size, but the overlap amount is constant ; larger blocks serve only to save bits on motion vectors, not offer better overlap characteristics.

    3. Lack of spatial adaptive quantization. As shown in x264 with VAQ, and correspondingly in HCEnc’s implementation and Theora’s recent implementation, spatial adaptive quantization has staggeringly impressive (before, after) effects on visual quality. Only Dirac seems to have such a feature, and the encoder doesn’t even use it. No other wavelet formats (Snow, JPEG2K, etc) seem to have such a feature. This results in serious blurring problems in areas with subtle texture (as in the comparison below).

    4. Wavelets don’t seem to code visual energy effectively. Remember that a single coefficient in a DCT represents a pattern which applies across an entire block : this makes it very easy to create apparent “detail” with a DCT. Furthermore, the sharp edges of DCT blocks, despite being an apparent weakness, often result in a “fake sharpness” that can actually improve the visual appearance of videos, as was seen with Xvid. Thus wavelet codecs have a tendency to look much blurrier than DCT-based codecs, but since PSNR likes blur, this is often seen as a benefit during video compression research. Some of the consequences of these factors can be seen in this comparison ; somewhat outdated and not general-case, but which very effectively shows the difference in how wavelets handle sharp edges and subtle textures.

    Another problem that periodically crops up is the visual aliasing that tends to be associated with wavelets at lower bitrates. Standard wavelets effectively consist of a recursive function that upscales the coefficients coded by the previous level by a factor of 2 and then adds a new set of coefficients. If the upscaling algorithm is naive — as it often is, for the sake of speed — the result can look quite ugly, as if parts of the image were coded at a lower resolution and then badly scaled up. Of course, it looks like that because they were coded at a lower resolution and then badly scaled up.

    JPEG2000 is a classic example of wavelet failure : despite having more advanced entropy coding, being designed much later than JPEG, being much more computationally intensive, and having much better PSNR, comparisons have consistently shown it to be visually worse than JPEG at sane filesizes. Here’s an example from Wikipedia. By comparison, H.264′s intra coding, when used for still image compression, can beat JPEG by a factor of 2 or more (I’ll make a post on this later). With the various advancements in DCT intra coding since H.264, I suspect that a state-of-the-art DCT compressor could win by an even larger factor.

    Despite the promised benefits of wavelets, a wavelet encoder even close to competitive with x264 has yet to be created. With some tests even showing Dirac losing to Theora in visual comparisons, it’s clear that many problems remain to be solved before wavelets can eliminate the ugliness of block-based transforms once and for all.