Recherche avancée

Médias (91)

Autres articles (71)

  • Des sites réalisés avec MediaSPIP

    2 mai 2011, par

    Cette page présente quelques-uns des sites fonctionnant sous MediaSPIP.
    Vous pouvez bien entendu ajouter le votre grâce au formulaire en bas de page.

  • Use, discuss, criticize

    13 avril 2011, par

    Talk to people directly involved in MediaSPIP’s development, or to people around you who could use MediaSPIP to share, enhance or develop their creative projects.
    The bigger the community, the more MediaSPIP’s potential will be explored and the faster the software will evolve.
    A discussion list is available for all exchanges between users.

  • Contribute to a better visual interface

    13 avril 2011

    MediaSPIP is based on a system of themes and templates. Templates define the placement of information on the page, and can be adapted to a wide range of uses. Themes define the overall graphic appearance of the site.
    Anyone can submit a new graphic theme or template and make it available to the MediaSPIP community.

Sur d’autres sites (6692)

  • Adventures in Unicode

    29 novembre 2012, par Multimedia Mike — Programming, php, Python, sqlite3, unicode

    Tangential to multimedia hacking is proper metadata handling. Recently, I have gathered an interest in processing a large corpus of multimedia files which are likely to contain metadata strings which do not fall into the lower ASCII set. This is significant because the lower ASCII set intersects perfectly with my own programming comfort zone. Indeed, all of my programming life, I have insisted on covering my ears and loudly asserting “LA LA LA LA LA ! ALL TEXT EVERYWHERE IS ASCII !” I suspect I’m not alone in this.

    Thus, I took this as an opportunity to conquer my longstanding fear of Unicode. I developed a self-learning course comprised of a series of exercises which add up to this diagram :



    Part 1 : Understanding Text Encoding
    Python has regular strings by default and then it has Unicode strings. The latter are prefixed by the letter ‘u’. This is what ‘ö’ looks like encoded in each type.

    1. >>> ’ö’, u’ö’
    2. (\xc3\xb6’, u\xf6’)

    A large part of my frustration with Unicode comes from Python yelling at me about UnicodeDecodeErrors and an inability to handle the number 0xc3 for some reason. This usually comes when I’m trying to wrap my head around an unrelated problem and don’t care to get sidetracked by text encoding issues. However, when I studied the above output, I finally understood where the 0xc3 comes from. I just didn’t understand what the encoding represents exactly.

    I can see from assorted tables that ‘ö’ is character 0xF6 in various encodings (in Unicode and Latin-1), so u’\xf6′ makes sense. But what does ‘\xc3\xb6′ mean ? It’s my style to excavate straight down to the lowest levels, and I wanted to understand exactly how characters are represented in memory. The UTF-8 encoding tables inform us that any Unicode code point above 0x7F but less than 0×800 will be encoded with 2 bytes :

     110xxxxx 10xxxxxx
    

    Applying this pattern to the \xc3\xb6 encoding :

                hex : 0xc3      0xb6
               bits : 11000011  10110110
     important bits : ---00011  —110110
          assembled : 00011110110
         code point : 0xf6
    

    I was elated when I drew that out and made the connection. Maybe I’m the last programmer to figure this stuff out. But I’m still happy that I actually understand those Python errors pertaining to the number 0xc3 and that I won’t have to apply canned solutions without understanding the core problem.

    I’m cheating on this part of this exercise just a little bit since the diagram implied that the Unicode text needs to come from a binary file. I’ll return to that in a bit. For now, I’ll just contrive the following Unicode string from the Python REPL :

    1. >>> u = u’Üñìçôđé’
    2. >>> u
    3. u\xdc\xf1\xec\xe7\xf4\u0111\xe9’

    Part 2 : From Python To SQLite3
    The next step is to see what happens when I use Python’s SQLite3 module to dump the string into a new database. Will the Unicode encoding be preserved on disk ? What will UTF-8 look like on disk anyway ?

    1. >>> import sqlite3
    2. >>> conn = sqlite3.connect(’unicode.db’)
    3. >>> conn.execute("CREATE TABLE t (t text)")
    4. >>> conn.execute("INSERT INTO t VALUES (?)", (u, ))
    5. >>> conn.commit()
    6. >>> conn.close()

    Next, I manually view the resulting database file (unicode.db) using a hex editor and look for strings. Here we go :

    000007F0   02 29 C3 9C  C3 B1 C3 AC  C3 A7 C3 B4  C4 91 C3 A9
    

    Look at that ! It’s just like the \xc3\xf6 encoding we see in the regular Python strings.

    Part 3 : From SQLite3 To A Web Page Via PHP
    Finally, use PHP (love it or hate it, but it’s what’s most convenient on my hosting provider) to query the string from the database and display it on a web page, completing the outlined processing pipeline.

    1. < ?php
    2. $dbh = new PDO("sqlite:unicode.db") ;
    3. foreach ($dbh->query("SELECT t from t") as $row) ;
    4. $unicode_string = $row[’t’] ;
    5.  ?>
    6.  
    7. <html>
    8. <head><meta http-equiv="Content-Type" content="text/html ; charset=utf-8"></meta></head>
    9. <body><h1>< ?=$unicode_string ?></h1></body>
    10. </html>

    I tested the foregoing PHP script on 3 separate browsers that I had handy (Firefox, Internet Explorer, and Chrome) :



    I’d say that counts as success ! It’s important to note that the “meta http-equiv” tag is absolutely necessary. Omit and see something like this :



    Since we know what the UTF-8 stream looks like, it’s pretty obvious how the mapping is operating here : 0xc3 and 0xc4 correspond to ‘Ã’ and ‘Ä’, respectively. This corresponds to an encoding named ISO/IEC 8859-1, a.k.a. Latin-1. Speaking of which…

    Part 4 : Converting Binary Data To Unicode
    At the start of the experiment, I was trying to extract metadata strings from these binary multimedia files and I noticed characters like our friend ‘ö’ from above. In the bytestream, this was represented simply with 0xf6. I mistakenly believed that this was the on-disk representation of UTF-8. Wrong. Turns out it’s Latin-1.

    However, I still need to solve the problem of transforming such strings into Unicode to be shoved through the pipeline diagrammed above. For this experiment, I created a 9-byte file with the Latin-1 string ‘Üñìçôdé’ couched by 0′s, to simulate yanking a string out of a binary file. Here’s unicode.file :

    00000000   00 DC F1 EC  E7 F4 64 E9  00         ......d..
    

    (Aside : this experiment uses plain ‘d’ since the ‘đ’ with a bar through it doesn’t occur in Latin-1 ; shows up all over the place in Vietnamese, at least.)

    I’ve been mashing around Python code via the REPL, trying to get this string into a Unicode-friendly format. This is a successful method but it’s probably not the best :

    1. >>> import struct
    2. >>> f = open(’unicode.file’, ’r’).read()
    3. >>> u = u’’
    4. >>> for c in struct.unpack("B"*7, f[1 :8]) :
    5. ... u += unichr(c)
    6. ...
    7. >>> u
    8. u\xdc\xf1\xec\xe7\xf4d\xe9’
    9. >>> print u
    10. Üñìçôdé

    Conclusion
    Dealing with text encoding matters reminds me of dealing with integer endian-ness concerns. When you’re just dealing with one system, you probably don’t need to think too much about it because the system is usually handling everything consistently underneath the covers.

    However, when the data leaves one system and will be interpreted by another system, that’s when a programmer needs to be cognizant of matters such as integer endianness or text encoding.

  • Vedanti and Max Sound vs. Google

    14 août 2014, par Multimedia Mike — Legal/Ethical

    Vedanti Systems Limited (VSL) and Max Sound Coporation filed a lawsuit against Google recently. Ordinarily, I wouldn’t care about corporate legal battles. However, this one interests me because it’s multimedia-related. I’m curious to know how coding technology patents might hold up in a real court case.

    Here’s the most entertaining complaint in the lawsuit :

    Despite Google’s well-publicized Code of Conduct — “Don’t be Evil” — which it explains is “about doing the right thing,” “following the law,” and “acting honorably,” Google, in fact, has an established pattern of conduct which is the exact opposite of its claimed piety.

    I wonder if this is the first known case in which Google has been sued over its long-obsoleted “Don’t be evil” mantra ?

    Researching The Plaintiffs

    I think I made a mistake by assuming this lawsuit might have merit. My first order of business was to see what the plaintiff organizations have produced. I have a strong feeling that these might be run of the mill patent trolls.

    VSL currently has a blank web page. Further, the Wayback Machine only has pages reaching back to 2011. The earliest page lists these claims against a plain black background (I’ve highlighted some of the more boisterous claims and the passages that make it appear that Vedanti doesn’t actually produce anything but is strictly an IP organization) :

    The inventions key :
    The patent and software reduced any data content, without compressing, up to a 97% total reduction of the data which also produces a lossless result. This physics based invention is often called the Holy Grail.

    Vedanti Systems Intellectual Property
    Our strategic IP portfolio is granted in all of the world’s largest technology development and use countries. A major value indemnification of our licensee products is the early date of invention filing and subsequent Issue. Vedanti IP has an intrinsic 20 year patent protection and valuation in royalties and licensing. The original data transmission art has no prior art against it.

    Vedanti Systems invented among other firsts, The Slice and Partitioning of Macroblocks within a RGB Tri level region in a frame to select or not, the pixel.

    Vedanti Systems invention is used in nearly every wireless chipset and handset in the world

    Our original pixel selection system revolutionized wireless handset communications. An example of this system “Slice” and “Macroblock Partitioning” is used throughout Satellite channel expansion, Wireless partitioning, Telecom – Video Conferencing, Surveillance Cameras, and 2010 developing Media applications.

    Vedanti Systems is a Semiconductor based software, applications, and IP Continuations Intellectual Property company.

    Let’s move onto the other plaintiff, Max Sound. They have a significantly more substantive website. They also have an Android app named Spins HD Audio, which appears to be little more than a music player based on the screenshots.

    Max Sound also has a stock ticker symbol : MAXD. Something clicked into place when I looked up their ticker symbol : While worth only a few pennies, it was worth a few more pennies after this lawsuit was announced, which might be one of the motivations behind the lawsuit.

    Here’s a trick I learned when I was looking for a new tech job last year : When I first look at a company’s website and am trying to figure out what they really do, I head straight to their jobs/careers page. A lot of corporate websites have way too much blathering corporatese that can be tough to cut through. But when I see what mix of talent and specific skills they are hoping to hire, that gives me a much better portrait of what the company does.

    The reason I bring this up is because this tech company doesn’t seem to have jobs/careers page.

    The Lawsuit
    The core complaint centers around Patent 7974339 : Optimized data transmission system and method. It was filed in July 2004 (or possibly as early as January 2002), issued in July 2011, and assigned (purchased ?) by Vedanti in May 2012. The lawsuit alleges that nearly everything Google has ever produced (or, more accurately, purchased) leverages the patented technology.

    The patent itself has 5 drawings. If you’ve ever seen a multimedia codec patent, or any whitepaper on a multimedia codec, you’ve seen these graphs before. E.g., “Raw pixels come in here -> some analysis happens here -> more analysis happens over here -> entropy coding -> final bitstream”. The text of a patent document isn’t meant to be particularly useful. I’ve tried to understand this stuff before and it never goes well. Skimming the text, I just see a blur of the words data, transmission, pixel, and matrix.

    So I read the complaint to try to figure out what this is all about. To summarize the storyline as narrated by the lawsuit, some inventors were unhappy with the state of video compression in 2001 and endeavored to create something better. So they did, and called it the VSL codec. This codec is so far undocumented on the MultimediaWiki, so it probably has yet to be seen “in the wild”. Good luck finding hard technical data on it now since searches for “VSL codec” are overwhelmed by articles about this lawsuit. Also, the original codec probably wasn’t called VSL because VSL is apparently an IP organization formed much later.

    Then, the protagonists of the lawsuit patented the codec. Then, years later, Google wanted to purchase a video codec that they could open source and use to supplant H.264.

    The complaint goes on to allege that in 2010, Google specifically contacted VSL to possibly license or acquire this mysterious VSL technology. Google was allegedly allowed to study the technology, eventually decided not to continue discussions, and shipped back the proprietary materials.

    Here’s where things get weird. When Google shipped back the materials, they allegedly shipped back a bunch of Post-It notes. The notes are alleged to contain a ton of incriminating evidence. The lawsuit claims that the notes contained such tidbits as :

    • Google was concerned that its infringement could be considered “recklessness” (the standard applicable to willful infringement) ;
    • Google personnel should “try” to destroy incriminating emails ;
    • Google should consider a “design around” because it was facing a “risk of litigation.”

    Actually, given Google’s acquisition of On2, I can totally believe that last one (On2’s codecs have famously contained a lot of weirdness which is commonly suspected to be attributable to designing around known patents).

    Anyway, a lot of this case seems to hinge on the authenticity of these Post-It notes :

    “65. The Post-It notes are unequivocal evidence of Google’s knowledge of the ’339 Patent and infringement by Defendants”

    I wish I could find a stock photo of a stack of Post-It notes in an evidence bag.

    I’ve worked at big technology companies. Big tech companies these days are very diligent about indoctrinating employees about IP liability issues. The reason this Post-It situation strikes me as odd is because the alleged contents of the notes basically outline everything the corporate lawyers tell you NOT to do.

    Analysis
    I’m trying to determine what specific algorithms and coding techniques. I guess I was expecting to see a specific claim that, “Our patent outlines this specific coding technique and here is unequivocal proof that Google A) uses the same technique, and B) specifically did so after looking at our patent.” I didn’t find that (well, a bit of part B, c.f., the Post-It note debacle), but maybe that’s not how these patent lawsuits operate. I’ve never kept up before.

    Maybe it’s just a patent troll. Maybe it’s for the stock bump. I’m expecting to see pump-n-dump stock spam featuring the stock symbol MAXD anytime now.

    I’ve never been interested in following a lawsuit case carefully before. I suddenly find myself wondering if I can subscribe to the RSS feed for this case ? Too much to hope for. But I found this item through Pando and maybe they’ll stay on top of it.

  • Revision 3331 : On ajoute un accès à la configuration dans les menus

    25 avril 2010, par kent1 — Log

    On ajoute un accès à la configuration dans les menus