
Recherche avancée
Autres articles (84)
-
Les autorisations surchargées par les plugins
27 avril 2010, parMediaspip core
autoriser_auteur_modifier() afin que les visiteurs soient capables de modifier leurs informations sur la page d’auteurs -
Personnaliser les catégories
21 juin 2013, parFormulaire de création d’une catégorie
Pour ceux qui connaissent bien SPIP, une catégorie peut être assimilée à une rubrique.
Dans le cas d’un document de type catégorie, les champs proposés par défaut sont : Texte
On peut modifier ce formulaire dans la partie :
Administration > Configuration des masques de formulaire.
Dans le cas d’un document de type média, les champs non affichés par défaut sont : Descriptif rapide
Par ailleurs, c’est dans cette partie configuration qu’on peut indiquer le (...) -
Des sites réalisés avec MediaSPIP
2 mai 2011, parCette page présente quelques-uns des sites fonctionnant sous MediaSPIP.
Vous pouvez bien entendu ajouter le votre grâce au formulaire en bas de page.
Sur d’autres sites (11409)
-
Revision 2c93a8ed5c : Give VP9 a different sync code from VP8 The new code is 0x49, 0x83, 0x42 There
19 mai 2013, par Jingning HanChanged Paths :
Modify /vp9/decoder/vp9_decodframe.c
Modify /vp9/encoder/vp9_bitstream.c
Modify /vp9/vp9_dx_iface.c
Give VP9 a different sync code from VP8The new code is 0x49, 0x83, 0x42
There is nothing particularly special about this code bitstream wise.
Its derivation is the word "sync" coded using 4x6bit alphabetic indices.Change-Id : Ie2430a854af32ddc5a5c25a6c1c90cf6497ba647
-
Investigating Steam for Linux
1er mars 2013, par Multimedia Mike — Game HackingValve recently released the final, public version of their Steam client for Linux, and the Linux world rejoiced. At least, it probably did. The announcement was 2 weeks ago on Valentine’s Day and I had other things on my mind, so I missed any fanfare. When framed in this manner, the announcement timing becomes suspect– it’s as though Linux enthusiasts would have plenty of time that day or something.
Taming the Frontier
Speculation about a Linux Steam client had been kicking around for nearly as long as Steam has existed. However, sometime last year, the rumors became more substantive.I naturally wondered how to port something like Steam to Linux. I have some experience with trying to make a necessarily binary-only program that runs on Linux. I’m fairly well-versed in the assorted technical challenges that one might face when attempting such a feat. Because of this, whenever I hear rumors that a company might be entertaining the notion of porting a major piece of proprietary software to Linux, my instinctive reflex is, “What ?! Why, you fools ?! Save yourselves !”
At least, that’s how it used to be. The proposal of developing a proprietary binary for Linux has been rendered considerably less insane by a few developments, for example :
- The rise of Ubuntu Linux as a quasi de facto standard for desktop Linux computing
- The increasing homogeneity in personal desktop computing technology
What I would like to know is how the Steam client runs on Linux. Does it rely on any libraries being present on the system ? Or does it bring its own ? The latter is a trick that proprietary programs can use– transport all of the shared libraries that the main program binary depends upon, install them someplace out of the way on the filesystem, probably in /opt, and then make the main program a shell script which sets a preload path to rely on the known quantity libraries instead of the copies already on the system.
Downloading and Installing the Client
For this exercise, I installed x86_64 desktop Ubuntu 12.04 Linux on a l33t gaming rig that was totally top of the line about 5 years ago, and that someone didn’t want anymore and handed down to me recently. So it should be ideal for this project.At first, I was blown away– the Linux client is in a .deb package that is less than 2 MB large. I unpacked the steam.deb file and found a bunch of support libraries — mostly X11 and standard C/C++ runtimes. Just as I suspected. Still, I can’t believe how small the thing is. However, my amazement quickly abated when I actually ran Steam and saw this :
So it turns out steam.db is just the installer program which immediately proceeds to download an additional 160+ MB of data. So there’s actually a lot more information to possibly sift through.
Another component of the installation is to basically run a big ‘apt-get install’ command to make sure a bunch of required packages are installed :
After all these installation steps, the client was ready to run. However, whenever I tried to do so, I got this dialog which would cause Steam to close when the dialog was dismissed.
Not a huge deal ; later NVIDIA drivers are fairly straightforward to install on Ubuntu Linux. After a few minutes of downloading, installing and restarting X, Steam ran with minimal complaint (it still had some issue regarding the video drivers but didn’t seem to consider it a deal-breaker).
Using Steam on Linux
So here’s Steam running on Linux :
If you have experience with using Steam on Windows or Mac, you might observe that it looks exactly the same. I don’t have a very expansive library of games (I only started using Steam because purchasing a few computer components a few years ago entitled me to some free Steam downloads of some of the games on the list in the screenshot). I didn’t really expect any of the games to have Linux versions yet, but it turns out that the indie darling FTL : Faster Than Light has been ported to Linux. FTL was a much-heralded Kickstarter success story and sounded like something I wanted to support. I purchased this from Steam shortly after its release last year and was able to download the Linux version at no additional cost with a single click.
It runs natively on Linux (note the Ubuntu desktop window decorations) :
You might notice from the main Steam client that, despite purchasing FTL about a 1/2 year ago and starting it up at least a 1/2 dozen times, I haven’t really invested a whole lot of time into it. I only managed to get about 2 minutes further this time :
What can I say ? This game just bores me to tears. It’s frustrating because I know that this is one of the cool games that all real gamers are supposed to like, but I practically catch myself nodding off every time I try to run through the tutorial. It’s strange to think that I’ve invested far more time into games that offer considerably less stimulation. That’s probably because I had far more free time compared to gaming options during those times.
But that’s neither here nor there. We’ll file this under “games that aren’t for me.” I’m glad that people like FTL and a little indie underdog has met with such success. And I’m pleased that Steam on Linux works. It’s native and the games are also native, which is all quite laudable (there was speculation that everything would just be running on top of a Wine layer).
Deeper Analysis
So I set out wondering how Steam was able to create a proprietary program that would satisfy a large enough cross-section of Linux users (i.e., on different platforms and distros). Answer : well, they didn’t, per the stated requirements. The installation is only tuned to work on Ubuntu 12.04. However, it works on both 32- and 64-bit platforms, the only 2 desktop CPU platforms that matter these days (unless ARM somehow makes inroads on the desktop). The Steam client is quite clearly an x86_32 binary– look at the terminal screenshot above and observe that it’s downloading all :i386 support libraries.The file /usr/bin/steam isn’t a binary but a launcher shell script (something you’ll also see if you investigate /usr/bin/firefox on a Linux system). Here’s an interesting tidbit :
function detect_platform() # Maybe be smarter someday # Right now this is the only platform we have a bootstrap for, so hard-code it. echo ubuntu12_32
I wager that it’s possible to get Steam running on other distributions, it probably just takes a little more effort (assuming that Steam doesn’t put too much effort into thwarting such attempts).
As for the FTL game, it comes with binaries and libraries for both x86_32 and x86_64. So, good work to the dev team for creating and testing both versions. FTL also distributes versions of the libraries it expects to work with.
I suspect that the Steam client overall is largely a WWW rendering engine underneath the covers. That would help explain how Valve is able to achieve such a consistent look and feel, not only across OS platforms, but also through a web browser. When I browse the Steam store through Google Chrome, it looks and feels exactly like the native desktop client. When I first thought of how someone could port Steam to Linux, I immediately wondered about how they would do the UI.
A little Googling for “steam uses webkit” (just a hunch) confirms my hypothesis.
-
Method For Crawling Google
28 mai 2011, par Multimedia Mike — Big DataI wanted to crawl Google in order to harvest a large corpus of certain types of data as yielded by a certain search term (we’ll call it “term” for this exercise). Google doesn’t appear to offer any API to automatically harvest their search results (why would they ?). So I sat down and thought about how to do it. This is the solution I came up with.
FAQ
Q : Is this legal / ethical / compliant with Google’s terms of service ?
A : Does it look like I care ? Moving right along…Manual Crawling Process
For this exercise, I essentially automated the task that would be performed by a human. It goes something like this :- Search for “term”
- On the first page of results, download each of the 10 results returned
- Click on the next page of results
- Go to step 2, until Google doesn’t return anymore pages of search results
Google returns up to 1000 results for a given search term. Fetching them 10 at a time is less than efficient. Fortunately, the search URL can easily be tweaked to return up to 100 results per page.
Expanding Reach
Problem : 1000 results for the “term” search isn’t that many. I need a way to expand the search. I’m not aiming for relevancy ; I’m just searching for random examples of some data that occurs around the internet.My solution for this is to refine the search using the “site” wildcard. For example, you can ask Google to search for “term” at all Canadian domains using “site :.ca”. So, the manual process now involves harvesting up to 1000 results for every single internet top level domain (TLD). But many TLDs can be more granular than that. For example, there are 50 sub-domains under .us, one for each state (e.g., .ca.us, .ny.us). Those all need to be searched independently. Same for all the sub-domains under TLDs which don’t allow domains under the main TLD, such as .uk (search under .co.uk, .ac.uk, etc.).
Another extension is to combine “term” searches with other terms that are likely to have a rich correlation with “term”. For example, if “term” is relevant to various scientific fields, search for “term” in conjunction with various scientific disciplines.
Algorithmically
My solution is to create an SQLite database that contains a table of search seeds. Each seed is essentially a “site :” string combined with a starting index.Each TLD and sub-TLD is inserted as a searchseed record with a starting index of 0.
A script performs the following crawling algorithm :
- Fetch the next record from the searchseed table which has not been crawled
- Fetch search result page from Google
- Scrape URLs from page and insert each into URL table
- Mark the searchseed record as having been crawled
- If the results page indicates there are more results for this search, insert a new searchseed for the same seed but with a starting index 100 higher
Digging Into Sites
Sometimes, Google notes that certain sites are particularly rich sources of “term” and offers to let you search that site for “term”. This basically links to another search for ‘term site:somesite”. That site gets its own search seed and the program might harvest up to 1000 URLs from that site alone.Harvesting the Data
Armed with a database of URLs, employ the following algorithm :- Fetch a random URL from the database which has yet to be downloaded
- Try to download it
- For goodness sake, have a mechanism in place to detect whether the download process has stalled and automatically kill it after a certain period of time
- Store the data and update the database, noting where the information was stored and that it is already downloaded
This step is easy to parallelize by simply executing multiple copies of the script. It is useful to update the URL table to indicate that one process is already trying to download a URL so multiple processes don’t duplicate work.
Acting Human
A few factors here :- Google allegedly doesn’t like automated programs crawling its search results. Thus, at the very least, don’t let your script advertise itself as an automated program. At a basic level, this means forging the User-Agent : HTTP header. By default, Python’s urllib2 will identify itself as a programming language. Change this to a well-known browser string.
- Be patient ; don’t fire off these search requests as quickly as possible. My crawling algorithm inserts a random delay of a few seconds in between each request. This can still yield hundreds of useful URLs per minute.
- On harvesting the data : Even though you can parallelize this and download data as quickly as your connection can handle, it’s a good idea to randomize the URLs. If you hypothetically had 4 download processes running at once and they got to a point in the URL table which had many URLs from a single site, the server might be configured to reject too many simultaneous requests from a single client.
Conclusion
Anyway, that’s just the way I would (and did) do it. What did I do with all the data ? That’s a subject for a different post.