Recherche avancée

Médias (1)

Mot : - Tags -/ogg

Autres articles (92)

  • Amélioration de la version de base

    13 septembre 2013

    Jolie sélection multiple
    Le plugin Chosen permet d’améliorer l’ergonomie des champs de sélection multiple. Voir les deux images suivantes pour comparer.
    Il suffit pour cela d’activer le plugin Chosen (Configuration générale du site > Gestion des plugins), puis de configurer le plugin (Les squelettes > Chosen) en activant l’utilisation de Chosen dans le site public et en spécifiant les éléments de formulaires à améliorer, par exemple select[multiple] pour les listes à sélection multiple (...)

  • Le profil des utilisateurs

    12 avril 2011, par

    Chaque utilisateur dispose d’une page de profil lui permettant de modifier ses informations personnelle. Dans le menu de haut de page par défaut, un élément de menu est automatiquement créé à l’initialisation de MediaSPIP, visible uniquement si le visiteur est identifié sur le site.
    L’utilisateur a accès à la modification de profil depuis sa page auteur, un lien dans la navigation "Modifier votre profil" est (...)

  • Configuration spécifique pour PHP5

    4 février 2011, par

    PHP5 est obligatoire, vous pouvez l’installer en suivant ce tutoriel spécifique.
    Il est recommandé dans un premier temps de désactiver le safe_mode, cependant, s’il est correctement configuré et que les binaires nécessaires sont accessibles, MediaSPIP devrait fonctionner correctement avec le safe_mode activé.
    Modules spécifiques
    Il est nécessaire d’installer certains modules PHP spécifiques, via le gestionnaire de paquet de votre distribution ou manuellement : php5-mysql pour la connectivité avec la (...)

Sur d’autres sites (5680)

  • Installing "ffmpeg" package from setup.py in Apache Beam pipeline running on Google Cloud Dataflow

    17 avril 2019, par John Allard

    I’m trying to run an Apache Beam pipeline on Google Cloud Dataflow that utilizes FFmpeg to perform transcoding operations. As I understand it, since ffmpeg is not a python package (available through PIP), I need to install it from setup.py using the following lines

    # The output of custom commands (including failures) will be logged in the
    # worker-startup log.
    CUSTOM_COMMANDS = [
       ['apt-get', 'update'],
       ['apt-get', 'install', '-y', 'ffmpeg']]

    Unfortunately, this is not working. My pipeline is stalling and when I go to examine the logs I’m seeing this

    enter image description here

    RuntimeError: Command ['apt-get', 'install', '-y', 'ffmpeg'] failed: exit code: 100

    It appears to be unable to find the package ’ffmpeg’. I’m curious as to why this is - ffmpeg is a standard package that should be available under apt-get.

  • Making a ffmpeg screen capture on Mac OS X using YUV 4:2:0 Planar color model

    30 mai 2019, par Bass

    I make screen recordings with ffmpeg, using avfoundation on Mac OS X, x11grab on Linux and gdigrab on Windows.

    The resulting files should be compatible with modern web browsers (<video></video>), so I use H.264 codec and request YUV 4:2:0 Planar pixel format.

    On Mac OS X, however (unlike Linux and Windows), I receive the following logging :

    /usr/local/bin/ffmpeg -y -v error -f avfoundation -threads 0 -hide_banner -i 1:none -f mp4 -vcodec h264 -pix_fmt yuv420p -r 25/1 -qscale:v 1 -vf scale=-1:1080 target.mp4
    [avfoundation @ 0x7fdba2003a00] Selected pixel format (yuv420p) is not supported by the input device.
    [avfoundation @ 0x7fdba2003a00] Supported pixel formats:
    [avfoundation @ 0x7fdba2003a00]   uyvy422
    [avfoundation @ 0x7fdba2003a00]   yuyv422
    [avfoundation @ 0x7fdba2003a00]   nv12
    [avfoundation @ 0x7fdba2003a00]   0rgb
    [avfoundation @ 0x7fdba2003a00]   bgr0

    Still, according to mplayer, the resulting MP4 file seems to have YUV 4:2:0 Planar color model :

    [h264 @ 0x1048a8ac0]Format yuv420p chosen by get_format().
    [h264 @ 0x1048a8ac0]Reinit context to 1728x1088, pix_fmt: yuv420p
    [h264 @ 0x1048a8ac0]Format yuv420p chosen by get_format().
    [h264 @ 0x1048a8ac0]Reinit context to 1728x1088, pix_fmt: yuv420p
    [swscaler @ 0x1048c3cc0]bicubic scaler, from yuv420p to yuyv422 using MMXEXT
    *** [scale] Exporting mp_image_t, 1728x1080x12bpp YUV planar, 2799360 bytes
    *** [vo] Allocating mp_image_t, 1728x1080x16bpp YUV packed, 3732480 bytes

    the same confirmed by ffmpeg :

    $ ffmpeg -i target.mp4 -hide_banner
    Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'target.mp4':
     Metadata:
       major_brand     : isom
       minor_version   : 512
       compatible_brands: isomiso2avc1mp41
       encoder         : Lavf58.20.100
     Duration: 00:00:04.72, start: 0.000000, bitrate: 201 kb/s
       Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1728x1080, 197 kb/s, 25 fps, 25 tbr, 12800 tbn, 50 tbc (default)
       Metadata:
         handler_name    : VideoHandler

    Questions :

    1. Can someone explain the above ffmpeg logging ?
    2. If I still need to convert the avfoundation video stream to yuv420p, how do I make it on the fly (in a single ffmpeg pass) ?
  • Method For Crawling Google

    28 mai 2011, par Multimedia Mike — Big Data

    I wanted to crawl Google in order to harvest a large corpus of certain types of data as yielded by a certain search term (we’ll call it “term” for this exercise). Google doesn’t appear to offer any API to automatically harvest their search results (why would they ?). So I sat down and thought about how to do it. This is the solution I came up with.



    FAQ
    Q : Is this legal / ethical / compliant with Google’s terms of service ?
    A : Does it look like I care ? Moving right along…

    Manual Crawling Process
    For this exercise, I essentially automated the task that would be performed by a human. It goes something like this :

    1. Search for “term”
    2. On the first page of results, download each of the 10 results returned
    3. Click on the next page of results
    4. Go to step 2, until Google doesn’t return anymore pages of search results

    Google returns up to 1000 results for a given search term. Fetching them 10 at a time is less than efficient. Fortunately, the search URL can easily be tweaked to return up to 100 results per page.

    Expanding Reach
    Problem : 1000 results for the “term” search isn’t that many. I need a way to expand the search. I’m not aiming for relevancy ; I’m just searching for random examples of some data that occurs around the internet.

    My solution for this is to refine the search using the “site” wildcard. For example, you can ask Google to search for “term” at all Canadian domains using “site :.ca”. So, the manual process now involves harvesting up to 1000 results for every single internet top level domain (TLD). But many TLDs can be more granular than that. For example, there are 50 sub-domains under .us, one for each state (e.g., .ca.us, .ny.us). Those all need to be searched independently. Same for all the sub-domains under TLDs which don’t allow domains under the main TLD, such as .uk (search under .co.uk, .ac.uk, etc.).

    Another extension is to combine “term” searches with other terms that are likely to have a rich correlation with “term”. For example, if “term” is relevant to various scientific fields, search for “term” in conjunction with various scientific disciplines.

    Algorithmically
    My solution is to create an SQLite database that contains a table of search seeds. Each seed is essentially a “site :” string combined with a starting index.

    Each TLD and sub-TLD is inserted as a searchseed record with a starting index of 0.

    A script performs the following crawling algorithm :

    • Fetch the next record from the searchseed table which has not been crawled
    • Fetch search result page from Google
    • Scrape URLs from page and insert each into URL table
    • Mark the searchseed record as having been crawled
    • If the results page indicates there are more results for this search, insert a new searchseed for the same seed but with a starting index 100 higher

    Digging Into Sites
    Sometimes, Google notes that certain sites are particularly rich sources of “term” and offers to let you search that site for “term”. This basically links to another search for ‘term site:somesite”. That site gets its own search seed and the program might harvest up to 1000 URLs from that site alone.

    Harvesting the Data
    Armed with a database of URLs, employ the following algorithm :

    • Fetch a random URL from the database which has yet to be downloaded
    • Try to download it
    • For goodness sake, have a mechanism in place to detect whether the download process has stalled and automatically kill it after a certain period of time
    • Store the data and update the database, noting where the information was stored and that it is already downloaded

    This step is easy to parallelize by simply executing multiple copies of the script. It is useful to update the URL table to indicate that one process is already trying to download a URL so multiple processes don’t duplicate work.

    Acting Human
    A few factors here :

    • Google allegedly doesn’t like automated programs crawling its search results. Thus, at the very least, don’t let your script advertise itself as an automated program. At a basic level, this means forging the User-Agent : HTTP header. By default, Python’s urllib2 will identify itself as a programming language. Change this to a well-known browser string.
    • Be patient ; don’t fire off these search requests as quickly as possible. My crawling algorithm inserts a random delay of a few seconds in between each request. This can still yield hundreds of useful URLs per minute.
    • On harvesting the data : Even though you can parallelize this and download data as quickly as your connection can handle, it’s a good idea to randomize the URLs. If you hypothetically had 4 download processes running at once and they got to a point in the URL table which had many URLs from a single site, the server might be configured to reject too many simultaneous requests from a single client.

    Conclusion
    Anyway, that’s just the way I would (and did) do it. What did I do with all the data ? That’s a subject for a different post.

    Adorable spider drawing from here.