Recherche avancée

Médias (3)

Mot : - Tags -/collection

Autres articles (84)

  • Les autorisations surchargées par les plugins

    27 avril 2010, par

    Mediaspip core
    autoriser_auteur_modifier() afin que les visiteurs soient capables de modifier leurs informations sur la page d’auteurs

  • Des sites réalisés avec MediaSPIP

    2 mai 2011, par

    Cette page présente quelques-uns des sites fonctionnant sous MediaSPIP.
    Vous pouvez bien entendu ajouter le votre grâce au formulaire en bas de page.

  • Configurer la prise en compte des langues

    15 novembre 2010, par

    Accéder à la configuration et ajouter des langues prises en compte
    Afin de configurer la prise en compte de nouvelles langues, il est nécessaire de se rendre dans la partie "Administrer" du site.
    De là, dans le menu de navigation, vous pouvez accéder à une partie "Gestion des langues" permettant d’activer la prise en compte de nouvelles langues.
    Chaque nouvelle langue ajoutée reste désactivable tant qu’aucun objet n’est créé dans cette langue. Dans ce cas, elle devient grisée dans la configuration et (...)

Sur d’autres sites (9017)

  • Bootstrapping an AI UGC system — video generation is expensive, APIs are limiting, and I need help navigating it all [closed]

    24 juin, par Barack _ Ouma

    I’m building a solo AI-powered UGC (User-Generated Content) platform — something that automates the creation of short-form content using AI avatars, voices, visuals, and scripts. But I’ve hit a wall with video generation and API limitations.

    


    So far, I’ve integrated TTS and voice cloning (using ElevenLabs), and I’ve gotten image generation working. But video generation (especially talking avatars) has been a nightmare — both financially and technically.

    


    🛠️ Features I’m trying to build :

    


    AI avatars (face + lip-syncing)
Script generation (LLM-driven)
Image generation
Video composition

    


    I’m trying to build an AI faceless content creation automtion platform alternative to Makeugc.com or Reelfarm.org or postbridge.com — just trying to create a working pipeline for automated content.

    


    ❌ Challenges so far :

    


    Services like D-ID, Synthesia, Magic Hour, and Luma are either paywalled, have no trials, or are very expensive.

    


    D-ID does support avatar creation, but you need to pay upfront to even access those features. There's no easy/free entry point.

    


    Tools like Google Veo 3 are powerful but clearly not accessible for indie builders.
I’ve looked into open-source models like WAN 2.1, CogVideo, etc., but I have no clue how to run them or what infra is needed.

    


    Now I’m torn between buying my own GPU or renting compute power to self-host these models.

    


    💸 Cost is a huge blocker

    


    I’ve been looking through Replicate’s pricing, and while some models (especially image gen) are manageable, video models get expensive fast. Even GPU rental rates stack up quickly, especially if you’re testing often or experimenting with pipelines. Plus, idle time billing doesn’t help.

    


    💭 What I could really use help with :

    


    Has anyone successfully stitched together APIs (voice, avatar, video) into a working UGC pipeline ?

    


    Should I use separate services (e.g. ElevenLabs + Synthesia + WAN) or try to host my own end-to-end system ?

    


    Is it cheaper (long term) to buy a used GPU like a 4090 and run things locally ? Or better to rent compute short-term ?

    


    Any open-source solutions that are beginner-friendly or have minimal setup ?
Any existing frameworks or wrappers for UGC media pipelines that make all this easier ?

    


    I’ve spent weeks researching, testing APIs, and hitting walls — and while I’ve learned a lot, I’d really appreciate any guidance from folks who’ve been here before.
Thanks in advance 🙏

    


    And good luck to everyone else trying to build with AI on a budget — this stuff isn’t as plug-and-play as it looks on launch videos 💀

    


  • Error recording an RTSP stream without transcoding

    23 août 2017, par Matt

    I’m trying to use FFmpeg to record RTSP streams from several security cameras. I have been successfully transcoding each stream for months now, but since this requires considerable CPU power, I’d like to simply copy each stream to disk in it’s original H.264 format.

    Whenever I try this, I receive and error similar to this (the "current" value varies) :

    Non-monotonous DTS in output stream 0:0 ; previous : 0, current : -62743 ;

    I’ve stripped most of the options I was using, although I really do want to keep -xerror so that FFmpeg quits when it encounters an error :

    ffmpeg.exe -xerror -i rtsp://admin:admin@192.168.1.135 -an -vcodec copy test.mp4

    And I still get this :

        ffmpeg version 3.3.3 Copyright (c) 2000-2017 the FFmpeg developers  
        built with gcc 7.1.0 (GCC)   
        configuration : —enable-gpl
        —enable-version3 —enable-cuda —enable-cuvid —enable-d3d11va —enable-dxva2 —enable-libmfx —enable-nvenc —enable-avisynth —enable-bzlib —enable-fontconfig —enable-frei0r —enable-gnutls —enable-iconv —enable-libass —enable-libbluray —enable-libbs2b —enable-libcaca —enable-libfreetype —enable-libgme —enable-libgsm —enable-libilbc —enable-libmodplug —enable-libmp3lame —enable-libopencore-amrnb —enable-libopencore-amrwb —enable-libopenh264 —enable-libopenjpeg —enable-libopus —enable-librtmp —enable-libsnappy —enable-libsoxr —enable-libspeex —enable-libtheora —enable-libtwolame —enable-libvidstab —enable-libvo-amrwbenc —enable-libvorbis —enable-libvpx —enable-libwavpack —enable-libwebp —enable-libx264 —enable-libx265 —enable-libxavs —enable-libxvid —enable-libzimg —enable-lzma —enable-zlib   
        libavutil      55. 58.100 / 55. 58.100   
        libavcodec     57. 89.100 / 57. 89.100   
        libavformat    57. 71.100 / 57. 71.100   
        libavdevice    57.  6.100 / 57.  6.100   
        libavfilter     6. 82.100 / 6. 82.100   
        libswscale      4.  6.100 / 4.  6.100  
        libswresample   2.  7.100 / 2.  7.100   
        libpostproc    54.  5.100 / 54.  5.100
        [udp @ 0000000002533b60] ’circular_buffer_size’ option was set but it is not supported on this build (pthread support is required) 
        [udp @ 0000000000ec97a0] ’circular_buffer_size’ option was set but it is not supported on this build (pthread support is required) 
        Input #0, rtsp, from ’rtsp ://admin:admin@192.168.1.135’ :   Metadata :
            title : RTSP Session/2.0   Duration : N/A, start : 0.837144, bitrate : N/A
            Stream #0:0 : Video : h264 (High), yuvj420p(pc, bt709, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 7 fps, 25 tbr, 90k tbn, 14 tbc Output
        #0, mp4, to ’test.mp4’ :   Metadata :
            title : RTSP Session/2.0
        y    encoder : Lavf57.71.100
            Stream #0:0 : Video : h264 (High) ([33][0][0][0] / 0x0021), yuvj420p(pc, bt709, progressive), 1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 7 fps, 25 tbr, 90k tbn, 90k tbc 
        Stream mapping :   Stream #0:0 -> #0:0 (copy) 
        Press [q] to stop, [?] for help
    

    [mp4 @ 00000000036b8340] Non-monotonous DTS in output stream 0:0 ; previous : 0, current : -62743 ; aborting.

    Conversion failed !

    Can anyone explain what the problem is and/or suggest the appropriate flags to handle this ?

  • swresample/resample : speed up build_filter by 50%

    4 novembre 2015, par Ganesh Ajjanagadde
    swresample/resample : speed up build_filter by 50%
    

    This speeds up build_filter by 50%. This gain should be pretty
    consistent across all architectures and platforms.

    Essentially, this relies on a observation that the filters have some
    even/odd symmetry that may be exploited during the construction of the
    polyphase filter bank. In particular, phases (scaled to [0, 1]) in [0.5, 1] are
    easily derived from [0, 0.5] and expensive reevaluation of function
    points are unnecessary. This requires some rather annoying even/odd
    bookkeeping as can be seen from the patch.

    I vaguely recall from signal processing theory more general symmetries allowing even greater
    optimization of the construction. At a high level, "even functions"
    correspond to 2, and one can imagine variations. Nevertheless, for the sake
    of some generality and because of existing filters, this is all that is
    being exploited.

    Currently, this patch relies on phase_count being even or (trivially) 1,
    though this is not an inherent limitation to the approach. This
    assumption is safe as phase_count is 1 << phase_bits, and is hence a
    power of two. There is no way for user API to set it to a nontrivial odd
    number. This assumption has been placed as an assert in the code.

    To repeat, this assumes even symmetry of the filters, which is the most common
    way to get generalized linear phase anyway and is true of all currently
    supported filters.

    As a side note, accuracy should be identical or perhaps slightly better
    due to this "forcing" filter symmetries leading to a better phase
    characteristic. As before, I can’t test this claim easily, though it may
    be of interest.

    Patch tested with FATE.

    Sample benchmark (x86-64, Haswell, GNU/Linux) :

    test : swr-resample-dblp-44100-2626

    new :
    527376779 decicycles in build_filter(loop 1000), 256 runs, 0 skips
    524361765 decicycles in build_filter(loop 1000), 512 runs, 0 skips
    516552574 decicycles in build_filter(loop 1000), 1024 runs, 0 skips

    old :
    974178658 decicycles in build_filter(loop 1000), 256 runs, 0 skips
    972794408 decicycles in build_filter(loop 1000), 512 runs, 0 skips
    954350046 decicycles in build_filter(loop 1000), 1024 runs, 0 skips

    Note that lower level optimizations are entirely possible, I focussed on
    getting the high level semantics correct. In any case, this should
    provide a good foundation.

    Reviewed-by : Michael Niedermayer <michael@niedermayer.cc>
    Signed-off-by : Ganesh Ajjanagadde <gajjanagadde@gmail.com>

    • [DH] libswresample/resample.c