
Recherche avancée
Médias (3)
-
Exemple de boutons d’action pour une collection collaborative
27 février 2013, par
Mis à jour : Mars 2013
Langue : français
Type : Image
-
Exemple de boutons d’action pour une collection personnelle
27 février 2013, par
Mis à jour : Février 2013
Langue : English
Type : Image
-
Collections - Formulaire de création rapide
19 février 2013, par
Mis à jour : Février 2013
Langue : français
Type : Image
Autres articles (84)
-
Les autorisations surchargées par les plugins
27 avril 2010, parMediaspip core
autoriser_auteur_modifier() afin que les visiteurs soient capables de modifier leurs informations sur la page d’auteurs -
Des sites réalisés avec MediaSPIP
2 mai 2011, parCette page présente quelques-uns des sites fonctionnant sous MediaSPIP.
Vous pouvez bien entendu ajouter le votre grâce au formulaire en bas de page. -
Configurer la prise en compte des langues
15 novembre 2010, parAccéder à la configuration et ajouter des langues prises en compte
Afin de configurer la prise en compte de nouvelles langues, il est nécessaire de se rendre dans la partie "Administrer" du site.
De là, dans le menu de navigation, vous pouvez accéder à une partie "Gestion des langues" permettant d’activer la prise en compte de nouvelles langues.
Chaque nouvelle langue ajoutée reste désactivable tant qu’aucun objet n’est créé dans cette langue. Dans ce cas, elle devient grisée dans la configuration et (...)
Sur d’autres sites (9017)
-
Bootstrapping an AI UGC system — video generation is expensive, APIs are limiting, and I need help navigating it all [closed]
24 juin, par Barack _ OumaI’m building a solo AI-powered UGC (User-Generated Content) platform — something that automates the creation of short-form content using AI avatars, voices, visuals, and scripts. But I’ve hit a wall with video generation and API limitations.


So far, I’ve integrated TTS and voice cloning (using ElevenLabs), and I’ve gotten image generation working. But video generation (especially talking avatars) has been a nightmare — both financially and technically.


🛠️ Features I’m trying to build :


AI avatars (face + lip-syncing)
Script generation (LLM-driven)
Image generation
Video composition


I’m trying to build an AI faceless content creation automtion platform alternative to Makeugc.com or Reelfarm.org or postbridge.com — just trying to create a working pipeline for automated content.


❌ Challenges so far :


Services like D-ID, Synthesia, Magic Hour, and Luma are either paywalled, have no trials, or are very expensive.


D-ID does support avatar creation, but you need to pay upfront to even access those features. There's no easy/free entry point.


Tools like Google Veo 3 are powerful but clearly not accessible for indie builders.
I’ve looked into open-source models like WAN 2.1, CogVideo, etc., but I have no clue how to run them or what infra is needed.


Now I’m torn between buying my own GPU or renting compute power to self-host these models.


💸 Cost is a huge blocker


I’ve been looking through Replicate’s pricing, and while some models (especially image gen) are manageable, video models get expensive fast. Even GPU rental rates stack up quickly, especially if you’re testing often or experimenting with pipelines. Plus, idle time billing doesn’t help.


💭 What I could really use help with :


Has anyone successfully stitched together APIs (voice, avatar, video) into a working UGC pipeline ?


Should I use separate services (e.g. ElevenLabs + Synthesia + WAN) or try to host my own end-to-end system ?


Is it cheaper (long term) to buy a used GPU like a 4090 and run things locally ? Or better to rent compute short-term ?


Any open-source solutions that are beginner-friendly or have minimal setup ?
Any existing frameworks or wrappers for UGC media pipelines that make all this easier ?


I’ve spent weeks researching, testing APIs, and hitting walls — and while I’ve learned a lot, I’d really appreciate any guidance from folks who’ve been here before.
Thanks in advance 🙏


And good luck to everyone else trying to build with AI on a budget — this stuff isn’t as plug-and-play as it looks on launch videos 💀


-
Error recording an RTSP stream without transcoding
23 août 2017, par MattI’m trying to use FFmpeg to record RTSP streams from several security cameras. I have been successfully transcoding each stream for months now, but since this requires considerable CPU power, I’d like to simply copy each stream to disk in it’s original H.264 format.
Whenever I try this, I receive and error similar to this (the "current" value varies) :
Non-monotonous DTS in output stream 0:0 ; previous : 0, current : -62743 ;
I’ve stripped most of the options I was using, although I really do want to keep -xerror so that FFmpeg quits when it encounters an error :
ffmpeg.exe -xerror -i rtsp://admin:admin@192.168.1.135 -an -vcodec copy test.mp4
And I still get this :
ffmpeg version 3.3.3 Copyright (c) 2000-2017 the FFmpeg developers built with gcc 7.1.0 (GCC) configuration : —enable-gpl —enable-version3 —enable-cuda —enable-cuvid —enable-d3d11va —enable-dxva2 —enable-libmfx —enable-nvenc —enable-avisynth —enable-bzlib —enable-fontconfig —enable-frei0r —enable-gnutls —enable-iconv —enable-libass —enable-libbluray —enable-libbs2b —enable-libcaca —enable-libfreetype —enable-libgme —enable-libgsm —enable-libilbc —enable-libmodplug —enable-libmp3lame —enable-libopencore-amrnb —enable-libopencore-amrwb —enable-libopenh264 —enable-libopenjpeg —enable-libopus —enable-librtmp —enable-libsnappy —enable-libsoxr —enable-libspeex —enable-libtheora —enable-libtwolame —enable-libvidstab —enable-libvo-amrwbenc —enable-libvorbis —enable-libvpx —enable-libwavpack —enable-libwebp —enable-libx264 —enable-libx265 —enable-libxavs —enable-libxvid —enable-libzimg —enable-lzma —enable-zlib libavutil 55. 58.100 / 55. 58.100 libavcodec 57. 89.100 / 57. 89.100 libavformat 57. 71.100 / 57. 71.100 libavdevice 57. 6.100 / 57. 6.100 libavfilter 6. 82.100 / 6. 82.100 libswscale 4. 6.100 / 4. 6.100 libswresample 2. 7.100 / 2. 7.100 libpostproc 54. 5.100 / 54. 5.100 [udp @ 0000000002533b60] ’circular_buffer_size’ option was set but it is not supported on this build (pthread support is required) [udp @ 0000000000ec97a0] ’circular_buffer_size’ option was set but it is not supported on this build (pthread support is required) Input #0, rtsp, from ’rtsp ://admin:admin@192.168.1.135’ : Metadata : title : RTSP Session/2.0 Duration : N/A, start : 0.837144, bitrate : N/A Stream #0:0 : Video : h264 (High), yuvj420p(pc, bt709, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 7 fps, 25 tbr, 90k tbn, 14 tbc Output #0, mp4, to ’test.mp4’ : Metadata : title : RTSP Session/2.0 y encoder : Lavf57.71.100 Stream #0:0 : Video : h264 (High) ([33][0][0][0] / 0x0021), yuvj420p(pc, bt709, progressive), 1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 7 fps, 25 tbr, 90k tbn, 90k tbc Stream mapping : Stream #0:0 -> #0:0 (copy) Press [q] to stop, [?] for help
[mp4 @ 00000000036b8340] Non-monotonous DTS in output stream 0:0 ; previous : 0, current : -62743 ; aborting.
Conversion failed !
Can anyone explain what the problem is and/or suggest the appropriate flags to handle this ?
-
swresample/resample : speed up build_filter by 50%
4 novembre 2015, par Ganesh Ajjanagaddeswresample/resample : speed up build_filter by 50%
This speeds up build_filter by 50%. This gain should be pretty
consistent across all architectures and platforms.Essentially, this relies on a observation that the filters have some
even/odd symmetry that may be exploited during the construction of the
polyphase filter bank. In particular, phases (scaled to [0, 1]) in [0.5, 1] are
easily derived from [0, 0.5] and expensive reevaluation of function
points are unnecessary. This requires some rather annoying even/odd
bookkeeping as can be seen from the patch.I vaguely recall from signal processing theory more general symmetries allowing even greater
optimization of the construction. At a high level, "even functions"
correspond to 2, and one can imagine variations. Nevertheless, for the sake
of some generality and because of existing filters, this is all that is
being exploited.Currently, this patch relies on phase_count being even or (trivially) 1,
though this is not an inherent limitation to the approach. This
assumption is safe as phase_count is 1 << phase_bits, and is hence a
power of two. There is no way for user API to set it to a nontrivial odd
number. This assumption has been placed as an assert in the code.To repeat, this assumes even symmetry of the filters, which is the most common
way to get generalized linear phase anyway and is true of all currently
supported filters.As a side note, accuracy should be identical or perhaps slightly better
due to this "forcing" filter symmetries leading to a better phase
characteristic. As before, I can’t test this claim easily, though it may
be of interest.Patch tested with FATE.
Sample benchmark (x86-64, Haswell, GNU/Linux) :
test : swr-resample-dblp-44100-2626
new :
527376779 decicycles in build_filter(loop 1000), 256 runs, 0 skips
524361765 decicycles in build_filter(loop 1000), 512 runs, 0 skips
516552574 decicycles in build_filter(loop 1000), 1024 runs, 0 skipsold :
974178658 decicycles in build_filter(loop 1000), 256 runs, 0 skips
972794408 decicycles in build_filter(loop 1000), 512 runs, 0 skips
954350046 decicycles in build_filter(loop 1000), 1024 runs, 0 skipsNote that lower level optimizations are entirely possible, I focussed on
getting the high level semantics correct. In any case, this should
provide a good foundation.Reviewed-by : Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by : Ganesh Ajjanagadde <gajjanagadde@gmail.com>