
Recherche avancée
Autres articles (47)
-
Des sites réalisés avec MediaSPIP
2 mai 2011, parCette page présente quelques-uns des sites fonctionnant sous MediaSPIP.
Vous pouvez bien entendu ajouter le votre grâce au formulaire en bas de page. -
Support audio et vidéo HTML5
10 avril 2011MediaSPIP utilise les balises HTML5 video et audio pour la lecture de documents multimedia en profitant des dernières innovations du W3C supportées par les navigateurs modernes.
Pour les navigateurs plus anciens, le lecteur flash Flowplayer est utilisé.
Le lecteur HTML5 utilisé a été spécifiquement créé pour MediaSPIP : il est complètement modifiable graphiquement pour correspondre à un thème choisi.
Ces technologies permettent de distribuer vidéo et son à la fois sur des ordinateurs conventionnels (...) -
HTML5 audio and video support
13 avril 2011, parMediaSPIP uses HTML5 video and audio tags to play multimedia files, taking advantage of the latest W3C innovations supported by modern browsers.
The MediaSPIP player used has been created specifically for MediaSPIP and can be easily adapted to fit in with a specific theme.
For older browsers the Flowplayer flash fallback is used.
MediaSPIP allows for media playback on major mobile platforms with the above (...)
Sur d’autres sites (4570)
-
lavu/tx : implement 32 bit fixed point FFT and MDCT
9 février 2020, par Lynnelavu/tx : implement 32 bit fixed point FFT and MDCT
Required minimal changes to the code so made sense to implement.
FFT and MDCT tested, the output of both was properly rounded.
Fun fact : the non-power-of-two fixed-point FFT and MDCT are the fastest ever
non-power-of-two fixed-point FFT and MDCT written.
This can replace the power of two integer MDCTs in aac and ac3 if the
MIPS optimizations are ported across.
Unfortunately the ac3 encoder uses a 16-bit fixed point forward transform,
unlike the encoder which uses a 32bit inverse transform, so some modifications
might be required there.The 3-point FFT is somewhat less accurate than it otherwise could be,
having minor rounding errors with bigger transforms. However, this
could be improved later, and the way its currently written is the way one
would write assembly for it.
Similar rounding errors can also be found throughout the power of two FFTs
as well, though those are more difficult to correct.
Despite this, the integer transforms are more than accurate enough. -
In ffmpeg : Can't get peak bitrate values within 10% error tolerance for HLS playlist files
5 octobre 2019, par ALS20394I’m using ffmpeg to produce several HLS variant playlists from an .mp4 file. When I check master.m3u8 file with mediastreamvalidator I get :
Error : Measured peak bitrate compared to master playlist declared value exceeds error tolerance
I understand that the error percentage needs to be less than 10%, and figured it out on 1 variant but not the three others. I’ve spent quite a bit of time adjusting the -maxrate and the -buffsize but the error percentage change is minimal. Beginning to wonder if I’m misunderstanding something ?
This is the latest of what I enter :
ffmpeg -i FHVid.mp4 \
-b:v:0 5000k -maxrate 5250k -bufsize 5500k -profile:v main -c:v h264 -crf 20 -sc_threshold 0 -g 48 \
-b:v:1 2800k -maxrate 2940k -bufsize 3100k -profile:v main -c:v h264 -crf 20 -sc_threshold 0 -g 48 \
-b:v:2 1400k -maxrate 1540k -bufsize 1700k -profile:v main -c:v h264 -crf 20 -sc_threshold 0 -g 48 \
-b:v:3 800k -maxrate 840k -bufsize 1050k -profile:v main -c:v h264 -crf 20 -sc_threshold 0 -g 48 \
-b:a:0 192k \
-b:a:1 128k \
-b:a:2 128k \
-b:a:3 96k \
-c:a aac -ar 48000 -keyint_min 48 -map 0:v -map 0:a -map 0:v -map 0:a -map 0:v -map 0:a -map 0:v -map 0:a \
-f hls -var_stream_map "v:0,a:0 v:1,a:1 v:2,a:2 v:3,a:3" \
-master_pl_name FHVidmaster.m3u8 -hls_time 4 -hls_playlist_type vod \
-hls_segment_filename 'file_%v_%03d.ts' out_%v.m3u8but I’ve also tried max rates that seem to be more standard
-b:v:0 5000k -maxrate 5500k -bufsize 6500k
-b:v:1 2800k -maxrate 3080k -bufsize 3200k
-b:v:2 1400k -maxrate 1540k -bufsize 1900k
-b:v:3 800k -maxrate 880k -bufsize 1050kThe latest error message :
Error: Measured peak bitrate compared to master playlist declared value exceeds error tolerance
--> Detail: Measured: 1111.54 kb/s, Master playlist: 1680.80 kb/s, Error: 33.87%
--> Source: /Users/Bun/Documents/CODING/CosmicPerspectiveAssets/01-FalconHeavy/HLS/FHVidmaster.m3u8
--> Compare: out_2.m3u8
--> Detail: Measured: 1178.21 kb/s, Master playlist: 5711.20 kb/s, Error: 79.37%
--> Source: /Users/Bun/Documents/CODING/CosmicPerspectiveAssets/01-FalconHeavy/HLS/FHVidmaster.m3u8
--> Compare: out_0.m3u8
--> Detail: Measured: 1109.03 kb/s, Master playlist: 3220.80 kb/s, Error: 65.57%
--> Source: /Users/Bun/Documents/CODING/CosmicPerspectiveAssets/01-FalconHeavy/HLS/FHVidmaster.m3u8
--> Compare: out_1.m3u8Any help would be greatly appreciated on the -maxrate and -buffsize for variant playlists 0, 1, and 2. No adjustment I make seems to make any difference.
-
Bootstrapping an AI UGC system — video generation is expensive, APIs are limiting, and I need help navigating it all [closed]
24 juin, par Barack _ OumaI’m building a solo AI-powered UGC (User-Generated Content) platform — something that automates the creation of short-form content using AI avatars, voices, visuals, and scripts. But I’ve hit a wall with video generation and API limitations.


So far, I’ve integrated TTS and voice cloning (using ElevenLabs), and I’ve gotten image generation working. But video generation (especially talking avatars) has been a nightmare — both financially and technically.


🛠️ Features I’m trying to build :


AI avatars (face + lip-syncing)
Script generation (LLM-driven)
Image generation
Video composition


I’m trying to build an AI faceless content creation automtion platform alternative to Makeugc.com or Reelfarm.org or postbridge.com — just trying to create a working pipeline for automated content.


❌ Challenges so far :


Services like D-ID, Synthesia, Magic Hour, and Luma are either paywalled, have no trials, or are very expensive.


D-ID does support avatar creation, but you need to pay upfront to even access those features. There's no easy/free entry point.


Tools like Google Veo 3 are powerful but clearly not accessible for indie builders.
I’ve looked into open-source models like WAN 2.1, CogVideo, etc., but I have no clue how to run them or what infra is needed.


Now I’m torn between buying my own GPU or renting compute power to self-host these models.


💸 Cost is a huge blocker


I’ve been looking through Replicate’s pricing, and while some models (especially image gen) are manageable, video models get expensive fast. Even GPU rental rates stack up quickly, especially if you’re testing often or experimenting with pipelines. Plus, idle time billing doesn’t help.


💭 What I could really use help with :


Has anyone successfully stitched together APIs (voice, avatar, video) into a working UGC pipeline ?


Should I use separate services (e.g. ElevenLabs + Synthesia + WAN) or try to host my own end-to-end system ?


Is it cheaper (long term) to buy a used GPU like a 4090 and run things locally ? Or better to rent compute short-term ?


Any open-source solutions that are beginner-friendly or have minimal setup ?
Any existing frameworks or wrappers for UGC media pipelines that make all this easier ?


I’ve spent weeks researching, testing APIs, and hitting walls — and while I’ve learned a lot, I’d really appreciate any guidance from folks who’ve been here before.
Thanks in advance 🙏


And good luck to everyone else trying to build with AI on a budget — this stuff isn’t as plug-and-play as it looks on launch videos 💀