  • Improving Performance and Quality of Screen Streaming(Headless) with FFMPEG and NVIDIA Hardware Acceleration [closed]

    24 avril, par Dhairya Verma

    I am attempting to stream my screen to an RTMP URL using FFMPEG with X11 for screen capture and NVIDIA's hardware acceleration to enhance performance. Despite using NVIDIA acceleration, the stream is still experiencing lags and low-quality output. I've noticed that FFMPEG is utilizing only about 100MB of GPU memory, which seems low. Here's the command I'm currently using :


    ffmpeg -hwaccel cuvid -f x11grab -s 1920x1080 -i :1 -f pulse -i VirtualSink.monitor -c:v h264_nvenc -preset:v p1 -b:v 2500k -maxrate 2500k -bufsize 5000k -vf "fps=30,crop=1280:720:320:180,format=yuv420p" -g 60 -c:a aac -b:a 128k -ar 44100 -f flv rtmps://[RTMP_URL]


    Questions :



    1. Are there any specific settings or tweaks I should consider to fully utilize the GPU for better performance and video quality ?



    3. Is there a more optimal way to configure the bitrate or buffer size to improve the stream quality without increasing lag ?



    5. Would adjusting the preset or using different ffmpeg flags help in reducing the load and improving the output ?



    7. Any advice on optimizing FFMPEG for smoother streaming with hardware acceleration would be greatly appreciated !




  • FFMPEG - converting interlaced 720x480 MPEG2 to scaled up progressive 1440x1080 HEVC (Nvidia available)

    6 février, par PCSO SAR COMM

    I am trying to improve the video of a ripped DVD. The current file is encoded in MPEG2 interlaced and 720:480. I am trying see if it is possible, with a bit of conversion, for it to look better... possibly upscaling to near-HD if it is possible. My computer does have an Nvida (1050 Ti) card, so I was thinking that may help.


    I used


    ffmpeg -hwaccel cuda -i "input.mkv" -c:v hevc_nvenc -preset p7 -profile:v 1 -level 0 -tier 1 -tune 1 -vf yadif -vf "scale=1440x1080:flags=lanczos" -c:a aac -c:s srt output.mkv


    The "-vf yadif" portion in that was intended to de-interlace, after looking at ideas on de-interlacing but the output still ends up being interlaced. The interlace artifact is quite annoying when viewed on my computer.


  • Low GPU Utilization NVIDIA / FFMPEG

    16 septembre 2023, par parakeetdev

    I'm trying to run a Docker container on to offset media transcoding via serverless GPU's. I have the container image based off of "nvidia/cuda:12.1.0-cudnn8-devel-ubuntu22.04".


    Here's the configuration for FFMPEG in my Dockerfile :


    git clone && \
    make install -C ./nv-codec-headers && \
    git clone ffmpeg_source/ && \
    /ffmpeg_source/configure --prefix=/usr --ld="g++" --enable-nonfree --enable-gpl --enable-gnutls --enable-cuda-nvcc --enable-cuda --enable-cuda-llvm --enable-cuvid --enable-nvenc --enable-ffnvcodec --enable-libnpp --enable-libmp3lame --enable-libx264 --enable-libx265 --enable-libvpx --enable-libfreetype --enable-libvorbis --enable-libfdk-aac --enable-libopus --extra-cflags=-I/usr/local/cuda/include --extra-ldflags=-L/usr/local/cuda/lib64 --disable-static --enable-shared --disable-stripping


    I have the capability for GPU in my compose.yml :


          - driver: nvidia
            count: 1
            capabilities: [gpu]


    I receive the CUDA boot up screen when the container deploys. It's running on an RTX A6000, which is capable of hardware accelerated encoding and decoding via NVIDIA for ffmpeg.


    My FFMPEG command goes as follows :


    command = "ffmpeg -y -hwaccel cuda -hwaccel_output_format cuda -i - "

command += f"-vf scale_npp=1920:1080 -c:v h264_nvenc -b:v 5M -preset p2 -tune ll -f mp4 -bufsize 5M -maxrate 10M -qmin 0 -g 250 -bf 3 -b_ref_mode middle -temporal-aq 1 -rc-lookahead 20 -i_qfactor 0.75 -b_qfactor 1.1 {} "

command += f"-vf scale_npp=1280:720 -c:v h264_nvenc -b:v 3M -preset p2 -tune ll -f mp4 -bufsize 3M -maxrate 6M -qmin 0 -g 250 -bf 3 -b_ref_mode middle -temporal-aq 1 -rc-lookahead 20 -i_qfactor 0.75 -b_qfactor 1.1 {} "

command += f"-vf scale_npp=640:480 -c:v h264_nvenc -b:v 1M -preset p2 -tune ll -f mp4 -bufsize 1M -maxrate 2M -qmin 0 -g 250 -bf 3 -b_ref_mode middle -temporal-aq 1 -rc-lookahead 20 -i_qfactor 0.75 -b_qfactor 1.1 {}"


    I'm using Python and piping to stdin with bytes.


    The CPU stays at 100%, while I'm lucky if the GPU ever leaves 0%. I think I've seen it hit at most about 4% utilization, while the CPU is completely maxed out.


    I've tried simpler commands. I thought maybe it was due to the audio, so I dropped the audio, but it didn't change anything.


    I've tried different images, 11.8 cuda, 12.0 cuda, 12.1 cuda, 12.2 cuda.


    I've tried the runtime and devel images for each of those versions.


    The drivers are up to date.


    It clearly taps into the GPU, because it will slightly bump up to a few percents before going back down to zero. On top of this, the output is also wrong/corrupted, as no video player will open the file, stating that it can't be played.


    I have also swapped "-hwaccel cuda" for "-hwaccel nvdec".


    No errors thrown and nothing changes. I have also tried with hevc_nvenc for the encoder in x265, also made no difference.


    Not sure what I'm doing wrong. Maybe this can't be done via piping ?