Newest 'ffmpeg' Questions - Stack Overflow
Les articles publiés sur le site
-
Output image with correct aspect with ffmpeg
11 février, par koichiroseI have a mkv video with the following properties (obtained with mediainfo):
Width : 718 pixels Height : 432 pixels Display aspect ratio : 2.35:1 Original display aspect ratio : 2.35:1
I'd like to take screenshots of it at certain times:
ffmpeg -ss 4212 -i filename.mkv -frames:v 1 -q:v 2 out.jpg
This will produce a 718x432 jpg image, but the aspect ratio is wrong (the image is "squeezed" horizontally). AFAIK, the output image should be 1015*432 (with width=height * DAR). Is this calculation correct?
Is there a way to have ffmpeg output images with the correct size/AR for all videos (i.e. no "hardcoded" values)? I tried playing with the setdar/setsar filters without success.
Also, out of curiosity, trying to obtain SAR and DAR with ffmpeg produces:
Stream #0:0(eng): Video: h264 (High), yuv420p(tv, smpte170m/smpte170m/bt709, progressive), 718x432 [SAR 64:45 DAR 2872:1215], SAR 155:109 DAR 55645:23544, 24.99 fps, 24.99 tbr, 1k tbn, 49.98 tbc (default)
2872/1215 is 2.363, so a slightly different value than what mediainfo reported. Anyone knows why?
-
FFmpeg - convert 2 audio tracks from video to 5.1 audio, (play video with different languages to different devices) [closed]
10 février, par SandreHow to watch a movie with one language playing through the speakers and another through the headphones?
Disclaimer: I know nothing about audio conversion, and don't want to study ffmpeg. I spend a few hours searching how to do, actually much more than I want. I found a bunch of questions from different people and not a single working solution, so I made a clunky but working solution. If someone helps me make it more elegant, I'll be happy. If my question just gets downvoted like most ffmpeg newbie questions, it probably deserves it. And I hope my question can help people who want enjoy video with 2 different languages.
A clumsy but working solution.
Setup Aggregate Audio Device to play 2 channels of 5.1 through speakers and 2 through bluetooth headphones. (On screenshot Audio MIDI Setup for MacOS)
Use ffmpeg to convert 2 audio tracks into 5.1 audio.
Play video with new external audio track.
# print list of channels ffprobe INPUT.mkv 2>&1 >/dev/null | grep Stream --- sample output --- Stream #0:0(eng): Video: h264 (High), yuv420p(progressive), 1280x544, SAR 1:1 DAR 40:17, 23.98 fps, 23.98 tbr, 1k tbn (default) Stream #0:1(rus): Audio: ac3, 48000 Hz, 5.1(side), fltp, 448 kb/s (default) Stream #0:2(eng): Audio: ac3, 48000 Hz, 5.1(side), fltp, 640 kb/s # extract audio ffmpeg -i INPUT.mkv -map 0:1 -acodec copy ru.ac3 ffmpeg -i INPUT.mkv -map 0:2 -acodec copy en.ac3 # extract only front_center channel from 5.1 - speach ffmpeg -i en.ac3 -filter_complex "channelsplit=channel_layout=5.1:channels=FC[center]" -map "[center]" en_front_center.wav ffmpeg -i ru.ac3 -filter_complex "channelsplit=channel_layout=5.1:channels=FC[center]" -map "[center]" ru_front_center.wav # join to 5.1 ffmpeg -i en_front_center.wav -i ru_front_center.wav -filter_complex "[0:a][0:a][0:a][0:a][1:a][1:a]join=inputs=6:channel_layout=5.1[a]" -map "[a]" output.wav
Is it possible to avoid re-encoding the audio and copying the same channel many times to reduce the file size?
-
OpenCV & RTSP - Python errors
10 février, par Midhun MI’m working on a Python script that reads multiple RTSP streams using OpenCV, detects motion, and saves frames when motion is detected. Initially, I faced issues with ash-colored frames due to H.265 codec, which OpenCV doesn’t support by default. After switching the camera codecs to H.264, the ash-colored frames issue was resolved. However, I’m now encountering decoding errors and glitching frames.
System Specifications:
Processor: Intel Core i3-6100 CPU @ 3.70GHz RAM:8GB Resource Usage: CPU: 35-45% RAM: 1GB max Network Speed: 7.4 Mb/s Disk Usage: 2.3 MB/s
Here’s the Python script I’m using:
import cv2 import os import datetime import threading import argparse from cryptography.fernet import Fernet def encrypt_text(text): return text class MotionDetector: def __init__(self, base_dir="motion_frames"): self.base_dir = base_dir self.output_dirs = [os.path.join(self.base_dir, str(i)) for i in range(1, 4)] for dir_path in self.output_dirs: os.makedirs(dir_path, exist_ok=True) self.fgbg_dict = {} def initialize_fgbg(self, camera_name): if camera_name not in self.fgbg_dict: self.fgbg_dict[camera_name] = cv2.createBackgroundSubtractorMOG2(history=500, varThreshold=35, detectShadows=True) def detect_motion(self, frame, camera_name): self.initialize_fgbg(camera_name) fgmask = self.fgbg_dict[camera_name].apply(frame) thresh = cv2.threshold(fgmask, 200, 255, cv2.THRESH_BINARY)[1] thresh = cv2.dilate(thresh, None, iterations=2) contours, _ = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) motion_detected = any(cv2.contourArea(contour) > 500 for contour in contours) return motion_detected def save_frame(self, frame, camera_name, count): folder_index = (count - 1) % 3 # This will rotate between 0, 1, and 2 output_dir = self.output_dirs[folder_index] timestamp = datetime.datetime.now().strftime("%Y-%m-%d_%H-%M-%S") filename = f"{camera_name}_Entireframe_object_{count}_{timestamp}.jpg" name = encrypt_text(filename) pathname = os.path.join(output_dir, f'{name}.jpg') cv2.imwrite(pathname, frame) # print(f"Motion detected: {pathname}") def process_camera_stream(rtsp_url, camera_name, detector, stop_event): cap = cv2.VideoCapture(rtsp_url, cv2.CAP_FFMPEG) print("Started Camera : ",camera_name) count = 0 while not stop_event.is_set(): ret, frame = cap.read() if not ret: print(f"Connection lost: {camera_name}. Reconnecting...") cap.release() cap = cv2.VideoCapture(rtsp_url) continue if detector.detect_motion(frame, camera_name): count += 1 detector.save_frame(frame, camera_name, count) cap.release() def main(): parser = argparse.ArgumentParser(description='RTSP Motion Detection') parser.add_argument('--output', type=str, default="motion_frames", help='Output directory') args = parser.parse_args() rtsp_urls = { "Camera1": "rtsp://admin:J**@884@192.168.1.103:554/cam/realmonitor?channel=1&subtype=1&protocol=TCP", "Camera2": "rtsp://admin:J***@884@192.168.1.105:554/cam/realmonitor?channel=1&subtype=1&protocol=TCP", "Camera3": "rtsp://admin:J***@884@192.168.1.104:554/cam/realmonitor?channel=1&subtype=1&protocol=TCP", "Camera4": "rtsp://admin:J@884@192.168.1.101:554/cam/realmonitor?channel=1&subtype=1&protocol=TCP", "Camera5": "rtsp://admin:admin123@192.168.1.33:554/Streaming/Channels/301&protocol=TCP", "Camera6": "rtsp://admin:admin123@192.168.1.33:554/Streaming/Channels/401&protocol=TCP", "Camera7": "rtsp://admin:admin123@192.168.1.33:554/Streaming/Channels/701&protocol=TCP", } detector = MotionDetector(base_dir=args.output) stop_event = threading.Event() threads = [] try: for camera_name, url in rtsp_urls.items(): thread = threading.Thread(target=process_camera_stream, args=(url, camera_name, detector, stop_event)) thread.start() threads.append(thread) while True: pass except KeyboardInterrupt: print("Stopping...") stop_event.set() for thread in threads: thread.join() if __name__ == "__main__": main()
i am getting glitched some images are having glitches, i am attaching some image examples. When running the script, I’m getting the following decoding errors in the terminal:
[h264 @ 00000185da541a80] error while decoding MB 26 1, bytestream -29 [h264 @ 00000185d8fefb00] error while decoding MB 23 31, bytestream -5 [h264 @ 00000185cedcc140] error while decoding MB 36 35, bytestream -7 [h264 @ 00000185d8ae73c0] cabac decode of qscale diff failed at 40 35 [h264 @ 00000185d8ae73c0] error while decoding MB 40 35, bytestream -5 [h264 @ 00000185da541a80] error while decoding MB 32 30, bytestream -11 [h264 @ 00000185e15f8500] error while decoding MB 16 34, bytestream -11 [h264 @ 00000185e15f9700] error while decoding MB 9 33, bytestream -9 [h264 @ 00000185e15fb680] error while decoding MB 6 32, bytestream -5 [h264 @ 00000185e15f8500] error while decoding MB 23 23, bytestream -7 [h264 @ 00000185e15fb680] error while decoding MB 28 23, bytestream -5 [h264 @ 00000185e15fa000] error while decoding MB 27 19, bytestream -37 [h264 @ 00000185e15fa900] error while decoding MB 6 27, bytestream -7 [h264 @ 00000185e15f8500] error while decoding MB 14 12, bytestream -5 [h264 @ 00000185e15f9280] error while decoding MB 22 35, bytestream -7 [h264 @ 00000185d8fefb00] error while decoding MB 31 32, bytestream -7 [h264 @ 00000185e15fb680] error while decoding MB 5 24, bytestream -5 [h264 @ 00000185d8ae7b00] error while decoding MB 29 26, bytestream -7
i have ip cameras & analog cameras, the error from ip camera are too frequent compared to analog camera.
Questions:
- What could be causing these decoding errors and glitching frames?
- Are there any specific settings or configurations I need to adjust in OpenCV or FFMPEG to handle H.264 streams more reliably?
- Could this be related to network latency, hardware limitations, or OpenCV’s handling of RTSP streams?
- Are there any alternative approaches or libraries I can use to improve the stability of RTSP stream processing?
the glitches are more on person or moving objects & i have no idea if the error and glitches are rtelated
What I’ve Tried:
- Switched camera codecs from H.265 to H.264 (resolved the ash-colored frames issue).
- Tested the script with a single camera using a different object detection script, but the same errors occurred.
I want the frames saved without the glitches
-
ffmpeg - force_original_aspect_ratio=increase - stretches video height but not width [closed]
10 février, par RhysI am trying to concat 3 videos. But the middle video, is only stretching in height.
test1.mp4 test2.mkv test3.gif
Which have been encoded into mp4s with ffmpeg
test1.mp4 Stream #0:0[0x100]: Video: mpeg2video (Main) ([2][0][0][0] / 0x0002), yuv420p(tv, bt709, progressive), 720x1280 [SAR 1:1 DAR 9:16], 30 fps, 30 tbr, 90k tbn test2.mp4 Stream #0:0[0x100]: Video: mpeg2video (Main) ([2][0][0][0] / 0x0002), yuv420p(tv, progressive), 538x662 [SAR 1:1 DAR 269:331], 30 fps, 30 tbr, 90k tbn test3.mp4 Stream #0:0[0x100]: Video: mpeg2video (Main) ([2][0][0][0] / 0x0002), yuv420p(tv, progressive), 333x592 [SAR 1:1 DAR 9:16], 30 fps, 30 tbr, 90k tbn
When I attempt the concat them.
test3.gif
scales up to fit the screen. Buttest2.mkv
only scales up its height to fit the screen. (Its width does not increase to match)When using, force_original_aspect_ratio=decrease,
test2.mkv
has correct aspect ratio (does not stretch) ... but does not stretch to fit screen (1280x720) ... it just sits in the middle of the screen like a thumbnail. Whereastest3.gif
correctly scales up to fit the screen.When using, force_original_aspect_ratio=increase,
test2.mkv
has the incorrect aspect ratio (only the height scales up to fit the screen) ... but the width does not scale up and the video appears squashed in width.These are the codes I am using for both examples,
force_original_aspect_ratio=increase
ffmpeg -f concat -i mylist.txt -vf scale=force_original_aspect_ratio=increase,setsar=1 merged_video2.mp4
force_original_aspect_ratio=decrease
ffmpeg -f concat -i mylist.txt -vf "scale=720:1280:force_original_aspect_ratio=decrease:eval=frame,pad=720:1280:-1:-1:color=black" merged_video2.mp4
Question
How can I get test2.mkv width to scale up with the height? So the aspect ratio stays fixed.
test2.mkv has a DAR 269:331 and the other 2 videos both have, DAR 9:16 ... could this be causing the problem?
-
How can I build a custom version of opencv while enabling CUDA and opengl ? [closed]
10 février, par JoshI have a hard requirement of python3.7 for certain libraries (aeneas & afaligner). I've been using the regular opencv-python and ffmpeg libraries in my program and they've been working find.
Recently I wanted to adjust my program to use h264 instead of mpeg4 and ran down a licensing rabbit hole of how opencv-python uses a build of ffmpeg with opengl codecs off to avoid licensing issues. x264 is apparently opengl, and is disabled in the opencv-python library.
In order to solve this issue, I built a custom build of opencv using another custom build of ffmpeg both with opengl enabled. This allowed me to use the x264 encoder with the VideoWriter in my python program.
Here's the dockerfile of how I've been running it:
FROM python:3.7-slim # Set optimization flags and number of cores globally ENV CFLAGS="-O3 -march=native -ffast-math -flto -fno-fat-lto-objects -ffunction-sections -fdata-sections" \ CXXFLAGS="-O3 -march=native -ffast-math -flto -fno-fat-lto-objects -ffunction-sections -fdata-sections" \ LDFLAGS="-flto -fno-fat-lto-objects -Wl,--gc-sections" \ MAKEFLAGS="-j\$(nproc)" # Combine all system dependencies in a single layer RUN apt-get update && apt-get install -y --no-install-recommends \ build-essential \ cmake \ git \ wget \ unzip \ yasm \ pkg-config \ libsm6 \ libxext6 \ libxrender-dev \ libglib2.0-0 \ libavcodec-dev \ libavformat-dev \ libswscale-dev \ libavutil-dev \ libswresample-dev \ nasm \ mercurial \ libnuma-dev \ espeak \ libespeak-dev \ libtiff5-dev \ libjpeg62-turbo-dev \ libopenjp2-7-dev \ zlib1g-dev \ libfreetype6-dev \ liblcms2-dev \ libwebp-dev \ tcl8.6-dev \ tk8.6-dev \ python3-tk \ libharfbuzz-dev \ libfribidi-dev \ libxcb1-dev \ python3-dev \ python3-setuptools \ libsndfile1 \ libavdevice-dev \ libavfilter-dev \ libpostproc-dev \ && apt-get clean \ && rm -rf /var/lib/apt/lists/* # Build x264 with optimizations RUN cd /tmp && \ wget https://code.videolan.org/videolan/x264/-/archive/master/x264-master.tar.bz2 && \ tar xjf x264-master.tar.bz2 && \ cd x264-master && \ ./configure \ --enable-shared \ --enable-pic \ --enable-asm \ --enable-lto \ --enable-strip \ --enable-optimizations \ --bit-depth=8 \ --disable-avs \ --disable-swscale \ --disable-lavf \ --disable-ffms \ --disable-gpac \ --disable-lsmash \ --extra-cflags="-O3 -march=native -ffast-math -fomit-frame-pointer -flto -fno-fat-lto-objects" \ --extra-ldflags="-O3 -flto -fno-fat-lto-objects" && \ make && \ make install && \ cd /tmp && \ # Build FFmpeg with optimizations wget https://ffmpeg.org/releases/ffmpeg-7.1.tar.bz2 && \ tar xjf ffmpeg-7.1.tar.bz2 && \ cd ffmpeg-7.1 && \ ./configure \ --enable-gpl \ --enable-libx264 \ --enable-shared \ --enable-nonfree \ --enable-pic \ --enable-asm \ --enable-optimizations \ --enable-lto \ --enable-pthreads \ --disable-debug \ --disable-static \ --disable-doc \ --disable-ffplay \ --disable-ffprobe \ --disable-filters \ --disable-programs \ --disable-postproc \ --extra-cflags="-O3 -march=native -ffast-math -fomit-frame-pointer -flto -fno-fat-lto-objects -ffunction-sections -fdata-sections" \ --extra-ldflags="-O3 -flto -fno-fat-lto-objects -Wl,--gc-sections" \ --prefix=/usr/local && \ make && \ make install && \ ldconfig && \ rm -rf /tmp/* # Install Python dependencies first RUN pip install --no-cache-dir --upgrade pip setuptools wheel && \ pip install --no-cache-dir numpy py-spy # Build OpenCV with optimized configuration RUN cd /tmp && \ # Download specific OpenCV version archives wget -O opencv.zip https://github.com/opencv/opencv/archive/4.8.0.zip && \ wget -O opencv_contrib.zip https://github.com/opencv/opencv_contrib/archive/4.8.0.zip && \ unzip opencv.zip && \ unzip opencv_contrib.zip && \ mv opencv-4.8.0 opencv && \ mv opencv_contrib-4.8.0 opencv_contrib && \ rm opencv.zip opencv_contrib.zip && \ cd opencv && \ mkdir build && cd build && \ cmake \ -D CMAKE_BUILD_TYPE=RELEASE \ -D CMAKE_C_FLAGS="-O3 -march=native -ffast-math -flto -fno-fat-lto-objects -ffunction-sections -fdata-sections" \ -D CMAKE_CXX_FLAGS="-O3 -march=native -ffast-math -flto -fno-fat-lto-objects -ffunction-sections -fdata-sections -Wno-deprecated" \ -D CMAKE_EXE_LINKER_FLAGS="-flto -fno-fat-lto-objects -Wl,--gc-sections" \ -D CMAKE_SHARED_LINKER_FLAGS="-flto -fno-fat-lto-objects -Wl,--gc-sections" \ -D CMAKE_INSTALL_PREFIX=/usr/local \ -D ENABLE_FAST_MATH=ON \ -D CPU_BASELINE_DETECT=ON \ -D CPU_BASELINE=SSE3 \ -D CPU_DISPATCH=SSE4_1,SSE4_2,AVX,AVX2,AVX512_SKX,FP16 \ -D WITH_OPENMP=ON \ -D OPENCV_ENABLE_NONFREE=ON \ -D WITH_FFMPEG=ON \ -D FFMPEG_ROOT=/usr/local \ -D OPENCV_EXTRA_MODULES_PATH=/tmp/opencv_contrib/modules \ -D PYTHON_EXECUTABLE=/usr/local/bin/python3.7 \ -D PYTHON3_EXECUTABLE=/usr/local/bin/python3.7 \ -D PYTHON3_INCLUDE_DIR=/usr/local/include/python3.7m \ -D PYTHON3_LIBRARY=/usr/local/lib/libpython3.7m.so \ -D PYTHON3_PACKAGES_PATH=/usr/local/lib/python3.7/site-packages \ -D PYTHON3_NUMPY_INCLUDE_DIRS=/usr/local/lib/python3.7/site-packages/numpy/core/include \ -D BUILD_opencv_python3=ON \ -D INSTALL_PYTHON_EXAMPLES=OFF \ -D BUILD_TESTS=OFF \ -D BUILD_PERF_TESTS=OFF \ -D BUILD_EXAMPLES=OFF \ -D BUILD_DOCS=OFF \ -D BUILD_opencv_apps=OFF \ -D WITH_OPENCL=OFF \ -D WITH_CUDA=OFF \ -D WITH_IPP=OFF \ -D WITH_TBB=OFF \ -D WITH_V4L=OFF \ -D WITH_QT=OFF \ -D WITH_GTK=OFF \ -D BUILD_LIST=core,imgproc,imgcodecs,videoio,python3 \ .. && \ make && \ make install && \ ldconfig && \ rm -rf /tmp/* # Set working directory and copy application code WORKDIR /app COPY requirements.txt . RUN apt-get update && apt-get install -y --no-install-recommends ffmpeg RUN pip install --no-cache-dir aeneas afaligner && \ pip install --no-cache-dir -r requirements.txt COPY . . # Make entrypoint executable RUN chmod +x entrypoint.sh ENTRYPOINT ["./entrypoint.sh"]
My trouble now, is I've been considering running parts of my program on my GPU, it's creating graphics for a video after all. I have no idea how to edit my Dockerfile to make the opencv build run with CUDA enabled, every combination I try leads to issues.
How can I tell which version of CUDA, opencv and ffmpeg are compatible with python 3.7? I've tried so so many combinations and they all lead to different issues, I've asked various AI agents and they all flounder. Where can I find a reliable source of information about this?