Newest 'ffmpeg' Questions - Stack Overflow
Les articles publiés sur le site
-
Are there any libraries on adding effect to audio, like phone, inner monologue, or sounds like a man/woman ? [closed]
6 mars, par MathewI'm trying to apply different audio effects, such as making audio sound like a phone call. Below is my current approach. As you can see, I'm using multiple filters and simple algorithms to achieve this effect, but the output quality isn't ideal.
Since I need to implement many sound effects/filters, are there any ready-to-use libraries that could help?
I've looked into FFmpeg filters and noticed mentions of LADSPA/LV2 plugins. Are these viable solutions? Any other suggestions would be greatly appreciated.
public static void applySceneEffect(String inputPath, String outputPath, int sceneType) { LOGGER.info("apply scene effect {} to {}", sceneType, inputPath); try (FFmpegFrameGrabber grabber = new FFmpegFrameGrabber(inputPath); FFmpegFrameRecorder recorder = new FFmpegFrameRecorder(outputPath, grabber.getAudioChannels())) { grabber.setOption("vn", ""); grabber.start(); recorder.setAudioCodec(avcodec.AV_CODEC_ID_PCM_S16LE); recorder.setSampleRate(grabber.getSampleRate()); recorder.setAudioChannels(grabber.getAudioChannels()); recorder.setAudioBitrate(grabber.getAudioBitrate()); recorder.setFormat("wav"); String audioFilter = String.join(",", "aresample=8000", "highpass=f=300, lowpass=f=3400", "acompressor=threshold=-15dB:ratio=4:attack=10:release=100", "volume=1.5", "aecho=0.9:0.4:10:0.6" ); FFmpegFrameFilter f1 = new FFmpegFrameFilter(audioFilter, grabber.getAudioChannels()); f1.setSampleRate(grabber.getSampleRate()); f1.start(); recorder.start(); Random random = new Random(); double noiseLevel = 0.02; while (true) { var frame = grabber.grabFrame(true, false, true, true); if (frame == null) { break; } ShortBuffer audioBuffer = (ShortBuffer) frame.samples[0]; short[] audioData = new short[audioBuffer.remaining()]; audioBuffer.get(audioData); applyElectricNoise(audioData, grabber.getSampleRate()); audioData = applyDistortion(audioData, 1.5, 30000); audioBuffer.rewind(); audioBuffer.put(audioData); audioBuffer.flip(); f1.push(frame); Frame filteredFrame; while ((filteredFrame = f1.pull()) != null) { recorder.record(filteredFrame); } } recorder.stop(); recorder.release(); grabber.stop(); grabber.release(); } catch (FrameGrabber.Exception | FrameRecorder.Exception | FFmpegFrameFilter.Exception e) { throw new RuntimeException(e); } } private static final double NOISE_LEVEL = 0.005; private static final int NOISE_FREQUENCY = 60; public static void applyElectricNoise(short[] audioData, int sampleRate) { Random random = new Random(); for (int i = 0; i < audioData.length; i++) { double noise = Math.sin(2 * Math.PI * NOISE_FREQUENCY * i / sampleRate); double electricNoise = random.nextGaussian() * NOISE_LEVEL * Short.MAX_VALUE + noise; audioData[i] = (short) Math.max(Math.min(audioData[i] + electricNoise, Short.MAX_VALUE), Short.MIN_VALUE); } } public static short[] applyTremolo(short[] audioData, int sampleRate, double frequency, double depth) { double phase = 0.0; double phaseIncrement = 2 * Math.PI * frequency / sampleRate; for (int i = 0; i < audioData.length; i++) { double modulator = 1.0 - depth + depth * Math.sin(phase); audioData[i] = (short) (audioData[i] * modulator); phase += phaseIncrement; if (phase > 2 * Math.PI) { phase -= 2 * Math.PI; } } return audioData; } public static short[] applyDistortion(short[] audioData, double gain, double threshold) { for (int i = 0; i < audioData.length; i++) { double sample = audioData[i] * gain; if (sample > threshold) { sample = threshold; } else if (sample < -threshold) { sample = -threshold; } audioData[i] = (short) sample; } return audioData; }
-
Generating grey nosie with FFmpeg
5 mars, par Azat KhabibulinI have the following sound configuration:
sub-bass: -inf dBFS low bass: -inf dBFS bass: -inf dBFS high bass: -inf dBFS low mids: 0 dBFS mids: 0 dBFS high mids: -inf dBFS low treble: -inf dBFS treble: -inf dBFS high treble: -inf dBFS
If you wonder what is it, you can listen to this sound here.
I'd like to create an audio file provided this sound configuration. FFmpeg filters seem like a good fit, but are not a strict requirement. It may be any command-line tool that handles this kind of task well.
The problem is that I don't really have necessary background in audio theory. I cannot choose the right FFmpeg filter (other than to make a generic white noise), I do not know how to filter frequencies in FFmpeg, I cannot even convert this particular lexicon ("bass", "mids", etc.) into specific numeric frequencies.
-
Can't show image with opencv when importing av
5 mars, par FlojomojoWhen importing the PyAv module, I am unable to show an image with opencv using imshow()
Code without the PyAv module (works as expected)
import cv2 img = cv2.imread("test_image.jpeg") cv2.imshow('image', img) cv2.waitKey(0)
Code with the import (doesn't work, just hangs)
import cv2 import av img = cv2.imread("test_image.jpeg") cv2.imshow('image', img) cv2.waitKey(0)
OS: Linux arch 5.18.3-arch1-1 #1 SMP PREEMPT_DYNAMIC Thu, 09 Jun 2022 16:14:10 +0000 x86_64 GNU/Linux
Am I doing something wrong or is this a (un-)known issue?
-
How does FFmpeg determine the dispositions of an MP4 track ?
5 mars, par obskyrThe Issue
FFmpeg has a concept of “dispositions” – a property that describes the purpose of a stream in a media file. For example, here are the streams in a file I have lying around, with the dispositions emphasized:
Stream #0:0[0x1](und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 251 kb/s (default) Metadata: creation_time : 2021-11-10T20:14:06.000000Z handler_name : Core Media Audio vendor_id : [0][0][0][0] Stream #0:1[0x2](und): Video: mjpeg (Baseline) (jpeg / 0x6765706A), yuvj420p(pc, bt470bg/unknown/unknown), 1024x1024, 0 kb/s, 0.0006 fps, 3.08 tbr, 600 tbn (default) (attached pic) (timed thumbnails) Metadata: creation_time : 2021-11-10T20:14:06.000000Z handler_name : Core Media Video vendor_id : [0][0][0][0] Stream #0:2[0x3](und): Data: bin_data (text / 0x74786574) Metadata: creation_time : 2021-11-10T20:14:06.000000Z handler_name : Core Media Text Stream #0:3[0x0]: Video: mjpeg (Baseline), yuvj420p(pc, bt470bg/unknown/ unknown), 1024x1024 [SAR 144:144 DAR 1:1], 90k tbr, 90k tbn (attached pic)
However, if I make any modification to this file’s chapter markers using the C++ library MP4v2 (even just re-saving the existing ones:
auto f = MP4Modify("test.m4a"); MP4Chapter_t* chapterList; uint32_t chapterCount; MP4GetChapters(f, &chapterList, &chapterCount); MP4SetChapters(f, chapterList, chapterCount); MP4Close(f);
), some of these dispositions are removed:Stream #0:0[0x1](und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 251 kb/s (default) Metadata: creation_time : 2021-11-10T20:14:06.000000Z handler_name : Core Media Audio vendor_id : [0][0][0][0] Stream #0:1[0x2](und): Video: mjpeg (Baseline) (jpeg / 0x6765706A), yuvj420p(pc, bt470bg/unknown/unknown), 1024x1024, 0 kb/s, 0.0006 fps, 3.08 tbr, 600 tbn (default) ← “attached pic” and “timed thumbnails” removed! Metadata: creation_time : 2021-11-10T20:14:06.000000Z handler_name : Core Media Video vendor_id : [0][0][0][0] Stream #0:2[0x0]: Video: mjpeg (Baseline), yuvj420p(pc, bt470bg/unknown/ unknown), 1024x1024 [SAR 144:144 DAR 1:1], 90k tbr, 90k tbn (attached pic) Stream #0:3[0x4](und): Data: bin_data (text / 0x74786574) This stream was moved to the end, but that’s intended behavior. It contains chapter titles, and we just edited the chapters. Metadata: creation_time : 2025-03-05T09:56:31.000000Z
It also renders the file unplayable in MPC-HC (but not in VLC!), which is apparently a bug in MP4v2. I’m currently investigating that bug to report and potentially fix it, but that’s a separate issue – in my journey there, I’m wracking my brain trying to understand what it is that MP4v2 changes to make FFmpeg stop reporting the “attached pic” and “timed thumbnails” dispositions. I’ve explored the before-and-afters in MP4 Box, and I can’t for the life of me find which atom it is that differs in a relevant way.
(I’d love to share the files, but unfortunately the contents are under copyright – if anyone knows of a way to remove the audio from an MP4 file without changing anything else, let me know and I’ll upload dummied-out versions. Without them, I can’t really ask about the issue directly. I can at least show you the files’ respective atom trees, but I’m not sure how relevant that is.)
The Question
I thought I’d read FFmpeg’s source code to find out how it determines dispositions for MP4 streams, but of course, FFmpeg is very complex. Could someone who’s more familiar with C and/or FFmpeg’s codebase help me sleuth out how FFmpeg determines dispositions for MP4 files (in particular, “attached pic” and “timed thumbnails”)?
Some Thoughts…
- I figure searching for “attached_pic” might be a good start?
- Could the MP4 muxer
movenc.c
be helpful? - I’d imagine what we’d really like to look at is the MP4 demuxing process, as it’s during demuxing that FFmpeg determines dispositions from the data in the file. After poring over the code for hours, however, I’ve been utterly unable to find where that happens.
-
Extracting frames from videos using ffmpeg, unpredictable behaviour [closed]
4 mars, par AlexI am using this ffmpeg command to generate a bunch of images captures from a video:
ffmpegCommand([ "-i", inputPath, "-vf", `select='not(mod(n,${frame_interval}))',setpts='N/(${fps}*TB)'`, "-s", `320x200`, "-f", "image2", outputPath, ]);
it is the most frame accurate method according to what I have researched on google and SO.
this works well when the video is around 30 fps, with around 250 frame_interval.
but when video is 5 fps, frame interval should obviously be lower, at around 70, because there are less frames in the video. But then I get a huge amount of images.
// example for 30fps video ffmpeg -i test30.mp4 -vf select='not(mod(n,250))',setpts='N/(29.97*TB) ..... // example for 5fps video ffmpeg -i test5.mp4 -vf select='not(mod(n,70))',setpts='N/(4.907*TB) ......
What could be wrong here?