Newest 'ffmpeg' Questions - Stack Overflow
Les articles publiés sur le site
-
Microphone and Camera Issues After Installing FFmpeg [migrated]
6 avril, par FangAfter installing FFmpeg using Winget on my Windows laptop, my microphone started capturing a lot of noise it doesnt work and my camera began flickering. The only fix that worked was performing a full system reset via USB boot. Normal reset or uninstalling drivers wont work. Issue occured on win10/win11 after installing ffmeg Note: I've installed ffmpeg on my other laptop everything works fine.
-
Get the maximum frequency of an audio spectrum
6 avril, par milahuI want to detect the cutoff frequency of the AAC audio encoder used to compress an M4A audio file.
This cutoff frequency (or maximum frequency) is an indicator of audio quality. High-quality audio has a cutoff around 20KHz (fullband), medium-quality audio has a cutoff around 14KHz (superwideband), low-quality audio has a cutoff around 7KHz (wideband), super-low-quality audio has a cutoff around 3KHz (narrowband). See also: voice frequency
Example spectrum of a 2 hours movie, generated with
sox
, with a maximum frequency around 19.6KHz:The program should ignore noise below a certain loudness, for example -80dB.
Here is a Python script generated by deepseek.com but it returns 0.2KHz instead of 19.6KHz.
#!/usr/bin/env python3 # get the maximum frequency # of an audio spectrum # as an indicator # of the actual audio quality # generated by deepseek.com # prompt """ create a python script to detect the maximum frequency in an m4a audio file. that maximum frequency is produced by the lowpass filter of the aac audio encoder. high-quality audio has a maximum frequency around 20 KHz (fullband), low-quality audio has a maximum frequency around 3 KHz (narrowband). use ffmpeg to decode the audio to pcm in chunks of 10 seconds. for each chunk: detect the local maximum, print the local maximum and the chunk time with the format f"t={t}sec f={f}KHz", update the global maximum. to detect the local maximum, remove the noise floor around -110dB, then find the maximum frequency in the spectrum. accept some command line options: --ss n: pass as "-ss n" to ffmpeg. --to n: pass as "-to n" to ffmpeg. both -ss and -to args must come before the -i arg for ffmpeg input seeking. print all frequencies in KHz. add a shebang line before the script, spaced by an empty line. do not recode the audio with ffmpeg. use ffprobe to get the input samplerate, usually 48KHz or 44.1KHz. create a python class, so we dont have to pass all parameters to functions. add a command line option to select the audio track id, by default zero. """ #!/usr/bin/env python3 import argparse import numpy as np import subprocess import sys from tempfile import NamedTemporaryFile class AudioAnalyzer: def __init__(self, input_file, audio_track=0, start_time=None, end_time=None): self.input_file = input_file self.audio_track = audio_track self.start_time = start_time self.end_time = end_time self.sample_rate = self._get_sample_rate() self.global_max_freq = 0 self.global_max_time = 0 def _get_sample_rate(self): cmd = [ 'ffprobe', '-v', 'error', '-select_streams', f'a:{self.audio_track}', '-show_entries', 'stream=sample_rate', '-of', 'default=noprint_wrappers=1:nokey=1', self.input_file ] result = subprocess.run(cmd, capture_output=True, text=True) return float(result.stdout.strip()) def _get_ffmpeg_command(self): cmd = [ 'ffmpeg', '-hide_banner', '-loglevel', 'error', ] if self.start_time is not None: cmd.extend(['-ss', str(self.start_time)]) if self.end_time is not None: cmd.extend(['-to', str(self.end_time)]) cmd.extend([ '-i', self.input_file, '-map', f'0:a:{self.audio_track}', '-ac', '1', # convert to mono '-f', 'f32le', # 32-bit float PCM '-' ]) return cmd def analyze(self, chunk_size=10): ffmpeg_cmd = self._get_ffmpeg_command() with subprocess.Popen(ffmpeg_cmd, stdout=subprocess.PIPE) as process: chunk_samples = int(chunk_size * self.sample_rate) bytes_per_sample = 4 # 32-bit float chunk_bytes = chunk_samples * bytes_per_sample current_time = self.start_time if self.start_time is not None else 0 while True: raw_data = process.stdout.read(chunk_bytes) if not raw_data: break samples = np.frombuffer(raw_data, dtype=np.float32) if len(samples) == 0: continue local_max_freq = self._analyze_chunk(samples) print(f"t={current_time:.1f}sec f={local_max_freq:.1f}KHz") if local_max_freq > self.global_max_freq: self.global_max_freq = local_max_freq self.global_max_time = current_time current_time += chunk_size def _analyze_chunk(self, samples): # Apply Hanning window window = np.hanning(len(samples)) windowed_samples = samples * window # Compute FFT fft = np.fft.rfft(windowed_samples) magnitudes = np.abs(fft) # Convert to dB eps = 1e-10 # avoid log(0) magnitudes_db = 20 * np.log10(magnitudes + eps) # Frequency bins freqs = np.fft.rfftfreq(len(samples), 1.0 / self.sample_rate) / 1000 # in KHz # Remove noise floor (-110dB) threshold = -110 valid_indices = magnitudes_db > threshold valid_freqs = freqs[valid_indices] valid_magnitudes = magnitudes_db[valid_indices] if len(valid_freqs) == 0: return 0 # Find frequency with maximum magnitude max_idx = np.argmax(valid_magnitudes) max_freq = valid_freqs[max_idx] return max_freq def main(): parser = argparse.ArgumentParser(description='Detect maximum frequency in audio file') parser.add_argument('input_file', help='Input audio file (m4a)') parser.add_argument('--ss', type=float, help='Start time in seconds') parser.add_argument('--to', type=float, help='End time in seconds') parser.add_argument('--track', type=int, default=0, help='Audio track ID (default: 0)') args = parser.parse_args() analyzer = AudioAnalyzer( input_file=args.input_file, audio_track=args.track, start_time=args.ss, end_time=args.to ) print(f"Analyzing audio file: {args.input_file}") print(f"Sample rate: {analyzer.sample_rate/1000:.1f} KHz") print(f"Audio track: {args.track}") if args.ss is not None: print(f"Start time: {args.ss} sec") if args.to is not None: print(f"End time: {args.to} sec") print("---") analyzer.analyze() print("---") print(f"Global maximum: t={analyzer.global_max_time:.1f}sec f={analyzer.global_max_freq:.1f}KHz") if analyzer.global_max_freq > 15: print("Quality: Fullband (high quality)") elif analyzer.global_max_freq > 5: print("Quality: Wideband (medium quality)") else: print("Quality: Narrowband (low quality)") if __name__ == '__main__': main()
Similar question: How to find the max frequency at a certain db in a fft signal
edited by kesh
Here is an example psd indicating the fullband quality with a psd dropoff around 20 kHz.
-
Command-line streaming webcam with audio from Ubuntu server in WebM format
6 avril, par mjtbI am trying to stream video and audio from my webcam connected to my headless Ubuntu server (running Maverick 10.10). I want to be able to stream in WebM format (VP8 video + OGG). Bandwidth is limited, and so the stream must be below 1Mbps.
I have tried using FFmpeg. I am able to record WebM video from the webcam with the following:
ffmpeg -s 640x360 \ -f video4linux2 -i /dev/video0 -isync -vcodec libvpx -vb 768000 -r 10 -vsync 1 \ -f alsa -ac 1 -i hw:1,0 -acodec libvorbis -ab 32000 -ar 11025 \ -f webm /var/www/telemed/test.webm
However despite experimenting with all manner of vsync and async options, I can either get out of sync audio, or Benny Hill style fast-forward video with matching fast audio. I have also been unable to get this actually working with ffserver (by replacing the test.webm path and filename with the relevant feed filename).
The objective is to get a live, audio + video feed which is viewable in a modern browser, in a tight bandwidth, using only open-source components. (None of that MP3 format legal chaff)
My questions are therefore: How would you go about streaming webm from a webcam via Linux with in-sync audio? What software you use?
Have you succeeded in encoding webm from a webcam with in-sync audio via FFmpeg? If so, what command did you issue?
Is it worth persevering with FFmpeg + FFserver, or are there other more suitable command-line tools around (e.g. VLC which doesn't seem too well built for encoding)?
Is something like Gstreamer + flumotion configurable from the command line? If so, where do I find command line documentation because flumotion doc is rather light on command line details?
Thanks in advance!
-
Generate thumbnail for text file
6 avril, par SophivorusSuppose a user uploads a .txt or .php file, and I want to generate a .png thumbnail for it. Is there a simple way of doing it, that doesn't require me to open the file and write its contents into a new .png? I have ImageMagick and FFmpeg available, there must be a way to take advantage of that, but I've been looking a lot and no luck yet.
Thanks in advance.
-
ffmpeg : error while loading shared libraries : libopenh264.so.5 [closed]
6 avril, par ESZI am using ffmpeg and getting this error
ffmpeg: error while loading shared libraries: libopenh264.so.5: cannot open shared object file: No such file or directory
I have already checked if the library exists and it does. I added it to /etc/ld.so.conf as mentioned in this previous question but it doesn't work.