
Recherche avancée
Autres articles (60)
-
Qualité du média après traitement
21 juin 2013, parLe bon réglage du logiciel qui traite les média est important pour un équilibre entre les partis ( bande passante de l’hébergeur, qualité du média pour le rédacteur et le visiteur, accessibilité pour le visiteur ). Comment régler la qualité de son média ?
Plus la qualité du média est importante, plus la bande passante sera utilisée. Le visiteur avec une connexion internet à petit débit devra attendre plus longtemps. Inversement plus, la qualité du média est pauvre et donc le média devient dégradé voire (...) -
Demande de création d’un canal
12 mars 2010, parEn fonction de la configuration de la plateforme, l’utilisateur peu avoir à sa disposition deux méthodes différentes de demande de création de canal. La première est au moment de son inscription, la seconde, après son inscription en remplissant un formulaire de demande.
Les deux manières demandent les mêmes choses fonctionnent à peu près de la même manière, le futur utilisateur doit remplir une série de champ de formulaire permettant tout d’abord aux administrateurs d’avoir des informations quant à (...) -
Changer son thème graphique
22 février 2011, parLe thème graphique ne touche pas à la disposition à proprement dite des éléments dans la page. Il ne fait que modifier l’apparence des éléments.
Le placement peut être modifié effectivement, mais cette modification n’est que visuelle et non pas au niveau de la représentation sémantique de la page.
Modifier le thème graphique utilisé
Pour modifier le thème graphique utilisé, il est nécessaire que le plugin zen-garden soit activé sur le site.
Il suffit ensuite de se rendre dans l’espace de configuration du (...)
Sur d’autres sites (6955)
-
Get the maximum frequency of an audio spectrum
6 avril, par milahuI want to detect the cutoff frequency of the AAC audio encoder used to compress an M4A audio file.


This cutoff frequency (or maximum frequency) is an indicator of audio quality.
High-quality audio has a cutoff around 20KHz (fullband),
medium-quality audio has a cutoff around 14KHz (superwideband),
low-quality audio has a cutoff around 7KHz (wideband),
super-low-quality audio has a cutoff around 3KHz (narrowband).
See also : voice frequency


Example spectrum of a 2 hours movie, generated with
sox
, with a maximum frequency around 19.6KHz :



The program should ignore noise below a certain loudness, for example -80dB.


Here is a Python script generated by deepseek.com but it returns 0.2KHz instead of 19.6KHz.


#!/usr/bin/env python3

# get the maximum frequency
# of an audio spectrum
# as an indicator
# of the actual audio quality

# generated by deepseek.com

# prompt
"""
create a python script
to detect the maximum frequency 
in an m4a audio file.
that maximum frequency is produced
by the lowpass filter
of the aac audio encoder.
high-quality audio
has a maximum frequency
around 20 KHz (fullband),
low-quality audio
has a maximum frequency
around 3 KHz (narrowband).
use ffmpeg to decode the audio
to pcm
in chunks of 10 seconds.
for each chunk:
detect the local maximum,
print the local maximum
and the chunk time
with the format
f"t={t}sec f={f}KHz",
update the global maximum.
to detect the local maximum,
remove the noise floor
around -110dB,
then find the maximum frequency
in the spectrum.
accept some command line options:
--ss n:
pass as "-ss n" to ffmpeg.
--to n:
pass as "-to n" to ffmpeg.
both -ss and -to args
must come before the -i arg
for ffmpeg input seeking.
print all frequencies in KHz.
add a shebang line before the script,
spaced by an empty line.
do not recode the audio with ffmpeg.
use ffprobe to get the input samplerate,
usually 48KHz or 44.1KHz.
create a python class,
so we dont have to pass all parameters to functions.
add a command line option to select the audio track id, by default zero.
"""

#!/usr/bin/env python3

import argparse
import numpy as np
import subprocess
import sys
from tempfile import NamedTemporaryFile


class AudioAnalyzer:
 def __init__(self, input_file, audio_track=0, start_time=None, end_time=None):
 self.input_file = input_file
 self.audio_track = audio_track
 self.start_time = start_time
 self.end_time = end_time
 self.sample_rate = self._get_sample_rate()
 self.global_max_freq = 0
 self.global_max_time = 0

 def _get_sample_rate(self):
 cmd = [
 'ffprobe',
 '-v', 'error',
 '-select_streams', f'a:{self.audio_track}',
 '-show_entries', 'stream=sample_rate',
 '-of', 'default=noprint_wrappers=1:nokey=1',
 self.input_file
 ]
 result = subprocess.run(cmd, capture_output=True, text=True)
 return float(result.stdout.strip())

 def _get_ffmpeg_command(self):
 cmd = [
 'ffmpeg',
 '-hide_banner',
 '-loglevel', 'error',
 ]
 
 if self.start_time is not None:
 cmd.extend(['-ss', str(self.start_time)])
 if self.end_time is not None:
 cmd.extend(['-to', str(self.end_time)])
 
 cmd.extend([
 '-i', self.input_file,
 '-map', f'0:a:{self.audio_track}',
 '-ac', '1', # convert to mono
 '-f', 'f32le', # 32-bit float PCM
 '-'
 ])
 
 return cmd

 def analyze(self, chunk_size=10):
 ffmpeg_cmd = self._get_ffmpeg_command()
 
 with subprocess.Popen(ffmpeg_cmd, stdout=subprocess.PIPE) as process:
 chunk_samples = int(chunk_size * self.sample_rate)
 bytes_per_sample = 4 # 32-bit float
 chunk_bytes = chunk_samples * bytes_per_sample
 
 current_time = self.start_time if self.start_time is not None else 0
 
 while True:
 raw_data = process.stdout.read(chunk_bytes)
 if not raw_data:
 break
 
 samples = np.frombuffer(raw_data, dtype=np.float32)
 if len(samples) == 0:
 continue
 
 local_max_freq = self._analyze_chunk(samples)
 
 print(f"t={current_time:.1f}sec f={local_max_freq:.1f}KHz")
 
 if local_max_freq > self.global_max_freq:
 self.global_max_freq = local_max_freq
 self.global_max_time = current_time
 
 current_time += chunk_size

 def _analyze_chunk(self, samples):
 # Apply Hanning window
 window = np.hanning(len(samples))
 windowed_samples = samples * window
 
 # Compute FFT
 fft = np.fft.rfft(windowed_samples)
 magnitudes = np.abs(fft)
 
 # Convert to dB
 eps = 1e-10 # avoid log(0)
 magnitudes_db = 20 * np.log10(magnitudes + eps)
 
 # Frequency bins
 freqs = np.fft.rfftfreq(len(samples), 1.0 / self.sample_rate) / 1000 # in KHz
 
 # Remove noise floor (-110dB)
 threshold = -110
 valid_indices = magnitudes_db > threshold
 valid_freqs = freqs[valid_indices]
 valid_magnitudes = magnitudes_db[valid_indices]
 
 if len(valid_freqs) == 0:
 return 0
 
 # Find frequency with maximum magnitude
 max_idx = np.argmax(valid_magnitudes)
 max_freq = valid_freqs[max_idx]
 
 return max_freq


def main():
 parser = argparse.ArgumentParser(description='Detect maximum frequency in audio file')
 parser.add_argument('input_file', help='Input audio file (m4a)')
 parser.add_argument('--ss', type=float, help='Start time in seconds')
 parser.add_argument('--to', type=float, help='End time in seconds')
 parser.add_argument('--track', type=int, default=0, help='Audio track ID (default: 0)')
 args = parser.parse_args()

 analyzer = AudioAnalyzer(
 input_file=args.input_file,
 audio_track=args.track,
 start_time=args.ss,
 end_time=args.to
 )
 
 print(f"Analyzing audio file: {args.input_file}")
 print(f"Sample rate: {analyzer.sample_rate/1000:.1f} KHz")
 print(f"Audio track: {args.track}")
 if args.ss is not None:
 print(f"Start time: {args.ss} sec")
 if args.to is not None:
 print(f"End time: {args.to} sec")
 print("---")
 
 analyzer.analyze()
 
 print("---")
 print(f"Global maximum: t={analyzer.global_max_time:.1f}sec f={analyzer.global_max_freq:.1f}KHz")
 
 if analyzer.global_max_freq > 15:
 print("Quality: Fullband (high quality)")
 elif analyzer.global_max_freq > 5:
 print("Quality: Wideband (medium quality)")
 else:
 print("Quality: Narrowband (low quality)")


if __name__ == '__main__':
 main()



Similar question :
How to find the max frequency at a certain db in a fft signal



edited by kesh


Here is an example psd indicating the fullband quality with a psd dropoff around 20 kHz.




-
HLS script has been lost to time, previous content was made in specific format, attempting to recreate using FFMPEG primitives
28 février, par WungoLooking to add this video to a stitched playlist. The variants, encoding, and everything must match exactly. We have no access to how things were done previously, so I am literally vibing through this as best as I can.


I recommend using a clip of buck bunny that's 30 seconds long, or the original buck bunny 1080p video.


#!/bin/bash
ffmpeg -i bbb_30s.mp4 -filter_complex "
[0:v]split=7[v1][v2][v3][v4][v5][v6][v7];
[v1]scale=416:234[v1out];
[v2]scale=416:234[v2out];
[v3]scale=640:360[v3out];
[v4]scale=768:432[v4out];
[v5]scale=960:540[v5out];
[v6]scale=1280:720[v6out];
[v7]scale=1920:1080[v7out]
" \
-map "[v1out]" -c:v:0 libx264 -b:v:0 200k -maxrate 361k -bufsize 400k -r 29.97 -g 60 -keyint_min 60 -sc_threshold 0 -preset veryfast -profile:v baseline -level 3.0 \
-map "[v2out]" -c:v:1 libx264 -b:v:1 500k -maxrate 677k -bufsize 700k -r 29.97 -g 60 -keyint_min 60 -sc_threshold 0 -preset veryfast -profile:v baseline -level 3.0 \
-map "[v3out]" -c:v:2 libx264 -b:v:2 1000k -maxrate 1203k -bufsize 1300k -r 29.97 -g 60 -keyint_min 60 -sc_threshold 0 -preset veryfast -profile:v main -level 3.1 \
-map "[v4out]" -c:v:3 libx264 -b:v:3 1800k -maxrate 2057k -bufsize 2200k -r 29.97 -g 60 -keyint_min 60 -sc_threshold 0 -preset veryfast -profile:v main -level 3.1 \
-map "[v5out]" -c:v:4 libx264 -b:v:4 2500k -maxrate 2825k -bufsize 3000k -r 29.97 -g 60 -keyint_min 60 -sc_threshold 0 -preset veryfast -profile:v main -level 4.0 \
-map "[v6out]" -c:v:5 libx264 -b:v:5 5000k -maxrate 5525k -bufsize 6000k -r 29.97 -g 60 -keyint_min 60 -sc_threshold 0 -preset veryfast -profile:v high -level 4.1 \
-map "[v7out]" -c:v:6 libx264 -b:v:6 8000k -maxrate 9052k -bufsize 10000k -r 29.97 -g 60 -keyint_min 60 -sc_threshold 0 -preset veryfast -profile:v high -level 4.2 \
-map a:0 -c:a:0 aac -b:a:0 128k -ar 48000 -ac 2 \
-f hls -hls_time 6 -hls_playlist_type vod -hls_flags independent_segments \
-hls_segment_type fmp4 \
-hls_segment_filename "output_%v_%03d.mp4" \
-master_pl_name master.m3u8 \
-var_stream_map "v:0,name:layer-416x234-200k v:1,name:layer-416x234-500k v:2,name:layer-640x360-1000k v:3,name:layer-768x432-1800k v:4,name:layer-960x540-2500k v:5,name:layer-1280x720-5000k v:6,name:layer-1920x1080-8000k a:0,name:layer-audio-128k" \
output_%v.m3u8




Above is what i've put together over the past few days.


I consistently run into the same issues :


- 

- my variants must match identically, the bit rate etc. must match identically no excuses. No variance allowed.
- When I did it a different way previously, it became impossible to sync the variants timing, thus making the project not stitchable, making the asset useless.The variants are encoded to last longer than the master.m3u8 says it will last. Rejecting the asset downstream.
- I end up either having variants mismatched with timing, or no audio/audio channels synced properly. Here is what the master.m3u8 should look like.








#EXTM3U
#EXT-X-VERSION:7

#EXT-X-STREAM-INF:AUDIO="aac",AVERAGE-BANDWIDTH=333000,BANDWIDTH=361000,CLOSED-CAPTIONS="cc1",CODECS="avc1.4d400d,mp4a.40.2",FRAME-RATE=29.97,RESOLUTION=416x234
placeholder.m3u8
#EXT-X-STREAM-INF:AUDIO="aac",AVERAGE-BANDWIDTH=632000,BANDWIDTH=677000,CLOSED-CAPTIONS="cc1",CODECS="avc1.4d400d,mp4a.40.2",FRAME-RATE=29.97,RESOLUTION=416x234
placeholder2.m3u8
#EXT-X-STREAM-INF:AUDIO="aac",AVERAGE-BANDWIDTH=1133000,BANDWIDTH=1203000,CLOSED-CAPTIONS="cc1",CODECS="avc1.4d401e,mp4a.40.2",FRAME-RATE=29.97,RESOLUTION=640x360
placeholder3.m3u8

#EXT-X-STREAM-INF:AUDIO="aac",AVERAGE-BANDWIDTH=1933000,BANDWIDTH=2057000,CLOSED-CAPTIONS="cc1",CODECS="avc1.4d401f,mp4a.40.2",FRAME-RATE=29.97,RESOLUTION=768x432
placeholder4.m3u8

#EXT-X-STREAM-INF:AUDIO="aac",AVERAGE-BANDWIDTH=2633000,BANDWIDTH=2825000,CLOSED-CAPTIONS="cc1",CODECS="avc1.4d401f,mp4a.40.2",FRAME-RATE=29.97,RESOLUTION=960x540
placeholder5.m3u8

#EXT-X-STREAM-INF:AUDIO="aac",AVERAGE-BANDWIDTH=5134000,BANDWIDTH=5525000,CLOSED-CAPTIONS="cc1",CODECS="avc1.4d401f,mp4a.40.2",FRAME-RATE=29.97,RESOLUTION=1280x720
placeholder6.m3u8

#EXT-X-STREAM-INF:AUDIO="aac",AVERAGE-BANDWIDTH=8135000,BANDWIDTH=9052000,CLOSED-CAPTIONS="cc1",CODECS="avc1.640028,mp4a.40.2",FRAME-RATE=29.97,RESOLUTION=1920x1080
placeholder7.m3u8

#EXT-X-STREAM-INF:AUDIO="aac",AVERAGE-BANDWIDTH=129000,BANDWIDTH=130000,CLOSED-CAPTIONS="cc1",CODECS="mp4a.40.2"
placeholder8.m3u8

#EXT-X-MEDIA:AUTOSELECT=YES,CHANNELS="2",DEFAULT=YES,GROUP-ID="aac",LANGUAGE="en",NAME="English",TYPE=AUDIO,URI="placeholder8.m3u8"
#EXT-X-MEDIA:AUTOSELECT=YES,DEFAULT=YES,GROUP-ID="cc1",INSTREAM-ID="CC1",LANGUAGE="en",NAME="English",TYPE=CLOSED-CAPTIONS



Underlying playlist clips should be *.mp4 not *.m4s or anything like that. Audio must be on a single channel by itself, closed captions are handled by a remote server and aren't a concern.


as mentioned above :


- 

- I have tried transcoding separately and then combining manually or later. Here is an example of that.




#!/bin/bash
set -e

# Input file
INPUT_FILE="bbb_30.mp4"

# Output directory
OUTPUT_DIR="hls_output"
mkdir -p "$OUTPUT_DIR"

# First, extract exact duration from master.m3u8 (if it exists)
MASTER_M3U8="master.m3u8" # Change if needed

echo "Extracting exact duration from the source MP4..."
EXACT_DURATION=$(ffprobe -v error -show_entries format=duration -of default=noprint_wrappers=1:nokey=1 "$INPUT_FILE")
echo "Using exact duration: $EXACT_DURATION seconds"
# Create a reference file with exact duration from the source
echo "Creating reference file with exact duration..."
ffmpeg -y -i "$INPUT_FILE" -c copy -t "$EXACT_DURATION" "$OUTPUT_DIR/exact_reference.mp4"

# Calculate exact GOP size for segment alignment (for 6-second segments at 29.97fps)
FPS=29.97
SEGMENT_DURATION=6
GOP_SIZE=$(echo "$FPS * $SEGMENT_DURATION" | bc | awk '{print int($1)}')
echo "Using GOP size of $GOP_SIZE frames for $SEGMENT_DURATION-second segments at $FPS fps"

# Function to encode a variant with exact duration
encode_variant() {
 local resolution="$1"
 local bitrate="$2"
 local maxrate="$3"
 local bufsize="$4"
 local profile="$5"
 local level="$6"
 local audiorate="$7"
 local name_suffix="$8"
 
 echo "Encoding $resolution variant with video bitrate $bitrate kbps and audio bitrate ${audiorate}k..."
 
 # Step 1: Create an intermediate file with exact duration and GOP alignment
 ffmpeg -y -i "$OUTPUT_DIR/exact_reference.mp4" \
 -c:v libx264 -profile:v "$profile" -level "$level" \
 -x264-params "bitrate=$bitrate:vbv-maxrate=$maxrate:vbv-bufsize=$bufsize:keyint=$GOP_SIZE:min-keyint=$GOP_SIZE:no-scenecut=1" \
 -s "$resolution" -r "$FPS" \
 -c:a aac -b:a "${audiorate}k" \
 -vsync cfr -start_at_zero -reset_timestamps 1 \
 -map 0:v:0 -map 0:a:0 \
 -t "$EXACT_DURATION" \
 -force_key_frames "expr:gte(t,n_forced*6)" \
 "$OUTPUT_DIR/temp_${name_suffix}.mp4"
 
 # Step 2: Create HLS segments with exact boundaries from the intermediate file.
 ffmpeg -y -i "$OUTPUT_DIR/temp_${name_suffix}.mp4" \
 -c copy \
 -f hls \
 -hls_time "$SEGMENT_DURATION" \
 -hls_playlist_type vod \
 -hls_segment_filename "$OUTPUT_DIR/layer-${name_suffix}-segment-%03d.mp4" \
 -hls_flags independent_segments+program_date_time+round_durations \
 -hls_list_size 0 \
 "$OUTPUT_DIR/layer-${name_suffix}.m3u8"
 
 # Verify duration
 VARIANT_DURATION=$(ffprobe -v error -show_entries format=duration -of default=noprint_wrappers=1:nokey=1 "$OUTPUT_DIR/temp_${name_suffix}.mp4")
 echo "Variant $name_suffix duration: $VARIANT_DURATION (target: $EXACT_DURATION, diff: $(echo "$VARIANT_DURATION - $EXACT_DURATION" | bc))"
 
 # Clean up temporary file
 rm "$OUTPUT_DIR/temp_${name_suffix}.mp4"
}

# Process each variant with exact duration matching
# Format: resolution, bitrate, maxrate, bufsize, profile, level, audio bitrate, name suffix
encode_variant "416x234" "333" "361" "722" "baseline" "3.0" "64" "416x234-200k"
encode_variant "416x234" "632" "677" "1354" "baseline" "3.0" "64" "416x234-500k"
encode_variant "640x360" "1133" "1203" "2406" "main" "3.0" "96" "640x360-1000k"
encode_variant "768x432" "1933" "2057" "4114" "main" "3.1" "96" "768x432-1800k"
encode_variant "960x540" "2633" "2825" "5650" "main" "3.1" "128" "960x540-2500k"
encode_variant "1280x720" "5134" "5525" "11050" "main" "3.1" "128" "1280x720-5000k"
encode_variant "1920x1080" "8135" "9052" "18104" "high" "4.0" "128" "1920x1080-8000k"

# 8. Audio-only variant
echo "Creating audio-only variant..."


# ffmpeg -y -i "$INPUT_FILE" \
# -vn -map 0:a \
# -c:a aac -b:a 128k -ac 2 \ 
# -t "$EXACT_DURATION" \
# -f hls \
# -hls_time "$SEGMENT_DURATION" \
# -hls_playlist_type vod \
# -hls_flags independent_segments+program_date_time+round_durations \
# -hls_segment_filename "$OUTPUT_DIR/layer-audio-128k-segment-%03d.ts" \
# -hls_list_size 0 \
# "$OUTPUT_DIR/layer-audio-128k.m3u8"

ffmpeg -y -i "$INPUT_FILE" \
 -vn \
 -map 0:a \
 -c:a aac -b:a 128k \
 -t "$EXACT_DURATION" \
 -f hls \
 -hls_time "$SEGMENT_DURATION" \
 -hls_playlist_type vod \
 -hls_segment_type fmp4 \
 -hls_flags independent_segments+program_date_time+round_durations \
 -hls_list_size 0 \
 -hls_segment_filename "$OUTPUT_DIR/layer-audio-128k-segment-%03d.m4s" \
 "$OUTPUT_DIR/layer-audio-128k.m3u8"


# Create master playlist
cat > "$OUTPUT_DIR/master.m3u8" << EOF
#EXTM3U
#EXT-X-VERSION:7
#EXT-X-INDEPENDENT-SEGMENTS

#EXT-X-STREAM-INF:BANDWIDTH=361000,AVERAGE-BANDWIDTH=333000,CODECS="avc1.4d400d,mp4a.40.2",RESOLUTION=416x234,FRAME-RATE=29.97
layer-416x234-200k.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=677000,AVERAGE-BANDWIDTH=632000,CODECS="avc1.4d400d,mp4a.40.2",RESOLUTION=416x234,FRAME-RATE=29.97
layer-416x234-500k.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=1203000,AVERAGE-BANDWIDTH=1133000,CODECS="avc1.4d401e,mp4a.40.2",RESOLUTION=640x360,FRAME-RATE=29.97
layer-640x360-1000k.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=2057000,AVERAGE-BANDWIDTH=1933000,CODECS="avc1.4d401f,mp4a.40.2",RESOLUTION=768x432,FRAME-RATE=29.97
layer-768x432-1800k.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=2825000,AVERAGE-BANDWIDTH=2633000,CODECS="avc1.4d401f,mp4a.40.2",RESOLUTION=960x540,FRAME-RATE=29.97
layer-960x540-2500k.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=5525000,AVERAGE-BANDWIDTH=5134000,CODECS="avc1.4d401f,mp4a.40.2",RESOLUTION=1280x720,FRAME-RATE=29.97
layer-1280x720-5000k.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=9052000,AVERAGE-BANDWIDTH=8135000,CODECS="avc1.640028,mp4a.40.2",RESOLUTION=1920x1080,FRAME-RATE=29.97
layer-1920x1080-8000k.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=130000,AVERAGE-BANDWIDTH=129000,CODECS="mp4a.40.2"
layer-audio-128k.m3u8
EOF

# Verify all durations match
cat > "$OUTPUT_DIR/verify_all.sh" << 'EOF'
#!/bin/bash

# Get exact reference duration from the exact reference file
REFERENCE_DURATION=$(ffprobe -v error -show_entries format=duration -of default=noprint_wrappers=1:nokey=1 "exact_reference.mp4")
echo "Reference duration: $REFERENCE_DURATION seconds"

# Check each segment's duration
echo -e "\nChecking individual segments..."
for seg in layer-*-segment-*.mp4 layer-audio-128k-segment-*.ts; do
 dur=$(ffprobe -v error -show_entries format=duration -of default=noprint_wrappers=1:nokey=1 "$seg")
 echo "$seg: $dur seconds"
done

# Get total duration for each variant by summing segment EXTINF durations from each playlist
echo -e "\nChecking combined variant durations..."
for variant in layer-*.m3u8; do
 total=0
 while read -r line; do
 if [[ $line == "#EXTINF:"* ]]; then
 dur=$(echo "$line" | sed 's/#EXTINF:\([0-9.]*\).*/\1/')
 total=$(echo "$total + $dur" | bc)
 fi
 done < "$variant"
 echo "$variant: $total seconds (reference: $REFERENCE_DURATION, diff: $(echo "$total - $REFERENCE_DURATION" | bc))"
done
EOF

chmod +x "$OUTPUT_DIR/verify_all.sh"

echo "HLS packaging complete with exact duration matching."
echo "Master playlist available at: $OUTPUT_DIR/master.m3u8"
echo "Run $OUTPUT_DIR/verify_all.sh to verify durations."
rm "$OUTPUT_DIR/exact_reference.mp4"




I end up with weird audio in vlc which can't be right, and I also end up with variants being longer than the master.m3u8 playlist which is wonky.


I tried using AI to fix the audio sync issue, and honestly I'm more confused than when I started.


-
My stitched frames colors looks very different from my original video, causing my video to not be able to stitch it back properly [closed]
27 mai 2024, par Wer WerI am trying to extract some frames off my video to do some form of steganography. I accidentally used a 120fps video, causing the files to be too big when i extract every single frame. To fix this, I decided to calculate how many frames is needed to hide the bits (LSB replacement for every 8 bit) and then extract only certain amount of frames. This means


- 

- if i only need 1 frame, ill extract frame0.png
- ill remove frame0 from the original video
- encode my data into frame0.png
- stitch frame0 back into ffv1 video
- concatenate frame0 video to the rest of the video, frame0 video in front.












I can do extraction and remove frame0 from the video. However, when looking at frame0.mkv and the original.mkv, i realised the colors seemed to be different.
Frame0.mkv
original.mkv


This causes a glitch during the stitching of videos together, where the end of the video has some corrupted pixels. Not only that, it stops the video at where frame0 ends. I think those corrupted pixels were supposed to be original.mkv pixels, but they did not concatenate properly.
results.mkv


I use an ffmpeg sub command to extract frames and stitch them


def split_into_frames(self, ffv1_video, hidden_text_length):
 if not ffv1_video.endswith(".mkv"):
 ffv1_video += ".mkv"

 ffv1_video_path = os.path.join(self.here, ffv1_video)
 ffv1_video = cv2.VideoCapture(ffv1_video_path)

 currentframe = 0
 total_frame_bits = 0
 frames_to_remove = []

 while True:
 ret, frame = ffv1_video.read()
 if ret:
 name = os.path.join(self.here, "data", f"frame{currentframe}.png")
 print("Creating..." + name)
 cv2.imwrite(name, frame)

 current_frame_path = os.path.join(
 self.here, "data", f"frame{currentframe}.png"
 )

 if os.path.exists(current_frame_path):
 binary_data = self.read_frame_binary(current_frame_path)

 if (total_frame_bits // 8) >= hidden_text_length:
 print("Complete")
 break
 total_frame_bits += len(binary_data)
 frames_to_remove.append(currentframe)
 currentframe += 1
 else:
 print("Complete")
 break

 ffv1_video.release()

 # Remove the extracted frames from the original video
 self.remove_frames_from_video(ffv1_video_path, frames_to_remove)




This code splits the video into the required number of frames. It checks if the total amount of frame bits is enough to encode the hidden text


def remove_frames_from_video(self, input_video, frames_to_remove):
 if not input_video.endswith(".mkv"):
 input_video += ".mkv"

 input_video_path = os.path.join(self.here, input_video)

 # Create a filter string to exclude specific frames
 filter_str = (
 "select='not("
 + "+".join([f"eq(n\,{frame})" for frame in frames_to_remove])
 + ")',setpts=N/FRAME_RATE/TB"
 )

 # Temporary output video path
 output_video_path = os.path.join(self.here, "temp_output.mkv")

 command = [
 "ffmpeg",
 "-y",
 "-i",
 input_video_path,
 "-vf",
 filter_str,
 "-c:v",
 "ffv1",
 "-level",
 "3",
 "-coder",
 "1",
 "-context",
 "1",
 "-g",
 "1",
 "-slices",
 "4",
 "-slicecrc",
 "1",
 "-an", # Remove audio
 output_video_path,
 ]

 try:
 subprocess.run(command, check=True)
 print(f"Frames removed. Temporary video created at {output_video_path}")

 # Replace the original video with the new video
 os.replace(output_video_path, input_video_path)
 print(f"Original video replaced with updated video at {input_video_path}")

 # Re-add the trimmed audio to the new video
 self.trim_audio_and_add_to_video(input_video_path, frames_to_remove)
 except subprocess.CalledProcessError as e:
 print(f"An error occurred: {e}")
 if os.path.exists(output_video_path):
 os.remove(output_video_path)

def trim_audio_and_add_to_video(self, video_path, frames_to_remove):
 # Calculate the new duration based on the remaining frames
 fps = 60 # Assuming the framerate is 60 fps
 total_frames_removed = len(frames_to_remove)
 original_duration = self.get_video_duration(video_path)
 new_duration = original_duration - (total_frames_removed / fps)

 # Extract and trim the audio
 audio_path = os.path.join(self.here, "trimmed_audio.aac")
 command_extract_trim = [
 "ffmpeg",
 "-y",
 "-i",
 video_path,
 "-t",
 str(new_duration),
 "-q:a",
 "0",
 "-map",
 "a",
 audio_path,
 ]
 try:
 subprocess.run(command_extract_trim, check=True)
 print(f"Audio successfully trimmed and extracted to {audio_path}")

 # Add the trimmed audio back to the video
 final_video_path = video_path.replace(".mkv", "_final.mkv")
 command_add_audio = [
 "ffmpeg",
 "-y",
 "-i",
 video_path,
 "-i",
 audio_path,
 "-c:v",
 "copy",
 "-c:a",
 "aac",
 "-strict",
 "experimental",
 final_video_path,
 ]
 subprocess.run(command_add_audio, check=True)
 print(f"Final video with trimmed audio created at {final_video_path}")

 # Replace the original video with the final video
 os.replace(final_video_path, video_path)
 print(f"Original video replaced with final video at {video_path}")
 except subprocess.CalledProcessError as e:
 print(f"An error occurred: {e}")

def get_video_duration(self, video_path):
 command = [
 "ffprobe",
 "-v",
 "error",
 "-show_entries",
 "format=duration",
 "-of",
 "default=noprint_wrappers=1:nokey=1",
 video_path,
 ]
 try:
 result = subprocess.run(
 command, check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE
 )
 duration = float(result.stdout.decode().strip())
 return duration
 except subprocess.CalledProcessError as e:
 print(f"An error occurred while getting video duration: {e}")
 return 0.0



here ill remove all the frames that has been extracted from the video


def stitch_frames_to_video(self, ffv1_video, framerate=60):
 # this command is another ffmpeg subcommand.
 # it takes every single frame from data1 directory and stitch it back into a ffv1 video
 if not ffv1_video.endswith(".mkv"):
 ffv1_video += ".mkv"

 output_video_path = os.path.join(self.here, ffv1_video)

 command = [
 "ffmpeg",
 "-y",
 "-framerate",
 str(framerate),
 "-i",
 os.path.join(self.frames_directory, "frame%d.png"),
 "-c:v",
 "ffv1",
 "-level",
 "3",
 "-coder",
 "1",
 "-context",
 "1",
 "-g",
 "1",
 "-slices",
 "4",
 "-slicecrc",
 "1",
 output_video_path,
 ]

 try:
 subprocess.run(command, check=True)
 print(f"Video successfully created at {output_video_path}")
 except subprocess.CalledProcessError as e:
 print(f"An error occurred: {e}")



after encoding the frames, ill try to stitch the frames back into ffv1 video


def concatenate_videos(self, video1_path, video2_path, output_path):
 if not video1_path.endswith(".mkv"):
 video1_path += ".mkv"
 if not video2_path.endswith(".mkv"):
 video2_path += ".mkv"
 if not output_path.endswith(".mkv"):
 output_path += ".mkv"

 video1_path = os.path.join(self.here, video1_path)
 video2_path = os.path.join(self.here, video2_path)
 output_video_path = os.path.join(self.here, output_path)

 # Create a text file with the paths of the videos to concatenate
 concat_list_path = os.path.join(self.here, "concat_list.txt")
 with open(concat_list_path, "w") as f:
 f.write(f"file '{video1_path}'\n")
 f.write(f"file '{video2_path}'\n")

 command = [
 "ffmpeg",
 "-y",
 "-f",
 "concat",
 "-safe",
 "0",
 "-i",
 concat_list_path,
 "-c",
 "copy",
 output_video_path,
 ]

 try:
 subprocess.run(command, check=True)
 print(f"Videos successfully concatenated into {output_video_path}")
 os.remove(concat_list_path) # Clean up the temporary file
 except subprocess.CalledProcessError as e:
 print(f"An error occurred: {e}")



now i try to concatenate the frames video with the original video, but it is corrupting as the colors are different.


this code does the other processing by removing all the extracted frames from the video, as well as trimming the audio (but i think ill be removing the audio trimming as i realised it is not needed at all)


I think its because .png frames will lose colors when they get extracted out. The only work around I know is to extract every single frame. But this causes the program to run too long as for a 12 second video, I will extract 700++ frames. Is there a way to fix this ?


my full code


import json
import os
import shutil
import magic
import ffmpeg
import cv2
import numpy as np
import subprocess
from PIL import Image
import glob


import json
import os
import shutil
import magic
import ffmpeg
import cv2
import numpy as np
import subprocess
from PIL import Image
import glob


class FFV1Steganography:
 def __init__(self):
 self.here = os.path.dirname(os.path.abspath(__file__))

 # Create a folder to save the frames
 self.frames_directory = os.path.join(self.here, "data")
 try:
 if not os.path.exists(self.frames_directory):
 os.makedirs(self.frames_directory)
 except OSError:
 print("Error: Creating directory of data")

 def read_hidden_text(self, filename):
 file_path_txt = os.path.join(self.here, filename)
 # Read the content of the file in binary mode
 with open(file_path_txt, "rb") as f:
 hidden_text_content = f.read()
 return hidden_text_content

 def calculate_length_of_hidden_text(self, filename):
 hidden_text_content = self.read_hidden_text(filename)
 # Convert each byte to its binary representation and join them
 return len("".join(format(byte, "08b") for byte in hidden_text_content))

 def find_raw_video_file(self, filename):
 file_extensions = [".mp4", ".mkv", ".avi"]
 for ext in file_extensions:
 file_path = os.path.join(self.here, filename + ext)
 if os.path.isfile(file_path):
 return file_path
 return None

 def convert_video(self, input_file, ffv1_video):
 # this function is the same as running this command line
 # ffmpeg -i video.mp4 -t 12 -c:v ffv1 -level 3 -coder 1 -context 1 -g 1 -slices 4 -slicecrc 1 -c:a copy output.mkv

 # in order to run any ffmpeg subprocess, you have to have ffmpeg installed into the computer.
 # https://ffmpeg.org/download.html

 # WARNING:
 # the ffmpeg you should download is not the same as the ffmpeg library for python.
 # you need to download the exe from the link above, then add ffmpeg bin directory to system variables
 output_file = os.path.join(self.here, ffv1_video)

 if not output_file.endswith(".mkv"):
 output_file += ".mkv"

 command = [
 "ffmpeg",
 "-y",
 "-i",
 input_file,
 "-t",
 "12",
 "-c:v",
 "ffv1",
 "-level",
 "3",
 "-coder",
 "1",
 "-context",
 "1",
 "-g",
 "1",
 "-slices",
 "4",
 "-slicecrc",
 "1",
 "-c:a",
 "copy",
 output_file,
 ]

 try:
 subprocess.run(command, check=True)
 print(f"Conversion successful: {output_file}")
 return output_file
 except subprocess.CalledProcessError as e:
 print(f"Error during conversion: {e}")

 def extract_audio(self, ffv1_video, audio_path):
 # Ensure the audio output file has the correct extension
 if not audio_path.endswith(".aac"):
 audio_path += ".aac"

 # Full path to the extracted audio file
 extracted_audio = os.path.join(self.here, audio_path)

 if not ffv1_video.endswith(".mkv"):
 ffv1_video += ".mkv"

 command = [
 "ffmpeg",
 "-i",
 ffv1_video,
 "-q:a",
 "0",
 "-map",
 "a",
 extracted_audio,
 ]
 try:
 result = subprocess.run(
 command, check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE
 )
 print(f"Audio successfully extracted to {extracted_audio}")
 print(result.stdout.decode())
 print(result.stderr.decode())
 except subprocess.CalledProcessError as e:
 print(f"An error occurred: {e}")
 print(e.stdout.decode())
 print(e.stderr.decode())

 def read_frame_binary(self, frame_path):
 # Open the image and convert to binary
 with open(frame_path, "rb") as f:
 binary_content = f.read()
 binary_string = "".join(format(byte, "08b") for byte in binary_content)
 return binary_string

 def remove_frames_from_video(self, input_video, frames_to_remove):
 if not input_video.endswith(".mkv"):
 input_video += ".mkv"

 input_video_path = os.path.join(self.here, input_video)

 # Create a filter string to exclude specific frames
 filter_str = (
 "select='not("
 + "+".join([f"eq(n\,{frame})" for frame in frames_to_remove])
 + ")',setpts=N/FRAME_RATE/TB"
 )

 # Temporary output video path
 output_video_path = os.path.join(self.here, "temp_output.mkv")

 command = [
 "ffmpeg",
 "-y",
 "-i",
 input_video_path,
 "-vf",
 filter_str,
 "-c:v",
 "ffv1",
 "-level",
 "3",
 "-coder",
 "1",
 "-context",
 "1",
 "-g",
 "1",
 "-slices",
 "4",
 "-slicecrc",
 "1",
 "-an", # Remove audio
 output_video_path,
 ]

 try:
 subprocess.run(command, check=True)
 print(f"Frames removed. Temporary video created at {output_video_path}")

 # Replace the original video with the new video
 os.replace(output_video_path, input_video_path)
 print(f"Original video replaced with updated video at {input_video_path}")

 # Re-add the trimmed audio to the new video
 self.trim_audio_and_add_to_video(input_video_path, frames_to_remove)
 except subprocess.CalledProcessError as e:
 print(f"An error occurred: {e}")
 if os.path.exists(output_video_path):
 os.remove(output_video_path)

 def trim_audio_and_add_to_video(self, video_path, frames_to_remove):
 # Calculate the new duration based on the remaining frames
 fps = 60 # Assuming the framerate is 60 fps
 total_frames_removed = len(frames_to_remove)
 original_duration = self.get_video_duration(video_path)
 new_duration = original_duration - (total_frames_removed / fps)

 # Extract and trim the audio
 audio_path = os.path.join(self.here, "trimmed_audio.aac")
 command_extract_trim = [
 "ffmpeg",
 "-y",
 "-i",
 video_path,
 "-t",
 str(new_duration),
 "-q:a",
 "0",
 "-map",
 "a",
 audio_path,
 ]
 try:
 subprocess.run(command_extract_trim, check=True)
 print(f"Audio successfully trimmed and extracted to {audio_path}")

 # Add the trimmed audio back to the video
 final_video_path = video_path.replace(".mkv", "_final.mkv")
 command_add_audio = [
 "ffmpeg",
 "-y",
 "-i",
 video_path,
 "-i",
 audio_path,
 "-c:v",
 "copy",
 "-c:a",
 "aac",
 "-strict",
 "experimental",
 final_video_path,
 ]
 subprocess.run(command_add_audio, check=True)
 print(f"Final video with trimmed audio created at {final_video_path}")

 # Replace the original video with the final video
 os.replace(final_video_path, video_path)
 print(f"Original video replaced with final video at {video_path}")
 except subprocess.CalledProcessError as e:
 print(f"An error occurred: {e}")

 def get_video_duration(self, video_path):
 command = [
 "ffprobe",
 "-v",
 "error",
 "-show_entries",
 "format=duration",
 "-of",
 "default=noprint_wrappers=1:nokey=1",
 video_path,
 ]
 try:
 result = subprocess.run(
 command, check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE
 )
 duration = float(result.stdout.decode().strip())
 return duration
 except subprocess.CalledProcessError as e:
 print(f"An error occurred while getting video duration: {e}")
 return 0.0

 def split_into_frames(self, ffv1_video, hidden_text_length):
 if not ffv1_video.endswith(".mkv"):
 ffv1_video += ".mkv"

 ffv1_video_path = os.path.join(self.here, ffv1_video)
 ffv1_video = cv2.VideoCapture(ffv1_video_path)

 currentframe = 0
 total_frame_bits = 0
 frames_to_remove = []

 while True:
 ret, frame = ffv1_video.read()
 if ret:
 name = os.path.join(self.here, "data", f"frame{currentframe}.png")
 print("Creating..." + name)
 cv2.imwrite(name, frame)

 current_frame_path = os.path.join(
 self.here, "data", f"frame{currentframe}.png"
 )

 if os.path.exists(current_frame_path):
 binary_data = self.read_frame_binary(current_frame_path)

 if (total_frame_bits // 8) >= hidden_text_length:
 print("Complete")
 break
 total_frame_bits += len(binary_data)
 frames_to_remove.append(currentframe)
 currentframe += 1
 else:
 print("Complete")
 break

 ffv1_video.release()

 # Remove the extracted frames from the original video
 self.remove_frames_from_video(ffv1_video_path, frames_to_remove)

 def stitch_frames_to_video(self, ffv1_video, framerate=60):
 # this command is another ffmpeg subcommand.
 # it takes every single frame from data1 directory and stitch it back into a ffv1 video
 if not ffv1_video.endswith(".mkv"):
 ffv1_video += ".mkv"

 output_video_path = os.path.join(self.here, ffv1_video)

 command = [
 "ffmpeg",
 "-y",
 "-framerate",
 str(framerate),
 "-i",
 os.path.join(self.frames_directory, "frame%d.png"),
 "-c:v",
 "ffv1",
 "-level",
 "3",
 "-coder",
 "1",
 "-context",
 "1",
 "-g",
 "1",
 "-slices",
 "4",
 "-slicecrc",
 "1",
 output_video_path,
 ]

 try:
 subprocess.run(command, check=True)
 print(f"Video successfully created at {output_video_path}")
 except subprocess.CalledProcessError as e:
 print(f"An error occurred: {e}")

 def add_audio_to_video(self, encoded_video, audio_path, final_video):
 # the audio will be lost during splitting and restitching.
 # that is why previously we separated the audio from video and saved it as aac.
 # now, we can put the audio back into the video, again using ffmpeg subcommand.

 if not encoded_video.endswith(".mkv"):
 encoded_video += ".mkv"

 if not final_video.endswith(".mkv"):
 final_video += ".mkv"

 if not audio_path.endswith(".aac"):
 audio_path += ".aac"

 final_output_path = os.path.join(self.here, final_video)

 command = [
 "ffmpeg",
 "-y",
 "-i",
 os.path.join(self.here, encoded_video),
 "-i",
 os.path.join(self.here, audio_path),
 "-c:v",
 "copy",
 "-c:a",
 "aac",
 "-strict",
 "experimental",
 final_output_path,
 ]
 try:
 subprocess.run(command, check=True)
 print(f"Final video with audio created at {final_output_path}")
 except subprocess.CalledProcessError as e:
 print(f"An error occurred: {e}")

 def concatenate_videos(self, video1_path, video2_path, output_path):
 if not video1_path.endswith(".mkv"):
 video1_path += ".mkv"
 if not video2_path.endswith(".mkv"):
 video2_path += ".mkv"
 if not output_path.endswith(".mkv"):
 output_path += ".mkv"

 video1_path = os.path.join(self.here, video1_path)
 video2_path = os.path.join(self.here, video2_path)
 output_video_path = os.path.join(self.here, output_path)

 # Create a text file with the paths of the videos to concatenate
 concat_list_path = os.path.join(self.here, "concat_list.txt")
 with open(concat_list_path, "w") as f:
 f.write(f"file '{video1_path}'\n")
 f.write(f"file '{video2_path}'\n")

 command = [
 "ffmpeg",
 "-y",
 "-f",
 "concat",
 "-safe",
 "0",
 "-i",
 concat_list_path,
 "-c",
 "copy",
 output_video_path,
 ]

 try:
 subprocess.run(command, check=True)
 print(f"Videos successfully concatenated into {output_video_path}")
 os.remove(concat_list_path) # Clean up the temporary file
 except subprocess.CalledProcessError as e:
 print(f"An error occurred: {e}")

 def cleanup(self, files_to_delete):
 # Delete specified files
 for file in files_to_delete:
 file_path = os.path.join(self.here, file)
 if os.path.exists(file_path):
 os.remove(file_path)
 print(f"Deleted file: {file_path}")
 else:
 print(f"File not found: {file_path}")

 # Delete the frames directory and its contents
 if os.path.exists(self.frames_directory):
 shutil.rmtree(self.frames_directory)
 print(f"Deleted directory and its contents: {self.frames_directory}")
 else:
 print(f"Directory not found: {self.frames_directory}")


if __name__ == "__main__":
 stego = FFV1Steganography()

 # original video (mp4,mkv,avi)
 original_video = "video"
 # converted ffv1 video
 ffv1_video = "output"
 # extracted audio
 extracted_audio = "audio"
 # encoded video without sound
 encoded_video = "encoded"
 # final result video, encoded, with sound
 final_video = "result"

 # region --hidden text processing --
 hidden_text = stego.read_hidden_text("hiddentext.txt")
 hidden_text_length = stego.calculate_length_of_hidden_text("hiddentext.txt")
 # endregion

 # region -- raw video locating --
 raw_video_file = stego.find_raw_video_file(original_video)
 if raw_video_file:
 print(f"Found video file: {raw_video_file}")
 else:
 print("video.mp4 not found.")
 # endregion

 # region -- video processing INPUT--
 # converted_video_file = stego.convert_video(raw_video_file, ffv1_video)
 # if converted_video_file and os.path.exists(converted_video_file):
 # stego.extract_audio(converted_video_file, extracted_audio)
 # else:
 # print(f"Conversion failed: {converted_video_file} not found.")

 # stego.split_into_frames(ffv1_video, hidden_text_length * 50000)
 # endregion

 # region -- video processing RESULT --
 # stego.stitch_frames_to_video(encoded_video)
 stego.concatenate_videos(encoded_video, ffv1_video, final_video)
 # stego.add_audio_to_video(final_video, extracted_audio, final_video)
 # endregion

 # region -- cleanup --
 files_to_delete = [
 extracted_audio + ".aac",
 encoded_video + ".mkv",
 ffv1_video + ".mkv",
 ]

 stego.cleanup(files_to_delete)
 # endregion








Edit for results expectations :
I dont know if there is a way to match the exact color encoding between the stitched png frames and the rest of the ffv1 video. Is there a way I can extract the frames such that the color, encoding or anything I may not know about ffv1 match the original ffv1 video ?