
Recherche avancée
Médias (3)
-
Elephants Dream - Cover of the soundtrack
17 octobre 2011, par
Mis à jour : Octobre 2011
Langue : English
Type : Image
-
Valkaama DVD Label
4 octobre 2011, par
Mis à jour : Février 2013
Langue : English
Type : Image
-
Publier une image simplement
13 avril 2011, par ,
Mis à jour : Février 2012
Langue : français
Type : Video
Autres articles (48)
-
Ajouter notes et légendes aux images
7 février 2011, parPour pouvoir ajouter notes et légendes aux images, la première étape est d’installer le plugin "Légendes".
Une fois le plugin activé, vous pouvez le configurer dans l’espace de configuration afin de modifier les droits de création / modification et de suppression des notes. Par défaut seuls les administrateurs du site peuvent ajouter des notes aux images.
Modification lors de l’ajout d’un média
Lors de l’ajout d’un média de type "image" un nouveau bouton apparait au dessus de la prévisualisation (...) -
Encoding and processing into web-friendly formats
13 avril 2011, parMediaSPIP automatically converts uploaded files to internet-compatible formats.
Video files are encoded in MP4, Ogv and WebM (supported by HTML5) and MP4 (supported by Flash).
Audio files are encoded in MP3 and Ogg (supported by HTML5) and MP3 (supported by Flash).
Where possible, text is analyzed in order to retrieve the data needed for search engine detection, and then exported as a series of image files.
All uploaded files are stored online in their original format, so you can (...) -
Submit bugs and patches
13 avril 2011Unfortunately a software is never perfect.
If you think you have found a bug, report it using our ticket system. Please to help us to fix it by providing the following information : the browser you are using, including the exact version as precise an explanation as possible of the problem if possible, the steps taken resulting in the problem a link to the site / page in question
If you think you have solved the bug, fill in a ticket and attach to it a corrective patch.
You may also (...)
Sur d’autres sites (9430)
-
Computer crashing when using python tools in same script
5 février 2023, par SL1997I am attempting to use the speech recognition toolkit VOSK and the speech diarization package Resemblyzer to transcibe audio and then identify the speakers in the audio.


Tools :


https://github.com/alphacep/vosk-api

https://github.com/resemble-ai/Resemblyzer

I can do both things individually but run into issues when trying to do them when running the one python script.


I used the following guide when setting up the diarization system :




Computer specs are as follows :


Intel(R) Core(TM) i3-7100 CPU @ 3.90GHz, 3912 Mhz, 2 Core(s), 4 Logical Processor(s)

32GB RAM

The following is my code, I am not to sure if using threading is appropriate or if I even implemented it correctly, how can I best optimize this code as to achieve the results I am looking for and not crash.


from vosk import Model, KaldiRecognizer
from pydub import AudioSegment
import json
import sys
import os
import subprocess
import datetime
from resemblyzer import preprocess_wav, VoiceEncoder
from pathlib import Path
from resemblyzer.hparams import sampling_rate
from spectralcluster import SpectralClusterer
import threading
import queue
import gc



def recognition(queue, audio, FRAME_RATE):

 model = Model("Vosk_Models/vosk-model-small-en-us-0.15")

 rec = KaldiRecognizer(model, FRAME_RATE)
 rec.SetWords(True)

 rec.AcceptWaveform(audio.raw_data)
 result = rec.Result()

 transcript = json.loads(result)#["text"]

 #return transcript
 queue.put(transcript)



def diarization(queue, audio):

 wav = preprocess_wav(audio)
 encoder = VoiceEncoder("cpu")
 _, cont_embeds, wav_splits = encoder.embed_utterance(wav, return_partials=True, rate=16)
 print(cont_embeds.shape)

 clusterer = SpectralClusterer(
 min_clusters=2,
 max_clusters=100,
 p_percentile=0.90,
 gaussian_blur_sigma=1)

 labels = clusterer.predict(cont_embeds)

 def create_labelling(labels, wav_splits):

 times = [((s.start + s.stop) / 2) / sampling_rate for s in wav_splits]
 labelling = []
 start_time = 0

 for i, time in enumerate(times):
 if i > 0 and labels[i] != labels[i - 1]:
 temp = [str(labels[i - 1]), start_time, time]
 labelling.append(tuple(temp))
 start_time = time
 if i == len(times) - 1:
 temp = [str(labels[i]), start_time, time]
 labelling.append(tuple(temp))

 return labelling

 #return
 labelling = create_labelling(labels, wav_splits)
 queue.put(labelling)



def identify_speaker(queue1, queue2):

 transcript = queue1.get()
 labelling = queue2.get()

 for speaker in labelling:

 speakerID = speaker[0]
 speakerStart = speaker[1]
 speakerEnd = speaker[2]

 result = transcript['result']
 words = [r['word'] for r in result if speakerStart < r['start'] < speakerEnd]
 #return
 print("Speaker",speakerID,":",' '.join(words), "\n")





def main():

 queue1 = queue.Queue()
 queue2 = queue.Queue()

 FRAME_RATE = 16000
 CHANNELS = 1

 podcast = AudioSegment.from_mp3("Podcast_Audio/Film-Release-Clip.mp3")
 podcast = podcast.set_channels(CHANNELS)
 podcast = podcast.set_frame_rate(FRAME_RATE)

 first_thread = threading.Thread(target=recognition, args=(queue1, podcast, FRAME_RATE))
 second_thread = threading.Thread(target=diarization, args=(queue2, podcast))
 third_thread = threading.Thread(target=identify_speaker, args=(queue1, queue2))

 first_thread.start()
 first_thread.join()
 gc.collect()

 second_thread.start()
 second_thread.join()
 gc.collect()

 third_thread.start()
 third_thread.join()
 gc.collect()

 # transcript = recognition(podcast,FRAME_RATE)
 #
 # labelling = diarization(podcast)
 #
 # print(identify_speaker(transcript, labelling))


if __name__ == '__main__':
 main()



When I say crash I mean everything freezes, I have to hold down the power button on the desktop and turn it back on again. No blue/blank screen, just frozen in my IDE looking at my code. Any help in resolving this issue would be greatly appreciated.


-
ARM inline asm secrets
Although I generally recommend against using GCC inline assembly, preferring instead pure assembly code in separate files, there are occasions where inline is the appropriate solution. Should one, at a time like this, turn to the GCC documentation for guidance, one must be prepared for a degree of disappointment. As it happens, much of the inline asm syntax is left entirely undocumented. This article attempts to fill in some of the blanks for the ARM target.
Constraints
Each operand of an inline asm block is described by a constraint string encoding the valid representations of the operand in the generated assembly. For example the “r” code denotes a general-purpose register. In addition to the standard constraints, ARM allows a number of special codes, only some of which are documented. The full list, including a brief description, is available in the constraints.md file in the GCC source tree. The following table is an extract from this file consisting of the codes which are meaningful in an inline asm block (a few are only useful in the machine description itself).
f Legacy FPA registers f0-f7. t The VFP registers s0-s31. v The Cirrus Maverick co-processor registers. w The VFP registers d0-d15, or d0-d31 for VFPv3. x The VFP registers d0-d7. y The Intel iWMMX co-processor registers. z The Intel iWMMX GR registers. l In Thumb state the core registers r0-r7. h In Thumb state the core registers r8-r15. j A constant suitable for a MOVW instruction. (ARM/Thumb-2) b Thumb only. The union of the low registers and the stack register. I In ARM/Thumb-2 state a constant that can be used as an immediate value in a Data Processing instruction. In Thumb-1 state a constant in the range 0 to 255. J In ARM/Thumb-2 state a constant in the range -4095 to 4095. In Thumb-1 state a constant in the range -255 to -1. K In ARM/Thumb-2 state a constant that satisfies the I constraint if inverted. In Thumb-1 state a constant that satisfies the I constraint multiplied by any power of 2. L In ARM/Thumb-2 state a constant that satisfies the I constraint if negated. In Thumb-1 state a constant in the range -7 to 7. M In Thumb-1 state a constant that is a multiple of 4 in the range 0 to 1020. N Thumb-1 state a constant in the range 0 to 31. O In Thumb-1 state a constant that is a multiple of 4 in the range -508 to 508. Pa In Thumb-1 state a constant in the range -510 to +510 Pb In Thumb-1 state a constant in the range -262 to +262 Ps In Thumb-2 state a constant in the range -255 to +255 Pt In Thumb-2 state a constant in the range -7 to +7 G In ARM/Thumb-2 state a valid FPA immediate constant. H In ARM/Thumb-2 state a valid FPA immediate constant when negated. Da In ARM/Thumb-2 state a const_int, const_double or const_vector that can be generated with two Data Processing insns. Db In ARM/Thumb-2 state a const_int, const_double or const_vector that can be generated with three Data Processing insns. Dc In ARM/Thumb-2 state a const_int, const_double or const_vector that can be generated with four Data Processing insns. This pattern is disabled if optimizing for space or when we have load-delay slots to fill. Dn In ARM/Thumb-2 state a const_vector which can be loaded with a Neon vmov immediate instruction. Dl In ARM/Thumb-2 state a const_vector which can be used with a Neon vorr or vbic instruction. DL In ARM/Thumb-2 state a const_vector which can be used with a Neon vorn or vand instruction. Dv In ARM/Thumb-2 state a const_double which can be used with a VFP fconsts instruction. Dy In ARM/Thumb-2 state a const_double which can be used with a VFP fconstd instruction. Ut In ARM/Thumb-2 state an address valid for loading/storing opaque structure types wider than TImode. Uv In ARM/Thumb-2 state a valid VFP load/store address. Uy In ARM/Thumb-2 state a valid iWMMX load/store address. Un In ARM/Thumb-2 state a valid address for Neon doubleword vector load/store instructions. Um In ARM/Thumb-2 state a valid address for Neon element and structure load/store instructions. Us In ARM/Thumb-2 state a valid address for non-offset loads/stores of quad-word values in four ARM registers. Uq In ARM state an address valid in ldrsb instructions. Q In ARM/Thumb-2 state an address that is a single base register. Operand codes
Within the text of an inline asm block, operands are referenced as %0, %1 etc. Register operands are printed as rN, memory operands as [rN, #offset], and so forth. In some situations, for example with operands occupying multiple registers, more detailed control of the output may be required, and once again, an undocumented feature comes to our rescue.
Special code letters inserted between the % and the operand number alter the output from the default for each type of operand. The table below lists the more useful ones.
c An integer or symbol address without a preceding # sign B Bitwise inverse of integer or symbol without a preceding # L The low 16 bits of an immediate constant m The base register of a memory operand M A register range suitable for LDM/STM H The highest-numbered register of a pair Q The least significant register of a pair R The most significant register of a pair P A double-precision VFP register p The high single-precision register of a VFP double-precision register q A NEON quad register e The low doubleword register of a NEON quad register f The high doubleword register of a NEON quad register h A range of VFP/NEON registers suitable for VLD1/VST1 A A memory operand for a VLD1/VST1 instruction y S register as indexed D register, e.g. s5 becomes d2[1] -
Need help configuring FFMPEG to work with a webcams h264 stream
9 août 2020, par The Welsh DragonI have been trying to get a H264 stream from a H264 usb webcam working but I am not making much progress so I'm hoping someone knows FFMPEG better than me !


There are dozens of questions/answers on SO but none solve my problem.


In short, I get a very pixelated (or sometimes mostly green) screen. I am using VLC to test the stream which is coming via an RTSP server. I am using FFMPEG to copy the webcam stream to the local RTSP server.


The webcam also supports YUYV which I can get working - it is just the h264 stream causing me problems.


So this is how the device is presented :


H264 USB Camera: USB Camera (usb-20980000.usb-1):
 /dev/video0
 /dev/video1
 /dev/video2
 /dev/video3



/dev/video0 is the YUYV and MPEG stream
/dev/video2 is the h264 stream that has the following capabilities :


ioctl: VIDIOC_ENUM_FMT
 Type: Video Capture

 [0]: 'H264' (H.264, compressed)
 Size: Discrete 1920x1080
 Interval: Discrete 0.033s (30.000 fps)
 Interval: Discrete 0.040s (25.000 fps)
 Interval: Discrete 0.067s (15.000 fps)
 Interval: Discrete 0.033s (30.000 fps)
 Interval: Discrete 0.040s (25.000 fps)
 Interval: Discrete 0.067s (15.000 fps)
 Size: Discrete 1280x720
 Interval: Discrete 0.033s (30.000 fps)
 Interval: Discrete 0.040s (25.000 fps)
 Interval: Discrete 0.067s (15.000 fps)
 Size: Discrete 800x600
 Interval: Discrete 0.033s (30.000 fps)
 Interval: Discrete 0.040s (25.000 fps)
 Interval: Discrete 0.067s (15.000 fps)
 Size: Discrete 640x480
 Interval: Discrete 0.033s (30.000 fps)
 Interval: Discrete 0.040s (25.000 fps)
 Interval: Discrete 0.067s (15.000 fps)
 Size: Discrete 640x360
 Interval: Discrete 0.033s (30.000 fps)
 Interval: Discrete 0.040s (25.000 fps)
 Interval: Discrete 0.067s (15.000 fps)
 Size: Discrete 352x288
 Interval: Discrete 0.033s (30.000 fps)
 Interval: Discrete 0.040s (25.000 fps)
 Interval: Discrete 0.067s (15.000 fps)
 Size: Discrete 320x240
 Interval: Discrete 0.033s (30.000 fps)
 Interval: Discrete 0.040s (25.000 fps)
 Interval: Discrete 0.067s (15.000 fps)
 Size: Discrete 1920x1080
 Interval: Discrete 0.033s (30.000 fps)
 Interval: Discrete 0.040s (25.000 fps)
 Interval: Discrete 0.067s (15.000 fps)
 Interval: Discrete 0.033s (30.000 fps)
 Interval: Discrete 0.040s (25.000 fps)
 Interval: Discrete 0.067s (15.000 fps)



I have tried various resolutions, the smaller giving slightly less pixelated images but none are usable and definitely dont compare to the YUYV high resolution results.


This (YUYV) command works :


ffmpeg -input_format yuyv422 -f video4linux2 -s 1280x720 -r 10 -i /dev/video0 -c:v h264_omx -r 10 -b:v 2M -an -f rtsp rtsp://localhost:80/live/stream



These two h264 options dont work :


ffmpeg -input_format h264 -f video4linux2 -video_size 1920x1080 -framerate 30 -i /dev/video0 -c:v copy -an -f rtsp rtsp://localhost:80/live/stream



ffmpeg -re -i /dev/video2 -video_size 800x600 -framerate 15 -pix_fmt yuv420p -tune zerolatency -c:v copy -an -f rtsp rtsp://localhost:80/live/stream



For that last command the FFMPEG output looks like this :


ffmpeg version git-2020-08-07-6fdf3cc Copyright (c) 2000-2020 the FFmpeg developers
 built with gcc 8 (Raspbian 8.3.0-6+rpi1)
 configuration: --extra-ldflags=-latomic --arch=armel --target-os=linux --enable-gpl --enable-omx --enable-omx-rpi --enable-nonfree --enable-libfreetype --enable-libx264 --enable-libmp3lame --enable-mmal --enable-indev=alsa --enable-outdev=alsa
 libavutil 56. 58.100 / 56. 58.100
 libavcodec 58.100.100 / 58.100.100
 libavformat 58. 50.100 / 58. 50.100
 libavdevice 58. 11.101 / 58. 11.101
 libavfilter 7. 87.100 / 7. 87.100
 libswscale 5. 8.100 / 5. 8.100
 libswresample 3. 8.100 / 3. 8.100
 libpostproc 55. 8.100 / 55. 8.100
Input #0, video4linux2,v4l2, from '/dev/video2':
 Duration: N/A, start: 1353.265049, bitrate: N/A
 Stream #0:0: Video: h264 (Main), yuv420p(progressive), 1920x1080, 30 fps, 30 tbr, 1000k tbn, 2000k tbc
[udp @ 0x38c29f0] attempted to set receive buffer to size 393216 but it only ended up set as 360448
[udp @ 0x38d7b50] attempted to set receive buffer to size 393216 but it only ended up set as 360448
Output #0, rtsp, to 'rtsp://localhost:80/live/stream':
 Metadata:
 encoder : Lavf58.50.100
 Stream #0:0: Video: h264 (Main), yuv420p(progressive), 1920x1080, q=2-31, 30 fps, 30 tbr, 90k tbn, 1000k tbc
Stream mapping:
 Stream #0:0 -> #0:0 (copy)
Press [q] to stop, [?] for help
[rtsp @ 0x38fd890] Timestamps are unset in a packet for stream 0. This is deprecated and will stop working in the future. Fix your code to set the timestamps properly
[rtsp @ 0x38fd890] Non-monotonous DTS in output stream 0:0; previous: 0, current: 0; changing to 1. This may result in incorrect timestamps in the output file.
frame= 348 fps= 18 q=-1.0 size=N/A time=00:00:21.03 bitrate=N/A speed=1.09x



The issue looks like it is bandwidth related or the lack of processing power in the device being used BUT the YUYV works at a high resolution and (taking a completely different approach i.e. not using FFMPEG) I can get a very decent MPEG stream working on the same device.


So any FFMPEG experts out there who can help me with getting the correct parameters for a h264 stream ?