
Recherche avancée
Médias (91)
-
Spitfire Parade - Crisis
15 mai 2011, par
Mis à jour : Septembre 2011
Langue : English
Type : Audio
-
Wired NextMusic
14 mai 2011, par
Mis à jour : Février 2012
Langue : English
Type : Video
-
Video d’abeille en portrait
14 mai 2011, par
Mis à jour : Février 2012
Langue : français
Type : Video
-
Sintel MP4 Surround 5.1 Full
13 mai 2011, par
Mis à jour : Février 2012
Langue : English
Type : Video
-
Carte de Schillerkiez
13 mai 2011, par
Mis à jour : Septembre 2011
Langue : English
Type : Texte
-
Publier une image simplement
13 avril 2011, par ,
Mis à jour : Février 2012
Langue : français
Type : Video
Autres articles (72)
-
Personnaliser en ajoutant son logo, sa bannière ou son image de fond
5 septembre 2013, parCertains thèmes prennent en compte trois éléments de personnalisation : l’ajout d’un logo ; l’ajout d’une bannière l’ajout d’une image de fond ;
-
Publier sur MédiaSpip
13 juin 2013Puis-je poster des contenus à partir d’une tablette Ipad ?
Oui, si votre Médiaspip installé est à la version 0.2 ou supérieure. Contacter au besoin l’administrateur de votre MédiaSpip pour le savoir -
Ajouter notes et légendes aux images
7 février 2011, parPour pouvoir ajouter notes et légendes aux images, la première étape est d’installer le plugin "Légendes".
Une fois le plugin activé, vous pouvez le configurer dans l’espace de configuration afin de modifier les droits de création / modification et de suppression des notes. Par défaut seuls les administrateurs du site peuvent ajouter des notes aux images.
Modification lors de l’ajout d’un média
Lors de l’ajout d’un média de type "image" un nouveau bouton apparait au dessus de la prévisualisation (...)
Sur d’autres sites (8992)
-
How to Read DJI H264 FPV Feed as OpenCV Mat Object ?
29 mai 2019, par Walter MorawaTDLR : All DJI developers would benefit from decoding raw H264 video stream byte arrays to a format compatible with OpenCV.
I’ve spent a lot of time looking for a solution to reading DJI’s FPV feed as an OpenCV Mat object. I am probably overlooking something fundamental, since I am not too familiar with Image Encoding/Decoding.
Future developers who come across it will likely run into a bunch of the same issues I had. It would be great if DJI developers could use opencv directly without needing a 3rd party library.
I’m willing to use ffmpeg or JavaCV if necessary, but that’s quite the hurdle for most Android developers as we’re going to have to use cpp, ndk, terminal for testing, etc. That seems like overkill. Both options seem quite time consuming. This JavaCV H264 conversion seems unnecessarily complex. I found it from this relevant question.
I believe the issue lies in the fact that we need to decode both the byte array of length 6 (info array) and the byte array with current frame info simultaneously.
Basically, DJI’s FPV feed comes in a number of formats.
- Raw H264 (MPEG4) in VideoFeeder.VideoDataListener
// The callback for receiving the raw H264 video data for camera live view
mReceivedVideoDataListener = new VideoFeeder.VideoDataListener() {
@Override
public void onReceive(byte[] videoBuffer, int size) {
//Log.d("BytesReceived", Integer.toString(videoStreamFrameNumber));
if (videoStreamFrameNumber++%30 == 0){
//convert video buffer to opencv array
OpenCvAndModelAsync openCvAndModelAsync = new OpenCvAndModelAsync();
openCvAndModelAsync.execute(videoBuffer);
}
if (mCodecManager != null) {
mCodecManager.sendDataToDecoder(videoBuffer, size);
}
}
};- DJI also has it’s own Android decoder sample with FFMPEG to convert to YUV format.
@Override
public void onYuvDataReceived(final ByteBuffer yuvFrame, int dataSize, final int width, final int height) {
//In this demo, we test the YUV data by saving it into JPG files.
//DJILog.d(TAG, "onYuvDataReceived " + dataSize);
if (count++ % 30 == 0 && yuvFrame != null) {
final byte[] bytes = new byte[dataSize];
yuvFrame.get(bytes);
AsyncTask.execute(new Runnable() {
@Override
public void run() {
if (bytes.length >= width * height) {
Log.d("MatWidth", "Made it");
YuvImage yuvImage = saveYuvDataToJPEG(bytes, width, height);
Bitmap rgbYuvConvert = convertYuvImageToRgb(yuvImage, width, height);
Mat yuvMat = new Mat(height, width, CvType.CV_8UC1);
yuvMat.put(0, 0, bytes);
//OpenCv Stuff
}
}
});
}
}Edit : For those who want to see DJI’s YUV to JPEG function, here it is from the sample application :
private YuvImage saveYuvDataToJPEG(byte[] yuvFrame, int width, int height){
byte[] y = new byte[width * height];
byte[] u = new byte[width * height / 4];
byte[] v = new byte[width * height / 4];
byte[] nu = new byte[width * height / 4]; //
byte[] nv = new byte[width * height / 4];
System.arraycopy(yuvFrame, 0, y, 0, y.length);
Log.d("MatY", y.toString());
for (int i = 0; i < u.length; i++) {
v[i] = yuvFrame[y.length + 2 * i];
u[i] = yuvFrame[y.length + 2 * i + 1];
}
int uvWidth = width / 2;
int uvHeight = height / 2;
for (int j = 0; j < uvWidth / 2; j++) {
for (int i = 0; i < uvHeight / 2; i++) {
byte uSample1 = u[i * uvWidth + j];
byte uSample2 = u[i * uvWidth + j + uvWidth / 2];
byte vSample1 = v[(i + uvHeight / 2) * uvWidth + j];
byte vSample2 = v[(i + uvHeight / 2) * uvWidth + j + uvWidth / 2];
nu[2 * (i * uvWidth + j)] = uSample1;
nu[2 * (i * uvWidth + j) + 1] = uSample1;
nu[2 * (i * uvWidth + j) + uvWidth] = uSample2;
nu[2 * (i * uvWidth + j) + 1 + uvWidth] = uSample2;
nv[2 * (i * uvWidth + j)] = vSample1;
nv[2 * (i * uvWidth + j) + 1] = vSample1;
nv[2 * (i * uvWidth + j) + uvWidth] = vSample2;
nv[2 * (i * uvWidth + j) + 1 + uvWidth] = vSample2;
}
}
//nv21test
byte[] bytes = new byte[yuvFrame.length];
System.arraycopy(y, 0, bytes, 0, y.length);
for (int i = 0; i < u.length; i++) {
bytes[y.length + (i * 2)] = nv[i];
bytes[y.length + (i * 2) + 1] = nu[i];
}
Log.d(TAG,
"onYuvDataReceived: frame index: "
+ DJIVideoStreamDecoder.getInstance().frameIndex
+ ",array length: "
+ bytes.length);
YuvImage yuver = screenShot(bytes,Environment.getExternalStorageDirectory() + "/DJI_ScreenShot", width, height);
return yuver;
}
/**
* Save the buffered data into a JPG image file
*/
private YuvImage screenShot(byte[] buf, String shotDir, int width, int height) {
File dir = new File(shotDir);
if (!dir.exists() || !dir.isDirectory()) {
dir.mkdirs();
}
YuvImage yuvImage = new YuvImage(buf,
ImageFormat.NV21,
width,
height,
null);
OutputStream outputFile = null;
final String path = dir + "/ScreenShot_" + System.currentTimeMillis() + ".jpg";
try {
outputFile = new FileOutputStream(new File(path));
} catch (FileNotFoundException e) {
Log.e(TAG, "test screenShot: new bitmap output file error: " + e);
//return;
}
if (outputFile != null) {
yuvImage.compressToJpeg(new Rect(0,
0,
width,
height), 100, outputFile);
}
try {
outputFile.close();
} catch (IOException e) {
Log.e(TAG, "test screenShot: compress yuv image error: " + e);
e.printStackTrace();
}
runOnUiThread(new Runnable() {
@Override
public void run() {
displayPath(path);
}
});
return yuvImage;
}- DJI also appears to have a "getRgbaData" function, but there is literally not a single example online or by DJI. Go ahead and Google "DJI getRgbaData"... There’s only the reference to the api documentation that explains the self explanatory parameters and return values but nothing else. I couldn’t figure out where to call this and there doesn’t appear to be a callback function as there is with YUV. You can’t call it from the h264b byte array directly, but perhaps you can get it from the yuv data.
Option 1 is much more preferable to option 2, since YUV format has quality issues. Option 3 would also likely involve a decoder.
Here’s a screenshot that DJI’s own YUV conversion produces.
I’ve looked at a bunch of things about how to improve the YUV, remove green and yellow colors and whatnot, but at this point if DJI can’t do it right, I don’t want to invest resources there.
Regarding Option 1, I know there’s FFMPEG and JavaCV that seem like good options if I have to go the video decoding route.
Moreover, from what I understand, OpenCV can’t handle reading and writing video files without FFMPEG, but I’m not trying to read a video file, I am trying to read an H264/MPEG4 byte[] array. The following code seems to get positive results.
/* Async OpenCV Code */
private class OpenCvAndModelAsync extends AsyncTask {
@Override
protected double[] doInBackground(byte[]... params) {//Background Code Executing. Don't touch any UI components
//get fpv feed and convert bytes to mat array
Mat videoBufMat = new Mat(4, params[0].length, CvType.CV_8UC4);
videoBufMat.put(0,0, params[0]);
//if I add this in it says the bytes are empty.
//Mat videoBufMat = Imgcodecs.imdecode(encodeVideoBuf, Imgcodecs.IMREAD_ANYCOLOR);
//encodeVideoBuf.release();
Log.d("MatRgba", videoBufMat.toString());
for (int i = 0; i< videoBufMat.rows(); i++){
for (int j=0; j< videoBufMat.cols(); j++){
double[] rgb = videoBufMat.get(i, j);
Log.i("Matrix", "red: "+rgb[0]+" green: "+rgb[1]+" blue: "+rgb[2]+" alpha: "
+ rgb[3] + " Length: " + rgb.length + " Rows: "
+ videoBufMat.rows() + " Columns: " + videoBufMat.cols());
}
}
double[] center = openCVThingy(videoBufMat);
return center;
}
protected void onPostExecute(double[] center) {
//handle ui or another async task if necessary
}
}Rows = 4, Columns > 30k. I get lots of RGB values that seem valid, such as red = 113, green=75, blue=90, alpha=220 as a made up example ; however, I get a ton of 0,0,0,0 values. That should be somewhat okay, since Black is 0,0,0 (although I would have thought the alpha would be higher) and I have a black object in my image. I also don’t seem to get any white values 255, 255, 255, even though there is also plenty of white area. I’m not logging the entire byte so it could be there, but I have yet to see it.
However, when I try to compute the contours from this image, I almost always get that the moments (center x, y) are exactly in the center of the image. This error has nothing to do with my color filter or contours algorithm, as I wrote a script in python and tested that I implemented it correctly in Android by reading a still image and getting the exact same number of contours, position, etc in both Python and Android.
I noticed it has something to do with the videoBuffer byte size (bonus points if you can explain why every other length is 6)
2019-05-23 21:14:29.601 21431-22086/com.dji.simulatorDemo D/VideoBufferSize: 2425
2019-05-23 21:14:29.802 21431-22086/com.dji.simulatorDemo D/VideoBufferSize: 2659
2019-05-23 21:14:30.004 21431-22086/com.dji.simulatorDemo D/VideoBufferSize: 6
2019-05-23 21:14:30.263 21431-22086/com.dji.simulatorDemo D/VideoBufferSize: 6015
2019-05-23 21:14:30.507 21431-22086/com.dji.simulatorDemo D/VideoBufferSize: 6
2019-05-23 21:14:30.766 21431-22086/com.dji.simulatorDemo D/VideoBufferSize: 4682
2019-05-23 21:14:31.005 21431-22086/com.dji.simulatorDemo D/VideoBufferSize: 6
2019-05-23 21:14:31.234 21431-22086/com.dji.simulatorDemo D/VideoBufferSize: 2840
2019-05-23 21:14:31.433 21431-22086/com.dji.simulatorDemo D/VideoBufferSize: 4482
2019-05-23 21:14:31.664 21431-22086/com.dji.simulatorDemo D/VideoBufferSize: 6
2019-05-23 21:14:31.927 21431-22086/com.dji.simulatorDemo D/VideoBufferSize: 4768
2019-05-23 21:14:32.174 21431-22086/com.dji.simulatorDemo D/VideoBufferSize: 6
2019-05-23 21:14:32.433 21431-22086/com.dji.simulatorDemo D/VideoBufferSize: 4700
2019-05-23 21:14:32.668 21431-22086/com.dji.simulatorDemo D/VideoBufferSize: 6
2019-05-23 21:14:32.864 21431-22086/com.dji.simulatorDemo D/VideoBufferSize: 4740
2019-05-23 21:14:33.102 21431-22086/com.dji.simulatorDemo D/VideoBufferSize: 6
2019-05-23 21:14:33.365 21431-22086/com.dji.simulatorDemo D/VideoBufferSize: 4640My questions :
I. Is this the correct format to read an h264 byte as mat ?
Assuming the format is RGBA, that means row = 4 and columns = byte[].length, and CvType.CV_8UC4. Do I have height and width correct ? Something tells me YUV height and width is off. I was getting some meaningful results, but the contours were exactly in the center, just like with the H264.II. Does OpenCV handle MP4 in android like this ? If not, do we need to use FFMPEG or JavaCV ?
III. Does the int size have something to do with it ? Why is the int size occassionally 6, and other times 2400 to 6000 ? I’ve heard about the difference between this frames information and information about the next frame, but I’m simply not knowledgeable enough to know how to apply that here.
I’m starting to think this is where the issue lies. Since I need to get the 6 byte array for info about next frame, perhaps my modulo 30 is incorrect. So should I pass the 29th or 31st frame as a format byte for each frame ? How is that done in opencv or are we doomed to use the complicated ffmpeg ? How would I go about joining the neighboring frames/ byte arrays ?
IV. Can I fix this using Imcodecs ? I was hoping opencv would natively handle whether a frame was color from this frame or info about next frame. I added the below code, but I am getting an empty array :
Mat videoBufMat = Imgcodecs.imdecode(new MatOfByte(params[0]), Imgcodecs.IMREAD_UNCHANGED);
This also is empty :
Mat encodeVideoBuf = new Mat(4, params[0].length, CvType.CV_8UC4);
encodeVideoBuf.put(0,0, params[0]);
Mat videoBufMat = Imgcodecs.imdecode(encodeVideoBuf, Imgcodecs.IMREAD_UNCHANGED);V. Should I try converting the bytes into Android jpeg and then import it ? Why is djis yuv decoder so complicated looking ? It makes me cautious from wanting to try ffmpeg or Javacv and just stick to Android decoder or opencv decoder.
VI. At what stage should I resize the frames to speed up calculations ?
Edit : DJI support got back to me and confirmed they don’t have any samples for doing what I’ve described. This is a time for we the community to make this available for everyone !
Upon further research, I don’t think opencv will be able to handle this as opencv’s android sdk has no functionality for video files/url’s (apart from a homegrown MJPEG codec).
So is there a way in Android to convert to mjpeg or similar in order to read ? In my application, I only need 1 or 2 frames per second, so perhaps I can save the image as jpeg.
But for real time applications we will likely need to write our own decoder. Please help so that we can make this available to everyone ! This question seems promising :
-
Performance issue in streaming desktop out of a Raspberry with FFMPEG
30 avril 2021, par skynetI'm quite a newbie in FFMPEG and I apologize in advance for any inaccuracy I may write.


My goal is to stream over UDP a Full HD (1920x1080) desktop connected to a Raspberry PI 4 - 4 GB RAM.


I made many attempts, and currently this the setup with better performance I found (I used the knowledge in https://www.willusher.io/general/2020/11/15/hw-accel-encoding-rpi4).


- 

- I installed the 64-bit Raspbian OS on the PI from this link : https://www.raspberrypi.org/forums/viewtopic.php?t=275370




This because the HW H264 encoder (h264_v4l2m2m) is faster than the 32-bit encoder h264_omx


- 

- I downloaded and installed FFMPEG 4.4, using this configuration
./configure —prefix="$HOME/ffmpeg_build" —pkg-config-flags="pkg-config —static" —extra-cflags="-I$HOME/ffmpeg_build/include" —extra-ldflags="-L$HOME/ffmpeg_build/lib" —extra-libs="-lpthread -lm" —bindir="$HOME/bin" —enable-gpl —enable-gnutls —disable-libaom —enable-libass —enable-libfdk-aac —enable-libfreetype —enable-libmp3lame —enable-libopus —enable-libvorbis —enable-libvpx —enable-libx264 —disable-libx265 —enable-nonfree —arch=aarch64 —disable-libxml2 —enable-libwebp—enable-libdrm




I used FFMPEG 4.4 because of compatibility issues with h264_v4l2m2m of the shipped FFMPEG version 4.1.4


- 

-
I use this command


ffmpeg -f x11grab -probesize 42M -s 1920x1080 -i :0.0 -c:a copy -c:v h264_v4l2m2m -num_output_buffers 32 -num_capture_buffers 16 -b:v 8M -minrate 8M -maxrate 8M -pix_fmt rgb24 -f mpegts udp ://239.255.90.60:5004 ?pkt_size=1316






The -pix_fmt rgb24 flag is needed because x11grab in 64 bit OS use BGR0 chroma subsampling... which means red and blue colours are inverted !


This is the output :


ffmpeg version 4.4 Copyright (c) 2000-2021 the FFmpeg developers 
 built with gcc 8 (Debian 8.3.0-6) 
 configuration: --prefix=/home/pi/ffmpeg_build --pkg-config-flags='pkg-config --static' --extra-cflags=-I/home/pi/ffmpeg_build/include --extra-ldflags=-L/home/pi/ffmpeg_build/lib --extra-libs='-lpthread -lm' --enable-gpl --enable-gnutls --disable-libaom --enable-libass --enable-libfdk-aac --enable-libfreetype --enable-libmp3lame --enable-libopus --enable-libvorbis --enable-libvpx --enable-libx264 --disable-libx265 --enable-nonfree --arch=aarch64 --disable-libxml2 --enable-libwebp --enable-libdrm 
 libavutil 56. 70.100 / 56. 70.100 
 libavcodec 58.134.100 / 58.134.100 
 libavformat 58. 76.100 / 58. 76.100 
 libavdevice 58. 13.100 / 58. 13.100 
 libavfilter 7.110.100 / 7.110.100 
 libswscale 5. 9.100 / 5. 9.100 
 libswresample 3. 9.100 / 3. 9.100 
 libpostproc 55. 9.100 / 55. 9.100 
Input #0, x11grab, from ':0.0': 
 Duration: N/A, start: 1619778505.990981, bitrate: 1988667 kb/s 
 Stream #0:0: Video: rawvideo (BGR[0] / 0x524742), bgr0, 1920x1080, 1988667 kb/s, 29.97 fps, 56 tbr, 1000k tbn, 1000k tbc 
Stream mapping: 
 Stream #0:0 -> #0:0 (rawvideo (native) -> h264 (h264_v4l2m2m)) 
Press [q] to stop, [?] for help 
[h264_v4l2m2m @ 0x5598d14070] Using device /dev/video11 
[h264_v4l2m2m @ 0x5598d14070] driver 'bcm2835-codec' on card 'bcm2835-codec-encode' in mplane mode 
[h264_v4l2m2m @ 0x5598d14070] requesting formats: output=RGB3 capture=H264 
[h264_v4l2m2m @ 0x5598d14070] Failed to set gop size: Invalid argument 
Output #0, mpegts, to 'udp://239.255.90.60:5004?pkt_size=1316': 
 Metadata: 
 encoder : Lavf58.76.100 
 Stream #0:0: Video: h264, rgb24(pc, progressive), 1920x1080, q=2-31, 8000 kb/s, 56 fps, 90k tbn 
 Metadata: 
 encoder : Lavc58.134.100 h264_v4l2m2m 
[mpegts @ 0x5598d12c10] Non-monotonous DTS in output stream 0:0; previous: 0, current: 0; changing to 1. This may result in incorrect timestamps in the output file. 
frame= 799 fps= 16 q=-0.0 size= 16180kB time=00:00:49.19 bitrate=2694.2kbits/s speed= 1x 



As you can see my issue is that the fps value is about 15 when streaming an HD video with VLC in full-screen mode (the fps value depends on what is displayed on screen, which I find odd being the encoding process should be done in HW).


So the question is : any hope I can get close to 25 fps, so to have a smooth display on the receiver ? Either using a better FFMPEG command or tweaking the Raspberry ?


Thanks for any help !


-
Computer crashing when using python tools in same script
5 février 2023, par SL1997I am attempting to use the speech recognition toolkit VOSK and the speech diarization package Resemblyzer to transcibe audio and then identify the speakers in the audio.


Tools :


https://github.com/alphacep/vosk-api

https://github.com/resemble-ai/Resemblyzer

I can do both things individually but run into issues when trying to do them when running the one python script.


I used the following guide when setting up the diarization system :




Computer specs are as follows :


Intel(R) Core(TM) i3-7100 CPU @ 3.90GHz, 3912 Mhz, 2 Core(s), 4 Logical Processor(s)

32GB RAM

The following is my code, I am not to sure if using threading is appropriate or if I even implemented it correctly, how can I best optimize this code as to achieve the results I am looking for and not crash.


from vosk import Model, KaldiRecognizer
from pydub import AudioSegment
import json
import sys
import os
import subprocess
import datetime
from resemblyzer import preprocess_wav, VoiceEncoder
from pathlib import Path
from resemblyzer.hparams import sampling_rate
from spectralcluster import SpectralClusterer
import threading
import queue
import gc



def recognition(queue, audio, FRAME_RATE):

 model = Model("Vosk_Models/vosk-model-small-en-us-0.15")

 rec = KaldiRecognizer(model, FRAME_RATE)
 rec.SetWords(True)

 rec.AcceptWaveform(audio.raw_data)
 result = rec.Result()

 transcript = json.loads(result)#["text"]

 #return transcript
 queue.put(transcript)



def diarization(queue, audio):

 wav = preprocess_wav(audio)
 encoder = VoiceEncoder("cpu")
 _, cont_embeds, wav_splits = encoder.embed_utterance(wav, return_partials=True, rate=16)
 print(cont_embeds.shape)

 clusterer = SpectralClusterer(
 min_clusters=2,
 max_clusters=100,
 p_percentile=0.90,
 gaussian_blur_sigma=1)

 labels = clusterer.predict(cont_embeds)

 def create_labelling(labels, wav_splits):

 times = [((s.start + s.stop) / 2) / sampling_rate for s in wav_splits]
 labelling = []
 start_time = 0

 for i, time in enumerate(times):
 if i > 0 and labels[i] != labels[i - 1]:
 temp = [str(labels[i - 1]), start_time, time]
 labelling.append(tuple(temp))
 start_time = time
 if i == len(times) - 1:
 temp = [str(labels[i]), start_time, time]
 labelling.append(tuple(temp))

 return labelling

 #return
 labelling = create_labelling(labels, wav_splits)
 queue.put(labelling)



def identify_speaker(queue1, queue2):

 transcript = queue1.get()
 labelling = queue2.get()

 for speaker in labelling:

 speakerID = speaker[0]
 speakerStart = speaker[1]
 speakerEnd = speaker[2]

 result = transcript['result']
 words = [r['word'] for r in result if speakerStart < r['start'] < speakerEnd]
 #return
 print("Speaker",speakerID,":",' '.join(words), "\n")





def main():

 queue1 = queue.Queue()
 queue2 = queue.Queue()

 FRAME_RATE = 16000
 CHANNELS = 1

 podcast = AudioSegment.from_mp3("Podcast_Audio/Film-Release-Clip.mp3")
 podcast = podcast.set_channels(CHANNELS)
 podcast = podcast.set_frame_rate(FRAME_RATE)

 first_thread = threading.Thread(target=recognition, args=(queue1, podcast, FRAME_RATE))
 second_thread = threading.Thread(target=diarization, args=(queue2, podcast))
 third_thread = threading.Thread(target=identify_speaker, args=(queue1, queue2))

 first_thread.start()
 first_thread.join()
 gc.collect()

 second_thread.start()
 second_thread.join()
 gc.collect()

 third_thread.start()
 third_thread.join()
 gc.collect()

 # transcript = recognition(podcast,FRAME_RATE)
 #
 # labelling = diarization(podcast)
 #
 # print(identify_speaker(transcript, labelling))


if __name__ == '__main__':
 main()



When I say crash I mean everything freezes, I have to hold down the power button on the desktop and turn it back on again. No blue/blank screen, just frozen in my IDE looking at my code. Any help in resolving this issue would be greatly appreciated.