
Recherche avancée
Médias (1)
-
Rennes Emotion Map 2010-11
19 octobre 2011, par
Mis à jour : Juillet 2013
Langue : français
Type : Texte
Autres articles (60)
-
Personnaliser en ajoutant son logo, sa bannière ou son image de fond
5 septembre 2013, parCertains thèmes prennent en compte trois éléments de personnalisation : l’ajout d’un logo ; l’ajout d’une bannière l’ajout d’une image de fond ;
-
Ecrire une actualité
21 juin 2013, parPrésentez les changements dans votre MédiaSPIP ou les actualités de vos projets sur votre MédiaSPIP grâce à la rubrique actualités.
Dans le thème par défaut spipeo de MédiaSPIP, les actualités sont affichées en bas de la page principale sous les éditoriaux.
Vous pouvez personnaliser le formulaire de création d’une actualité.
Formulaire de création d’une actualité Dans le cas d’un document de type actualité, les champs proposés par défaut sont : Date de publication ( personnaliser la date de publication ) (...) -
Publier sur MédiaSpip
13 juin 2013Puis-je poster des contenus à partir d’une tablette Ipad ?
Oui, si votre Médiaspip installé est à la version 0.2 ou supérieure. Contacter au besoin l’administrateur de votre MédiaSpip pour le savoir
Sur d’autres sites (11517)
-
Google Speech API "Sample rate in request does not match FLAC header"
13 février 2017, par kjdion84I’m trying to convert an mp4 video clip into a FLAC audio file and then have google speech spit out the words from the video so that I can detect if specific words were said.
I have everything working except that I am getting an error from the Speech API :
{
"error": {
"code": 400,
"message": "Sample rate in request does not match FLAC header.",
"status": "INVALID_ARGUMENT"
}
}I am using FFMPEG in order to convert the mp4 into a FLAC file. I am specifying that the FLAC file be 16 bits in the command, but when I right click on the FLAC file Windows is telling me it is 302kbps.
Here is my PHP code :
// convert mp4 video to 16 bit flac audio file
$cmd = 'C:/wamp/www/ffmpeg/bin/ffmpeg.exe -i C:/wamp/www/test.mp4 -c:a flac -sample_fmt s16 C:/wamp/www/test.flac';
exec($cmd, $output);
// convert flac to text so we can detect if certain words were said
$data = array(
"config" => array(
"encoding" => "FLAC",
"sampleRate" => 16000,
"languageCode" => "en-US"
),
"audio" => array(
"content" => base64_encode(file_get_contents("test.flac")),
)
);
$json_data = json_encode($data);
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://speech.googleapis.com/v1beta1/speech:syncrecognize?key=MY_API_KEY');
curl_setopt($ch, CURLOPT_HTTPHEADER, array("Content-Type: application/json"));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $json_data);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
$result = curl_exec($ch); -
Opencv VideoCapture not streaming RTSP link and returns "no frame !"
6 septembre 2023, par Asadullah NaeemI am trying to stream my HikVision IP camera throough python. I am using
cv2.VideoCapture("rtsp_link")
which works fine on my Laptop but when I try to run the same python script with same Opencv and FFmpeg version it gives me following error :

Error :


[h264 @ 000002124c7f9a40] missing picture in access unit with size 47
[h264 @ 000002124c7f9a40] no frame!



I have so far tried to run this script on 5 computer devices but it gives the same error. I am using the following python script and my Opencv version is
4.6.0.66
and ffmpeg version2022-06-20-git-56419428a8-essentials_build-www.gyan.dev
:

Python Script :


import cv2

# RTSP stream URL
rtsp_url = "rtsp://username:password@ip_address:port/Streaming/Channels/501"

# Open the RTSP stream
cap = cv2.VideoCapture(rtsp_url)

# Check if the stream was successfully opened
if not cap.isOpened():
 print("Failed to open RTSP stream.")
 exit()

# Read and display frames from the stream
while True:
 # Read a frame from the stream
 ret, frame = cap.read()

 # Check if the frame was successfully read
 if not ret:
 print("Failed to read frame from RTSP stream.")
 break

 # Display the frame
 cv2.imshow("RTSP Stream", frame)

 # Exit if 'q' is pressed
 if cv2.waitKey(1) & 0xFF == ord('q'):
 break

# Release the resources
cap.release()
cv2.destroyAllWindows()




Update :


Code runs on a laptop on both wifi and mobile internet (4G) but on other devices rtsp link is accessible only with mobile internet (4G).


-
Google Speech API + Go - Transcribing Audio Stream of Unknown Length
14 février 2018, par JoshI have an rtmp stream of a video call and I want to transcribe it. I have created 2 services in Go and I’m getting results but it’s not very accurate and a lot of data seems to get lost.
Let me explain.
I have a
transcode
service, I use ffmpeg to transcode the video to Linear16 audio and place the output bytes onto a PubSub queue for atranscribe
service to handle. Obviously there is a limit to the size of the PubSub message, and I want to start transcribing before the end of the video call. So, I chunk the transcoded data into 3 second clips (not fixed length, just seems about right) and put them onto the queue.The data is transcoded quite simply :
var stdout Buffer
cmd := exec.Command("ffmpeg", "-i", url, "-f", "s16le", "-acodec", "pcm_s16le", "-ar", "16000", "-ac", "1", "-")
cmd.Stdout = &stdout
if err := cmd.Start(); err != nil {
log.Fatal(err)
}
ticker := time.NewTicker(3 * time.Second)
for {
select {
case <-ticker.C:
bytesConverted := stdout.Len()
log.Infof("Converted %d bytes", bytesConverted)
// Send the data we converted, even if there are no bytes.
topic.Publish(ctx, &pubsub.Message{
Data: stdout.Bytes(),
})
stdout.Reset()
}
}The
transcribe
service pulls messages from the queue at a rate of 1 every 3 seconds, helping to process the audio data at about the same rate as it’s being created. There are limits on the Speech API stream, it can’t be longer than 60 seconds so I stop the old stream and start a new one every 30 seconds so we never hit the limit, no matter how long the video call lasts for.This is how I’m transcribing it :
stream := prepareNewStream()
clipLengthTicker := time.NewTicker(30 * time.Second)
chunkLengthTicker := time.NewTicker(3 * time.Second)
cctx, cancel := context.WithCancel(context.TODO())
err := subscription.Receive(cctx, func(ctx context.Context, msg *pubsub.Message) {
select {
case <-clipLengthTicker.C:
log.Infof("Clip length reached.")
log.Infof("Closing stream and starting over")
err := stream.CloseSend()
if err != nil {
log.Fatalf("Could not close stream: %v", err)
}
go getResult(stream)
stream = prepareNewStream()
case <-chunkLengthTicker.C:
log.Infof("Chunk length reached.")
bytesConverted := len(msg.Data)
log.Infof("Received %d bytes\n", bytesConverted)
if bytesConverted > 0 {
if err := stream.Send(&speechpb.StreamingRecognizeRequest{
StreamingRequest: &speechpb.StreamingRecognizeRequest_AudioContent{
AudioContent: transcodedChunk.Data,
},
}); err != nil {
resp, _ := stream.Recv()
log.Errorf("Could not send audio: %v", resp.GetError())
}
}
msg.Ack()
}
})I think the problem is that my 3 second chunks don’t necessarily line up with starts and end of phrases or sentences so I suspect that the Speech API is a recurrent neural network which has been trained on full sentences rather than individual words. So starting a clip in the middle of a sentence loses some data because it can’t figure out the first few words up to the natural end of a phrase. Also, I lose some data in changing from an old stream to a new stream. There’s some context lost. I guess overlapping clips might help with this.
I have a couple of questions :
1) Does this architecture seem appropriate for my constraints (unknown length of audio stream, etc.) ?
2) What can I do to improve accuracy and minimise lost data ?
(Note I’ve simplified the examples for readability. Point out if anything doesn’t make sense because I’ve been heavy handed in cutting the examples down.)