
Recherche avancée
Médias (1)
-
Rennes Emotion Map 2010-11
19 octobre 2011, par
Mis à jour : Juillet 2013
Langue : français
Type : Texte
Autres articles (48)
-
MediaSPIP v0.2
21 juin 2013, parMediaSPIP 0.2 est la première version de MediaSPIP stable.
Sa date de sortie officielle est le 21 juin 2013 et est annoncée ici.
Le fichier zip ici présent contient uniquement les sources de MediaSPIP en version standalone.
Comme pour la version précédente, il est nécessaire d’installer manuellement l’ensemble des dépendances logicielles sur le serveur.
Si vous souhaitez utiliser cette archive pour une installation en mode ferme, il vous faudra également procéder à d’autres modifications (...) -
MediaSPIP version 0.1 Beta
16 avril 2011, parMediaSPIP 0.1 beta est la première version de MediaSPIP décrétée comme "utilisable".
Le fichier zip ici présent contient uniquement les sources de MediaSPIP en version standalone.
Pour avoir une installation fonctionnelle, il est nécessaire d’installer manuellement l’ensemble des dépendances logicielles sur le serveur.
Si vous souhaitez utiliser cette archive pour une installation en mode ferme, il vous faudra également procéder à d’autres modifications (...) -
List of compatible distributions
26 avril 2011, parThe table below is the list of Linux distributions compatible with the automated installation script of MediaSPIP. Distribution nameVersion nameVersion number Debian Squeeze 6.x.x Debian Weezy 7.x.x Debian Jessie 8.x.x Ubuntu The Precise Pangolin 12.04 LTS Ubuntu The Trusty Tahr 14.04
If you want to help us improve this list, you can provide us access to a machine whose distribution is not mentioned above or send the necessary fixes to add (...)
Sur d’autres sites (4906)
-
Convert mediarecorder blobs to a type that google speech to text can transcribe
5 janvier 2021, par Manesha RameshI am making an app where the user browser records the user speaking and sends it to the server which then passes it on to the Google speech to the text interface. I am using mediaRecorder to get 1-second blobs which are sent to a server. On the server-side, I send these blobs over to the Google speech to the text interface. However, I am getting an empty transcriptions.



I know what the issue is. Mediarecorder's default Mime Type id audio/WebM codec=opus, which is not accepted by google's speech to text API. After doing some research, I realize I need to use ffmpeg to convert blobs to LInear16. However, ffmpeg only accepts audio FILES and I want to be able to convert BLOBS. Then I can send the resulting converted blobs over to the API interface.



server.js



wsserver.on('connection', socket => {
 console.log("Listening on port 3002")
 audio = {
 content: null
 }
 socket.on('message',function(message){
 // const buffer = new Int16Array(message, 0, Math.floor(data.byteLength / 2));
 // console.log(`received from a client: ${new Uint8Array(message)}`);
 // console.log(message);
 audio.content = message.toString('base64')
 console.log(audio.content);
 livetranscriber.createRequest(audio).then(request => {
 livetranscriber.recognizeStream(request);
 });


 });
});




livetranscriber



module.exports = {
 createRequest: function(audio){
 const encoding = 'LINEAR16';
const sampleRateHertz = 16000;
const languageCode = 'en-US';
 return new Promise((resolve, reject, err) =>{
 if (err){
 reject(err)
 }
 else{
 const request = {
 audio: audio,
 config: {
 encoding: encoding,
 sampleRateHertz: sampleRateHertz,
 languageCode: languageCode,
 },
 interimResults: false, // If you want interim results, set this to true
 };
 resolve(request);
 }
 });

 },
 recognizeStream: async function(request){
 const [response] = await client.recognize(request)
 const transcription = response.results
 .map(result => result.alternatives[0].transcript)
 .join('\n');
 console.log(`Transcription: ${transcription}`);
 // console.log(message);
 // message.pipe(recognizeStream);
 },

}




client



recorder.ondataavailable = function(e) {
 console.log('Data', e.data);

 var ws = new WebSocket('ws://localhost:3002/websocket');
 ws.onopen = function() {
 console.log("opening connection");

 // const stream = websocketStream(ws)
 // const duplex = WebSocket.createWebSocketStream(ws, { encoding: 'utf8' });
 var blob = new Blob(e, { 'type' : 'audio/wav; base64' });
 ws.send(blob.data);
 // e.data).pipe(stream); 
 // console.log(e.data);
 console.log("Sent the message")
 };

 // chunks.push(e.data);
 // socket.emit('data', e.data);
 }



-
Revision b50e518ab6 : Require webm when explicitly requested https://code.google.com/p/webm/issues/de
31 janvier 2015, par JohannChanged Paths :
Modify /vpxenc.c
Require webm when explicitly requestedhttps://code.google.com/p/webm/issues/detail?id=906
Change-Id : I72841078ff81152d21d84ccf4d5548e757685a6d
-
What's the most desireable way to capture system display and audio in the form of individual encoded audio and video packets in go (language) ? [closed]
11 janvier 2023, par Tiger YangQuestion (read the context below first) :


For those of you familiar with the capabilities of go, Is there a better way to go about all this ? Since ffmpeg is so ubiquitous, I'm sure it's been optomized to perfection, but what's the best way to capture system display and audio in the form of individual encoded audio and video packets in go (language), so that they can be then sent via webtransport-go ? I wish for it to prioritize efficiency and low latency, and ideally capture and encode the framebuffer directly like ffmpeg does.


Thanks ! I have many other questions about this, but I think it's best to ask as I go.


Context and what I've done so far :


I'm writing a remote desktop software for my personal use because of grievances with current solutions out there. At the moment, it consists of a web app that uses the webtransport API to send input datagrams and receive AV packets on two dedicated unidirectional streams, and the webcodecs API to decode these packets. On the serverside, I originally planned to use python with the aioquic library as a webtransport server. Upon connection and authentication, the server would start ffmpeg as a subprocess with this command :


ffmpeg -init_hw_device d3d11va -filter_complex ddagrab=video_size=1920x1080:framerate=60 -vcodec hevc_nvenc -tune ll -preset p7 -spatial_aq 1 -temporal_aq 1 -forced-idr 1 -rc cbr -b:v 400K -no-scenecut 1 -g 216000 -f hevc -


What I really appreciate about this is that it uses windows' desktop duplication API to copy the framebuffer of my GPU and hand that directly to the on-die hardware encoder with zero round trips to the CPU. I think it's about as efficient and elegant a solution as I can manage. It then outputs the encoded stream to the stdout, which python can read and send to the client.


As for the audio, there is another ffmpeg instance :


ffmpeg -f dshow -channels 2 -sample_rate 48000 -sample_size 16 -audio_buffer_size 15 -i audio="RD Audio (High Definition Audio Device)" -acodec libopus -vbr on -application audio -mapping_family 0 -apply_phase_inv true -b:a 25K -fec false -packet_loss 0 -map 0 -f data -


which listens to a physical loopback interface, which is literally just a short wire bridging the front panel headphone and microphone jacks (I'm aware of the quality loss of converting to analog and back, but the audio is then crushed down to 25kbps so it's fine) ()


Unfortunately, aioquic was not easy to work with IMO, and I found webtransport-go https://github.com/adriancable/webtransport-go, which was a hell of a lot better in both simplicity and documentation. However, now I'm dealing with a whole new language, and I wanna ask : (above)


EDIT : Here's the code for my server so far :




package main

import (
 "bytes"
 "context"
 "fmt"
 "log"
 "net/http"
 "os/exec"
 "time"

 "github.com/adriancable/webtransport-go"
)

func warn(str string) {
 fmt.Printf("\n===== WARNING ===================================================================================================\n %s\n=================================================================================================================\n", str)
}

func main() {

 password := []byte("abc")

 videoString := []string{
 "ffmpeg",
 "-init_hw_device", "d3d11va",
 "-filter_complex", "ddagrab=video_size=1920x1080:framerate=60",
 "-vcodec", "hevc_nvenc",
 "-tune", "ll",
 "-preset", "p7",
 "-spatial_aq", "1",
 "-temporal_aq", "1",
 "-forced-idr", "1",
 "-rc", "cbr",
 "-b:v", "500K",
 "-no-scenecut", "1",
 "-g", "216000",
 "-f", "hevc", "-",
 }

 audioString := []string{
 "ffmpeg",
 "-f", "dshow",
 "-channels", "2",
 "-sample_rate", "48000",
 "-sample_size", "16",
 "-audio_buffer_size", "15",
 "-i", "audio=RD Audio (High Definition Audio Device)",
 "-acodec", "libopus",
 "-mapping_family", "0",
 "-b:a", "25K",
 "-map", "0",
 "-f", "data", "-",
 }

 connected := false

 http.HandleFunc("/", func(writer http.ResponseWriter, request *http.Request) {
 session := request.Body.(*webtransport.Session)

 session.AcceptSession()
 fmt.Println("\nAccepted incoming WebTransport connection.")
 fmt.Println("Awaiting authentication...")

 authData, err := session.ReceiveMessage(session.Context()) // Waits here till first datagram
 if err != nil { // if client closes connection before sending anything
 fmt.Println("\nConnection closed:", err)
 return
 }

 if len(authData) >= 2 && bytes.Equal(authData[2:], password) {
 if connected {
 session.CloseSession()
 warn("Client has authenticated, but a session is already taking place! Connection closed.")
 return
 } else {
 connected = true
 fmt.Println("Client has authenticated!\n")
 }
 } else {
 session.CloseSession()
 warn("Client has failed authentication! Connection closed. (" + string(authData[2:]) + ")")
 return
 }

 videoStream, _ := session.OpenUniStreamSync(session.Context())

 videoCmd := exec.Command(videoString[0], videoString[1:]...)
 go func() {
 videoOut, _ := videoCmd.StdoutPipe()
 videoCmd.Start()

 buffer := make([]byte, 15000)
 for {
 len, err := videoOut.Read(buffer)
 if err != nil {
 break
 }
 if len > 0 {
 videoStream.Write(buffer[:len])
 }
 }
 }()

 time.Sleep(50 * time.Millisecond)

 audioStream, err := session.OpenUniStreamSync(session.Context())

 audioCmd := exec.Command(audioString[0], audioString[1:]...)
 go func() {
 audioOut, _ := audioCmd.StdoutPipe()
 audioCmd.Start()

 buffer := make([]byte, 15000)
 for {
 len, err := audioOut.Read(buffer)
 if err != nil {
 break
 }
 if len > 0 {
 audioStream.Write(buffer[:len])
 }
 }
 }()

 for {
 data, err := session.ReceiveMessage(session.Context())
 if err != nil {
 videoCmd.Process.Kill()
 audioCmd.Process.Kill()

 connected = false

 fmt.Println("\nConnection closed:", err)
 break
 }

 if len(data) == 0 {

 } else if data[0] == byte(0) {
 fmt.Printf("Received mouse datagram: %s\n", data)
 }
 }

 })

 server := &webtransport.Server{
 ListenAddr: ":1024",
 TLSCert: webtransport.CertFile{Path: "SSL/fullchain.pem"},
 TLSKey: webtransport.CertFile{Path: "SSL/privkey.pem"},
 QuicConfig: &webtransport.QuicConfig{
 KeepAlive: false,
 MaxIdleTimeout: 3 * time.Second,
 },
 }

 fmt.Println("Launching WebTransport server at", server.ListenAddr)
 ctx, cancel := context.WithCancel(context.Background())
 if err := server.Run(ctx); err != nil {
 log.Fatal(err)
 cancel()
 }

}