
Recherche avancée
Autres articles (51)
-
La sauvegarde automatique de canaux SPIP
1er avril 2010, parDans le cadre de la mise en place d’une plateforme ouverte, il est important pour les hébergeurs de pouvoir disposer de sauvegardes assez régulières pour parer à tout problème éventuel.
Pour réaliser cette tâche on se base sur deux plugins SPIP : Saveauto qui permet une sauvegarde régulière de la base de donnée sous la forme d’un dump mysql (utilisable dans phpmyadmin) mes_fichiers_2 qui permet de réaliser une archive au format zip des données importantes du site (les documents, les éléments (...) -
MediaSPIP v0.2
21 juin 2013, parMediaSPIP 0.2 est la première version de MediaSPIP stable.
Sa date de sortie officielle est le 21 juin 2013 et est annoncée ici.
Le fichier zip ici présent contient uniquement les sources de MediaSPIP en version standalone.
Comme pour la version précédente, il est nécessaire d’installer manuellement l’ensemble des dépendances logicielles sur le serveur.
Si vous souhaitez utiliser cette archive pour une installation en mode ferme, il vous faudra également procéder à d’autres modifications (...) -
Mise à disposition des fichiers
14 avril 2011, parPar défaut, lors de son initialisation, MediaSPIP ne permet pas aux visiteurs de télécharger les fichiers qu’ils soient originaux ou le résultat de leur transformation ou encodage. Il permet uniquement de les visualiser.
Cependant, il est possible et facile d’autoriser les visiteurs à avoir accès à ces documents et ce sous différentes formes.
Tout cela se passe dans la page de configuration du squelette. Il vous faut aller dans l’espace d’administration du canal, et choisir dans la navigation (...)
Sur d’autres sites (7770)
-
Is there a case where I can not create a file if I run a lot of threads ? ( feat. Ffmpeg, thread)
26 juin 2018, par JunburgI am now creating an app that will duet with the singer. We use a thread (mAudioPlayer) that outputs background music, a vAudioPlayer that prints the voice of the singer, a mRecordThread that records the voice of the user, and a thread that weaves and attaches each file (mp3ConcatThread).
It works by stopping the singer’s voice while recording background music and recording by the user. Of course, when the user does not record, the singer’s voice is output. In this way, each section must be made into an mp3 file and merged into a single file. However, it often happens that files that do not record and merge properly are created.
Audio processing is done using Ffmpeg. I guess the following error might be the reason.
06-26 21:37:11.084 13017-13017/com.softcode.kihnoplay I/Choreographer: Skipped 72 frames! The application may be doing too much work on its main thread.
Could not generate a file because of this kind of error ?
If you know the answer to this question, please answer. Thank you.
Below are related codes. For more information, please leave a comment.
Because the code is too long, I’ve included only the code that looks like it’s needed.Record Thread.class
public class Record_Thread {
private static final String LOG_TAG = Record_Thread.class.getSimpleName();
private static final int SAMPLE_RATE = 44100;
private int bufferSize = 0;
private String currentOutFile = null;
private Context context;
byte RECORDER_BPP = 16;
public Record_Thread(Record_interface listener) {
mListener = listener;
Player.currentCreateFileName = SmcInfo.APPDIRPATH + "/ucc/" + Player.getCurrentTime(false);
currentOutFile = Player.currentCreateFileName + ".pcm";
}
public Record_Thread(Record_interface listener, Context context) {
mListener = listener;
RecordActivity.currentCreateFileName = SmcInfo.APPDIRPATH + "/ucc/" + RecordActivity.getCurrentTime(false);
currentOutFile = RecordActivity.currentCreateFileName + ".pcm";
this.context = context;
}
private boolean isSampleTranspo;
private boolean isRecording;
public boolean isSharding = false;
private Record_interface mListener;
private Thread mThread;
public boolean recording() {
return mThread != null;
}
public void setSampleTranspo(boolean booleanValue) {
this.isSampleTranspo = booleanValue;
}
public boolean getSampleTranspo() {
return this.isSampleTranspo;
}
long startpoint = 0;
boolean posWrite = false;
public void startRecording() {
if (mThread != null)
return;
isRecording = true;
mThread = new Thread(new Runnable() {
@Override
public void run() {
record();
}
});
mThread.start();
}
public void stopRecording() {
if (mThread == null)
return;
isRecording = false;
mThread = null;
posWrite = false;
startpoint = 0;
}
public void startFileWrite(long startpoint) {
this.startpoint = startpoint;
this.posWrite = true;
}
public void stopFileWrite() {
this.posWrite = false;
}
private void record() {
try {
Log.v(LOG_TAG, "Start");
android.os.Process.setThreadPriority(android.os.Process.THREAD_PRIORITY_AUDIO);
bufferSize = AudioRecord.getMinBufferSize(SAMPLE_RATE, AudioFormat.CHANNEL_IN_STEREO, AudioFormat.ENCODING_PCM_16BIT);
if (bufferSize == AudioRecord.ERROR || bufferSize == AudioRecord.ERROR_BAD_VALUE) {
bufferSize = SAMPLE_RATE * 2;
}
short[] audioBuffer = new short[bufferSize];
short[] audioZero = new short[bufferSize];
AudioRecord record = new AudioRecord(MediaRecorder.AudioSource.MIC, SAMPLE_RATE, AudioFormat.CHANNEL_IN_STEREO, AudioFormat.ENCODING_PCM_16BIT, bufferSize);
if (record.getState() != AudioRecord.STATE_INITIALIZED) {
Log.e(LOG_TAG, "Audio Record can't initialize!");
return;
}
record.startRecording();
Log.v(LOG_TAG, "Start recording");
long shortsRead = 0;
int readsize = 0;
File tempFile = new File(currentOutFile);
if (tempFile.exists())
tempFile.delete();
FileOutputStream fos = new FileOutputStream(currentOutFile);
byte[] audiodata = new byte[bufferSize];
while (isRecording && record != null) {
readsize = record.read(audiodata, 0, audiodata.length);
if (AudioRecord.ERROR_INVALID_OPERATION != readsize && fos != null) {
try {
if (readsize > 0 && readsize <= audiodata.length) {
fos.write(audiodata, 0, readsize);//TypeCast.shortToByte(audioBuffer)
fos.flush();
}
} catch (Exception ex) {
Log.e("AudioRecorder", ex.getMessage());
}
}
ShortBuffer sb = ByteBuffer.wrap(audiodata).order(ByteOrder.LITTLE_ENDIAN).asShortBuffer();
short[] samples = new short[sb.limit()];
sb.get(samples);
if (isSampleTranspo) {
mListener.onAudioDataReceived(samples);
} else {
mListener.onAudioDataReceived(audioZero);
}
if (posWrite) {
FileOutputStream pos = null;
try {
if (startpoint > 0) {
if (context instanceof RecordActivity) {
pos = new FileOutputStream(RecordActivity.currentCreateFileName.replaceAll("/ucc/", "/tmp/") + "_" + caltime(String.valueOf((int) (startpoint / 1000)), false) + "_uv.pcm", true);/////파일에 이어서 쓰기
Log.d(TAG, "record: " + pos.toString());
} else {
pos = new FileOutputStream(Player.currentCreateFileName.replaceAll("/ucc/", "/tmp/") + "_" + caltime(String.valueOf((int) (startpoint / 1000)), false) + "_uv.pcm", true);/////파일에 이어서 쓰기
}
}
pos.write(audiodata);
pos.flush();
} catch (Exception e) {
e.printStackTrace();
} finally {
pos.close();
pos = null;
}
}
}
if (fos != null)
fos.close();
mListener.onRecordEnd();
record.stop();
record.release();
} catch (IOException e) {
Log.e("AudioRecorder", e.getMessage());
}
}
private String caltime(String sMillis, boolean timeFormat) {
double dMillis = 0;
int minutes = 0;
int seconds = 0;
int millis = 0;
String sTime;
try {
dMillis = Double.parseDouble(sMillis);
} catch (Exception e) {
System.out.println(e.getMessage());
}
seconds = (int) (dMillis / 1000) % 60;
millis = (int) (dMillis % 1000);
if (seconds > 0) {
minutes = (int) (dMillis / 1000 / 60) % 60;
if (minutes > 0) {
if (timeFormat)
sTime = String.format("%02d:%02d.%d", minutes, seconds, millis);
else
sTime = String.format("%02d%02d%d", minutes, seconds, millis);
} else {
if (timeFormat)
sTime = String.format("%02d:%02d.%d", 0, seconds, millis);
else
sTime = String.format("%02d%02d%d", 0, seconds, millis);
}
} else {
if (timeFormat)
sTime = String.format("%02d:%02d.%d", 0, 0, millis);
else
sTime = String.format("%02d%02d%d", 0, 0, millis);
}
return sTime;
}
}RecordActivity.class
public class RecordActivity extends AppCompatActivity implements Player_interface, SeekBar.OnSeekBarChangeListener {
private static final String TAG = "RecordActivity";
public Context context = this;
private LinearLayout recordLayout;
private RelativeLayout recordBtn, saveBtn;
private CircleImageView userImg, artistImg;
private TextView songTitleTxt, playTimeTxt, progressTimeTxt;
private BlurBitmap blurBitmap;
private SeekBar seekBar;
private ImageView micBg1, micBg2;
private String assPath;
private String ampPath;
private int deviceWidth, deviceHeight;
public static AssRenderView assView;
public static LinearLayout lyricsLayout;
public static int lyricsWidth, lyricsHeight, layoutWidth;
public static LinearLayout.LayoutParams assViewParams;
public static String currentCreateFileName = null;
public static String mrPath;
public static String voicePath;
private String recMusicPath;
Player_Thread mAudioPlayer = null, vAudioPlayer = null, testPlayer = null;
private Record_Thread mRecordThread;
public static Mp3Concat_Thread mMp3ConcatThread;
long lastDuration = 0L;
private boolean isSeekbarTouch = false;
private ArrayList<long> combineList;
CNetProgressdialog createMp3Dialog;
int bufferSize = 7104;
int SAMPLE_RATE = 44100;
int RECORDER_SAMPLERATE = 44100;
byte RECORDER_BPP = 16;
@Override
protected void onCreate(@Nullable Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
removeDir(SmcInfo.APPDIRPATH + "/tmp");
setContentView(R.layout.activity_record_phone);
if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.KITKAT) {
Window window = getWindow();
window.addFlags(WindowManager.LayoutParams.FLAG_LAYOUT_NO_LIMITS);
window.addFlags(WindowManager.LayoutParams.FLAG_TRANSLUCENT_NAVIGATION);
}
recordLayout = (LinearLayout) findViewById(R.id.record_layout);
userImg = (CircleImageView) findViewById(R.id.user_img);
artistImg = (CircleImageView) findViewById(R.id.artist_img);
songTitleTxt = (TextView) findViewById(R.id.song_title_txt);
progressTimeTxt = (TextView) findViewById(R.id.progress_time_txt);
playTimeTxt = (TextView) findViewById(R.id.play_time_txt);
recordBtn = (RelativeLayout) findViewById(R.id.record_btn);
saveBtn = (RelativeLayout) findViewById(R.id.save_btn);
seekBar = (SeekBar) findViewById(R.id.seek_bar);
micBg1 = (ImageView) findViewById(R.id.mic_bg_small);
micBg2 = (ImageView) findViewById(R.id.mic_bg_big);
createMp3Dialog = new CNetProgressdialog(this);
GradientDrawable drawable = new GradientDrawable();
drawable.setColors(new int[]{
Color.parseColor("#32c49b"),
Color.parseColor("#19b2c3")
});
Intent intent = getIntent();
final String artistImgPath = intent.getStringExtra("artistImgPath");
final String songTitle = intent.getStringExtra("songTitle");
assPath = intent.getStringExtra("assPath");
ampPath = intent.getStringExtra("ampPath");
String playTime = intent.getStringExtra("playTime");
blurBitmap = new BlurBitmap();
songTitleTxt.setText(songTitle);
playTimeTxt.setText(playTime);
final Bitmap artistImgBitmap = blurBitmap.toBitmap(artistImgPath);
final Bitmap userImgBitmap = BitmapFactory.decodeResource(context.getResources(), R.drawable.dummy_artist_2);
final Bitmap userBlurImg = blurBitmap.blurRenderScript(this, userImgBitmap, 25);
final Bitmap artistBlurImg = blurBitmap.blurRenderScript(this, artistImgBitmap, 25);
artistImg.setImageBitmap(artistImgBitmap);
userImg.setImageBitmap(userBlurImg);
drawable.setGradientType(GradientDrawable.LINEAR_GRADIENT);
drawable.setOrientation(GradientDrawable.Orientation.TOP_BOTTOM);
recordLayout.setBackground(drawable);
play(ampToMp3(ampPath));
mRecordThread = new Record_Thread(new Record_interface() {
@Override
public void onAudioDataReceived(short[] data) {
}
@Override
public void onRecordEnd() {
}
}, context);
mMp3ConcatThread = new Mp3Concat_Thread(new Mp3Concat_interface() {
@Override
public void onAudioDataReceived(short[] data) {
}
@Override
public void onRecordEnd() {
createMp3Dialog.dismiss();
startPrelisteningActivity(recMusicPath, songTitle);
}
}, this);
if (!mRecordThread.recording()) {
mRecordThread.startRecording();
}
final Animation animZoomIn = AnimationUtils.loadAnimation(this, R.anim.zoom_in);
final Animation animZoomOut = AnimationUtils.loadAnimation(this, R.anim.zoom_out);
final Animation animMic1 = AnimationUtils.loadAnimation(this, R.anim.bg_mic_anim_1_phone);
final Animation animMic2 = AnimationUtils.loadAnimation(this, R.anim.bg_mic_anim_2_phone);
artistImg.startAnimation(animZoomIn);
combineList = new ArrayList<long>();
recordBtn.setOnTouchListener(new View.OnTouchListener() {
@Override
public boolean onTouch(View view, MotionEvent motionEvent) {
switch (motionEvent.getAction()) {
case MotionEvent.ACTION_DOWN: {
long currentDuration = vAudioPlayer.getCurrentDuration();
// 녹음 시작 ( combineList 사이즈가 짝수일 때 )
if (mRecordThread != null) {
if (combineList.size() % 2 == 0) {
mRecordThread.startFileWrite(currentDuration);
combineList.add(currentDuration);
}
vAudioPlayer.setSampleTranspo(true);
mRecordThread.setSampleTranspo(true);
}
}
micBg1.setVisibility(View.VISIBLE);
micBg2.setVisibility(View.VISIBLE);
micBg1.startAnimation(animMic1);
micBg2.startAnimation(animMic2);
userImg.setImageBitmap(userImgBitmap);
userImg.startAnimation(animZoomIn);
artistImg.setImageBitmap(artistBlurImg);
artistImg.startAnimation(animZoomOut);
break;
case MotionEvent.ACTION_UP: {
long currentDuration = vAudioPlayer.getCurrentDuration();
if (mRecordThread != null) {
if (combineList.size() % 2 == 1) {
mRecordThread.startRecording();
mRecordThread.stopFileWrite();
File waveFile = new File(RecordActivity.currentCreateFileName.replaceAll("/ucc/", "/tmp/")
+ "_" + caltime(combineList.get(combineList.size() - 1) / 1000, false) + "_uv.pcm");
if (waveFile.exists()) {
copyWaveFile(RecordActivity.currentCreateFileName.replaceAll("/ucc/", "/tmp/") + "_" + caltime(combineList.get(combineList.size() - 1) / 1000, false) + "_uv.pcm",
RecordActivity.currentCreateFileName.replaceAll("/ucc/", "/tmp/") + "_" + caltime(combineList.get(combineList.size() - 1) / 1000, false) + "_u0.wav");
Log.d(TAG, "onTouch: " + currentCreateFileName);
if (mMp3ConcatThread != null) {
mMp3ConcatThread.startCombine(null, 3333333333333333333L, combineList.get(combineList.size() - 1), currentDuration);
}
}
combineList.add(currentDuration);
Log.d(TAG, "onTouch: " + combineList.size());
if (combineList.size() == 2) {
mMp3ConcatThread.startCombine(null, 0, combineList.get(combineList.size() - 2), currentDuration);
} else {
mMp3ConcatThread.startCombine(null, combineList.get(combineList.size() - 3), combineList.get(combineList.size() - 2), currentDuration);
}
}
vAudioPlayer.setSampleTranspo(false);
mRecordThread.setSampleTranspo(false);
}
}
micBg1.setVisibility(View.GONE);
micBg2.setVisibility(View.GONE);
micBg1.clearAnimation();
micBg2.clearAnimation();
userImg.setImageBitmap(userBlurImg);
userImg.startAnimation(animZoomOut);
artistImg.setImageBitmap(artistImgBitmap);
artistImg.startAnimation(animZoomIn);
break;
}
return false;
}
});
saveBtn.setOnClickListener(new View.OnClickListener() {
@Override
public void onClick(View view) {
createMp3Dialog.show();
vAudioPlayer.setSampleTranspo(true);
vAudioPlayer.setlistenerStop(true);
if (assView != null)
assView.Destroy();
if (lyricsLayout != null) {
lyricsLayout.removeAllViews();
}
seekBar.setProgress(0);
seekBar.setMax(100);
Log.d(TAG, "donep3: " + "done");
if (mMp3ConcatThread != null) {
try {
mMp3ConcatThread.startCombine(combineList, 7777777777777777777L, combineList.get(combineList.size() - 1), lastDuration);
} catch (ArrayIndexOutOfBoundsException e) {
e.getMessage();
finish();
}
}
releaseAudioPlayer();
recMusicPath = SmcInfo.APPDIRPATH + "/ucc/" + currentCreateFileName.substring(currentCreateFileName.lastIndexOf('/') + 1, currentCreateFileName.length()) + ".mp3";
}
});
DisplayMetrics displayMetrics = new DisplayMetrics();
getWindowManager().getDefaultDisplay().getMetrics(displayMetrics);
deviceWidth = displayMetrics.widthPixels;
deviceHeight = displayMetrics.heightPixels;
lyricsWidth = deviceWidth;
lyricsHeight = deviceHeight;
Log.d(TAG, "onCreate: " + lyricsWidth + "/" + lyricsHeight);
layoutWidth = lyricsWidth * 2 / 3;
int parentAssViewHeight = ((lyricsHeight * 50) / 91) - 2;
if (layoutWidth > parentAssViewHeight)
layoutWidth = (parentAssViewHeight * 8) / 10;
assViewParams = new LinearLayout.LayoutParams(new ViewGroup.LayoutParams(layoutWidth * 2, layoutWidth));
assViewParams.gravity = Gravity.CENTER;
lyricsLayout = (LinearLayout)
findViewById(R.id.lyrics_layout);
if (assView != null) {
assView.Destroy();
}
if (lyricsLayout != null) {
lyricsLayout.removeAllViews();
}
assView = new AssRenderView(getApplicationContext(), layoutWidth * 13 / 10, layoutWidth);
File assFile = new File(assPath);
if (assFile.exists()) {
assView.ReadASSFile(assFile.toString(), true, layoutWidth * 2, layoutWidth * 5 / 7);
}
lyricsLayout.addView(assView, assViewParams);
lyricsLayout.setGravity(Gravity.CENTER);
assView.ShowASS(true);
seekBar.setOnSeekBarChangeListener(this);
seekBar.setProgress(0);
seekBar.setMax(100);
}
private void startPrelisteningActivity(String recMusicPath, String songTitle) {
Intent intent = new Intent(RecordActivity.this, PrelisteningActivity.class);
intent.putExtra("recMusicPath", recMusicPath);
intent.putExtra("songTitle", songTitle);
startActivityForResult(intent, 1);
}
private String[] ampToMp3(String ampPath) {
String[] pathArray = new String[2];
try {
File ampFile = new File(ampPath);
String ampName = ampFile.getName();
int size;
BufferedInputStream buf = null;
FileInputStream fis = null;
size = (int) ampFile.length();
byte[] bytes = new byte[size];
fis = new FileInputStream(ampFile);
buf = new BufferedInputStream(fis, 8 * 1024);
buf.read(bytes, 0, bytes.length);
byte[] vocalbytes = AMPFileUtility.getByteData(bytes, "voice");
byte[] mrbytes = AMPFileUtility.getByteData(bytes, "mr");
voicePath = SmcInfo.APPDIRPATH + "/audio/" + ampName.replaceAll(".amp", "") + "_voice.mp3";
mrPath = SmcInfo.APPDIRPATH + "/audio/" + ampName.replaceAll(".amp", "") + "_mr.mp3";
BufferedOutputStream bosVocal = new BufferedOutputStream(new FileOutputStream(voicePath));
bosVocal.write(vocalbytes);
bosVocal.flush();
bosVocal.close();
BufferedOutputStream bosMr = new BufferedOutputStream(new FileOutputStream(mrPath));
bosMr.write(mrbytes);
bosMr.flush();
bosMr.close();
} catch (Exception e) {
e.getMessage();
}
pathArray[0] = voicePath;
pathArray[1] = mrPath;
return pathArray;
}
private void play(String[] pathArray) {
releaseAudioPlayer();
String voicePath = pathArray[0];
String mrPath = pathArray[1];
mAudioPlayer = new Player_Thread();
mAudioPlayer.setOnAudioStreamInterface(this);
mAudioPlayer.setUrlString(mrPath);
mAudioPlayer.setlistenerStop(true);
vAudioPlayer = new Player_Thread();
vAudioPlayer.setOnAudioStreamInterface(this);
vAudioPlayer.setUrlString(voicePath);
vAudioPlayer.setlistenerStop(false);
try {
mAudioPlayer.play();
vAudioPlayer.play();
} catch (IOException e) {
e.printStackTrace();
}
}
private void releaseAudioPlayer() {
if (mAudioPlayer != null) {
mAudioPlayer.stop();
mAudioPlayer.release();
mAudioPlayer = null;
}
if (vAudioPlayer != null) {
vAudioPlayer.stop();
vAudioPlayer.release();
vAudioPlayer = null;
}
if (mRecordThread != null) {
mRecordThread.stopRecording();
}
}
public static String getCurrentTime(boolean dateForm) {
SimpleDateFormat dateFormat;
if (dateForm)
dateFormat = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss.SSS"); //SSS가 밀리세컨드 표시
else
dateFormat = new SimpleDateFormat("yyyyMMdd_HHmmssSSS");
Calendar calendar = Calendar.getInstance();
return dateFormat.format(calendar.getTime());
}
private String caltime(long sMillis, boolean timeFormat) {
double dMillis = 0;
int minutes = 0;
int seconds = 0;
int millis = 0;
String sTime;
try {
dMillis = Double.parseDouble(String.valueOf(sMillis));
} catch (Exception e) {
System.out.println(e.getMessage());
}
seconds = (int) (dMillis / 1000) % 60;
millis = (int) (dMillis % 1000);
if (seconds > 0) {
minutes = (int) (dMillis / 1000 / 60) % 60;
if (minutes > 0) {
if (timeFormat)
sTime = String.format("%02d:%02d.%d", minutes, seconds, millis);
else
sTime = String.format("%02d%02d%d", minutes, seconds, millis);
} else {
if (timeFormat)
sTime = String.format("%02d:%02d.%d", 0, seconds, millis);
else
sTime = String.format("%02d%02d%d", 0, seconds, millis);
}
} else {
if (timeFormat)
sTime = String.format("%02d:%02d.%d", 0, 0, millis);
else
sTime = String.format("%02d%02d%d", 0, 0, millis);
}
Log.d(TAG, "caltime: " + sTime);
return sTime;
}
public void copyWaveFile(String inFilename, String outFilename) {
FileInputStream in = null;
FileOutputStream out = null;
long totalAudioLen = 0;
long totalDataLen = totalAudioLen + 36;
long longSampleRate = SAMPLE_RATE;
int channels = 2;/////////////////byte 저장은 1에서 완벽함 AudioFormat.CHANNEL_IN_MONO: channels = 1;AudioFormat.CHANNEL_IN_STEREO: channels = 2;
long byteRate = RECORDER_BPP * SAMPLE_RATE * channels / 8;
try {
in = new FileInputStream(inFilename);
out = new FileOutputStream(outFilename);
byte[] data = new byte[bufferSize];
totalAudioLen = in.getChannel().size();
totalDataLen = totalAudioLen + 36;
AppLog.logString("File size: " + totalDataLen);
WriteWaveFileHeader(out, totalAudioLen, totalDataLen, longSampleRate, channels, byteRate);
while (in.read(data) != -1) {
out.write(data);
}
in.close();
out.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
public void WriteWaveFileHeader(FileOutputStream out, long totalAudioLen, long totalDataLen, long longSampleRate, int channels, long byteRate) throws IOException {
byte[] header = new byte[44];
header[0] = 'R';
header[1] = 'I';
header[2] = 'F';
header[3] = 'F';
header[4] = (byte) (totalDataLen & 0xff);
header[5] = (byte) ((totalDataLen >> 8) & 0xff);
header[6] = (byte) ((totalDataLen >> 16) & 0xff);
header[7] = (byte) ((totalDataLen >> 24) & 0xff);
header[8] = 'W';
header[9] = 'A';
header[10] = 'V';
header[11] = 'E';
header[12] = 'f';
header[13] = 'm';
header[14] = 't';
header[15] = ' ';
header[16] = 16;
header[17] = 0;
header[18] = 0;
header[19] = 0;
header[20] = 1;
header[21] = 0;
header[22] = (byte) channels;
header[23] = 0;
header[24] = (byte) (longSampleRate & 0xff);
header[25] = (byte) ((longSampleRate >> 8) & 0xff);
header[26] = (byte) ((longSampleRate >> 16) & 0xff);
header[27] = (byte) ((longSampleRate >> 24) & 0xff);
header[28] = (byte) (byteRate & 0xff);
header[29] = (byte) ((byteRate >> 8) & 0xff);
header[30] = (byte) ((byteRate >> 16) & 0xff);
header[31] = (byte) ((byteRate >> 24) & 0xff);
header[32] = (byte) (2 * 16 / 8);
header[33] = 0;
header[34] = RECORDER_BPP;
header[35] = 0;
header[36] = 'd';
header[37] = 'a';
header[38] = 't';
header[39] = 'a';
header[40] = (byte) (totalAudioLen & 0xff);
header[41] = (byte) ((totalAudioLen >> 8) & 0xff);
header[42] = (byte) ((totalAudioLen >> 16) & 0xff);
header[43] = (byte) ((totalAudioLen >> 24) & 0xff);
out.write(header, 0, 44);
}
</long></long> -
What role does bit rate play in the accuracy of Google Speech To Text transcription ?
11 novembre 2020, par Jash ShahI am helping a client convert a video file using
ffmpeg
and they originally used-b:a 64k
while transcoding their video to audio at a sampling rate (-ar 44100
argument inffmpeg
) of 44100. Their objective is that they want to generate the most accurate transcriptions using the Google Cloud Speech To Text API.

While combing through their documentation I did not find anything on how bit rate impacts the accuracy of the transcription. So my question is thus - would using a higher bit rate such as
128k
help me in getting better transcriptions or does it not matter ?

-
Watson NarrowBand Speech to Text not accepting ogg file
19 janvier 2017, par Bob DillNodeJS app using ffmpeg to create ogg files from mp3 & mp4. If the source file is broadband, Watson Speech to Text accepts the file with no issues. If the source file is narrow band, Watson Speech to Text fails to read the ogg file. I’ve tested the output from ffmpeg and the narrowband ogg file has the same audio content (e.g. I can listen to it and hear the same people) as the mp3 file. Yes, in advance, I am changing the call to Watson to correctly specify the model and content_type. Code follows :
exports.createTranscript = function(req, res, next)
{ var _name = getNameBase(req.body.movie);
var _type = getType(req.body.movie);
var _voice = (_type == "mp4") ? "en-US_BroadbandModel" : "en-US_NarrowbandModel" ;
var _contentType = (_type == "mp4") ? "audio/ogg" : "audio/basic" ;
var _audio = process.cwd()+"/HTML/movies/"+_name+'ogg';
var transcriptFile = process.cwd()+"/HTML/movies/"+_name+'json';
speech_to_text.createSession({model: _voice}, function(error, session) {
if (error) {console.log('error:', error);}
else
{
var params = { content_type: _contentType, continuous: true,
audio: fs.createReadStream(_audio),
session_id: session.session_id
};
speech_to_text.recognize(params, function(error, transcript) {
if (error) {console.log('error:', error);}
else
{ fs.writeFile(transcriptFile, JSON.stringify(transcript), function(err) {if (err) {console.log(err);}});
res.send(transcript);
}
});
}
});
}_type
is either mp3 (narrowband from phone recording) or mp4 (broadband)
model: _voice
has been traced to ensure correct setting
content_type: _contentType
has been traced to ensure correct settingAny ogg file submitted to Speech to Text with narrowband settings fails with
Error: No speech detected for 30s.
Tested with both real narrowband files and asking Watson to read a broadband ogg file (created from mp4) as narrowband. Same error message. What am I missing ?