I have a TTS (text-to-speech) system that produces audio in numpy-array form whose data type is np.float32
. This system is running in the backend and I want to transfer the data from the backend to the frontend to be played when a certain event happens.
The obvious solution for this problem is to write the audio data on disk as a wav file and then pass the path to the frontend to be played. This worked fine, but I don't want to do that for administrative reasons. I just want to transfer only the audio data (numpy array) to the frontend.
What I have done till now is the following:
text = "Hello"
wav, sr = tts_model.synthesize(text)
data = {"snd", wav.tolist()}
flask_response = app.response_class(response=flask.json.dumps(data),
status=200,
mimetype='application/json' )
# then return flask_response
// gets wav from backend
let arrayData = new Float32Array(wav);
let blob = new Blob([ arrayData ]);
let url = URL.createObjectURL(blob);
let snd = new Audio(url);
snd.play()
That what I have done till now, but the JavaScript throws the following error:
Uncaught (in promise) DOMException: Failed to load because no supported source was found.
This is the gist of what I'm trying to do. I'm so sorry, you can't repreduce the error as you don't have the TTS system, so this is an audio file generated by it which you can use to see what I'm doing wrong.
np.int8
, np.int16
to be casted in the JavaScript by Int8Array()
and int16Array()
respectively.blob
such as {"type": "application/text;charset=utf-8;"}
and {"type": "audio/ogg; codecs=opus;"}
.I have been struggling in this issue for so long, so any help is appriciated !!
Your sample as is does not work out of the box. (Does not play)
However with:
Flask
from flask import Flask, render_template, json
import base64
app = Flask(__name__)
with open("sample_16.wav", "rb") as binary_file:
# Read the whole file at once
data = binary_file.read()
wav_file = base64.b64encode(data).decode('UTF-8')
@app.route('/wav')
def hello_world():
data = {"snd": wav_file}
res = app.response_class(response=json.dumps(data),
status=200,
mimetype='application/json')
return res
@app.route('/')
def stat():
return render_template('index.html')
if __name__ == '__main__':
app.run(debug = True)
js
<audio controls></audio>
<script>
;(async _ => {
const res = await fetch('/wav')
let {snd: b64buf} = await res.json()
document.querySelector('audio').src="data:audio/wav;base64, "+b64buf;
})()
</script>
So, what I ended up doing before (using this solution) that solved my problem is to:
np.float32
to np.int16
:wav = (wav * np.iinfo(np.int16).max).astype(np.int16)
scipy.io.wavfile
:from scipy.io import wavfile
wavfile.write(".tmp.wav", sr, wav)
# read the bytes
with open(".tmp.wav", "rb") as fin:
wav = fin.read()
import os
os.remove(".tmp.wav")
Right after synthesis you can convert numpy array of wav to byte object then encode via base64.
import io
from scipy.io.wavfile import write
bytes_wav = bytes()
byte_io = io.BytesIO(bytes_wav)
write(byte_io, sr, wav)
wav_bytes = byte_io.read()
audio_data = base64.b64encode(wav_bytes).decode('UTF-8')
This can be used directly to create html audio tag as source (with flask):
<audio controls src="data:audio/wav;base64, {{ audio_data }}"></audio>
So, all you need is to convert wav
, sr
to audio_data
representing raw .wav
file. And use as parameter of render_template
for your flask app. (Solution without sending)
Or if you send audio_data
, in .js
file where you accept response, use audio_data
to construct url (would be placed as src
attribute like in html):
// get audio_data from response
let snd = new Audio("data:audio/wav;base64, " + audio_data);
snd.play()
because:
Audio(url)
Return value: A new HTMLAudioElement object, configured to be used for playing back the audio from the file specified by url.The new object's preload property is set to auto and its src property is set to the specified URL or null if no URL is given. If a URL is specified, the browser begins to asynchronously load the media resource before returning the new object.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.