我有一个Python Flask应用程序,其方法如下所示。在该方法中,我使用 Azure 文本到语音从文本合成语音。
@app.route("/retrieve_speech", methods=['POST'])
def retrieve_speech():
text= request.form.get('text')
start = time.time()
speech_key = "my key"
speech_region = "my region"
speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=speech_region)
speech_config.endpoint_id = "my endpoint"
speech_config.speech_synthesis_voice_name = "voice name"
speech_config.set_speech_synthesis_output_format(
speechsdk.SpeechSynthesisOutputFormat.Audio24Khz160KBitRateMonoMp3)
synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=None)
result = synthesizer.speak_text_async(text=text).get()
if result.reason == speechsdk.ResultReason.SynthesizingAudioCompleted:
# Convert to wav
audio = AudioSegment.from_file(io.BytesIO(result.audio_data))
duration = audio.duration_seconds
data = io.BytesIO()
audio.export(data, format='wav')
data.seek(0)
# Convert binary data to base64 string
data = base64.b64encode(data.read()).decode('utf-8')
speech_timing = time.time() - start
elif result.reason == speechsdk.ResultReason.Canceled:
cancellation_details = result.cancellation_details
if cancellation_details.reason == speechsdk.CancellationReason.Error:
logging.error("Azure speech synthesis failed: {}".format(cancellation_details.error_details))
return jsonify(audio_data=data, speech_timing=str(speech_timing), other="other strings")
我在我的前端(网页)中使用 Javascript 使用 Flask 方法,如下所示:
$.post("/retrieve_speech", { text: "This is a test" }).done(function (data) {
var audio_data= data.audio_data;
var speech_timing = data.speech_timing;
var other = data.other;
// Decode base64 string to binary
var binaryData = atob(audioData);
// Create an array of 8-bit unsigned integers
var byteArray = new Uint8Array(binaryData.length);
for(var i = 0; i < binaryData.length; i++) {
byteArray[i] = binaryData.charCodeAt(i);
}
// Create a blob object from the byte array
var blob = new Blob([byteArray], {type: 'audio/wav'});
// Create a URL for the blob object
var url = URL.createObjectURL(blob);
// Play the audio
var audio = new Audio(url);
audio.play();
现在的问题是音频没有播放。此外,在 Flask 应用程序中,我收到以下消息:
Numba: Attempted to fork from a non-main thread, the TBB library may be in an invalid state in the child process.
合成语音是有效的,所以问题一定是在 Flask 应用程序中转换为 wav 或字符串和/或在 Javascript 中解码字符串。
我的代码有问题吗?
base64.b64encode
将原始字节作为输入,而不是字符串:
data = base64.b64encode(result.audio_data).decode('utf-8')
AudioSegment.from_wav
,因为您正在以 WAV 格式导出数据。audio = AudioSegment.from_wav(io.BytesIO(result.audio_data))
result = synthesizer.speak_text_async(text=text).get()
if result.reason == speechsdk.ResultReason.SynthesizingAudioCompleted:
# Convert MP3 data to base64 string
data = base64.b64encode(result.audio_data).decode('utf-8')
speech_timing = time.time() - start
elif result.reason == speechsdk.ResultReason.Canceled:
cancellation_details = result.cancellation_details
if cancellation_details.reason == speechsdk.CancellationReason.Error:
logging.error("Azure speech synthesis failed: {}".format(cancellation_details.error_details))
输出: