我有一个最小的天蓝色文本到语音示例,该示例在某些计算机上失败,而在其他计算机上则失败。所有计算机都是 MacOS 14.5,运行 python 3.11.8,azure-cognitiveservices-speech==1.41.1 运行代码的计算机之间没有其他差异。
有些计算机可以立即工作并生成音频文件,而其他计算机则从不工作并超时并出现以下错误:
错误详细信息:USP 错误:等待第一个音频块超时 错误:文件:/Users/runner/work/1/s/external/azure-c-shared-utility/pal/ios-osx/tlsio_appleios.c Func:tlsio_appleios_destroy 行:196 tlsio_appleios_destroy 在未处于 TLSIO_STATE_CLOSED 时调用。
github 上有一个未解决的问题,尽管它只引用了我怀疑是次要的 TLS 错误:https://github.com/azure/azure-c-shared-utility/issues/658
def text_to_speech(text, voice_name='zh-CN-YunfengNeural')
if not os.path.exists(TEMP_AZURE_AUDIO_PATH): os.makedirs(TEMP_AZURE_AUDIO_PATH)
output_file = os.path.join(TEMP_AZURE_AUDIO_PATH, f"{text[:10]}--{voice_name}--{style}.wav")
speech_config = speechsdk.SpeechConfig(subscription=credentials.azure_speech_key, region=credentials.azure_service_region)
speech_config.speech_synthesis_voice_name = voice_name
audio_config = speechsdk.audio.AudioOutputConfig(filename=output_file)
synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=audio_config)
result = synthesizer.speak_text_async(text).get()
# Check result status
if result.reason == speechsdk.ResultReason.SynthesizingAudioCompleted:
logger.info("Speech synthesis completed.")
# Verify that the file was created successfully
if os.path.exists(output_file):
print(f"File '{output_file}' was created successfully.")
else:
print(f"File '{output_file}' was not created.")
return None
elif result.reason == speechsdk.ResultReason.Canceled:
cancellation_details = result.cancellation_details
print(f"Speech synthesis canceled: {cancellation_details.reason}")
if cancellation_details.reason == speechsdk.CancellationReason.Error:
if cancellation_details.error_details:
print(f"Error details: {cancellation_details.error_details}")
return None
return output_file
期望在相同的环境、操作系统、Python 包和相同的凭据下,它可以在每台计算机上运行。 8 台计算机中有 5 台产生错误,其他计算机每次都能工作。
有人有什么建议吗?
下面是完整的修改代码,以两种方式修复错误。
如果 SDK 方法超时或失败,代码会自动回退到 REST API 进行语音合成,确保 SDK 的网络或 TLS 相关问题不会阻止功能。
try- except 块尽早捕获异常,记录问题,并切换到 REST API,确保即使出现网络或 SDK 问题也能顺利执行。
代码:
import os
import logging
import azure.cognitiveservices.speech as speechsdk
import requests
logging.basicConfig(level=logging.DEBUG)
AZURE_SPEECH_KEY = "<speech_key>"
AZURE_SERVICE_REGION = "<speech_region>"
TEMP_AZURE_AUDIO_PATH = "./azure_audio_output"
def text_to_speech(text, voice_name='zh-CN-YunfengNeural'):
if not os.path.exists(TEMP_AZURE_AUDIO_PATH):
os.makedirs(TEMP_AZURE_AUDIO_PATH)
output_file = os.path.join(TEMP_AZURE_AUDIO_PATH, f"{text[:10]}-{voice_name}.wav")
try:
speech_config = speechsdk.SpeechConfig(subscription=AZURE_SPEECH_KEY, region=AZURE_SERVICE_REGION)
speech_config.speech_synthesis_voice_name = voice_name
audio_config = speechsdk.audio.AudioOutputConfig(filename=output_file)
synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=audio_config)
result = synthesizer.speak_text_async(text).get()
if result.reason == speechsdk.ResultReason.SynthesizingAudioCompleted:
print("Speech synthesis completed.")
if os.path.exists(output_file):
print(f"File '{output_file}' created successfully.")
else:
print("File not created.")
return None
elif result.reason == speechsdk.ResultReason.Canceled:
print(f"Synthesis canceled: {result.cancellation_details.reason}")
if result.cancellation_details.error_details:
print(f"Error details: {result.cancellation_details.error_details}")
return None
return output_file
except Exception as e:
print(f"Exception occurred: {e}")
print("Attempting to use REST API fallback...")
return text_to_speech_rest(text, voice_name)
def text_to_speech_rest(text, voice_name='zh-CN-YunfengNeural'):
url = f"https://{AZURE_SERVICE_REGION}.tts.speech.microsoft.com/cognitiveservices/v1"
headers = {
'Ocp-Apim-Subscription-Key': AZURE_SPEECH_KEY,
'Content-Type': 'application/ssml+xml',
'X-Microsoft-OutputFormat': 'riff-24khz-16bit-mono-pcm'
}
ssml = f"""
<speak version='1.0' xml:lang='en-US'>
<voice xml:lang='zh-CN' name='{voice_name}'>
{text}
</voice>
</speak>"""
try:
response = requests.post(url, headers=headers, data=ssml.encode('utf-8'))
if response.status_code == 200:
output_file = os.path.join(TEMP_AZURE_AUDIO_PATH, f"{text[:10]}-{voice_name}-rest.wav")
with open(output_file, "wb") as audio_file:
audio_file.write(response.content)
print(f"REST API: File '{output_file}' created successfully.")
return output_file
else:
print(f"REST API Error: {response.status_code}, {response.text}")
return None
except Exception as e:
print(f"REST API exception: {e}")
return None
if __name__ == "__main__":
text = "你好, 欢迎使用微软的语音服务。"
voice_name = "zh-CN-YunfengNeural"
output = text_to_speech(text, voice_name)
if output:
print(f"Audio file generated: {output}")
else:
print("Failed to generate audio.")
输出:
以下代码运行成功,并从文本输入中得到语音输出,如下所示。