Azure 语音服务连续语音识别

问题描述 投票:0回答:1

我对 Azure 语音服务还很陌生,我正在使用 twilo/plivo 服务将号码与 azure stt 连接起来,并在转录后进一步处理它。

我的问题是,当我说话时,它检测得很好,当我停止说话或保持沉默时,它会自动处理包含空转录文本的空语音并返回它,这种情况每 10-15 秒发生一次。它会自动检测语音..直到通话结束我才会取消连续识别。

有人有类似的经历或者我可以改变语音配置吗?请告诉我。

我使用了azure SDK并使用了初始和语音分段超时,但没有变化..我正在实时使用它,所以我不能添加超过一秒的时间。

azure speech-recognition azure-sdk-python azure-speech
1个回答
0
投票

我尝试了连续语音识别的示例代码,将语音转换为文本,并避免由于静音或噪音而处理空转录。

我使用

InitialSilenceTimeoutMs
EndSilenceTimeoutMs
来管理静音,
last_recognition_time
来过滤有效识别,使用
evt.result.text.strip()
来跳过空转录。

代码:

import azure.cognitiveservices.speech as speechsdk
import time

SUBSCRIPTION_KEY = "<speechKey>"
REGION = "<speechRegion>"

speech_config = speechsdk.SpeechConfig(subscription=SUBSCRIPTION_KEY, region=REGION)
speech_config.speech_recognition_language = "en-US" 

speech_config.set_service_property(name="InitialSilenceTimeoutMs", value="1000", channel=speechsdk.ServicePropertyChannel.UriQueryParameter)
speech_config.set_service_property(name="EndSilenceTimeoutMs", value="1000", channel=speechsdk.ServicePropertyChannel.UriQueryParameter)
audio_config = speechsdk.audio.AudioConfig(use_default_microphone=True)
recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, audio_config=audio_config)
last_recognition_time = time.time()

def recognizing_handler(evt):
    """Handles partial recognition results."""
    if evt.result.text.strip():
        print(f"Recognizing: {evt.result.text}")

def recognized_handler(evt):
    """Handles final recognition results."""
    global last_recognition_time
    if evt.result.reason == speechsdk.ResultReason.RecognizedSpeech:
        if evt.result.text.strip() and (time.time() - last_recognition_time > 2):  
            print(f"Recognized: {evt.result.text}")
            last_recognition_time = time.time()
    elif evt.result.reason == speechsdk.ResultReason.NoMatch:
        print("No speech recognized.")

def canceled_handler(evt):
    """Handles recognition cancellation events."""
    print(f"Recognition canceled: {evt.reason}")
    if evt.reason == speechsdk.CancellationReason.Error:
        print(f"Error details: {evt.error_details}")

def session_started_handler(evt):
    """Handles session start events."""
    print("Session started.")

def session_stopped_handler(evt):
    """Handles session stop events."""
    print("Session stopped.")

recognizer.recognizing.connect(recognizing_handler)
recognizer.recognized.connect(recognized_handler)
recognizer.canceled.connect(canceled_handler)
recognizer.session_started.connect(session_started_handler)
recognizer.session_stopped.connect(session_stopped_handler)

print("Starting continuous recognition...")
recognizer.start_continuous_recognition()

try:
    while True:
        time.sleep(1)
except KeyboardInterrupt:
    print("Stopping recognition...")
    recognizer.stop_continuous_recognition()

输出:

enter image description here

最新问题
© www.soinside.com 2019 - 2025. All rights reserved.