Azure 语音服务连续语音识别

Question

我对 Azure 语音服务还很陌生，我正在使用 twilo/plivo 服务将号码与 azure stt 连接起来，并在转录后进一步处理它。

我的问题是，当我说话时，它检测得很好，当我停止说话或保持沉默时，它会自动处理包含空转录文本的空语音并返回它，这种情况每 10-15 秒发生一次。它会自动检测语音..直到通话结束我才会取消连续识别。

有人有类似的经历或者我可以改变语音配置吗？请告诉我。

我使用了azure SDK并使用了初始和语音分段超时，但没有变化..我正在实时使用它，所以我不能添加超过一秒的时间。

Answer 1

我尝试了连续语音识别的示例代码，将语音转换为文本，并避免由于静音或噪音而处理空转录。

我使用

InitialSilenceTimeoutMs

和

EndSilenceTimeoutMs

来管理静音，

last_recognition_time

来过滤有效识别，使用

evt.result.text.strip()

来跳过空转录。

代码：

import azure.cognitiveservices.speech as speechsdk
import time

SUBSCRIPTION_KEY = "<speechKey>"
REGION = "<speechRegion>"

speech_config = speechsdk.SpeechConfig(subscription=SUBSCRIPTION_KEY, region=REGION)
speech_config.speech_recognition_language = "en-US" 

speech_config.set_service_property(name="InitialSilenceTimeoutMs", value="1000", channel=speechsdk.ServicePropertyChannel.UriQueryParameter)
speech_config.set_service_property(name="EndSilenceTimeoutMs", value="1000", channel=speechsdk.ServicePropertyChannel.UriQueryParameter)
audio_config = speechsdk.audio.AudioConfig(use_default_microphone=True)
recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, audio_config=audio_config)
last_recognition_time = time.time()

def recognizing_handler(evt):
    """Handles partial recognition results."""
    if evt.result.text.strip():
        print(f"Recognizing: {evt.result.text}")

def recognized_handler(evt):
    """Handles final recognition results."""
    global last_recognition_time
    if evt.result.reason == speechsdk.ResultReason.RecognizedSpeech:
        if evt.result.text.strip() and (time.time() - last_recognition_time > 2):  
            print(f"Recognized: {evt.result.text}")
            last_recognition_time = time.time()
    elif evt.result.reason == speechsdk.ResultReason.NoMatch:
        print("No speech recognized.")

def canceled_handler(evt):
    """Handles recognition cancellation events."""
    print(f"Recognition canceled: {evt.reason}")
    if evt.reason == speechsdk.CancellationReason.Error:
        print(f"Error details: {evt.error_details}")

def session_started_handler(evt):
    """Handles session start events."""
    print("Session started.")

def session_stopped_handler(evt):
    """Handles session stop events."""
    print("Session stopped.")

recognizer.recognizing.connect(recognizing_handler)
recognizer.recognized.connect(recognized_handler)
recognizer.canceled.connect(canceled_handler)
recognizer.session_started.connect(session_started_handler)
recognizer.session_stopped.connect(session_stopped_handler)

print("Starting continuous recognition...")
recognizer.start_continuous_recognition()

try:
    while True:
        time.sleep(1)
except KeyboardInterrupt:
    print("Stopping recognition...")
    recognizer.stop_continuous_recognition()

输出：

enter image description here

Azure 语音服务连续语音识别

问题描述投票：0回答：1

1个回答

最新问题

Azure 语音服务连续语音识别

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1