语音激活可以触发语音识别吗?

问题描述 投票:0回答:1

因此,我在 Unity 中有一个使用 Microsoft Azure 的语音识别应用程序,可以在单击按钮时调用语音识别。据我所知,您需要一些东西来触发语音识别,无论是单击按钮/按下 RecognizeOnceAsync() 的按键,还是在 StartContinouslyRecognitionAsync() 的情况下,需要一个用于停止语音识别并开始处理的关键字。我的问题是,是否可以只对着麦克风说话,然后发送语音数据进行分析,尽可能接近自然对话?或者,是否可以在设定的时间内激活语音识别,十秒后停止录音,收到响应,然后再次开始收听?

await speechRecognizer.StartContinuousRecognitionAsync().ConfigureAwait(false); speechRecognizer.Recognized += (s, e) => 
{ 
var result = e.Result; Debug.Log(result); 
} 
single shot version: var result = await recognizer.RecognizeOnceAsync().ConfigureAwait(false); 
string newMessage = string.Empty; 
if (result.Reason == ResultReason.RecognizedSpeech)
 { 
Debug.Log(result.Text); 
}

现在我有一个等待关键字并发送数据的工作版本,以及一个通过单击按钮发送语音数据的版本。我想知道是否有可能只对着麦克风说话并接收 TTS 响应,尽可能接近真实的对话。

azure unity-game-engine async-await speech-recognition azure-cognitive-services
1个回答
0
投票

是的,可以使用 Azure 认知服务创建更自然的对话体验,系统会持续侦听语音输入并对其进行处理。

以下代码用于使用认知服务进行连续语音识别和合成。

    var speechConfig = SpeechConfig.FromSubscription(speechKey, speechRegion);

    using var recognizer = new SpeechRecognizer(speechConfig);
    StringBuilder recognizedTextBuffer = new StringBuilder();

    // Subscribe to events
    recognizer.Recognizing += (s, e) =>
    {
        Console.WriteLine($"Recognizing: {e.Result.Text}");
    };

    recognizer.Recognized += async (s, e) =>
    {
        if (e.Result.Reason == ResultReason.RecognizedSpeech)
        {
            recognizedTextBuffer.Append(e.Result.Text + " ");

            // Example action: Check for a specific keyword
            if (e.Result.Text.ToLower().Contains("stop"))
            {
                Console.WriteLine("Stopping recognition...");
                await recognizer.StopContinuousRecognitionAsync();
            }
            else
            {
                // Synthesize the recognized text
                await SynthesizeSpeechAsync(e.Result.Text);
            }
        }
        else if (e.Result.Reason == ResultReason.NoMatch)
        {
            Console.WriteLine("No speech could be recognized.");
        }
    };

    recognizer.SessionStopped += (s, e) =>
    {
        Console.WriteLine("Session stopped.");
        Console.WriteLine("Press any key to exit...");
        Console.ReadKey();
    };

    // Start continuous recognition
    await recognizer.StartContinuousRecognitionAsync();

    Console.WriteLine("Say something... Say 'stop' to end the recognition.");
    Console.ReadKey();

    // Optionally stop recognition when exiting the application
    await recognizer.StopContinuousRecognitionAsync();
}

static async Task SynthesizeSpeechAsync(string text)
{
    var speechConfig = SpeechConfig.FromSubscription(speechKey, speechRegion);
    speechConfig.SpeechSynthesisVoiceName = "en-US-AvaMultilingualNeural";

    using var synthesizer = new SpeechSynthesizer(speechConfig);

    var result = await synthesizer.SpeakTextAsync(text);

    switch (result.Reason)
    {
        case ResultReason.SynthesizingAudioCompleted:
            Console.WriteLine($"Speech synthesized for text: [{text}]");
            break;
        case ResultReason.Canceled:
            var cancellation = SpeechSynthesisCancellationDetails.FromResult(result);
 ![enter image description here](https://i.imgur.com/NHkjzTI.png)           Console.WriteLine($"Synthesis canceled: Reason={cancellation.Reason}");

            if (cancellation.Reason == CancellationReason.Error)
            {
                Console.WriteLine($"ErrorCode={cancellation.ErrorCode}");
                Console.WriteLine($"ErrorDetails={cancellation.ErrorDetails}");
            }
            break;
    }
}

输出: enter image description here

© www.soinside.com 2019 - 2024. All rights reserved.