当我使用 recognize_once() 在 azure 中提供超过 4 种语言的列表时如何识别音频?

问题描述 投票:0回答:1

Azure Speech SDK 有一个限制,它仅支持在“DetectAudioAtStart”模式下同时检测最多 4 种语言。为了解决这个限制,我从 languages_to_detect 列表中创建了 4 种语言的批次,并尝试检测每个批次的语言。但它无法识别并给我错误的答案。我正在传递一个孟加拉语音频文件,它说的是印地语。这是错误的。以下是代码供您参考:

import azure.cognitiveservices.speech as speechsdk

subscription_key = "00000000000000000000000000"
service_region = "westus"
audio_file_path = "C:\\yogesh_folder\\speech_bangla.wav"

# List of all languages to detect
languages_to_detect = ["en-US", "ml-IN", "ta-IN", "te-IN", "gu-IN", "kn-IN", "mr-IN", "pa-IN", "bn-IN", "hi-IN"]

# Configure speech recognition
speech_config = speechsdk.SpeechConfig(subscription=subscription_key, region=service_region)

# Audio configuration
audio_config = speechsdk.audio.AudioConfig(filename=audio_file_path)

# Initialize detected language
detected_language = None

# Iterate through batches of 4 languages
for i in range(0, len(languages_to_detect), 4):
    # Slice the batch of languages
    batch_languages = languages_to_detect[i:i+4]

    # Configure auto-detection of source language for current batch
    auto_detect_source_language_config = speechsdk.languageconfig.AutoDetectSourceLanguageConfig(
        languages=batch_languages
    )

    # Create a speech recognizer instance for current batch
    speech_recognizer = speechsdk.SpeechRecognizer(
        speech_config=speech_config,
        auto_detect_source_language_config=auto_detect_source_language_config,
        audio_config=audio_config
    )

    # Perform recognition
    print(f"Detecting speech in languages: {batch_languages}")
    result = speech_recognizer.recognize_once()

    # Check result
    if result.reason == speechsdk.ResultReason.RecognizedSpeech:
        detected_language = result.properties.get(speechsdk.PropertyId.SpeechServiceConnection_AutoDetectSourceLanguageResult)
        print(f"Detected language: {detected_language}")
        break  # Exit loop if language is detected

# If no language is detected, provide feedback
if detected_language is None:
    print("No language detected.")
python azure speech-recognition azure-cognitive-services azure-speech
1个回答
0
投票

我正在传递一个孟加拉语音频文件,它说的是印地语。这是错误的。以下是代码供您参考:

  • 记录中间结果以查看每个批次检测到的内容。音频文件应该清晰、质量良好,并检查它是否没有达到 API 的任何速率限制或其他限制。

App.py:

        # Log recognized text
        recognized_text = result.text
        print(f"Recognized text: {recognized_text}")

        break  # Exit loop if language is detected

    elif result.reason == speechsdk.ResultReason.NoMatch:
        print("No speech could be recognized")
    elif result.reason == speechsdk.ResultReason.Canceled:
        cancellation_details = result.cancellation_details
        print(f"Speech Recognition canceled: {cancellation_details.reason}")
        if cancellation_details.reason == speechsdk.CancellationReason.Error:
            print(f"Error details: {cancellation_details.error_details}")

# If no language is detected, provide feedback
if detected_language is None:
    print("No language detected.")

结果:

enter image description here

© www.soinside.com 2019 - 2024. All rights reserved.