Python中的语音识别持续时间设置问题

问题描述 投票:3回答:1

我有一个要转录的Wav格式的音频文件:

我的代码是:

import speech_recognition as sr
harvard = sr.AudioFile('speech_file.wav')
with harvard as source:
    try:
        audio = r.listen(source)
        #print("Done")
    except sr.UnknownValueError:
        exec()

r.recognize_google(audio)

我确实收到输出:

Out[20]: 'thank you for calling my name is Denise who I have a pleasure speaking with hi my name is Mary Jane. Good afternoon Mary Jane I do have your account open with your email'

但是,在此之后还有很多其他要说的。我认为它只能捕获语音的这一部分,因为在音频文件中说出“电子邮件”一词后会停顿一下。我尝试设置持续时间,但收到错误:

import speech_recognition as sr
harvard = sr.AudioFile('speech_file.wav')
with harvard as source:
    try:
        audio = r.listen(source,duration = 200)
        #print("Done")
    except sr.UnknownValueError:
        exec()


r.recognize_google(audio)
Traceback (most recent call last):

  File "<ipython-input-24-30fb65edc627>", line 5, in <module>
    audio = r.listen(source,duration = 200)

TypeError: listen() got an unexpected keyword argument 'duration'

我该怎么做才能使我的代码转录整个音频文件,并且如果有暂停,也不会停止打印文本?

python nlp speech-recognition pyaudio
1个回答
0
投票

您可以这样使用timeout而不是duration

audio = r.listen(source, timeout=2)

这意味着模型在放弃并引发speech_recognition.WaitTimeoutError异常之前,将等待最多两秒钟短语开始。如果为timeout=None,则没有等待,这取决于您的情况。

编辑

recognize_google()的全部功能是调用google Speech API并获取结果。使用提供的音频文件时,我获得了前30秒的录音。这是由于Google语音API的免费版本受到限制,与代码无关。

© www.soinside.com 2019 - 2024. All rights reserved.