我有一个要转录的Wav格式的音频文件:
我的代码是:
import speech_recognition as sr
harvard = sr.AudioFile('speech_file.wav')
with harvard as source:
try:
audio = r.listen(source)
#print("Done")
except sr.UnknownValueError:
exec()
r.recognize_google(audio)
我确实收到输出:
Out[20]: 'thank you for calling my name is Denise who I have a pleasure speaking with hi my name is Mary Jane. Good afternoon Mary Jane I do have your account open with your email'
但是,在此之后还有很多其他要说的。我认为它只能捕获语音的这一部分,因为在音频文件中说出“电子邮件”一词后会停顿一下。我尝试设置持续时间,但收到错误:
import speech_recognition as sr
harvard = sr.AudioFile('speech_file.wav')
with harvard as source:
try:
audio = r.listen(source,duration = 200)
#print("Done")
except sr.UnknownValueError:
exec()
r.recognize_google(audio)
Traceback (most recent call last):
File "<ipython-input-24-30fb65edc627>", line 5, in <module>
audio = r.listen(source,duration = 200)
TypeError: listen() got an unexpected keyword argument 'duration'
我该怎么做才能使我的代码转录整个音频文件,并且如果有暂停,也不会停止打印文本?
您可以这样使用timeout
而不是duration
:
audio = r.listen(source, timeout=2)
这意味着模型在放弃并引发speech_recognition.WaitTimeoutError
异常之前,将等待最多两秒钟短语开始。如果为timeout=None
,则没有等待,这取决于您的情况。
recognize_google()
的全部功能是调用google Speech API并获取结果。使用提供的音频文件时,我获得了前30秒的录音。这是由于Google语音API的免费版本受到限制,与代码无关。