我想从 wav 中进行语音识别。 为此,我将一个 wav 分成多个块,导出它们,然后使用 SpeechRecognition 库。
from pydub import AudioSegment
import speech_recognition as sr
r = sr.Recognizer()
for i in range(5):
audio = AudioSegment.from_wav("some_wav.wav")
audio_chunk=audio[int(i*1000):int(i*3000)]
audio_chunk.export('test.wav', format='wav')
detection = sr.AudioFile('test.wav')
with detection as source:
audio = r.record(source)
word = r.recognize_google(audio, language = 'ro-RO')
问题是这不是很理想。我想去掉导出的 wav 部分。我想将audio_chunk转换为字节,然后在speechRecognition.AudioFile()中使用内存字节。
有没有办法将音频段类型转换为字节?
晚了一年,但试试这个:
from pydub import AudioSegment
import speech_recognition as sr
import io
r = sr.Recognizer()
for i in range(5):
audio = AudioSegment.from_wav("some_wav.wav")
audio_chunk = audio[int(i*1000):int(i*3000)]
buffer = io.BytesIO()
audio_chunk.export(buffer, format="wav")
detection = sr.AudioFile(buffer)
with detection as source:
audio = r.record(source)
word = r.recognize_google(audio, language='ro-RO')