我有这段代码,可以根据
link
转录一些 YouTube 视频的音频流,
现在你可能会发现这很慢,因为我必须首先将视频流下载为 .mp4
,然后使用 moviepy
将其转换为 .wav
,然后录制音频,然后转录它。
我想要拥有完全相同的功能,但不需要首先通过将流的数据写入某个缓冲区来下载流。
from pytube import YouTube
import speech_recognition as sr
from moviepy.editor import *
video = YouTube(url=link)
audio_stream = video.streams.get_by_itag(140)
recognizer = sr.Recognizer()
audio_stream.download(filename="mp4_output.mp4")
audio = AudioFileClip("mp4_output.mp4")
audio.write_audiofile("wav_output.wav")
with sr.AudioFile("./wav_output.wav") as audio_file:
audio_data = recognizer.record(audio_file, duration=100)
transcript = recognizer.recognize_sphinx(audio_data=audio_data)
我尝试过以下方法
from pytube import YouTube
import speech_recognition as sr
video = YouTube(url=link)
audio_stream = video.streams.get_by_itag(140)
buffer = io.BytesIO()
audio_stream.stream_to_buffer(buffer)
recognizer = sr.Recognizer()
with sr.AudioFile(buffer) as audio_file:
audio_data = recognizer.record(audio_file, duration=100)
transcript = recognizer.recognize_sphinx(audio_data=audio_data)
我收到以下错误
audio file could not be read as PCM WAV, AIFF/AIFF-C, or Native FLAC; check if file is corrupted or in another format
有没有办法将 buffer
转换为其中一种格式?
您可以使用
pydub
库,它可以帮助进行音频格式转换:
pip install pydub ffmpeg-python
然后,您可以修改脚本以将音频流从缓冲区转换为 WAV 格式,然后再与语音识别库一起使用:
from pytube import YouTube
import speech_recognition as sr
import io
from pydub import AudioSegment
# Your YouTube link
video = YouTube(url=link)
audio_stream = video.streams.get_by_itag(140)
buffer = io.BytesIO()
audio_stream.stream_to_buffer(buffer)
# Move back to the start of the BytesIO object
buffer.seek(0)
# Convert the audio stream to WAV format using pydub
audio_segment = AudioSegment.from_file(buffer, format="mp4")
wav_buffer = io.BytesIO()
audio_segment.export(wav_buffer, format="wav")
# Reset the buffer position to the start
wav_buffer.seek(0)
recognizer = sr.Recognizer()
with sr.AudioFile(wav_buffer) as audio_file:
audio_data = recognizer.record(audio_file)
transcript = recognizer.recognize_sphinx(audio_data=audio_data)