我正在尝试将语音隔离与 Swift 中的语音识别集成在一起。我的目标是让 IOS 的内置语音识别具有更好的质量,因为我们都知道它可能有多不准确,尤其是在嘈杂的背景下。目前,我的语音识别工作正常,我创建了一个音频单元,允许对语音隔离进行语音处理,但我不知道如何从那里开始将音频单元与语音识别任务集成。我试着拼凑出互联网上的内容,但我仍然是 Swift 的初学者,真的不知道如何从那里开始。另外,如果有任何其他建议可以使语音识别更好,那就太好了。这是我现在拥有的代码:
对于语音识别,这就是我所拥有的并且工作正常:
let audioEngineSpeech = AVAudioEngine()
let speechRecognizer = SFSpeechRecognizer(locale: Locale(identifier: "en-US"))!
var recognitionRequest = SFSpeechAudioBufferRecognitionRequest()
var recognitionTask: SFSpeechRecognitionTask?
let inputNode = audioEngineSpeech.inputNode
inputNode.reset()
inputNode.removeTap(onBus: 0)
inputNode.isVoiceProcessingBypassed = true
let format = inputNode.inputFormat(forBus: 0)
inputNode.installTap(onBus: 0, bufferSize: 1024, format: format) { buffer, _ in
recognitionRequest.append(buffer)
}
audioEngineSpeech.prepare()
audioEngineSpeech.start()
recognitionTask = speechRecognizer.recognitionTask(with: recognitionRequest) { result, error in
let transcription = result.bestTranscription.formattedString
// using transcription here...
}
我还创建了音频单元:
var desc = AudioComponentDescription(
componentType: kAudioUnitType_Output,
componentSubType: kAudioUnitSubType_VoiceProcessingIO,
componentManufacturer: kAudioUnitManufacturer_Apple,
componentFlags: 0,
componentFlagsMask: 0
)
guard let component = AudioComponentFindNext(nil, &desc) else {
fatalError("Unable to find audio component")
}
var audioUnit: AudioUnit?
let osErr = AudioComponentInstanceNew(component, &audioUnit)
print("os Error for set up audio unit:", osErr)
var enable: UInt32 = 1
AudioUnitSetProperty(audioUnit!,
kAUVoiceIOProperty_VoiceProcessingEnableAGC,
kAudioUnitScope_Global,
0,
&enable,
UInt32(MemoryLayout<UInt32>.size))
AudioUnitSetProperty(audioUnit!,
kAudioOutputUnitProperty_EnableIO,
kAudioUnitScope_Input,
0,
&enable,
UInt32(MemoryLayout<UInt32>.size))
AudioUnitSetProperty(audioUnit!,
kAudioUnitSubType_AUSoundIsolation,
kAudioUnitScope_Input,
0,
&enable,
UInt32(MemoryLayout<UInt32>.size))
AudioUnitSetProperty(audioUnit!,
kAUVoiceIOProperty_BypassVoiceProcessing,
kAudioUnitScope_Global,
0,
&enable,
UInt32(MemoryLayout<UInt32>.size))
AudioUnitSetProperty(audioUnit!,
kAudioUnitProperty_ShouldAllocateBuffer,
kAudioUnitScope_Output,
0,
&enable,
UInt32(MemoryLayout<UInt32>.size))
let result = AudioUnitInitialize(audioUnit!)
在我的视图中,我还显示了 systemUserInterface 以提示用户选择语音隔离麦克风模式:
AVCaptureDevice.showSystemUserInterface(.microphoneModes)