在 Swift 中将语音隔离与语音识别集成

Question

我正在尝试将语音隔离与 Swift 中的语音识别集成在一起。我的目标是让 IOS 的内置语音识别具有更好的质量，因为我们都知道它可能有多不准确，尤其是在嘈杂的背景下。目前，我的语音识别工作正常，我创建了一个音频单元，允许对语音隔离进行语音处理，但我不知道如何从那里开始将音频单元与语音识别任务集成。我试着拼凑出互联网上的内容，但我仍然是 Swift 的初学者，真的不知道如何从那里开始。另外，如果有任何其他建议可以使语音识别更好，那就太好了。这是我现在拥有的代码：

对于语音识别，这就是我所拥有的并且工作正常：

let audioEngineSpeech = AVAudioEngine()
let speechRecognizer = SFSpeechRecognizer(locale: Locale(identifier: "en-US"))!
var recognitionRequest = SFSpeechAudioBufferRecognitionRequest()
var recognitionTask: SFSpeechRecognitionTask?

let inputNode = audioEngineSpeech.inputNode
inputNode.reset()
inputNode.removeTap(onBus: 0)
inputNode.isVoiceProcessingBypassed = true
let format = inputNode.inputFormat(forBus: 0)
inputNode.installTap(onBus: 0, bufferSize: 1024, format: format) { buffer, _ in
    recognitionRequest.append(buffer)
}

audioEngineSpeech.prepare()
audioEngineSpeech.start()
recognitionTask = speechRecognizer.recognitionTask(with: recognitionRequest) { result, error in
      let transcription = result.bestTranscription.formattedString
      // using transcription here...
}

我还创建了音频单元：

var desc = AudioComponentDescription(
    componentType: kAudioUnitType_Output,
    componentSubType: kAudioUnitSubType_VoiceProcessingIO,
    componentManufacturer: kAudioUnitManufacturer_Apple,
    componentFlags: 0,
    componentFlagsMask: 0
)

guard let component = AudioComponentFindNext(nil, &desc) else {
    fatalError("Unable to find audio component")
}
var audioUnit: AudioUnit?
let osErr = AudioComponentInstanceNew(component, &audioUnit)
print("os Error for set up audio unit:", osErr)

var enable: UInt32 = 1
AudioUnitSetProperty(audioUnit!,
    kAUVoiceIOProperty_VoiceProcessingEnableAGC,
    kAudioUnitScope_Global,
    0,
    &enable,
    UInt32(MemoryLayout<UInt32>.size))

AudioUnitSetProperty(audioUnit!,
    kAudioOutputUnitProperty_EnableIO,
    kAudioUnitScope_Input,
    0,
    &enable,
    UInt32(MemoryLayout<UInt32>.size))

AudioUnitSetProperty(audioUnit!,
    kAudioUnitSubType_AUSoundIsolation,
    kAudioUnitScope_Input,
    0,
    &enable,
    UInt32(MemoryLayout<UInt32>.size))

AudioUnitSetProperty(audioUnit!,
    kAUVoiceIOProperty_BypassVoiceProcessing,
    kAudioUnitScope_Global,
    0,
    &enable,
    UInt32(MemoryLayout<UInt32>.size))

AudioUnitSetProperty(audioUnit!,
    kAudioUnitProperty_ShouldAllocateBuffer,
    kAudioUnitScope_Output,
    0,
    &enable,
    UInt32(MemoryLayout<UInt32>.size))

let result = AudioUnitInitialize(audioUnit!)

在我的视图中，我还显示了 systemUserInterface 以提示用户选择语音隔离麦克风模式：

AVCaptureDevice.showSystemUserInterface(.microphoneModes)

在 Swift 中将语音隔离与语音识别集成

问题描述投票：0回答：0

最新问题

在 Swift 中将语音隔离与语音识别集成

问题描述 投票：0回答：0

最新问题

问题描述投票：0回答：0