AVAudioPlayerNode 导致失真

问题描述 投票:0回答:3

我有一个

AVAudioPlayerNode
附加到一个
AVAudioEngine
。样本缓冲液通过
playerNode
方法提供给
scheduleBuffer()

但是,

playerNode
似乎在扭曲音频。输出不是简单地“通过”缓冲区,而是失真并包含静电(但大部分仍可听到)。

相关代码:

let myBufferFormat = AVAudioFormat(standardFormatWithSampleRate: 48000, channels: 2)

// Configure player node
let playerNode = AVAudioPlayerNode()
audioEngine.attach(playerNode)
audioEngine.connect(playerNode, to: audioEngine.mainMixerNode, format: myBufferFormat)

// Provide audio buffers to playerNode
for await buffer in mySource.streamAudio() {
    await playerNode.scheduleBuffer(buffer)
}

在上面的示例中,

mySource.streamAudio()
从 ScreenCaptureKit
SCStreamDelegate
实时提供音频。音频缓冲区作为
CMSampleBuffer
到达,被转换为
AVAudioPCMBuffer
,然后通过
AsyncStream
传递到上面的音频引擎。我已验证转换后的缓冲区有效。

也许缓冲区到达的速度不够快?这张约 25,000 帧的图表表明

inputNode
正在定期插入“零”帧的片段:

失真似乎是这些空框的结果。

编辑:

即使我们从管道中删除

AsyncStream
,并立即在 ScreenCaptureKit 回调中处理缓冲区,失真仍然存在。这是一个可以按原样运行的端到端示例(重要部分是
didOutputSampleBuffer
):

class Recorder: NSObject, SCStreamOutput {
    
    private let audioEngine = AVAudioEngine()
    private let playerNode = AVAudioPlayerNode()
    private var stream: SCStream?
    private let queue = DispatchQueue(label: "sampleQueue", qos: .userInitiated)
    
    func setupEngine() {
        let format = AVAudioFormat(standardFormatWithSampleRate: 48000, channels: 2)
        audioEngine.attach(playerNode)
        // playerNode --> mainMixerNode --> outputNode --> speakers
        audioEngine.connect(playerNode, to: audioEngine.mainMixerNode, format: format)
        audioEngine.prepare()
        try? audioEngine.start()
        playerNode.play()
    }
    
    func startCapture() async {
        // Capture audio from Safari
        let availableContent = try! await SCShareableContent.excludingDesktopWindows(true, onScreenWindowsOnly: false)
        let display = availableContent.displays.first!
        let app = availableContent.applications.first(where: {$0.applicationName == "Safari"})!
        let filter = SCContentFilter(display: display, including: [app], exceptingWindows: [])
        let config = SCStreamConfiguration()
        config.capturesAudio = true
        config.sampleRate = 48000
        config.channelCount = 2
        stream = SCStream(filter: filter, configuration: config, delegate: nil)
        try! stream!.addStreamOutput(self, type: .audio, sampleHandlerQueue: queue)
        try! stream!.addStreamOutput(self, type: .screen, sampleHandlerQueue: queue) // To prevent warnings
        try! await stream!.startCapture()
    }
    
    func stream(_ stream: SCStream, didOutputSampleBuffer sampleBuffer: CMSampleBuffer, of type: SCStreamOutputType) {
        switch type {
        case .audio:
            let pcmBuffer = createPCMBuffer(from: sampleBuffer)!
            playerNode.scheduleBuffer(pcmBuffer, completionHandler: nil)
        default:
            break // Ignore video frames
        }
    }
    
    func createPCMBuffer(from sampleBuffer: CMSampleBuffer) -> AVAudioPCMBuffer? {
        var ablPointer: UnsafePointer<AudioBufferList>?
        try? sampleBuffer.withAudioBufferList { audioBufferList, blockBuffer in
            ablPointer = audioBufferList.unsafePointer
        }
        guard let audioBufferList = ablPointer,
              let absd = sampleBuffer.formatDescription?.audioStreamBasicDescription,
              let format = AVAudioFormat(standardFormatWithSampleRate: absd.mSampleRate, channels: absd.mChannelsPerFrame) else { return nil }
        return AVAudioPCMBuffer(pcmFormat: format, bufferListNoCopy: audioBufferList)
    }
    
}

let recorder = Recorder()
recorder.setupEngine()
Task {
    await recorder.startCapture()
}
swift macos audio avfoundation avaudioengine
3个回答
1
投票

您的“将缓冲区写入文件:失真!” block 几乎肯定在做一些缓慢且阻塞的事情(比如写入文件)。您每 170 毫秒 (8192/48k) 就会被调用一次。 tap 块的执行时间最好不要超过它,否则你会落在后面并丢弃缓冲区。

可以在写入文件时跟上进度,但这取决于您的操作方式。如果你正在做一些非常低效的事情(比如为每个缓冲区重新打开和刷新文件),那么你可能跟不上。

如果这个理论是正确的,那么现场扬声器输出不应该有静电,只有你的输出文件。


1
投票

罪魁祸首是

createPCMBuffer()
功能。用这个替换它,一切运行顺利:

func createPCMBuffer(from sampleBuffer: CMSampleBuffer) -> AVAudioPCMBuffer? {
    let numSamples = AVAudioFrameCount(sampleBuffer.numSamples)
    let format = AVAudioFormat(cmAudioFormatDescription: sampleBuffer.formatDescription!)
    let pcmBuffer = AVAudioPCMBuffer(pcmFormat: format, frameCapacity: numSamples)!
    pcmBuffer.frameLength = numSamples
    CMSampleBufferCopyPCMDataIntoAudioBufferList(sampleBuffer, at: 0, frameCount: Int32(numSamples), into: pcmBuffer.mutableAudioBufferList)
    return pcmBuffer
}

我问题中的原始功能直接取自苹果的 ScreenCaptureKit 示例项目。它在技术上是可行的,并且在写入文件时音频听起来不错,但显然它对于实时音频来说还不够快。

编辑: 实际上这可能与速度无关,因为由于复制数据,新功能平均慢 2-3 倍。可能是在用指针创建

AVAudioPCMBuffer
时,底层数据被释放了。


0
投票

我还看到 createPCMBuffer 实现在某些情况下返回 nil,破坏了我们的麦克风 audiowave 代码和 ScreenCaptureKit 录音实现。

似乎复制不安全指针并在传递给 withAudioBufferList 的块之外使用它不是一个好主意。我们已经为此调整了我们的实施,并解决了问题:

func createPCMBuffer(for sampleBuffer: CMSampleBuffer) -> AVAudioPCMBuffer? {
    try? sampleBuffer.withAudioBufferList { audioBufferList, _ -> AVAudioPCMBuffer? in
        guard let absd = sampleBuffer.formatDescription?.audioStreamBasicDescription else { return nil }
        guard let format = AVAudioFormat(standardFormatWithSampleRate: absd.mSampleRate, channels: absd.mChannelsPerFrame) else { return nil }
        return AVAudioPCMBuffer(pcmFormat: format, bufferListNoCopy: audioBufferList.unsafePointer)
    }
}

此解决方案更接近于 Apple 的原始 ScreenCaptureKit 示例,并且不应该受到 Hundley 上面描述的性能影响。

© www.soinside.com 2019 - 2024. All rights reserved.