我正在尝试录制我的麦克风,压缩录制的音频缓冲区,将压缩的缓冲区作为字节传输到另一个设备,将接收到的字节解压缩到新的缓冲区中,最后播放该缓冲区。
我使用以下代码来录制麦克风,并测试压缩和解压缩是否正常工作(因此省略了数据传输部分):
// ... code to setup audioEngines for recording and playback
tappableInputNode?.installTap(onBus: bus, bufferSize: 1024, format: format!) { (buffer, when) in
guard let compressedData = compressPCMBuffer(buffer) else {
print("Failed to compress buffer")
return
}
guard let decompressedBuffer = decompressDataToPCMBuffer(compressedData) else {
print("Failed to decompress data")
return
}
let renderTime =
playerNode.playerTime(
forNodeTime: playerNode.lastRenderTime ?? AVAudioTime(hostTime: mach_absolute_time()))
?? AVAudioTime(hostTime: mach_absolute_time())
let delayInSeconds = 0.1
let futureHostTime = renderTime.hostTime + UInt64(delayInSeconds)
let scheduledTime = AVAudioTime(hostTime: futureHostTime)
playerNode.scheduleBuffer(decompressedBuffer, at: scheduledTime, completionCallbackType: .dataPlayedBack) { (type) in
print("Buffer played back")
}
playerNode.play()
}
播放水龙头提供的初始缓冲区时,一切正常。但是当我尝试播放解压缩的缓冲区(如上面的代码)时,我唯一听到的是快速的点击声。
为了压缩 AVAudioPCMBuffer 我编写了以下函数:
public func compressPCMBuffer(_ buffer: AVAudioPCMBuffer) -> Data? {
var error: NSError?
var opusDesc = AudioStreamBasicDescription()
opusDesc.mFormatID = kAudioFormatOpus
opusDesc.mChannelsPerFrame = buffer.format.channelCount
opusDesc.mSampleRate = buffer.format.sampleRate
guard let audioFormat = AVAudioFormat(streamDescription: &opusDesc) else {
print("Failed to create AVAudioFormat")
return nil
}
guard let converter = AVAudioConverter(from: buffer.format, to: audioFormat) else {
print("Failed to initialize AVAudioConverter")
return nil
}
let outputBuffer = AVAudioCompressedBuffer(
format: audioFormat, packetCapacity: 1, maximumPacketSize: 512
)
let inputBlock: AVAudioConverterInputBlock = { inNumPackets, outStatus in
outStatus.pointee = .haveData
return buffer
}
let status = converter.convert(to: outputBuffer, error: &error, withInputFrom: inputBlock)
if status == .error || error != nil {
print("Audio conversion failed: \(error?.localizedDescription ?? "Unknown error")")
return nil
}
let packetCount = Int(outputBuffer.packetCount)
if packetCount == 0 {
print("No packets were produced during conversion")
return nil
}
let data = outputBuffer.toData()
return data
}
返回的数据大小似乎为 160 - 180 字节,非常适合我的用例。
然后,为了将数据解压缩到 AVAudioPCMBuffer,我有这个函数:
public func decompressDataToPCMBuffer(_ rawData: Data) -> AVAudioPCMBuffer? {
var opusDesc = AudioStreamBasicDescription()
opusDesc.mFormatID = kAudioFormatOpus
opusDesc.mChannelsPerFrame = 1
opusDesc.mSampleRate = 48000
guard let opusFormat = AVAudioFormat(streamDescription: &opusDesc) else {
print("Failed to create Opus AVAudioFormat")
return nil
}
let inputBuffer = AVAudioCompressedBuffer(
format: opusFormat, packetCapacity: 1, maximumPacketSize: 512
)
inputBuffer.packetCount = 1
inputBuffer.byteLength = UInt32(rawData.count)
rawData.withUnsafeBytes { (rawBufferPointer: UnsafeRawBufferPointer) in
let rawPointer = rawBufferPointer.baseAddress!
inputBuffer.audioBufferList.pointee.mBuffers.mData!.copyMemory(
from: rawPointer, byteCount: rawData.count
)
}
guard
let pcmFormat = AVAudioFormat(
commonFormat: .pcmFormatInt16, sampleRate: opusDesc.mSampleRate,
channels: AVAudioChannelCount(opusDesc.mChannelsPerFrame), interleaved: false
)
else {
print("Failed to create PCM format")
return nil
}
guard let converter = AVAudioConverter(from: opusFormat, to: pcmFormat) else {
print("Failed to create AVAudioConverter")
return nil
}
guard let pcmBuffer = AVAudioPCMBuffer(pcmFormat: pcmFormat, frameCapacity: 4800) else {
print("Failed to create PCM buffer")
return nil
}
var error: NSError? = nil
let inputBlock: AVAudioConverterInputBlock = { inNumPackets, outStatus in
outStatus.pointee = .haveData
return inputBuffer
}
converter.convert(to: pcmBuffer, error: &error, withInputFrom: inputBlock)
if let error = error {
print("Error during conversion: \(error.localizedDescription)")
return nil
}
return pcmBuffer
}
这两个函数都可以正确运行,没有任何警告或错误,但播放的音频已损坏/格式错误,并伴有我之前描述的点击声。
我也尝试过其他格式,例如 AAC,但我得到了完全相同的咔哒声。
如有任何帮助,我们将不胜感激。谢谢!
AVAudioConverter
承诺在许多音频编解码器及其格式之间进行转换。编解码器在生成输出之前需要多少输入数据,具有不同的复杂性和要求,AVAudioConverter
的设计反映了这一点。头文件文档涉及到这一点,区分了“简单”(没有回调)和“复杂”转换(有回调) - 简单不涉及速率转换或编解码器,因此可能仅限于简单的 LPCM 格式更改,浮动到诸如此类的事情。
代码中两个最明显的问题是
为每个音频缓冲区重新创建
AVAudioConverter
,
这破坏了 AVAudioConverter
维护缓冲区之间状态的能力,并消除了它选择何时以及为您提供多少压缩或解压缩帧的自由 - 这是使用回调进行复杂转换的全部要点。
代码消耗每个转换器的一定量的输出(512 个数据包用于压缩,512 个帧用于解压缩)。对于解压缩来说,这可能太少了,并且通过让
AVAudioConverter
超出范围,您将永远无法获得剩余的 PCM 数据。此外,某些编解码器会在流开始时为您提供一些额外的静音。其中任何一个都可以解释点击声。
因此仅创建
AVAudioConverter
一次并消耗其所有可用输出。可能还有其他问题,但我就到此为止。