我们正在尝试规范化UIImage
,以便它可以正确地传递到CoreML模型中。
我们从每个像素中检索RGB值的方法是首先为每个像素初始化一个名为[CGFloat]
值的rawData
数组,以便存在红色,绿色,蓝色和alpha值的位置。在bitmapInfo
中,我们从原始UIimage本身获取原始像素值并进行操作。这用于填充bitmapInfo
变量中的context
参数。我们稍后将CGContext
变量用于context
和draw
,后来将标准化的CGImage
转换回CGImage
。
使用嵌套的for循环迭代UIImage
和x
坐标,找到所有像素中所有颜色(通过y
的原始数据阵列找到)中的最小和最大像素颜色值。绑定变量设置为终止for循环,否则,它将超出范围错误。
CGFloat
表示可能的RGB值范围(即最大颜色值和最小值之间的差值)。
使用公式来标准化每个像素值:
range
和一个类似的设计嵌套for循环从上面解析通过A = Image
curPixel = current pixel (R,G, B or Alpha)
NormalizedPixel = (curPixel-minPixel(A))/range
数组并根据此规范化修改每个像素的颜色。
我们的大多数代码来自:
我们使用https://gist.github.com/pimpapare/e8187d82a3976b851fc12fe4f8965789而不是CGFloat
,因为标准化的像素值应该是介于0和1之间的实数,而不是0或1。
UInt8
在归一化之前,我们期望像素值范围是0到255,并且在归一化之后,像素值范围是0-1。
规范化公式能够将像素值标准化为0到1之间的值。但是当我们尝试打印输出(只需在循环像素值时添加打印语句)标准化之前的像素值以验证我们得到原始像素值是否正确,我们发现这些值的范围是关闭的。例如,像素值的值为3.506e + 305(大于255)。我们认为我们在开始时得到了错误的原始像素值。
我们不熟悉Swift中的图像处理,我们不确定整个规范化过程是否正确。任何帮助,将不胜感激!
几点意见:
func normalize() -> UIImage?{
let colorSpace = CGColorSpaceCreateDeviceRGB()
guard let cgImage = cgImage else {
return nil
}
let width = Int(size.width)
let height = Int(size.height)
var rawData = [CGFloat](repeating: 0, count: width * height * 4)
let bytesPerPixel = 4
let bytesPerRow = bytesPerPixel * width
let bytesPerComponent = 8
let bitmapInfo = CGImageAlphaInfo.premultipliedLast.rawValue | CGBitmapInfo.byteOrder32Big.rawValue & CGBitmapInfo.alphaInfoMask.rawValue
let context = CGContext(data: &rawData,
width: width,
height: height,
bitsPerComponent: bytesPerComponent,
bytesPerRow: bytesPerRow,
space: colorSpace,
bitmapInfo: bitmapInfo)
let drawingRect = CGRect(origin: .zero, size: CGSize(width: width, height: height))
context?.draw(cgImage, in: drawingRect)
let bound = rawData.count
//find minimum and maximum
var minPixel: CGFloat = 1.0
var maxPixel: CGFloat = 0.0
for x in 0..<width {
for y in 0..<height {
let byteIndex = (bytesPerRow * x) + y * bytesPerPixel
if(byteIndex > bound - 4){
break
}
minPixel = min(CGFloat(rawData[byteIndex]), minPixel)
minPixel = min(CGFloat(rawData[byteIndex + 1]), minPixel)
minPixel = min(CGFloat(rawData[byteIndex + 2]), minPixel)
minPixel = min(CGFloat(rawData[byteIndex + 3]), minPixel)
maxPixel = max(CGFloat(rawData[byteIndex]), maxPixel)
maxPixel = max(CGFloat(rawData[byteIndex + 1]), maxPixel)
maxPixel = max(CGFloat(rawData[byteIndex + 2]), maxPixel)
maxPixel = max(CGFloat(rawData[byteIndex + 3]), maxPixel)
}
}
let range = maxPixel - minPixel
print("minPixel: \(minPixel)")
print("maxPixel : \(maxPixel)")
print("range: \(range)")
for x in 0..<width {
for y in 0..<height {
let byteIndex = (bytesPerRow * x) + y * bytesPerPixel
if(byteIndex > bound - 4){
break
}
rawData[byteIndex] = (CGFloat(rawData[byteIndex]) - minPixel) / range
rawData[byteIndex+1] = (CGFloat(rawData[byteIndex+1]) - minPixel) / range
rawData[byteIndex+2] = (CGFloat(rawData[byteIndex+2]) - minPixel) / range
rawData[byteIndex+3] = (CGFloat(rawData[byteIndex+3]) - minPixel) / range
}
}
let cgImage0 = context!.makeImage()
return UIImage.init(cgImage: cgImage0!)
}
是浮点数,rawData
,数组,但你的上下文不是用浮点数据填充它,而是用CGFloat
数据填充它。如果需要浮点缓冲区,请使用UInt8
构建浮点上下文并相应地调整上下文参数。例如。:
CGBitmapInfo.floatComponents
func normalize() -> UIImage? {
let colorSpace = CGColorSpaceCreateDeviceRGB()
guard let cgImage = cgImage else {
return nil
}
let width = cgImage.width
let height = cgImage.height
var rawData = [Float](repeating: 0, count: width * height * 4)
let bytesPerPixel = 16
let bytesPerRow = bytesPerPixel * width
let bitsPerComponent = 32
let bitmapInfo = CGImageAlphaInfo.premultipliedLast.rawValue | CGBitmapInfo.floatComponents.rawValue | CGBitmapInfo.byteOrder32Little.rawValue
guard let context = CGContext(data: &rawData,
width: width,
height: height,
bitsPerComponent: bitsPerComponent,
bytesPerRow: bytesPerRow,
space: colorSpace,
bitmapInfo: bitmapInfo) else { return nil }
let drawingRect = CGRect(origin: .zero, size: CGSize(width: width, height: height))
context.draw(cgImage, in: drawingRect)
var maxValue: Float = 0
var minValue: Float = 1
for pixel in 0 ..< width * height {
let baseOffset = pixel * 4
for offset in baseOffset ..< baseOffset + 3 {
let value = rawData[offset]
if value > maxValue { maxValue = value }
if value < minValue { minValue = value }
}
}
let range = maxValue - minValue
guard range > 0 else { return nil }
for pixel in 0 ..< width * height {
let baseOffset = pixel * 4
for offset in baseOffset ..< baseOffset + 3 {
rawData[offset] = (rawData[offset] - minValue) / range
}
}
return context.makeImage().map { UIImage(cgImage: $0, scale: scale, orientation: imageOrientation) }
}
数据,执行浮点数学运算,然后更新UInt8
缓冲区,并从中创建图像。从而:
UInt8
我只是依赖于你是否真的需要这个浮点缓冲区用于你的ML模型(在这种情况下,你可能会在第一个例子中返回浮点数组,而不是创建一个新图像)或者目标是否只是为了创建规范化func normalize() -> UIImage? {
let colorSpace = CGColorSpaceCreateDeviceRGB()
guard let cgImage = cgImage else {
return nil
}
let width = cgImage.width
let height = cgImage.height
var rawData = [UInt8](repeating: 0, count: width * height * 4)
let bytesPerPixel = 4
let bytesPerRow = bytesPerPixel * width
let bitsPerComponent = 8
let bitmapInfo = CGImageAlphaInfo.premultipliedLast.rawValue
guard let context = CGContext(data: &rawData,
width: width,
height: height,
bitsPerComponent: bitsPerComponent,
bytesPerRow: bytesPerRow,
space: colorSpace,
bitmapInfo: bitmapInfo) else { return nil }
let drawingRect = CGRect(origin: .zero, size: CGSize(width: width, height: height))
context.draw(cgImage, in: drawingRect)
var maxValue: UInt8 = 0
var minValue: UInt8 = 255
for pixel in 0 ..< width * height {
let baseOffset = pixel * 4
for offset in baseOffset ..< baseOffset + 3 {
let value = rawData[offset]
if value > maxValue { maxValue = value }
if value < minValue { minValue = value }
}
}
let range = Float(maxValue - minValue)
guard range > 0 else { return nil }
for pixel in 0 ..< width * height {
let baseOffset = pixel * 4
for offset in baseOffset ..< baseOffset + 3 {
rawData[offset] = UInt8(Float(rawData[offset] - minValue) / range * 255)
}
}
return context.makeImage().map { UIImage(cgImage: $0, scale: scale, orientation: imageOrientation) }
}
。
我对此进行了基准测试,它在iPhone XS Max上比浮点渲染速度稍微快一点,但占用内存的四分之一(例如2000 x 2000px图像需要16mb与UIImage
,但64mb与UInt8
)。Float
有一个高度优化的功能,vImage做了一些非常类似于我们上面所做的事情。只是vImageContrastStretch_ARGB8888
然后你可以做类似的事情:
import Accelerate
虽然这采用了稍微不同的算法,但值得考虑,因为在我的基准测试中,在我的iPhone XS Max上,它的速度是浮点数再现的5倍。一些不相关的观察:
func normalize3() -> UIImage? {
let colorSpace = CGColorSpaceCreateDeviceRGB()
guard let cgImage = cgImage else { return nil }
var format = vImage_CGImageFormat(bitsPerComponent: UInt32(cgImage.bitsPerComponent),
bitsPerPixel: UInt32(cgImage.bitsPerPixel),
colorSpace: Unmanaged.passRetained(colorSpace),
bitmapInfo: cgImage.bitmapInfo,
version: 0,
decode: nil,
renderingIntent: cgImage.renderingIntent)
var source = vImage_Buffer()
var result = vImageBuffer_InitWithCGImage(
&source,
&format,
nil,
cgImage,
vImage_Flags(kvImageNoFlags))
guard result == kvImageNoError else { return nil }
defer { free(source.data) }
var destination = vImage_Buffer()
result = vImageBuffer_Init(
&destination,
vImagePixelCount(cgImage.height),
vImagePixelCount(cgImage.width),
32,
vImage_Flags(kvImageNoFlags))
guard result == kvImageNoError else { return nil }
result = vImageContrastStretch_ARGB8888(&source, &destination, vImage_Flags(kvImageNoFlags))
guard result == kvImageNoError else { return nil }
defer { free(destination.data) }
return vImageCreateCGImageFromBuffer(&destination, &format, nil, nil, vImage_Flags(kvImageNoFlags), nil).map {
UIImage(cgImage: $0.takeRetainedValue(), scale: scale, orientation: imageOrientation)
}
}
的宽度和高度,而是使用UIImage
中的值。如果您的图像的比例可能不是1,这是一个重要的区别。