MLKit:使用 MLKSegmentationMask 从视频捕获中删除背景

问题描述 投票:0回答:1

我正在使用 iOS 版 MLKit 进行自拍分割。在他们的示例项目中,他们使用彩色蒙版来识别背景。我需要使用

MLKSegmentationMask
CVImageBufferRef

删除背景

https://developers.google.com/ml-kit/vision/selfie-segmentation

下面是从 MLKit 和图像缓冲区(即实际帧)获取分段掩码的代码。现在的重点是我需要将背景像素 alpha 设置为 0。分割蒙版包含从 0 到 1 的置信值。

+ (void)applySegmentationMask:(MLKSegmentationMask *)mask
                toImageBuffer:(CVImageBufferRef)imageBuffer
          withBackgroundColor:(nullable UIColor *)backgroundColor
              foregroundColor:(nullable UIColor *)foregroundColor {
  NSAssert(CVPixelBufferGetPixelFormatType(imageBuffer) == kCVPixelFormatType_32BGRA,
           @"Image buffer must have 32BGRA pixel format type");
  size_t width = CVPixelBufferGetWidth(mask.buffer);
  size_t height = CVPixelBufferGetHeight(mask.buffer);
  NSAssert(CVPixelBufferGetWidth(imageBuffer) == width, @"Height must match");
  NSAssert(CVPixelBufferGetHeight(imageBuffer) == height, @"Width must match");

  if (backgroundColor == nil && foregroundColor == nil) {
    return;
  }

  CVPixelBufferLockBaseAddress(imageBuffer, 0);
  CVPixelBufferLockBaseAddress(mask.buffer, kCVPixelBufferLock_ReadOnly);

  float *maskAddress = (float *)CVPixelBufferGetBaseAddress(mask.buffer);
  size_t maskBytesPerRow = CVPixelBufferGetBytesPerRow(mask.buffer);

  unsigned char *imageAddress = (unsigned char *)CVPixelBufferGetBaseAddress(imageBuffer);
  size_t bytesPerRow = CVPixelBufferGetBytesPerRow(imageBuffer);
  static const int kBGRABytesPerPixel = 4;

  foregroundColor = foregroundColor ?: UIColor.clearColor;
  backgroundColor = backgroundColor ?: UIColor.clearColor;
  CGFloat redFG, greenFG, blueFG, alphaFG;
  CGFloat redBG, greenBG, blueBG, alphaBG;
  [foregroundColor getRed:&redFG green:&greenFG blue:&blueFG alpha:&alphaFG];
  [backgroundColor getRed:&redBG green:&greenBG blue:&blueBG alpha:&alphaBG];

  static const float kMaxColorComponentValue = 255.0f;

  for (int row = 0; row < height; ++row) {
    for (int col = 0; col < width; ++col) {
      int pixelOffset = col * kBGRABytesPerPixel;
      int blueOffset = pixelOffset;
      int greenOffset = pixelOffset + 1;
      int redOffset = pixelOffset + 2;
      int alphaOffset = pixelOffset + 3;

      float maskValue = maskAddress[col];
      float backgroundRegionRatio = 1.0f - maskValue;
      float foregroundRegionRatio = maskValue;

      float originalPixelRed = imageAddress[redOffset] / kMaxColorComponentValue;
      float originalPixelGreen = imageAddress[greenOffset] / kMaxColorComponentValue;
      float originalPixelBlue = imageAddress[blueOffset] / kMaxColorComponentValue;
      float originalPixelAlpha = imageAddress[alphaOffset] / kMaxColorComponentValue;

      float redOverlay = redBG * backgroundRegionRatio + redFG * foregroundRegionRatio;
      float greenOverlay = greenBG * backgroundRegionRatio + greenFG * foregroundRegionRatio;
      float blueOverlay = blueBG * backgroundRegionRatio + blueFG * foregroundRegionRatio;
      float alphaOverlay = alphaBG * backgroundRegionRatio + alphaFG * foregroundRegionRatio;

      // Calculate composite color component values.
      // Derived from https://en.wikipedia.org/wiki/Alpha_compositing#Alpha_blending
      float compositeAlpha = ((1.0f - alphaOverlay) * originalPixelAlpha) + alphaOverlay;
      float compositeRed = 0.0f;
      float compositeGreen = 0.0f;
      float compositeBlue = 0.0f;
      // Only perform rgb blending calculations if the output alpha is > 0. A zero-value alpha
      // means none of the color channels actually matter, and would introduce division by 0.
      if (fabs(compositeAlpha) > FLT_EPSILON) {
        compositeRed = (((1.0f - alphaOverlay) * originalPixelAlpha * originalPixelRed) +
                        (alphaOverlay * redOverlay)) /
                       compositeAlpha;
        compositeGreen = (((1.0f - alphaOverlay) * originalPixelAlpha * originalPixelGreen) +
                          (alphaOverlay * greenOverlay)) /
                         compositeAlpha;
        compositeBlue = (((1.0f - alphaOverlay) * originalPixelAlpha * originalPixelBlue) +
                         (alphaOverlay * blueOverlay)) /
                        compositeAlpha;
      }

      imageAddress[blueOffset] = compositeBlue * kMaxColorComponentValue;
      imageAddress[greenOffset] = compositeGreen * kMaxColorComponentValue;
      imageAddress[redOffset] = compositeRed * kMaxColorComponentValue;
      imageAddress[alphaOffset] = compositeAlpha * kMaxColorComponentValue;
    }
    imageAddress += bytesPerRow / sizeof(unsigned char);
    maskAddress += maskBytesPerRow / sizeof(float);
  }

  CVPixelBufferUnlockBaseAddress(imageBuffer, 0);
  CVPixelBufferUnlockBaseAddress(mask.buffer, kCVPixelBufferLock_ReadOnly);
}
ios objective-c deep-learning google-mlkit semantic-segmentation
1个回答
0
投票

您尚未发布您到目前为止所尝试的内容,和/或您所尝试的内容是否有效。所以我们要帮助你并不容易,因为看来你想要的是有人来做你的工作。

但是,我可能建议查看以下网站,它们似乎在解释如何实现您想要实现的目标:

https://github.com/tbchen/BackgroundRemovalWithCoreMLSample https://medium.com/macoclock/remove-the-image-background-in-swift-using-core-ml-8646ed3a1c14

问候,

© www.soinside.com 2019 - 2024. All rights reserved.