除Haar级联之外的哪些算法或方法可用于自定义对象检测？

Question

我需要做计算机视觉任务，以便检测watter瓶或汽水罐。我将获得瓶子，汽水罐或任何其他随机物品（一个接一个）的“正面”图像，我的算法应该确定它是瓶子，罐头还是其中任何一个。

关于对象检测场景的一些细节：

如上所述，我将测试每个图像/视频帧的一个对象。
并非所有的水瓶都是一样的。塑料，盖子或标签变化可能有颜色。也许有些人无法获得标签或盖子。
同样的变化适用于汽水罐。没有皱巴巴的罐子可以测试。
对象之间可能存在小的尺寸变化。
我可以有一个绿色（或任何自定义颜色）背景。
我会在图像上做任何需要的过滤器。
这将在Raspberry Pi上运行。

以防万一，每个例子：

我已经测试了几次OpenCV人脸检测算法，我知道它工作得很好但是我需要获得一个特殊的Haar Cascades功能XML文件来检测这种方法中的每个自定义对象。

所以，我想到的不同选择是：

我想得到一个简单的算法，我认为甚至不需要创建自定义Haar分类器。你会建议什么？

更新

我强烈考虑了形状/纵横比方法。

然而，我想我正面临一些问题，因为每个瓶子都有不同的尺寸或形状。但这让我想到或设定了以下注意事项：

我正在使用THRESH_BINARY方法应用阈值。（感谢答案）。
我将在检测时使用白色背景。
汽水罐的尺寸都相同。
因此，高精度汽水罐的边界框可以区分罐头。

我取得了什么：

门槛真的对我有帮助，我注意到在白色背景测试中我会获得罐头：

这就是它为瓶子所获得的：

因此，较暗的区域保持优势是显而易见的。在罐头中有一些情况可能会变成假阴性。而对于瓶子来说，光线和角度可能导致不一致的结果，但我真的认为这可能是一种较短的方法。

所以，我现在很困惑，我应该如何评估黑暗的统治性，我已经读过findContours导致它，但我很失去如何抓住这种功能。例如，在汽水罐的情况下，它可能会发现几个轮廓，所以我迷失了评估的内容。

注意：我愿意测试与Open CV不同的任何其他算法或库。

Answer 1

我在这里看到几个基本想法：

检查对象（准确地说 - 对象boundind rect）宽度/高度比。对于瓶子来说它是2-2.5，对于瓶子，我认为它将超过3。这很简单，它应该很容易快速测试，我认为它应该具有相当好的准确性。对于某些值，如2.75（假设我给出的值是正确的，很可能不是真的），您可以使用一些不同的算法。
检查你的对象是否包含玻璃/透明区域 - 如果是，那肯定是一个瓶子。 Here你可以阅读更多相关信息。
使用抓取算法获取对象遮罩/更精确的形状，并检查顶部的形状宽度是否与底部的宽度相似 - 如果是，则为罐头，无瓶子（瓶子顶部有螺旋盖）。

Answer 2

既然你想要识别can vs bottle而不是pepsi vs coke，那么与Haar相比，形状匹配可能是最佳选择，而且像SIFT / SURF / ORB这样的features2d匹配器

独特的背景颜色将使事情变得更容易。

首先从仅背景的图像创建直方图

int channels[] = {0,1,2}; // use all the channels
int rgb_bins = 32; // quantize to 32 colors per channel
int histSize[] = {rgb_bins, rgb_bins, rgb_bins};
float _range[] = {0,255};
float* ranges[] = {_range, _range, _range};

cv::SparseMat bghist;
cv::calcHist(&bg_image, 1, channels, cv::noArray(),bghist, 3, histSize, ranges );

然后使用calcBackProject创建一个bg而不是bg的掩码

cv::MatND temp_ND;
cv::calcBackProject( &bottle_image, 1, channels, bghist, temp_ND, ranges );

cv::Mat bottle_mask, bottle_backproj;
if( feeling_lazy ){
    cv::normalize(temp_ND, bottle_backproj, 0, 255, cv::NORM_MINMAX, CV_8U);
    //a small blur here could work nicely
    threshold( bottle_backproj, bottle_mask, 0, 255, THRESH_OTSU );
    bottle_mask = cv::Scalar(255) - bottle_mask; //invert the mask
} else {
    //finding just the right value here might be better than the above method
    int magic_threshold = 64; 
    temp_ND.convertTo( bottle_backproj, CV_8U, 255.); 
    //I expect temp_ND to be CV_32F ranging from 0-1, but I might be wrong.
    threshold( bottle_backproj, bottle_mask, magic_threshold, 255, THRESH_BINARY_INV );
}

然后：

使用带有置信度阈值的matchTemplate将bottle_mask或bottle_backproj与几个样品瓶面具/反投影进行比较，以确定它是否匹配。

matchTemplate(bottle_mask, bottle_template, result, CV_TM_CCORR_NORMED);
double confidence; minMaxLoc( result, NULL, &confidence);

或者使用matchShapes，虽然我从来没有让它正常工作。

double confidence = matchShapes(bottle_mask, bottle_template, CV_CONTOURS_MATCH_I3);

或者使用难以设置的linemod，但对于形状不是很复杂的图像效果很好。除了链接文件，我还没有找到任何这种方法的工作样本，所以这就是我所做的。

首先使用一些样本图像创建/训练检测器

//some magic numbers
std::vector<int> T_at_level;
T_at_level.push_back(4); 
T_at_level.push_back(8);

//add some padding so linemod doesn't scream at you
const int T = 32;
int width = bottle_mask.cols;
if( width % T != 0)
    width += T - width % T;

int height = bottle_mask.rows;
if( height % T != 0)
    height += T - height % T;

//in this case template_backproj is created specifically from a sample bottle_backproj
cv::Rect padded_roi( (width - template_backproj.cols)/2, (height - template_backproj.rows)/2, template_backproj.cols, template_backproj.rows);
cv::Mat padded_backproj = zeros( width, height, template_backproj.type());
padded_backproj( padded_roi ) = template_backproj;

cv::Mat padded_mask = zeros( width, height, template_mask.type());
padded_mask( padded_roi ) = template_mask; 
//you might need to erode padded_mask by a few pixels.

//initialize detector
std::vector< cv::Ptr<cv::linemod::Modality> > modalities;
modalities.push_back( cv::makePtr<cv::linemod::ColorGradient>() ); //for those that don't have a kinect
cv::Ptr<cv::linemod::Detector> new_detector = cv::makePtr<cv::linemod::Detector>(modalities, T_at_level);

//add sample images to the detector
std::vector<cv::Mat> template_images;
templates.push_back( padded_backproj);
cv::Rect ignore_me;
const std::string class_id = "bottle";
template_id = new_detector->addTemplate(template_images, class_id, padded_mask, &ignore_me);

然后做一些匹配

std::vector<cv::Mat> sources_vec;
sources_vec.push_back( padded_backproj );
//padded_backproj doesn't need to be the same size as the trained template images, but it does need to be padded the same way.
float matching_threshold = 0.8; //a higher number makes the algorithm faster
std::vector<cv::linemod::Match> matches;
std::vector<cv::String> class_ids;

new_detector->match(sources_vec, matching_threshold, matches,class_ids);
float confidence = matches.size() > 0? matches[0].similarity : 0;

Answer 3

正如cyriel所说，纵横比（宽度/高度）可能是一个有用的衡量标准。这是一些OpenCV Python代码，可以找到轮廓（希望包括瓶子或罐子的轮廓），并为您提供纵横比和其他一些测量：

    # src image should have already had some contrast enhancement (such as
    # cv2.threshold) and edge finding (such as cv2.Canny)
    contours, hierarchy = cv2.findContours(src, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    for contour in contours:
    num_points = len(contour)
    if num_points < 5:
        # The contour has too few points to fit an ellipse. Skip it.
        continue

    # We could use area to help determine the type of object.
    # Small contours are probably false detections (not really a whole object).
    area = cv2.contourArea(contour)

    bounding_ellipse = cv2.fitEllipse(contour)
    center, radii, angle_degrees = bounding_ellipse

    # Let's define an ellipse's normal orientation to be landscape (width > height).
    # We must ensure that the ellipse's measurements match this orientation.
    if radii[0] < radii[1]:
        radii = (radii[1], radii[0])
        angle_degrees -= 90.0

    # We could use the angle to help determine the type of object.
    # A bottle or can's angle is probably approximately a multiple of 90 degrees,
    # assuming that it is at rest and not falling.

    # Calculate the aspect ratio (width / height).
    # For example, 0.5 means the object's height is 2 times its width.
    # A bottle is probably taller than a can.
    aspect_ratio = radii[0] / radii[1]

要检查透明度，可以使用直方图分析或背景减法将图片与已知背景进行比较。

轮廓的力矩可用于确定其质心（重心）：

    moments = cv2.moments(contour)
    m00 = moments['m00']
    m01 = moments['m01']
    m10 = moments['m10']
    centroid = (m10 / m00, m01 / m00)

你可以将它与中心进行比较。如果物体的一端较大（“较重”），则质心将比中心更接近该端。

Answer 4

所以，我的主要检测方法是：

瓶子是透明的，罐子是不透明的

一般算法包括：

拍一张灰度图片。

应用二进制阈值。

从中选择一个方便的ROI。

获得它的颜色均值甚至是标准偏差。

区分。

实施基本上简化为此功能（之前定义了CAN和BOTTLE）：

int detector(int x, int y, int width, int height, int thresholdValue, CvCapture* capture) {

  Mat img;
  Rect r;
  vector<Mat> channels;
  r = Rect(x,y,width,height);

  if ( !capture ) {
        fprintf( stderr, "ERROR: capture is NULL \n" );
        getchar();
        return -1;
                   }

  img = Mat(cvQueryFrame( capture ));
  cvtColor(img,img,CV_RGB2GRAY);
  threshold(img, img, 127, 255, THRESH_BINARY);

  // ROI
  Mat roiImage = img(r);
  split(roiImage,  channels);
  Scalar m = mean(channels[0]);
  float media = m[0];
  printf("Media: %f\n", media);

  if (media < thresholdValue) {

    return CAN;
  }

  else {
    return BOTTLE;
  }
}

可以看出，应用了THRESH_BINARY阈值，并且使用了纯白色背景。然而，我面对整个方法和算法的主要和关键问题是环境中的光度变化，即使是次要的变化。

有时我会注意到THRESH_BINARY_INV可能会有所帮助，但我想知道我是否可以使用一些certian阈值参数或者应用其他过滤器可能导致摆脱环境闪电作为一个问题。

我真的很欣赏从边界框或找到轮廓的纵横比计算方法，但我发现在条件调整时这是直截了当的。

Answer 5

我会使用基于转移学习的深度学习。

这个想法是这样的：给定一个高度复杂的训练有素的神经网络，它训练了类似的分类任务（在大型公共数据集上，如图像网），你可以冻结它的大部分重量，只训练最后一层。那里有很多教程。您不需要具备深度学习的背景知识。

有一个教程几乎是开箱即用的张量流here和here还有另一个基于keras。

除Haar级联之外的哪些算法或方法可用于自定义对象检测？

问题描述投票：3回答：5

更新

5个回答

最新问题

除Haar级联之外的哪些算法或方法可用于自定义对象检测？

问题描述 投票：3回答：5

更新

5个回答

最新问题

问题描述投票：3回答：5