如何在opencv / python中识别具有特定形状的直方图

Question

I want to segment images (from magazines) in text and image parts. 我想在文本和图像部分中分割图像（来自杂志）。 I have several histograms for several ROIs in my picture. 我的图片中有几个ROI的直方图。 I use opencv with python (cv2). 我使用opencv和python（cv2）。

I want to recognize histograms that look like this 我想识别看起来像这样的直方图

http://matplotlib.sourceforge.net/users/image_tutorial-6.png http://matplotlib.sourceforge.net/users/image_tutorial-6.png

as it is a typical shape for a text region. 因为它是文本区域的典型形状。 How can I do that? 我怎样才能做到这一点？

Edit: Thank you for your help so far. 编辑：感谢您的帮助到目前为止。

I compared the histograms I got from my ROIs to a sample histogram I provided: 我将从ROI获得的直方图与我提供的样本直方图进行了比较：

hist = cv2.calcHist(roi,[0,1], None, [180,256],ranges)
compareValue = cv2.compareHist(hist, samplehist, cv.CV_COMP_CORREL)
print "ROI: {0}, compareValue: {1}".format(i,compareValue)

Assuming ROI 0, 1, 4 and 5 are text regions and ROI is an image region, I get output like this: 假设ROI 0,1,4和5是文本区域，ROI是图像区域，我得到如下输出：

ROI: 0, compareValue: 1.0 ROI：0，compareValue：1.0
ROI: 1, compareValue: -0.000195522081574 <--- wrong classified 投资回报率：1，compareValue：-0.000195522081574 <---错误分类
ROI: 2, compareValue: 0.0612670248952 投资回报率：2，compareValue：0.0612670248952
ROI: 3, compareValue: -0.000517370176887 投资回报率：3，compareValue：-0.000517370176887
ROI: 4, compareValue: 1.0 投资回报率：4，compareValue：1.0
ROI: 5, compareValue: 1.0 投资回报率：5，compareValue：1.0

What can I do to avoid wrong classification? 我该怎么做才能避免错误的分类？ For some images, the misclassification rate is about 30%, which is way too high. 对于某些图像，错误分类率约为30％，这太高了。

(I tried also with CV_COMP_CHISQR, CV_COMP_INTERSECT, CV_COMP_BHATTACHARYY and (hist*samplehist).sum() but they also provide wrong compareValues) （我也尝试过CV_COMP_CHISQR，CV_COMP_INTERSECT，CV_COMP_BHATTACHARYY和（hist * samplehist）.sum（）但它们也提供了错误的compareValues）

Answer 1

(See the EDIT at the end in case i misunderstood the question) : （如果我误解了这个问题，请参见最后的编辑）：

If you are looking to draw the histograms, I had submitted one python sample to OpenCV, and you can get it from here : 如果你想绘制直方图，我已经向OpenCV提交了一个python样本，你可以从这里得到它：

http://code.opencv.org/projects/opencv/repository/entry/trunk/opencv/samples/python2/hist.py http://code.opencv.org/projects/opencv/repository/entry/trunk/opencv/samples/python2/hist.py

It is used to draw two kinds of histograms. 它用于绘制两种直方图。 First one applicable to both color and grayscale images as shown here : http://opencvpython.blogspot.in/2012/04/drawing-histogram-in-opencv-python.html 第一个适用于彩色和灰度图像，如下所示： http ： //opencvpython.blogspot.in/2012/04/drawing-histogram-in-opencv-python.html

Second one is exclusive for grayscale image which is same as your image in the question. 第二个是灰度图像专用的，与问题中的图像相同。

I will show the second and its modification. 我将展示第二个及其修改。

Consider a full image as below : 考虑如下完整图像：

在此输入图像描述

We need to draw a histogram as you have shown. 我们需要绘制直方图，如图所示。 Check the below code: 检查以下代码：

import cv2
import numpy as np

img = cv2.imread('messi5.jpg')
mask = cv2.imread('mask.png',0)
ret,mask = cv2.threshold(mask,127,255,0)

def hist_lines(im,mask):
    h = np.zeros((300,256,3))
    if len(im.shape)!=2:
        print "hist_lines applicable only for grayscale images"
        #print "so converting image to grayscale for representation"
        im = cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
    hist_item = cv2.calcHist([im],[0],mask,[256],[0,255])
    cv2.normalize(hist_item,hist_item,0,255,cv2.NORM_MINMAX)
    hist=np.int32(np.around(hist_item))
    for x,y in enumerate(hist):
        cv2.line(h,(x,0),(x,y),(255,255,255))
    y = np.flipud(h)
    return y

histogram = hist_lines(img,None)

And below is the histogram we got. 下面是我们得到的直方图。 Remember it is histogram of full image. 请记住，它是完整图像的直方图。 For that,we have given None for mask. 为此，我们给了面具None 。

在此输入图像描述

Now I want to find the histogram of some part of the image. 现在我想找到图像某些部分的直方图。 OpenCV histogram function has got a mask facility for that. OpenCV直方图功能有一个掩码工具。 For normal histogram, you should set it None . 对于普通直方图，您应将其设置为None 。 Otherwise you have to specify the mask. 否则你必须指定掩码。

Mask is a 8-bit image, where white denotes that region should be used for histogram calculations, and black means it should not. 掩码是一个8位图像，其中白色表示该区域应用于直方图计算，黑色表示不应该。

So I used a mask like below ( created using paint, you have to create your own mask for your purposes). 所以我使用了下面的面具（使用颜色创建，你必须为你的目的创建自己的面具）。

在此输入图像描述

I changed the last line of code as below : 我更改了最后一行代码，如下所示：

histogram = hist_lines(img,mask)

Now see the difference below : 现在看看下面的区别：

在此输入图像描述

(Remember, values are normalized, so values shown are not actual pixel count, normalized to 255. Change it as you like.) （请记住，值是标准化的，因此显示的值不是实际像素数，标准化为255.根据需要更改它。）

EDIT : 编辑：

I think i misunderstood your question. 我想我误解了你的问题。 You need to compare histograms, right ? 你需要比较直方图，对吗？

If that is what you wanted, you can use cv2.compareHist function. 如果这是你想要的，你可以使用cv2.compareHist函数。

There is an official tutorial about this in C++ . 在C ++中有一个关于此的官方教程。 You can find its corresponding Python code here. 您可以在此处找到相应的Python代码。

Answer 2

You can use a simple correlation metric. 您可以使用简单的关联度量标准。

make sure that the histogram you compute and your reference are normalized (ie represent probapilities) 确保您计算的直方图和参考标准化（即表示概率）
for each histogram compute (given that myRef and myHist are numpy arrays): 对于每个直方图计算（假设myRef和myHist是numpy数组）：
metric = (myRef * myHist).sum()
this metric is a measure of how much the histogram looks like your reference. 此指标衡量直方图看起来像您的参考。

如何在opencv / python中识别具有特定形状的直方图

问题描述

2 个解决方案

解决方案1
9 2012-06-22 20:52:16

解决方案2
3 2012-06-22 16:29:46

如何在opencv / python中识别具有特定形状的直方图

问题描述

2 个解决方案

解决方案1 9 2012-06-22 20:52:16

解决方案2 3 2012-06-22 16:29:46

解决方案1
9 2012-06-22 20:52:16

解决方案2
3 2012-06-22 16:29:46