I am using Python2.7.12 and OpenCV 3.0.0-rc1
I am working on a Text Recognition project.
This is what I got right now. Original iamge after findContour, line 34
As you can see, the image contains a lot of 'boxes', in which there are the text.
My approach is to find these boxes, cut them out into separate images, and feed them to TesseractOCR.
The program treat the whole image as one contour. How can I find the smaller one inside?
Or, if you have alternative approach, welcome
Code:
import cv2
def threshold(im, method):
# make it grayscale
im_gray = cv2.cvtColor(im, cv2.COLOR_RGB2GRAY)
if method == 'fixed':
threshed_im = cv2.threshold(im_gray, 128, 255, cv2.THRESH_BINARY)
elif method == 'mean':
threshed_im = cv2.adaptiveThreshold(im_gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 5, 10)
elif method == 'gaussian':
threshed_im = cv2.adaptiveThreshold(im_gray, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 5, 7)
else:
return None
return threshed_im
image = cv2.imread('demo4.jpg')
# threshold it
thresh = threshold(image, 'mean')
# find contours
_, cnts, hierarchy = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
print len(cnts)
cv2.drawContours(image, cnts, -1, (0, 255, 0), 20)
cv2.imshow('contours', image)
cv2.waitKey()
cv2.drawContours(thresh, cnts, -1, (0, 255, 0), 20)
cv2.imshow('contours', thresh)
cv2.waitKey()
`
You are only getting the outermost contour because you specified cv2.RETR_EXTERNAL
. To get all the contours of the image, you should call the method like this:
cv2.findContours(thresh.copy(), cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
Take a look at OpenCV documentation to see how the function works.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.