How do I isolate or crop only the handwritten text using OpenCV and Phyton for the image:
I have tried to use:
cv2.findContours
but because of the noise (background and dirty in paper) I can't get only the paper.
How do I do this?
To smooth noisy images, typical methods are to apply some type of blurring filter. For instance cv2.GaussianBlur()
, cv2.medianBlur()
, or cv2.bilaterialFilter()
can be used to remove salt/pepper noise. After blurring, we can threshold to obtain a binary image then perform morphological operations. From here, we can find contours and filter using aspect ratio or contour area. To crop the ROI, we can use Numpy slicing
Detected text
Extracted ROI
Code
import cv2
image = cv2.imread('1.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.medianBlur(gray, 5)
thresh = cv2.adaptiveThreshold(blur,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV,11,8)
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5,5))
dilate = cv2.dilate(thresh, kernel, iterations=6)
cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)
for c in cnts:
x,y,w,h = cv2.boundingRect(c)
ROI = image[y:y+h, x:x+w]
cv2.imwrite('ROI.png', ROI)
break
cv2.imshow('thresh', thresh)
cv2.imshow('dilate', dilate)
cv2.imshow('ROI', ROI)
cv2.waitKey()
MORPH_CLOSE
. Here you should play with kernel, most likely it will be ellipse 3x3, and number of iterations, usually 5-10 iterations is ok. kernel = cv2.getStructuringElement(shape=cv2.MORPH_ELLIPSE, ksize=(3, 3))
image = cv2.morphologyEx(image, cv2.MORPH_CLOSE, kernel, iterations=7)
stats
will hold bounding boxes either for whole word, or (if you omit step #2) it will hold info for each connected characters group. PS: Let me know if you need full code example.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.