I am trying to remove rules and a background smiley face from multiple notebook pages before performing text detection and recognition on the handwritten text.
An earlier thread offers helpful hints, but my problem is different in several respects.
I'm thinking of using OpenCV for this task, but I'm open to using ImageMagick or command-line GIMP so long as I can process the entire batch at once. Since I have never used any of these tools before, any advice would be welcome. Thank you.
Here's a simple approach with the assumption that the text is blue
cv2.inRange()
We begin by converting the image to HSV format and create a mask to isolate the characters
image = cv2.imread('1.png')
result = image.copy()
image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
lower = np.array([21,0,0])
upper = np.array([179, 255, 209])
mask = cv2.inRange(image, lower, upper)
Now we perform morphological transformations to remove small noise
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (2,2))
close = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, kernel, iterations=1)
We have the desired text outlines so we can isolate characters by masking with the original image
result[close==0] = (255,255,255)
Finally to prepare the image for OCR/Tesseract, we change the characters to black
retouch_mask = (result <= [250.,250.,250.]).all(axis=2)
result[retouch_mask] = [0,0,0]
Full code
import numpy as np
import cv2
image = cv2.imread('1.png')
result = image.copy()
image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
lower = np.array([21,0,0])
upper = np.array([179, 255, 209])
mask = cv2.inRange(image, lower, upper)
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (2,2))
close = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, kernel, iterations=1)
result[close==0] = (255,255,255)
cv2.imshow('cleaned', result)
retouch_mask = (result <= [250.,250.,250.]).all(axis=2)
result[retouch_mask] = [0,0,0]
cv2.imshow('mask', mask)
cv2.imshow('close', close)
cv2.imshow('result', result)
cv2.waitKey()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.