简体   繁体   中英

Optical Braille recognition using OpenCV

I am actually trying to recognize Braille characters in a document. I intend to convert a braille document into plain text. I am using OpenCV with Java in order to do the image processing.

First, I imported an image of a Braille document :

原始盲文文档的图像

Then, I made some image processing in order to binarize the original image. I have read that the important steps are :

  • Convert the image into gray levels
  • Reduct the noise
  • Enhance the edge contrast
  • Binarize the image

Here is the code that I used :

public static void main(String args[]) {

    Mat imgGrayscale = new Mat();

    Mat image = Imgcodecs.imread("C:/Users/original_braille.jpg", 1);  


    Imgproc.cvtColor(image, imgGrayscale, Imgproc.COLOR_BGR2GRAY);

    Imgproc.GaussianBlur(imgGrayscale, imgGrayscale, new Size(3, 3), 0);
    Imgproc.adaptiveThreshold(imgGrayscale, imgGrayscale, 255, Imgproc.ADAPTIVE_THRESH_MEAN_C, Imgproc.THRESH_BINARY_INV, 5, 4);

    Imgproc.medianBlur(imgGrayscale, imgGrayscale, 3);
    Imgproc.threshold(imgGrayscale, imgGrayscale, 0, 255, Imgproc.THRESH_OTSU);

    Imgproc.GaussianBlur(imgGrayscale, imgGrayscale, new Size(3, 3), 0);
    Imgproc.threshold(imgGrayscale, imgGrayscale, 0, 255, Imgproc.THRESH_OTSU);

    Imgcodecs.imwrite( "C:/Users/Jean-Baptiste/Desktop/Reconnaissance_de_formes/result.jpg", imgGrayscale );

}

I obtained the following result for this step :

图像二值化

According to me, we can improve the quality of this image for better results but I'm not experienced with the different image processing techniques. Can I improve the quality of my filters ?

After that, I would like to perform a segmentation of the image in order to detect the different characters of this document. I would like to separate the different characters of the document in order to convert them into text.

For instance I have drawn the separation lines of the document manually :

分隔线

But I didn't find solutions for this step. Is there a possibility to do the same with OpenCV ?

Here is a small script that finds the lines in your image. It's in python, I don't have a java version of openCV installed, but I think you can get the idea of the algorithm anyway.

Finding vertical lines is not as easy because the space between the dots depends on the letters following each other. You could probably try template matching algorithms with some common letters. Given the fact that at this point you know the height of the letters it shouldn't be too hard.

Of course, this whole approach assumes that the document is not rotated.

import numpy as np
import cv2

# This is just the transposition of your code in python
img      = cv2.imread('L1ZzA.jpg')
gray     = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
blur     = cv2.GaussianBlur(gray,(3,3),0)
thres    = cv2.adaptiveThreshold(blur,255,cv2.ADAPTIVE_THRESH_MEAN_C,cv2.THRESH_BINARY,5,4)
blur2    = cv2.medianBlur(thres,3)
ret2,th2 = cv2.threshold(blur2,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
blur3    = cv2.GaussianBlur(th2,(3,3),0)
ret3,th3 = cv2.threshold(blur3,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)

# Find connected components and extract the mean height and width
output = cv2.connectedComponentsWithStats(255-th3, 6, cv2.CV_8U)
mean_h = np.mean(output[2][:,cv2.CC_STAT_HEIGHT])
mean_w = np.mean(output[2][:,cv2.CC_STAT_WIDTH])

# Find empty rows, defined as having less than mean_h/2 pixels
empty_rows = []
for i in range(th3.shape[0]):
  if np.sum(255-th3[i,:]) < mean_h/2.0:
    empty_rows.append(i)           

# Group rows by labels
d = np.ediff1d(empty_rows, to_begin=1)

good_rows   = []
good_labels = []
label       = 0

# 1: assign labels to each row
# based on whether they are following each other or not (i.e. diff >1)
for i in range(1,len(empty_rows)-1):
  if d[i+1] == 1:
    good_labels.append(label)
    good_rows.append(empty_rows[i])

  elif d[i] > 1 and d[i+1] > 1:
    label = good_labels[len(good_labels)-1] + 1

# 2: find the mean row value associated with each label, and color that line in green in the original image
for i in range(label):
  frow = np.mean(np.asarray(good_rows)[np.where(np.asarray(good_labels) == i)])
  img[int(frow),:,1] = 255 

# Display the image with the green rows
cv2.imshow('test',img)
cv2.waitKey(0)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM