简体   繁体   English

使用 OpenCV 进行光学盲文识别

[英]Optical Braille recognition using OpenCV

I am actually trying to recognize Braille characters in a document.我实际上是在尝试识别文档中的盲文字符。 I intend to convert a braille document into plain text.我打算将盲文文档转换为纯文本。 I am using OpenCV with Java in order to do the image processing.我正在使用 OpenCV 和 Java 来进行图像处理。

First, I imported an image of a Braille document :首先,我导入了一个盲文文档的图像:

原始盲文文档的图像

Then, I made some image processing in order to binarize the original image.然后,我进行了一些图像处理以对原始图像进行二值化。 I have read that the important steps are :我已经读到重要的步骤是:

  • Convert the image into gray levels将图像转换为灰度级
  • Reduct the noise降低噪音
  • Enhance the edge contrast增强边缘对比度
  • Binarize the image二值化图像

Here is the code that I used :这是我使用的代码:

public static void main(String args[]) {

    Mat imgGrayscale = new Mat();

    Mat image = Imgcodecs.imread("C:/Users/original_braille.jpg", 1);  


    Imgproc.cvtColor(image, imgGrayscale, Imgproc.COLOR_BGR2GRAY);

    Imgproc.GaussianBlur(imgGrayscale, imgGrayscale, new Size(3, 3), 0);
    Imgproc.adaptiveThreshold(imgGrayscale, imgGrayscale, 255, Imgproc.ADAPTIVE_THRESH_MEAN_C, Imgproc.THRESH_BINARY_INV, 5, 4);

    Imgproc.medianBlur(imgGrayscale, imgGrayscale, 3);
    Imgproc.threshold(imgGrayscale, imgGrayscale, 0, 255, Imgproc.THRESH_OTSU);

    Imgproc.GaussianBlur(imgGrayscale, imgGrayscale, new Size(3, 3), 0);
    Imgproc.threshold(imgGrayscale, imgGrayscale, 0, 255, Imgproc.THRESH_OTSU);

    Imgcodecs.imwrite( "C:/Users/Jean-Baptiste/Desktop/Reconnaissance_de_formes/result.jpg", imgGrayscale );

}

I obtained the following result for this step :我为这一步获得了以下结果:

图像二值化

According to me, we can improve the quality of this image for better results but I'm not experienced with the different image processing techniques.据我说,我们可以提高此图像的质量以获得更好的结果,但我对不同的图像处理技术没有经验。 Can I improve the quality of my filters ?我可以提高过滤器的质量吗?

After that, I would like to perform a segmentation of the image in order to detect the different characters of this document.之后,我想对图像进行分割,以检测该文档的不同字符。 I would like to separate the different characters of the document in order to convert them into text.我想将文档的不同字符分开,以便将它们转换为文本。

For instance I have drawn the separation lines of the document manually :例如,我手动绘制了文档的分隔线:

分隔线

But I didn't find solutions for this step.但是我没有找到这一步的解决方案。 Is there a possibility to do the same with OpenCV ?有没有可能用 OpenCV 做同样的事情?

Here is a small script that finds the lines in your image.这是一个小脚本,用于查找图像中的线条。 It's in python, I don't have a java version of openCV installed, but I think you can get the idea of the algorithm anyway.它在python中,我没有安装java版本的openCV,但我认为无论如何你都可以了解算法。

Finding vertical lines is not as easy because the space between the dots depends on the letters following each other.找到垂直线并不容易,因为点之间的空间取决于彼此后面的字母。 You could probably try template matching algorithms with some common letters.您可能可以尝试使用一些常见字母的模板匹配算法。 Given the fact that at this point you know the height of the letters it shouldn't be too hard.鉴于此时您知道字母的高度,这应该不会太难。

Of course, this whole approach assumes that the document is not rotated.当然,整个方法假定文档没有旋转。

import numpy as np
import cv2

# This is just the transposition of your code in python
img      = cv2.imread('L1ZzA.jpg')
gray     = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
blur     = cv2.GaussianBlur(gray,(3,3),0)
thres    = cv2.adaptiveThreshold(blur,255,cv2.ADAPTIVE_THRESH_MEAN_C,cv2.THRESH_BINARY,5,4)
blur2    = cv2.medianBlur(thres,3)
ret2,th2 = cv2.threshold(blur2,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
blur3    = cv2.GaussianBlur(th2,(3,3),0)
ret3,th3 = cv2.threshold(blur3,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)

# Find connected components and extract the mean height and width
output = cv2.connectedComponentsWithStats(255-th3, 6, cv2.CV_8U)
mean_h = np.mean(output[2][:,cv2.CC_STAT_HEIGHT])
mean_w = np.mean(output[2][:,cv2.CC_STAT_WIDTH])

# Find empty rows, defined as having less than mean_h/2 pixels
empty_rows = []
for i in range(th3.shape[0]):
  if np.sum(255-th3[i,:]) < mean_h/2.0:
    empty_rows.append(i)           

# Group rows by labels
d = np.ediff1d(empty_rows, to_begin=1)

good_rows   = []
good_labels = []
label       = 0

# 1: assign labels to each row
# based on whether they are following each other or not (i.e. diff >1)
for i in range(1,len(empty_rows)-1):
  if d[i+1] == 1:
    good_labels.append(label)
    good_rows.append(empty_rows[i])

  elif d[i] > 1 and d[i+1] > 1:
    label = good_labels[len(good_labels)-1] + 1

# 2: find the mean row value associated with each label, and color that line in green in the original image
for i in range(label):
  frow = np.mean(np.asarray(good_rows)[np.where(np.asarray(good_labels) == i)])
  img[int(frow),:,1] = 255 

# Display the image with the green rows
cv2.imshow('test',img)
cv2.waitKey(0)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM