简体   繁体   English

opencv BGR2GRAY和Pillow转换函数之间的区别

[英]Difference between opencv BGR2GRAY and Pillow convert functions

I'm trying to OCR an image which includes both numbers and characters, using Tesseract library with opencv and c++. 我正在尝试使用带有opencv和c ++的Tesseract库来OCR包含数字和字符的图像。 Before calling tesseract library, i used to gray scale the image with opencv 在调用tesseract库之前,我曾经使用opencv对图像进行灰度调整

cvtColor(roiImg,roiImg,CV_BGR2GRAY);

This is the 这是 我用python收到的灰度图像

OCR results for this image wasn't 100% accurate. 该图像的OCR结果不是100%准确。

Then the same image was tested with pillow library with python. 然后使用带有python的枕头库测试相同的图像。 The original image was gray scaled using the following method. 使用以下方法对原始图像进行灰度缩放。

gray = image.convert('L')

This is the 这是 我用枕头库收到的灰度图像

The latter mentioned gray scaled image gave 100% accurate results. 后者提到灰度图像给出了100%准确的结果。

Once i searched through the internet it was mentioned that both opencv BGR2Gray and pillow img.convert methods use the same luma transform algorithm. 一旦我通过互联网搜索,就提到opencv BGR2Gray和枕头img.convert方法都使用相同的亮度变换算法。

What is the reason for two different OCR results? 两种不同OCR结果的原因是什么?

Thanks in Advance 提前致谢

Pillow can read only 3x8-bit pixels for color image. 枕头只能读取彩色图像的3x8位像素。

Here a quick test to see how both libraries will round the values: 这里有一个快速测试,看看两个库如何围绕值:

  • OpenCV code: OpenCV代码:

     cv::Mat img(2, 1, CV_8UC3), img_gray; img.at<cv::Vec3b>(0, 0) = cv::Vec3b(248, 249, 249); //BGR img.at<cv::Vec3b>(1, 0) = cv::Vec3b(249, 248, 248); //BGR cv::cvtColor(img, img_gray, cv::COLOR_BGR2GRAY); std::cout << "img:\\n" << img << std::endl; std::cout << "img_gray:\\n" << img_gray << std::endl; float val1 = 249*0.299f + 249*0.587f + 248*0.114f; //RGB float val2 = 248*0.299f + 248*0.587f + 249*0.114f; //RGB std::cout << "val1=" << val1 << std::endl; std::cout << "val2=" << val2 << std::endl; 

img: IMG:

[248, 249, 249; [248,249,249;

249, 248, 248] 249,248,248]

img_gray: img_gray:

[249; [249;

248] 248]

val1=248.886 VAL1 = 248.886

val2=248.114 VAL2 = 248.114

  • Python code: Python代码:

     rgbArray = np.zeros((2,1,3), 'uint8') rgbArray[0,0,0] = 249 #R rgbArray[0,0,1] = 249 #G rgbArray[0,0,2] = 248 #B rgbArray[1,0,0] = 248 #R rgbArray[1,0,1] = 248 #G rgbArray[1,0,2] = 249 #B img = Image.fromarray(rgbArray) imgGray = img.convert('L') print("rgbArray:\\n", rgbArray) print("imgGray:\\n", np.asarray(imgGray)) print("np.asarray(imgGray).dtype: ", np.asarray(imgGray).dtype) 

rgbArray: rgbArray:

[[[249 249 248]] [[[249 249 248]]

[[248 248 249]]] [[248 248 249]]]

imgGray: imgGray:

[[248] [248]

[248]] [248]

np.asarray(imgGray).dtype: uint8 np.asarray(imgGray).dtype:uint8

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM