[英]Why using cv2.imread to read an image created by Wand from pdf file return all 255 for all RGB?
I am trying to identify blobs of text in a pdf file. 我试图在pdf文件中识别文本blob。 So say for example, there are different sections in an academic paper, and I want to identify the title as a section, the authors and addresses as a section, and the abstract as a section.
例如,学术论文中有不同的部分,我想将标题标识为一个部分,将作者和地址标识为一个部分,将摘要标识为一个部分。
One solution i am thinking is to use cv2. 我想的一个解决方案是使用cv2。 I first convert pdf to a image using Wand using the following codes:
我首先使用Wand将pdf转换为图像,使用以下代码:
from wand.color import Color
from wand.image import Image as Img
with Img(filename='./files/paper.pdf', resolution=300) as img:
img.background_color = Color("white")
img.alpha_channel = 'remove'
img.save(filename='test_file.jpg')
However, when I am trying to open the jpg file in cv2 with: 但是,当我尝试在cv2中打开jpg文件时:
image = cv2.imread('test_file.jpg')
print image
the printout shows that all the values in that image is 255 for all pixels. 打印输出显示所有像素中该图像中的所有值均为255。
array([[[255, 255, 255],
[255, 255, 255],
[255, 255, 255],
...,
[255, 255, 255],
[255, 255, 255],
[255, 255, 255]],
[[255, 255, 255],
[255, 255, 255],
[255, 255, 255],
...,
[255, 255, 255],
[255, 255, 255],
[255, 255, 255]],
[[255, 255, 255],
[255, 255, 255],
[255, 255, 255],
...,
[255, 255, 255],
[255, 255, 255],
[255, 255, 255]],
...,
[[255, 255, 255],
[255, 255, 255],
[255, 255, 255],
...,
[255, 255, 255],
[255, 255, 255],
[255, 255, 255]],
[[255, 255, 255],
[255, 255, 255],
[255, 255, 255],
...,
[255, 255, 255],
[255, 255, 255],
[255, 255, 255]],
[[255, 255, 255],
[255, 255, 255],
[255, 255, 255],
...,
[255, 255, 255],
[255, 255, 255],
[255, 255, 255]]], dtype=uint8)
And then, when i want to use cv2.dnn.blobFromImage(), it just won't get it right. 然后,当我想使用cv2.dnn.blobFromImage()时,它就不会正确。
What's going on? 这是怎么回事? Was it because the pdf didn't get converted correctly into an image?
是因为pdf没有正确转换成图像吗? But i tried
但我试过了
from PIL import Image
text = pytesseract.image_to_string(Image.open('test_file.jpg'))
, it returned all text to me... ,它把所有文字都归还给我了......
See all the dots? 看到所有的点? The printing of the image is just showing a few pixels of the image.
图像的打印仅显示图像的几个像素。 Assuming you have a pdf text document with a white background, it is safe to assume that all the edge pixels are white.
假设您有一个白色背景的pdf文本文档,可以安全地假设所有边缘像素都是白色。 The print will typically show you the corners of the image.
打印通常会显示图像的角落。
To show the image use 显示图像使用
image = cv2.imread('test_file.jpg')
cv2.imshow('Image', image)
cv2.waitKey(0)
This will display the image in a window, and wait for you to press a key before disappearing. 这将在窗口中显示图像,并等待您在消失前按键。
Wand images are not numpy arrays and so cannot be simply opened in cv2. 魔杖图像不是numpy数组,所以不能简单地在cv2中打开。 In Wand 5.3, there will be a way to import and export Wand images to and from numpy arrays.
在Wand 5.3中,将有一种方法可以在numpy数组中导入和导出Wand图像。
In Wand 5.2, you can use import_pixels to convert a numpy array to a Wand image. 在Wand 5.2中,您可以使用import_pixels将numpy数组转换为Wand图像。 In Wand 5.2, you can export a Wand image to numpy array that you should be able to use in cv2.
在Wand 5.2中,您可以将Wand图像导出到您应该能够在cv2中使用的numpy数组。
import numpy as np
from wand.image import Image
with Image(filename='rose.png') as img:
matrix = np.array(img)
matrix will be a numpy array that you should then be able to use in OpenCV 矩阵将是一个numpy数组,您应该可以在OpenCV中使用它
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.