[英]Pass PIL Image to google cloud vision without saving and reading
UPDATE BELOW更新如下
Is there a way to pass a PIL Image to google cloud vision?有没有办法将 PIL 图像传递给谷歌云视觉?
I tried to use io.Bytes
, io.String
and Image.tobytes()
but I always get:我尝试使用
io.Bytes
、 io.String
和Image.tobytes()
但我总是得到:
Traceback (most recent call last):
"C:\Users\...\vision_api.py", line 20, in get_text
image = vision.Image(content)
File "C:\...\venv\lib\site-packages\proto\message.py", line 494, in __init__
raise TypeError(
TypeError: Invalid constructor input for Image:b'Ma\x81Ma\x81La\x81Ma\x81Ma\x81Ma\x81Ma\x81Ma\x81Ma\x81Ma\x81Ma\x81La\x81Ma\x81Ma\x81Ma\x81Ma\x80Ma\x81La\x81Ma\x81Ma\x81Ma\x80Ma\x81Ma\x81Ma\x81Ma\x8 ...
or this if I pass the PIL-Image directly:或者如果我直接传递 PIL-Image 的话:
TypeError: Invalid constructor input for Image: <PIL.Image.Image image mode=RGB size=480x300 at 0x1D707131DC0>
This is my code:这是我的代码:
image = Image.open(path).convert('RGB') # Opening the saved image
cropped_image = image.crop((30, 900, 510, 1200)) # Cropping the image
vision_image = vision.Image(# I passed the different options) # Here I need to pass the image, but I don't know how
client = vision.ImageAnnotatorClient()
response = client.text_detection(image=vision_image) # Text detection using google-vision-api
FOR CLARITY:为清楚起见:
I want google text detection to only analyse a certain part of an image saved on my disk.我希望谷歌文本检测只分析保存在我磁盘上的图像的特定部分。 So my idea was to crop the image using PIL and then pass the cropped image to google-vision.
所以我的想法是使用 PIL 裁剪图像,然后将裁剪后的图像传递给 google-vision。 But it is not possible to pass an PIL-Image to
vision.Image
, as I get the error above.但是不可能将 PIL-Image 传递给
vision.Image
,因为我得到了上面的错误。
The documentation from Google.来自谷歌的文档。
This can be found in the vision.Image
class:这可以在
vision.Image
class 中找到:
Attributes:
content (bytes):
Image content, represented as a stream of bytes. Note: As
with all ``bytes`` fields, protobuffers use a pure binary
representation, whereas JSON representations use base64.
Currently, this field only works for BatchAnnotateImages
requests. It does not work for AsyncBatchAnnotateImages
requests.
A working option is to save the PIL-Image as a PNG/JPG on my disk and load it using:一个可行的选择是将 PIL-Image 保存为我的磁盘上的 PNG/JPG 并使用以下方式加载它:
with io.open(file_name, 'rb') as image_file:
content = image_file.read()
vision_image = vision.Image(content=content)
But this is slow and seems unnecessary.但这很慢而且似乎没有必要。 And the whole point for me behind using google-vision-api is the speed comaped to open-cv.
对我来说,使用 google-vision-api 的全部意义在于与 open-cv 相匹配的速度。
UPDATE as of 25/9/2021截至 2021 年 9 月 25 日更新
from PIL import Image
from io import BytesIO
from google.cloud import vision
with open('images/screenshots/screenshot.png', 'rb') as image_file:
data = image_file.read()
try:
image = vision.Image(content=data)
print('worked')
except TypeError:
print('failed')
im = Image.open('images/screenshots/screenshot.png')
buffer = BytesIO()
im.save(buffer, format='PNG')
try:
image = vision.Image(buffer.getvalue())
print('worked')
except TypeError:
print('failed')
The first version works as expected, but I can't get the second one to work as @Mark Setchell recommended.第一个版本按预期工作,但我无法让第二个版本按照@Mark Setchell 的建议工作。 The first few characters (~50) are the same, the rest is completely different.
前几个字符(~50)相同,rest 完全不同。
UPDATE as of 26/9/2021截至 2021 年 9 月 26 日更新
Both inputs are of type <class 'bytes'>
.两个输入都是
<class 'bytes'>
类型。 The complete error stack can be seen at the top of the question.完整的错误堆栈可以在问题的顶部看到。
Using this code:使用此代码:
print(input_data[:200])
print(type(input_data))
i get the following output:我得到以下 output:
b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x048\x00\x00\x07\x80\x08\x06\x00\x00\x00+a\xe7\n\x00\x00\x00\x04sBIT\x08\x08\x08\x08|\x08d\x88\x00\x00 \x00IDATx\x9c\xec\xbdy\xd8-\xc7Y\x1f\xf8\xab\xea>\xe7\xdb\xef\xaa\xbbk\xb3%\xcb\x8b\x16[\x12\xc6\xc8\xbb,\x1b\x03\x06\xc6\x8111\x93@2y\xc2381\x8b1\x90\x10\x9e\xf18\x93\x10\x0811\x84\x192\x0c3\x9e\x1020\x03\x03\xc3\xb0\x04\xf0C0\xc6\x96m\xc9\x96m\xed\xb2dI\x96\xaetu\xf7\xed\xdb\xcf\xe9\xae\x9a?j\xe9\xea\xbd\xba\xbb\xbaO\x9f\xef\x9e\xd7\xd6\xfd\xfat\xbf\xf5Vu-o\xbd\xf5\xeb\xb7\xde"\xef\xff\xc7\'8\x1c\x13\x07\x00\xd2\x82\xcc6\xe5\xc6\xa8B&'
<class 'bytes'>
for the working input.为工作输入。 And:
和:
b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x048\x00\x00\x07\x80\x08\x06\x00\x00\x00+a\xe7\n\x00\x01\x00\x00IDATx\x9c\xec\xbdw\x80$\xc7u\x1f\xfc\xab\xea\xeeI\x9bw/\'\x1cr\xce\x04@\x10\x04A\x82`\x84\x95%J"\x95,\xcb\x1f%\x91T\xb0$*}\x1fM\xd9\x96\x95EY\x94(\xc9\xb6\x92i+\x90\x12\x83(3)0\x82\x08$rN\x07\\\xce\xb7\xb7yBw\xd5\xf7G\x85\xaeN3\xdd=\xdd\xb3\xb3{\xfb\xc8\xc3\xceLW\xbd\xca\xaf\xde\xfb\xf5\xabW\xe4{\xdeu\x84\xa3`\xe2\x00@J\xe0Y&\xdf\x00e($\x94\x94\'p\xcc\xc3\xda\xe7Y\x0c\xf1Te\x13\xbf\xcc>\xfa:]Y=x\x84\x7f\xe8\xc23u\x1f\x91l\xfd\x99'
<class 'bytes'>
for the failing input.对于失败的输入。
As far as I can tell, you start off with a PIL Image
and you want to obtain a PNG image in memory without going to disk.据我所知,你从一个
PIL Image
开始,你想在 memory 中获取一个 PNG 图像而不去磁盘。 So you need this:所以你需要这个:
#!/usr/bin/env python3
from PIL import Image
from io import BytesIO
# Create PIL Image like you have - filled with red
im = Image.new('RGB', (320,240), (255,0,0))
# Create in-memory PNG - like you want for Google Cloud Vision
buffer = BytesIO()
im.save(buffer, format="PNG")
# Look at first few bytes
PNG = buffer.getvalue()
print(PNG[:20])
It prints this, which is exactly what you would get if you wrote the image to disk as a PNG and then read it back as binary - except this does it in memory without going to disk:它会打印这个,这正是您将图像作为 PNG 格式写入磁盘然后将其作为二进制文件读回时得到的结果 - 除了它在 memory 中执行而没有转到磁盘:
b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x01@'
It would be good to have whole error stack and more accurate code snippet.最好有完整的错误堆栈和更准确的代码片段。 But form presented information this seems to be confusion of two different "Images".
但是形式呈现的信息这似乎是两个不同“图像”的混淆。 Probably the some copy/paste error, as the tutorials have exactly the same line:
可能是一些复制/粘贴错误,因为教程具有完全相同的行:
response = client.text_detection(image=image)
But mentioned tutorials image
is created by vision.Image()
so I think in presented code this should be:但是提到的教程
image
是由vision.Image()
创建的,所以我认为在提供的代码中应该是:
response = client.text_detection(image=vision_image)
As, at least if I understand correctly the code snippet, image
is PIL Image, while vision_image
is Vision Image that should be passed to text_detection
method.因为,至少如果我正确理解代码片段,
image
是 PIL Image,而vision_image
是应该传递给text_detection
方法的 Vision Image。 So whatever is done in vision.Image()
does not have effect on the error massage.因此,无论在
vision.Image()
中做什么,都不会影响错误消息。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.