简体   繁体   English

将 PIL Image 传递给 google cloud vision 无需保存和读取

[英]Pass PIL Image to google cloud vision without saving and reading

UPDATE BELOW更新如下

Is there a way to pass a PIL Image to google cloud vision?有没有办法将 PIL 图像传递给谷歌云视觉?

I tried to use io.Bytes , io.String and Image.tobytes() but I always get:我尝试使用io.Bytesio.StringImage.tobytes()但我总是得到:

Traceback (most recent call last):
  "C:\Users\...\vision_api.py", line 20, in get_text
    image = vision.Image(content)
  File "C:\...\venv\lib\site-packages\proto\message.py", line 494, in __init__
    raise TypeError(
TypeError: Invalid constructor input for Image:b'Ma\x81Ma\x81La\x81Ma\x81Ma\x81Ma\x81Ma\x81Ma\x81Ma\x81Ma\x81Ma\x81La\x81Ma\x81Ma\x81Ma\x81Ma\x80Ma\x81La\x81Ma\x81Ma\x81Ma\x80Ma\x81Ma\x81Ma\x81Ma\x8 ...

or this if I pass the PIL-Image directly:或者如果我直接传递 PIL-Image 的话:

TypeError: Invalid constructor input for Image: <PIL.Image.Image image mode=RGB size=480x300 at 0x1D707131DC0>

This is my code:这是我的代码:

image = Image.open(path).convert('RGB')   # Opening the saved image
cropped_image = image.crop((30, 900, 510, 1200))   # Cropping the image

vision_image = vision.Image(# I passed the different options)   # Here I need to pass the image, but I don't know how
client = vision.ImageAnnotatorClient()
response = client.text_detection(image=vision_image)   # Text detection using google-vision-api

FOR CLARITY:为清楚起见:

I want google text detection to only analyse a certain part of an image saved on my disk.我希望谷歌文本检测只分析保存在我磁盘上的图像的特定部分。 So my idea was to crop the image using PIL and then pass the cropped image to google-vision.所以我的想法是使用 PIL 裁剪图像,然后将裁剪后的图像传递给 google-vision。 But it is not possible to pass an PIL-Image to vision.Image , as I get the error above.但是不可能将 PIL-Image 传递给vision.Image ,因为我得到了上面的错误。

The documentation from Google.来自谷歌的文档

This can be found in the vision.Image class:这可以在vision.Image class 中找到:

Attributes:
        content (bytes):
            Image content, represented as a stream of bytes. Note: As
            with all ``bytes`` fields, protobuffers use a pure binary
            representation, whereas JSON representations use base64.

            Currently, this field only works for BatchAnnotateImages
            requests. It does not work for AsyncBatchAnnotateImages
            requests.

A working option is to save the PIL-Image as a PNG/JPG on my disk and load it using:一个可行的选择是将 PIL-Image 保存为我的磁盘上的 PNG/JPG 并使用以下方式加载它:

with io.open(file_name, 'rb') as image_file:
    content = image_file.read()

vision_image = vision.Image(content=content)

But this is slow and seems unnecessary.但这很慢而且似乎没有必要。 And the whole point for me behind using google-vision-api is the speed comaped to open-cv.对我来说,使用 google-vision-api 的全部意义在于与 open-cv 相匹配的速度。

UPDATE as of 25/9/2021截至 2021 年 9 月 25 日更新

from PIL import Image
from io import BytesIO
from google.cloud import vision


with open('images/screenshots/screenshot.png', 'rb') as image_file:
    data = image_file.read()
    try:
        image = vision.Image(content=data)
        print('worked')

    except TypeError:
        print('failed')


im = Image.open('images/screenshots/screenshot.png')
buffer = BytesIO()
im.save(buffer, format='PNG')
try:
    image = vision.Image(buffer.getvalue())
    print('worked')

except TypeError:
    print('failed')

The first version works as expected, but I can't get the second one to work as @Mark Setchell recommended.第一个版本按预期工作,但我无法让第二个版本按照@Mark Setchell 的建议工作。 The first few characters (~50) are the same, the rest is completely different.前几个字符(~50)相同,rest 完全不同。

UPDATE as of 26/9/2021截至 2021 年 9 月 26 日更新

Both inputs are of type <class 'bytes'> .两个输入都是<class 'bytes'>类型。 The complete error stack can be seen at the top of the question.完整的错误堆栈可以在问题的顶部看到。

Using this code:使用此代码:

print(input_data[:200])
print(type(input_data))

i get the following output:我得到以下 output:

b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x048\x00\x00\x07\x80\x08\x06\x00\x00\x00+a\xe7\n\x00\x00\x00\x04sBIT\x08\x08\x08\x08|\x08d\x88\x00\x00 \x00IDATx\x9c\xec\xbdy\xd8-\xc7Y\x1f\xf8\xab\xea>\xe7\xdb\xef\xaa\xbbk\xb3%\xcb\x8b\x16[\x12\xc6\xc8\xbb,\x1b\x03\x06\xc6\x8111\x93@2y\xc2381\x8b1\x90\x10\x9e\xf18\x93\x10\x0811\x84\x192\x0c3\x9e\x1020\x03\x03\xc3\xb0\x04\xf0C0\xc6\x96m\xc9\x96m\xed\xb2dI\x96\xaetu\xf7\xed\xdb\xcf\xe9\xae\x9a?j\xe9\xea\xbd\xba\xbb\xbaO\x9f\xef\x9e\xd7\xd6\xfd\xfat\xbf\xf5Vu-o\xbd\xf5\xeb\xb7\xde"\xef\xff\xc7\'8\x1c\x13\x07\x00\xd2\x82\xcc6\xe5\xc6\xa8B&'
<class 'bytes'>

for the working input.为工作输入。 And:和:

b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x048\x00\x00\x07\x80\x08\x06\x00\x00\x00+a\xe7\n\x00\x01\x00\x00IDATx\x9c\xec\xbdw\x80$\xc7u\x1f\xfc\xab\xea\xeeI\x9bw/\'\x1cr\xce\x04@\x10\x04A\x82`\x84\x95%J"\x95,\xcb\x1f%\x91T\xb0$*}\x1fM\xd9\x96\x95EY\x94(\xc9\xb6\x92i+\x90\x12\x83(3)0\x82\x08$rN\x07\\\xce\xb7\xb7yBw\xd5\xf7G\x85\xaeN3\xdd=\xdd\xb3\xb3{\xfb\xc8\xc3\xceLW\xbd\xca\xaf\xde\xfb\xf5\xabW\xe4{\xdeu\x84\xa3`\xe2\x00@J\xe0Y&\xdf\x00e($\x94\x94\'p\xcc\xc3\xda\xe7Y\x0c\xf1Te\x13\xbf\xcc>\xfa:]Y=x\x84\x7f\xe8\xc23u\x1f\x91l\xfd\x99'
<class 'bytes'>

for the failing input.对于失败的输入。

As far as I can tell, you start off with a PIL Image and you want to obtain a PNG image in memory without going to disk.据我所知,你从一个PIL Image开始,你想在 memory 中获取一个 PNG 图像而不去磁盘。 So you need this:所以你需要这个:

#!/usr/bin/env python3

from PIL import Image
from io import BytesIO

# Create PIL Image like you have - filled with red
im = Image.new('RGB', (320,240), (255,0,0))

# Create in-memory PNG - like you want for Google Cloud Vision
buffer = BytesIO()
im.save(buffer, format="PNG")

# Look at first few bytes
PNG = buffer.getvalue()
print(PNG[:20])

It prints this, which is exactly what you would get if you wrote the image to disk as a PNG and then read it back as binary - except this does it in memory without going to disk:它会打印这个,这正是您将图像作为 PNG 格式写入磁盘然后将其作为二进制文件读回时得到的结果 - 除了它在 memory 中执行而没有转到磁盘:

b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x01@'

It would be good to have whole error stack and more accurate code snippet.最好有完整的错误堆栈和更准确的代码片段。 But form presented information this seems to be confusion of two different "Images".但是形式呈现的信息这似乎是两个不同“图像”的混淆。 Probably the some copy/paste error, as the tutorials have exactly the same line:可能是一些复制/粘贴错误,因为教程具有完全相同的行:

response = client.text_detection(image=image)

But mentioned tutorials image is created by vision.Image() so I think in presented code this should be:但是提到的教程image是由vision.Image()创建的,所以我认为在提供的代码中应该是:

response = client.text_detection(image=vision_image)

As, at least if I understand correctly the code snippet, image is PIL Image, while vision_image is Vision Image that should be passed to text_detection method.因为,至少如果我正确理解代码片段, image是 PIL Image,而vision_image是应该传递给text_detection方法的 Vision Image。 So whatever is done in vision.Image() does not have effect on the error massage.因此,无论在vision.Image()中做什么,都不会影响错误消息。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM