使用 PIL 模块从 GCS 打开文件

Question

I am a beginner in programming, and this is my first little try.我是编程初学者，这是我的第一次小尝试。 I'm currently facing a bottleneck, I would like to ask for the help.我目前面临瓶颈，我想寻求帮助。 Any advice will be welcome.欢迎任何建议。 Thank you in advance!先感谢您！

Here is what I want to do:这是我想要做的：

To make a text detection application and extract the text for the further usage(for instance, to map some of the other relevant information in a data).做一个文本检测应用程序，提取文本以供进一步使用（例如，到 map 数据中的一些其他相关信息）。 So, I devided into two steps: 1.first, to detect the text 2.extract the text and use the regular expression to rearrange it for the data mapping.所以，我分为两个步骤： 1.首先，检测文本 2.提取文本并使用正则表达式重新排列它以进行数据映射。

For the first step, I use google vision api, so I have no probelm reading the image from google cloud storage(code reference 1):第一步，我使用的是google vision api，所以我没有问题从google云存储中读取图像（代码参考1）：

However, when it comes to step two, I need a PIL module to open the file for drawing the text.但是，当涉及到第二步时，我需要一个PIL 模块来打开用于绘制文本的文件。 When useing the method Image.open() , it requries a path `.使用方法Image.open()时，它需要一个路径`. My question is how do I call the path?我的问题是如何调用路径？ (code reference 2): （代码参考2）：

code reference 1:代码参考1：

from google.cloud import vision

    image_uri = 'gs://img_platecapture/img_001.jpg'
    client = vision.ImageAnnotatorClient()
    image = vision.Image()
    image.source.image_uri = image_uri  ##  <- THE PATH  ##

    response = client.text_detection(image=image)
    for text in response.text_annotations:
        print('=' * 30)
        print(text.description)
        vertices = ['(%s,%s)' % (v.x, v.y) for v in text.bounding_poly.vertices]
        print('bounds:', ",".join(vertices))

    if response.error.message:
        raise Exception(
            '{}\nFor more info on error messages, check: '
            'https://cloud.google.com/apis/design/errors'.format(
                response.error.message))

code reference 2:代码参考2：

from PIL import Image, ImageDraw
from PIL import ImageFont
import re

img = Image.open(?)                        <- THE PATH  ##
draw = ImageDraw.Draw(img)
font = ImageFont.truetype("simsun.ttc", 18)

for text in response.text_annotations[1::]:
  ocr = text.description
  bound=text.bounding_poly    
  draw.text((bound.vertices[0].x-25, bound.vertices[0].y-25),ocr,fill=(255,0,0),font=font)     
        
  draw.polygon(
         [
             bound.vertices[0].x,
             bound.vertices[0].y,
             bound.vertices[1].x,
             bound.vertices[1].y,
             bound.vertices[2].x,
             bound.vertices[2].y,
             bound.vertices[3].x,
             bound.vertices[3].y,
         ],
         None,
         'yellow',
       
         )
  texts=response.text_annotations

  a=str(texts[0].description.split())
  b=re.sub(u"([^\u4e00-\u9fa5\u0030-u0039])","",a) 
    b1="".join(b)
  

    regex1 = re.search(r"\D{1,2}Dist.",b) 
    if regex1:
        regex1="{}".format(regex1.group(0))

     .........

Answer 1

PIL does not have built in ability to automatically open files from GCS. PIL没有从 GCS 自动打开文件的内置功能。 you will need to either你需要

Download the file to local storage and point PIL to that file or将文件下载到本地存储并将 PIL 指向该文件或

Give PIL a BlobReader which it can use to access the data:为 PIL 提供一个可用于访问数据的BlobReader ：

 from PIL import Image from google.cloud import storage storage_client = storage.Client() bucket = storage_client.bucket('img_platecapture') blob = bucket.get_blob('img_001.jpg') # use get_blob to fix generation number, so we don't get corruption if blob is overwritten while we read it. with blob.open() as file: img = Image.open(file) #...

使用 PIL 模块从 GCS 打开文件

问题描述

1 个解决方案

解决方案1
0 已采纳 2022-12-04 22:32:05

使用 PIL 模块从 GCS 打开文件

问题描述

1 个解决方案

解决方案1 0 已采纳 2022-12-04 22:32:05

解决方案1
0 已采纳 2022-12-04 22:32:05