[英]How do I define my Google Cloud Platform service account's key in a python program?
I'm trying to use Google Cloud Vision's OCR as a substitute for Pytesseract.我正在尝试使用 Google Cloud Vision 的 OCR 作为 Pytesseract 的替代品。 Part of this is defining the account key.
其中一部分是定义帐户密钥。 The way google suggests doing this is by setting the
GOOGLE_APPLICATION_CREDENTIALS
environmental variable to the path with the key.谷歌建议这样做的方式是将
GOOGLE_APPLICATION_CREDENTIALS
环境变量设置为带有密钥的路径。 I use VSCode, so running $env:GOOGLE_APPLICATION_CREDENTIALS="C:\Users\User_1\key.json
command in powershell works just fine. However, the program I am writing will be used on my boss's computer, and I can't enter the powershell command every time. The example code Google gives to see if this worked is:我使用VSCode,所以在powershell中运行
$env:GOOGLE_APPLICATION_CREDENTIALS="C:\Users\User_1\key.json
命令就可以了。但是我写的程序会在我老板的电脑上使用,我不能powershell 命令每次。谷歌给出的示例代码看看这是否有效:
def implicit():
from google.cloud import storage
# If you don't specify credentials when constructing the client, the
# client library will look for credentials in the environment.
storage_client = storage.Client()
# Make an authenticated API request
buckets = list(storage_client.list_buckets())
print(buckets)
implicit()
Is there another way to pass the key to GCP without doing this?有没有另一种方法可以在不这样做的情况下将密钥传递给 GCP? I know with the google sheets API, I use:
我知道谷歌表格 API,我使用:
def auth():
creds = None
token_file = 'token.json'
SCOPES = ['https://www.googleapis.com/auth/drive', 'https://www.googleapis.com/auth/spreadsheets']
if os.path.exists(token_file):
creds = Credentials.from_authorized_user_file(token_file, SCOPES)
if not creds or not creds.valid:
if creds and creds.expired and creds.refresh_token:
creds.refresh(Request())
else:
flow = InstalledAppFlow.from_client_secrets_file(
client_secrets_file='credentials.json', scopes=SCOPES)
creds = flow.run_local_server(port=0)
with open(token_file, 'w') as token:
token.write(creds.to_json())
return creds
However, I am under the impression that this is not applicable here.但是,我的印象是这不适用于这里。 Or, if it is, I'm not sure how to apply it correctly.
或者,如果是,我不确定如何正确应用它。 Are there any alternatives to running the powershell command every time?
每次运行 powershell 命令是否有任何替代方法?
The only alternative I've been able to think of is to find a way to run the powershell command through python, but the program will end up being a standalone.exe on my boss's computer (using pyinstaller), so I do not believe this will work.我能想到的唯一替代方法是找到一种通过 python 运行 powershell 命令的方法,但该程序最终将成为我老板计算机上的一个 Standalone.exe(使用 pyinstaller),所以我不相信这一点将工作。
I think you have missed the documentation of passing key explicitly .我认为您错过了明确传递密钥的文档。 From the doc:
从文档:
from google.cloud import storage
# Explicitly use service account credentials by specifying the private key file.
storage_client = storage.Client.from_service_account_json('path_to_json_file')
# Make an authenticated API request
buckets = list(storage_client.list_buckets())
print(buckets)
Another simple way of doing this is setting the GOOGLE_APPLICATION_CREDENTIALS
in environment via python itself.另一种简单的方法是通过 python 本身在环境中设置
GOOGLE_APPLICATION_CREDENTIALS
。
import os
from google.cloud import storage
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'path_to_json_file'
storage_client = storage.Client()
The solution posted above by Sayan Bhattacharya is great. Sayan Bhattacharya 上面发布的解决方案很棒。 However, I've found an even better one;
但是,我找到了一个更好的。 don't use Google Vision.
不要使用谷歌视觉。
I ended up using an OCR called "Space OCR", you get up to 500 api calls per day, and up to 25,000 per month, however you manage to do that at a rate of 500/day I'm not sure, but those are the limitations on a free account.我最终使用了一个名为“Space OCR”的 OCR,你每天最多可以接到 500 个 api 调用,每月最多可以调用 25,000 个,但是你设法以每天 500 个的速度做到这一点我不确定,但是那些是免费帐户的限制。 This OCR works great and is easy to use.
此 OCR 效果很好且易于使用。 You do need an account and an API key, but once you have those, the documentation is very easy to follow (and by that I mean ctrl+c, ctrl+v, and swap out the default api key for your own).
您确实需要一个帐户和一个 API 密钥,但是一旦有了这些,文档就很容易理解(我的意思是 ctrl+c、ctrl+v,然后将默认的 api 密钥换成您自己的)。 It seems to be mostly accurate, and I would like to assume the reason it's missed the few times it has is because of me taking pictures with my phone instead of scanning them in through iOS notes or Genius Scan, but I haven't tested this.
它似乎是最准确的,我想假设它错过了几次的原因是因为我用手机拍照而不是通过 iOS 笔记或 Genius Scan 扫描它们,但我没有测试过这个. Space OCR does have a 2000-5000ms wait time from making the initial api call to receiving your results.
空间 OCR 从发出初始 api 调用到接收结果确实有 2000-5000 毫秒的等待时间。 This isn't a problem for me, as I'll probably only scan 40-50 documents per day, and the program I wrote looks through an entire directory, then processes all of the images in it, so I can just walk away for a few minutes.
这对我来说不是问题,因为我可能每天只扫描 40-50 个文档,而我编写的程序会查看整个目录,然后处理其中的所有图像,所以我可以走开几分钟。
Thanks all for the help.谢谢大家的帮助。
EDIT编辑
I did try using pytesseract, but it was so inaccurate using the exact same images I used with space OCR, it's almost laughable.我确实尝试过使用 pytesseract,但使用与空间 OCR 使用的完全相同的图像是如此不准确,这几乎是可笑的。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.