简体   繁体   English

在 python 编码的 Apache-Beam 管道中提供 BigQuery 凭证

[英]Provide BigQuery credentials in Apache-Beam pipeline coded in python

I'm trying to read data from bigquery in my beam pipeline using cloud dataflow runner.我正在尝试使用云数据流运行器从我的光束管道中的 bigquery 读取数据。 I want to provide a credentials to access the project.我想提供访问该项目的凭据。

I've seen examples in Java but none in Python.我在 Java 中看到了示例,但在 Python 中没有看到。

The only possibility I found is to use the: --service_account_email argument But what if I want to give the.json key information in the code itself in all the options like: google_cloud_options.service_account = '/path/to/credential.json'我发现的唯一可能性是使用: --service_account_email参数但是如果我想在代码本身的所有选项中给出 .json 关键信息,例如:google_cloud_options.service_account = '/path/to/credential.json'

options = PipelineOptions(flags=argv)
google_cloud_options = options.view_as(GoogleCloudOptions)
google_cloud_options.project = 'project_name'
google_cloud_options.job_name = 'job_name'
google_cloud_options.staging_location = 'gs://bucket'
google_cloud_options.temp_location = 'gs://bucket'
options.view_as(StandardOptions).runner = 'DataflowRunner'

with beam.Pipeline(options=options) as pipeline:
    query = open('query.sql', 'r')
    bq_source = beam.io.BigQuerySource(query=query.read(), use_standard_sql=True)
    main_table = \
        pipeline \
        | 'ReadAccountViewAll' >> beam.io.Read(bq_source) \

Java has a method getGcpCredential but cant find one in Python... Java 有一个方法getGcpCredential但在 Python 中找不到...

Any ideas?有任何想法吗?

The --service_account_email is the recommended approach as mentioned here . --service_account_email这里提到的推荐方法。 Downloading the key and storing it locally or on GCE is not recommended.不建议下载密钥并将其存储在本地或 GCE 上。

For the cases where it is required to use a different path for the json file within the code, you can try the following python Authentication workarounds:对于需要在代码中为 json 文件使用不同路径的情况,您可以尝试以下python 身份验证解决方法:

client = Client.from_service_account_json('/path/to/keyfile.json')

or或者

client = Client(credentials=credentials)

Here is an example for creating custom credentials from a file: 以下是从文件创建自定义凭据的示例:

credentials = service_account.Credentials.from_service_account_file(
    key_path,
    scopes=["https://www.googleapis.com/auth/cloud-platform"],
)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何以编程方式在Apache Beam python中提供凭据? - how to provide credentials in apache beam python programmatically? 在 apache-beam 中使用 python ReadFromKafka 不支持的信号:2 - ReadFromKafka with python in apache-beam Unsupported signal: 2 如何在流传输管道中添加重复数据删除[apache-beam] - How to add de-duplication to a streaming pipeline [apache-beam] 在现有的谷歌云 VM 上运行 Apache-beam 管道作业 - Run Apache-beam pipeline job on existing google cloud VM Python/Apache-Beam:如何将文本文件解析为 CSV? - Python/Apache-Beam: How to Parse Text File To CSV? 如何在 Python 的 Apache-Beam DataFlow 中合并解析的文本文件? - How To Combine Parsed TextFiles In Apache-Beam DataFlow in Python? Apache-Beam + Python:将JSON(或字典)字符串写入输出文件 - Apache-Beam + Python: Writing JSON (or dictionaries) strings to output file 为什么我的 Apache-Beam Python 库安装失败? - Why is my install failing for Apache-Beam Python library? 如何将两个结果和 pipe 组合到 apache-beam 管道中的下一步 - How to combine two results and pipe it to next step in apache-beam pipeline Apache-Beam将序列号添加到PCollection - Apache-Beam add sequence number to a PCollection
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM