简体   繁体   English

大查询 api 使用 python Api

[英]Bigquery api using python Api

I am trying to automate the tables creation in BQ by reading bucket raw files(based on the bucket file name it should creat same tables )which will use yml as a configuration.我正在尝试通过读取存储桶原始文件(基于存储桶文件名,它应该创建相同的表)来自动创建 BQ 中的表,这将使用 yml 作为配置。 Can anyone provide a lead on this as how to write with sone code sample.任何人都可以提供有关如何使用代码示例编写的线索。

I am doing something similar.我正在做类似的事情。 It also depends on how you want to read the bucket "raw files" which in my case is a GCS notification + PubSub.它还取决于您想如何读取存储桶“原始文件”,在我的例子中是 GCS 通知 + PubSub。

Very simple example:非常简单的例子:

uri = "gs://" + event['attributes']['bucketId'] + "/" + filename
table = os.path.splitext(filename)[0]

#Format the filename to only numbers and letters (no special characters)
table = re.sub('[^A-Za-z0-9]+', '', table)

# Construct a BigQuery client object.
client = bigquery.Client()

# Name of the table which will be created automatically by BigQuery
table_id = project_id + "." + dataset_id + "." + table

job_config = bigquery.LoadJobConfig(
    autodetect=True,
    source_format=bigquery.SourceFormat.CSV,
)

load_job = client.load_table_from_uri(
        uri, table_id, job_config=job_config
)

BigQuery job will create the table automatically if it does not exist.如果表不存在,BigQuery 作业将自动创建表。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM