如何通过 Python 谷歌客户端 API 列出我的预定查询？

Question

我已经设置了我的服务帐户，我可以使用client.query()在 bigQuery 上运行查询。

我可以将所有计划查询写入这种新的client.query()格式，但我已经有很多计划查询，所以我想知道是否有办法获取/列出计划查询，然后使用该信息运行这些查询从脚本。

Answer 1

是的，您可以使用 API。 当您不知道使用哪一个时，我有一个提示。 使用@Yev 提出的命令

bq ls --transfer_config --transfer_location=US --format=prettyjson

但记录 API 调用。 为此使用--apilog <logfile name>参数

bq --apilog./log ls --transfer_config --transfer_location=US --format=prettyjson

而且，神奇的是，您可以找到命令调用的 API： https://bigquerydatatransfer.googleapis.com/v1/projects/<PROJECT-ID>/locations/US/transferConfigs?alt=json ://bigquerydatatransfer.googleapis.com/v1/projects/<PROJECT-ID>/locations/US/transferConfigs?alt=json

然后，一个简单的谷歌搜索会引导您找到正确的文档

在 python 中，在您的requirements.txt中添加该依赖项： google-cloud-bigquery-datatransfer并使用该代码

from google.cloud import bigquery_datatransfer

client = bigquery_datatransfer.DataTransferServiceClient()
parent = client.common_project_path("<PROJECT-ID>")
resp = client.list_transfer_configs(parent=parent)
print(resp)

Answer 2

使用云 SDK 有一个命令可以让你得到你想要的和更多： bq ls --transfer_config --transfer_location=US --format=prettyjson ，更多关于这里 - List Scheduled Queries in BigQuery

在命令提示符下执行此操作（假设已安装谷歌云 SKD）将为您提供以下内容（红色部分 = 计划查询 sql）：

之后，您可以在 Python 中将其作为 shell 子进程运行并将其解析出来：

import pandas as pd
import json
from subprocess import PIPE, run, call

response = run('bq ls --transfer_config --transfer_location=US --format=prettyjson', 
               stdout=PIPE, 
               stderr=PIPE, 
               universal_newlines=True, 
               shell=True)

response

这是上面产生的前几行：

CompletedProcess(args='bq ls --transfer_config --transfer_location=US --format=prettyjson', returncode=0, stdout='[\n  {\n    "dataSourceId": "scheduled_query",\...

然后要获取 sql，您可以通过response.stdout访问输出并解析为 json，然后将您的方式字典化为所需的结果或将其转换为 pandas 数据帧格式并从那里开始，如下所示：

data = json.loads(response.stdout)
df = pd.json_normalize(data)

df.columns =

dataSourceId
datasetRegion
destinationDatasetId
disabled
displayName
name
schedule
state
updateTime
userId
emailPreferences.enableFailureEmail
params.destination_table_name_template

### sql located in this one
params.query

params.write_disposition
scheduleOptions.startTime
params.overwrite_destination_table
params.source_dataset_id
params.source_project_id
scheduleOptions.endTime
nextRunTime

Answer 3

可能是我，但我很难找到有关使用 bigquery 数据传输 API（= 计划查询）的 python 文档。

在 python 中你可以这样做：

from google.cloud import bigquery_datatransfer

bq_datatransfer_client = bigquery_datatransfer.DataTransferServiceClient()
request_datatransfers = bigquery_datatransfer.ListTransferConfigsRequest(
    # if US, you can just do parent='projects/YOUR_PROJECT_ID'
    parent='projects/YOUR_PROJECT_ID/locations/EU',  
)

# this method will also deal with pagination
response_datatransfers = bq_datatransfer_client.list_transfer_configs(
    request=request_datatransfers)

# to convert the response to a list of scheduled queries
datatransfers = list(response_datatransfers)

以下是有关 API 的一些有用资源：

具体在.list_transfer_configs()方法上：

https://cloud.google.com/python/docs/reference/bigquerydatatransfer/latest/google.cloud.bigquery_datatransfer_v1.services.data_transfer_service.DataTransferServiceClient#google_cloud_bigquery_datatransfer_v1_services_data_transfer_service_DataTransferServiceClient_list_transfer_configs

在ListTransferConfigsRequest类上：

https://cloud.google.com/python/docs/reference/bigquerydatatransfer/latest/google.cloud.bigquery_datatransfer_v1.types.ListTransferConfigsRequest

关于如何使用 python API 的代码片段和示例：

https://github.com/googleapis/python-bigquery-datatransfer/tree/main/samples/snippets

API使用的一些官方文档：

https://cloud.google.com/python/docs/reference/bigquerydatatransfer/latest/

Answer 4

使用 python 你可以这样做：

from google.cloud import bigquery_datatransfer

bq_datatransfer_client = bigquery_datatransfer.DataTransferServiceClient()

parent = 'projects/YOUR_PROJECT_ID/locations/EU'
resp_datatransfers = bq_datatransfer_client.list_transfer_configs(parent=parent)

datatransfers = list(resp_datatransfers)

文档链接： .list_transfer_configs()

Answer 5

获取任何特定列：

bq ls --transfer_config --transfer_location=US --format=prettyjson|jq -r ".[]|[.name,.displayName,.dataSourceId,.state,.userId]|@csv"|tr -d "\""

如何通过 Python 谷歌客户端 API 列出我的预定查询？

问题描述

5 个解决方案

解决方案1
5 已采纳 2022-03-10 17:39:08

解决方案2
2 2022-03-10 16:26:51

解决方案3
0 2022-12-19 14:45:49

解决方案4
0 2022-12-20 09:17:51

解决方案5
0 2023-01-02 11:57:38

如何通过 Python 谷歌客户端 API 列出我的预定查询？

问题描述

5 个解决方案

解决方案1 5 已采纳 2022-03-10 17:39:08

解决方案2 2 2022-03-10 16:26:51

解决方案3 0 2022-12-19 14:45:49

解决方案4 0 2022-12-20 09:17:51

解决方案5 0 2023-01-02 11:57:38

解决方案1
5 已采纳 2022-03-10 17:39:08

解决方案2
2 2022-03-10 16:26:51

解决方案3
0 2022-12-19 14:45:49

解决方案4
0 2022-12-20 09:17:51

解决方案5
0 2023-01-02 11:57:38