繁体   English   中英

如何使用 python 在 bigquery 客户端查询中限制执行时间

[英]How to limit execution time in bigquery client query using python

我正在尝试使用 python 运行 BQ 查询,有时需要大约 6 小时才能完成执行。 我想将此执行时间限制为 2 小时。 这意味着即使查询没有在 2 小时内完成执行,它也应该停止运行,状态为“失败”。 当然最好在 2 小时内完成。

utility.py 下的代码:

def execute_select_bq_query(query, location, jobconfig):
    from google.cloud import bigquery

    # Construct a BigQuery client object.
    client = bigquery.Client(project=queryexcproject)
    job_config = bigquery.QueryJobConfig()
    dest_table = bq_table_reference(client, jobconfig['destination'])
    job_config.destination = dest_table
    job_config.create_disposition = jobconfig['create_disposition']
    job_config.write_disposition = jobconfig['write_disposition']
    query_job = client.query(query, location=location, job_config=job_config)
    results = query_job.result()
    response = "Data loaded into BQ table {}".format(jobconfig['destination'])
    return response

main.py下的BQ作业配置:

from utility import execute_select_bq_query

def get_data(request):

    job_config = {"destination": bq_project_details,
                          "create_disposition": "CREATE_IF_NEEDED",
                          "write_disposition": tbl_write_disposition}
    response = execute_select_bq_query(query, bq_location, job_config)
    print(response)

    return response()

您可以在query_job.result() function 中设置超时:

def execute_select_bq_query(query, location, jobconfig):
    from google.cloud import bigquery
    
    # Timeout in seconds (float)
    timeout = 

    # Construct a BigQuery client object.
    client = bigquery.Client(project=queryexcproject)
    job_config = bigquery.QueryJobConfig()
    dest_table = bq_table_reference(client, jobconfig['destination'])
    job_config.destination = dest_table
    job_config.create_disposition = jobconfig['create_disposition']
    job_config.write_disposition = jobconfig['write_disposition']
    query_job = client.query(query, location=location, job_config=job_config)
    results = query_job.result(
           timeout=timeout,
           job_retry=None
    )
    response = "Data loaded into BQ table {}".format(jobconfig['destination'])
    return response

文档说:

timeout (Optional[float]):
                The number of seconds to wait for the underlying HTTP transport
                before using ``retry``.
                If multiple requests are made under the hood, ``timeout``
                applies to each individual request.

job_retry (Optional[google.api_core.retry.Retry]):
                How to retry failed jobs.  The default retries
                rate-limit-exceeded errors. Passing ``None`` disables
                job retry.

                Not all jobs can be retried.  If ``job_id`` was
                provided to the query that created this job, then the
                job returned by the query will not be retryable, and
                an exception will be raised if non-``None``
                non-default ``job_retry`` is also provided.

要限制查询的执行时间,您必须传递timeout参数,如果您不想执行作业重试,您可以将job_retry参数传递为None

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM