简体   繁体   English

Google Dataflow API 按作业名称过滤

[英]Google Dataflow API Filter by Job Name

Is there a way to filter Dataflow jobs by Job Name with REST API?有没有办法使用 REST API 按作业名称过滤数据流作业? I am looking for a way to get list of job details filter by job name.我正在寻找一种方法来获取按工作名称筛选的工作详细信息列表。 Currently, I am able to do it through cloud dataflow console, but not from dataflow rest api.目前,我可以通过云数据流控制台来完成,但不能通过数据流 rest api 来完成。

GET /v1b3/projects/{projectId}/jobs获取 /v1b3/projects/{projectId}/jobs

The filter that is performed within the Dataflow console is not part of the API (It seems that the Dataflow API is requested to get the jobs but the frontend layer is the one that performs the filtering function).在 Dataflow 控制台中执行的过滤器不是 API 的一部分(似乎请求 Dataflow API 来获取作业,但前端层是执行过滤功能的层)。

Therefore, you could replicate this by following the same steps:因此,您可以按照相同的步骤进行复制:

1- To list all jobs across all regions, use projects.jobs.aggregated (GET/v1b3/projects/{projectId}/jobs: aggregated). 1- 要列出所有区域的所有作业,请使用projects.jobs.aggregated (GET/v1b3/projects/{projectId}/jobs: aggregated)。 Additionally, this method allows you to pre-filter jobs for a specified job state .此外,此方法允许您预先过滤指定作业状态的作业

Projects.jobs.list (GET/v1b3/projects/{projectId}/ jobs) is not recommended, as you can only get the list of jobs that are running in us-central1.不推荐使用 Projects.jobs.list (GET/v1b3/projects/{projectId}/jobs),因为您只能获取在 us-central1 中运行的作业列表。

2- Both methods mentioned above, return a JSON ListJobsResponse object, this object contains a list of Jobs. 2- 上面提到的两种方法,都返回一个 JSON ListJobsResponse对象,该对象包含一个 Jobs 列表。 Therefore, you can iterate this list in some programming language like Python and filter the jobs by a regex over the job name to get the desired jobs:因此,您可以使用某种编程语言(如 Python)迭代此列表,并通过作业名称上的正则表达式过滤作业以获得所需的作业:

import json
import re
 
desired_name = 'REGEX_STRING'
 
filtered_jobs  = list()
 
with open('ListJobsResponse.json') as json_file:
   response_dict = json.load(json_file)
  
   jobs = response_dict['jobs']
 
   for j in jobs:
       x = re.search(desired_name, j['name'])
       if x:
           filtered_jobs.append(j)
 
print(filtered_jobs)

You can use filter the Dataflow jobs by name while using below API.您可以在使用以下 API 时按名称过滤数据流作业。

API Reference: https://cloud.google.com/dataflow/docs/reference/rest/v1b3/projects.jobs/aggregated API 参考: https://cloud.google.com/dataflow/docs/reference/rest/v1b3/projects.jobs/aggregated

API Methods: API 方法:

GET https://dataflow.googleapis.com/v1b3/projects/{projectId}/jobs:aggregated
or
GET https://dataflow.googleapis.com/v1b3/projects/{projectId}/jobs

parameters (works for both APIs):
    name: name of the dataflow job

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Cloud Dataflow 中的失败作业:启用 Dataflow API - Failed job in Cloud Dataflow: enable Dataflow API Java API 列出数据流作业 - Java API to list dataflow job 如何通过运行 Google Compute Engine cron 作业来安排数据流作业 - How to schedule Dataflow Job by running Google Compute Engine cron job 作业图太大,无法提交到 Google Cloud Dataflow - Job graph too large to submit to Google Cloud Dataflow Google Dataflow 作业因“上传的数据不足”错误而失败 - Google Dataflow job failed with "insufficient data uploaded" error 如何在谷歌数据流中获取现有作业的“templateLocation”参数值 - How to Get the "templateLocation" parameter value of an existing Job in google dataflow 如何从数据流作业发送和过滤结构化日志 - How to send and filter structured logs from dataflow job 如何使用自定义 Docker 图像运行 Python Google Cloud Dataflow 作业? - How to run a Python Google Cloud Dataflow job with a custom Docker image? Google 数据流 api 返回空结果 - Google dataflow api is returning empty result 数据流流作业错误:未定义名称“函数”。 如何为所有数据流步骤创建全局函数 - Dataflow Streaming Job Error: name 'function' is not defined. How to create a global function for all Dataflow steps
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM