简体   繁体   English

GCP 数据流作业 REST 响应添加显示数据 object 与 {“key”:“datasetName”,...}

[英]GCP Dataflow JOB REST response add displayData object with { "key":"datasetName", ...}

Why this code of line doesn't generate displayData object with { "key":"datasetName", ...} and how I can generate it if it's not coming by default when using BigQuery source from apache beam?为什么此行代码不使用{ "key":"datasetName", ...}生成 displayData object,如果在使用来自 apache beam 的 BigQuery 源时默认情况下它不是默认情况下,我如何生成它?

bigqcollection = p | 'ReadFromBQ' >> beam.io.Read(beam.io.BigQuerySource(project=project,query=get_java_query))

[UPDATE] Adding result that I try to produce: [更新] 添加我尝试生成的结果:

"displayData": [
                    {
                        "key": "table",
                        "namespace": "....",
                        "strValue": "..."
                    },          
                    {
                        "key": "datasetName",
                        "strValue": "..."
                    }
]

From reading the implementation of display_data() for a BigQuerySource in the most recent version of Beam, it does not extract the table and dataset from the query, which your example uses.通过阅读最新版本 Beam 中BigQuerySource display_data()实现,它不会从您的示例使用的查询中提取表和数据集。 And more significantly, it does not create any fields specifically named datasetName .更重要的是,它不会创建任何专门命名为datasetName的字段。

I would recommend writing a subclass of _BigQuerySource which adds the fields you need to the display data, while preserving all the other behavior.我建议编写一个_BigQuerySource的子类,它将您需要的字段添加到显示数据,同时保留所有其他行为。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 GCP 数据流作业部署 - GCP Dataflow Job Deployment GCP Dataflow 计算图和作业执行 - GCP Dataflow Computation Graph and Job Execution 来自作曲家错误的 gcp 触发数据流作业 - gcp trigger dataflow job from composer error 为什么在 GCP 的 java sdk 中编写的数据流作业管道的日志不可见? - why logs are not visible for Dataflow job pipeline written in java sdk at GCP? 如何将非模板化的梁作业转换为模板化作业并在 GCP Dataflow 运行器上运行? - How to convert a non-templated beam job to templated job and run it on GCP Dataflow runner? StreamingDataflowWorker$KeyCommitTooLargeException。 在 gcp 数据流中提交阶段 S8 和密钥 8 的请求大于 2GB - StreamingDataflowWorker$KeyCommitTooLargeException. Commit request for stage S8 and key 8 is larger than 2GB in gcp Dataflow 如何向 GCP 云调度程序作业添加属性? - How to add attributes to a GCP cloud scheduler job? GCP 数据流和 On-premDB - GCP dataflow and On-premDB 如何运行用 Golang 编写的 GCP Cloud Function 以运行数据流作业以将文本文件导入 Spanner? - How to run a GCP Cloud Function written in Golang to run a Dataflow job to import text file to Spanner? GCP 数据流批处理作业 - 防止工作人员在批处理作业中一次运行多个元素 - GCP Dataflow Batch jobs - Preventing workers from running more than one element at a time in a batch job
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM