GCP 数据流作业 REST 响应添加显示数据 object 与 {“key”：“datasetName”，...}

Question

Why this code of line doesn't generate displayData object with { "key":"datasetName", ...} and how I can generate it if it's not coming by default when using BigQuery source from apache beam?为什么此行代码不使用{ "key":"datasetName", ...}生成 displayData object，如果在使用来自 apache beam 的 BigQuery 源时默认情况下它不是默认情况下，我如何生成它？

bigqcollection = p | 'ReadFromBQ' >> beam.io.Read(beam.io.BigQuerySource(project=project,query=get_java_query))

[UPDATE] Adding result that I try to produce: [更新] 添加我尝试生成的结果：

"displayData": [
                    {
                        "key": "table",
                        "namespace": "....",
                        "strValue": "..."
                    },          
                    {
                        "key": "datasetName",
                        "strValue": "..."
                    }
]

Answer 1

From reading the implementation of display_data() for a BigQuerySource in the most recent version of Beam, it does not extract the table and dataset from the query, which your example uses.通过阅读最新版本 Beam 中BigQuerySource 的display_data()实现，它不会从您的示例使用的查询中提取表和数据集。 And more significantly, it does not create any fields specifically named datasetName .更重要的是，它不会创建任何专门命名为datasetName的字段。

I would recommend writing a subclass of _BigQuerySource which adds the fields you need to the display data, while preserving all the other behavior.我建议编写一个_BigQuerySource的子类，它将您需要的字段添加到显示数据，同时保留所有其他行为。

GCP 数据流作业 REST 响应添加显示数据 object 与 {“key”：“datasetName”，...}

问题描述

1 个解决方案

解决方案1
1 已采纳 2021-09-25 01:05:30

GCP 数据流作业 REST 响应添加显示数据 object 与 {“key”：“datasetName”，...}

问题描述

1 个解决方案

解决方案1 1 已采纳 2021-09-25 01:05:30

解决方案1
1 已采纳 2021-09-25 01:05:30