[英]GCP Dataflow JOB REST response add displayData object with { "key":"datasetName", ...}
Why this code of line doesn't generate displayData object with { "key":"datasetName", ...}
and how I can generate it if it's not coming by default when using BigQuery source from apache beam?为什么此行代码不使用
{ "key":"datasetName", ...}
生成 displayData object,如果在使用来自 apache beam 的 BigQuery 源时默认情况下它不是默认情况下,我如何生成它?
bigqcollection = p | 'ReadFromBQ' >> beam.io.Read(beam.io.BigQuerySource(project=project,query=get_java_query))
[UPDATE] Adding result that I try to produce: [更新] 添加我尝试生成的结果:
"displayData": [
{
"key": "table",
"namespace": "....",
"strValue": "..."
},
{
"key": "datasetName",
"strValue": "..."
}
]
From reading the implementation of display_data()
for a BigQuerySource
in the most recent version of Beam, it does not extract the table and dataset from the query, which your example uses.通过阅读最新版本 Beam 中
BigQuerySource
的display_data()
实现,它不会从您的示例使用的查询中提取表和数据集。 And more significantly, it does not create any fields specifically named datasetName
.更重要的是,它不会创建任何专门命名为
datasetName
的字段。
I would recommend writing a subclass of _BigQuerySource
which adds the fields you need to the display data, while preserving all the other behavior.我建议编写一个
_BigQuerySource
的子类,它将您需要的字段添加到显示数据,同时保留所有其他行为。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.