![](/img/trans.png)
[英]How to flatten my RECORD field that is made of repeated fields in BigQuery?
[英]How do I fetch the latest repeated entry of a record in bigquery?
我有一个表,其中所有更新都作为新条目推送。 表的模式是这样的:
[
{
"id":"221212",
"fieldsData": [
{
"key": "someDate",
"value": "12-12-2022"
},
{
"key": "someString",
"value": "ABCDEF"
}
],
"name": "Sample data 1",
"createdOn":"12-11-2022",
"insertedDate": "14-11-2022",
"updatedOn": "14-11-2022"
},
{
"id":"221212",
"fieldsData": [
{
"key": "someDate",
"value": "12-12-2022"
},
{
"key": "someString",
"value": "ABCDEF"
},
{
"key": "someMoreString",
"value": "12qwwe122"
}
],
"name": "Sample data 1",
"createdOn":"12-11-2022",
"insertedDate": "15-11-2022",
"updatedOn": "15-11-2022"
}
]
它使用 createdOn 字段按月进行分区。 fieldsData 字段是通用的,可以有任意数量的记录/字段作为单独的行。
如何获取id = 221212的最新条目并只获取最新条目的重复记录?
我知道我可以使用展平但展平查询所有记录,这超出了拥有分区表的目的。
我现在得到的查询是:
select * from
(
SELECT
id, createdAt, createdBy, fields.key, fields.value,
DENSE_RANK() OVER(PARTITION BY id ORDER BY insertedDate DESC)AS Rank1
FROM `mytableName` , UNNEST(fieldsData) as fields
WHERE createdAt IS NULL or DATE(createdAt) = CURRENT_DATE()
)
where rank1 = 1
PS:这张表每天要推送近10k条记录。
让我知道这是否符合您的目的。
SELECT AS value ARRAY_AGG(t ORDER BY insertedDate DESC LIMIT 1)[OFFSET(0)]
FROM `project.dataset.table` t
GROUP BY id
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.