[英]Query that counts total records per day and total records with same time timestamp and id per day in Bigquery
我有这样的时间序列数据:
时间 | ID | 价值 |
---|---|---|
2018-04-25 22:00:00 UTC | 一个 | 1 |
2018-04-25 23:00:00 UTC | 一个 | 2 |
2018-04-25 23:00:00 UTC | 一个 | 2.1 |
2018-04-25 23:00:00 UTC | 乙 | 1 |
2018-04-26 23:00:00 UTC | 乙 | 1.3 |
如何编写查询以生成包含这些列的 output 表:
time
和id
的组合不唯一的记录数。 在上面的示例数据中,id==A 在 2018-04-25 23:00:00 UTC 的两条记录将被计算为日期 2018-04-25所以我们查询的 output 应该是:
日期 | 记录 | records_conflicting_time_id |
---|---|---|
2018-04-25 | 4 | 2 |
2018-04-26 | 1 | 0 |
获取records
很容易,我只是截断获取日期的time
,然后按date
分组。 但我真的很难生成一个列来计算id
+ time
在该日期不是唯一的记录数......
with YOUR_DATA as
(
select cast('2018-04-25 22:00:00 UTC' as timestamp) as `time`, 'A' as id, 1.0 as value
union all select cast('2018-04-25 23:00:00 UTC' as timestamp) as `time`, 'A' as id, 2.0 as value
union all select cast('2018-04-25 23:00:00 UTC' as timestamp) as `time`, 'A' as id, 2.1 as value
union all select cast('2018-04-25 23:00:00 UTC' as timestamp) as `time`, 'B' as id, 1.0 as value
union all select cast('2018-04-26 23:00:00 UTC' as timestamp) as `time`, 'B' as id, 1.3 as value
)
select cast(timestamp_trunc(t1.`time`, day) as date) as `date`,
count(*) as records,
case when count(*)-count(distinct cast(t1.`time` as string) || t1.id) = 0 then 0
else count(*)-count(distinct cast(t1.`time` as string) || t1.id)+1
end as records_conflicting_time_id
from YOUR_DATA t1
group by cast(timestamp_trunc(t1.`time`, day) as date)
;
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.