简体   繁体   English

根据与上一行值的差异跳过 bigquery 中的行

[英]Skip rows in bigquery based on difference from value in previous row

assuming the table below is ordered by value (DESC), how can I return only those rows where the difference between current value and value in previous row is less than some number x (eg 2), and also discard the next rows once this condition is met for the first time假设下表是按值排序的(DESC),我如何才能只返回当前值与前一行中的值之差小于某个数字 x(例如 2)的那些行,并在这种情况下丢弃下一行第一次见面

ie return only rows 1 and 2 below, since the difference between the values of rows 3 and 2 (9.0-4.0=5.0) >2, so we skip rows 3 and 4即下面只返回第1行和第2行,因为第3行和第2行的值的差值(9.0-4.0=5.0)>2,所以我们跳过第3行和第4行

with table as (
    select 1 as id, "a" as name, 10.0 as value UNION ALL
    select 2, "b", 9.0 UNION ALL 
    select 3, "c", 4.0 UNION ALL 
    select 4, "d", 1.0 UNION ALL 
)

output output

id, name, value
1,   a,   10.0
2,   b,   9.0

We can use lag() to find the difference and combine id<=id and max(difference)<=2 to filter the results.我们可以使用 lag() 来找出差异,并结合 id<=id 和 max(difference)<=2 来过滤结果。

with t1 as (
    select 1 as id, 'a' as name, 10.0 as value UNION ALL
    select 2, 'b', 9.0 UNION ALL 
    select 3, 'c', 4.0 UNION ALL 
    select 4, 'd', 1.0  
)

select 
  a.id, a.name, a.value,
  max(b.value_diff) max_diff
from t1 a
join (select id, abs(coalesce(value - lag(value) over (order by id),0)) as value_diff from t1 )b
on  a.id >= b.id
group by a.id, a.name, a.value
having max(b.value_diff) <= 2;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM