简体   繁体   中英

Skip rows in bigquery based on difference from value in previous row

assuming the table below is ordered by value (DESC), how can I return only those rows where the difference between current value and value in previous row is less than some number x (eg 2), and also discard the next rows once this condition is met for the first time

ie return only rows 1 and 2 below, since the difference between the values of rows 3 and 2 (9.0-4.0=5.0) >2, so we skip rows 3 and 4

with table as (
    select 1 as id, "a" as name, 10.0 as value UNION ALL
    select 2, "b", 9.0 UNION ALL 
    select 3, "c", 4.0 UNION ALL 
    select 4, "d", 1.0 UNION ALL 
)

output

id, name, value
1,   a,   10.0
2,   b,   9.0

We can use lag() to find the difference and combine id<=id and max(difference)<=2 to filter the results.

with t1 as (
    select 1 as id, 'a' as name, 10.0 as value UNION ALL
    select 2, 'b', 9.0 UNION ALL 
    select 3, 'c', 4.0 UNION ALL 
    select 4, 'd', 1.0  
)

select 
  a.id, a.name, a.value,
  max(b.value_diff) max_diff
from t1 a
join (select id, abs(coalesce(value - lag(value) over (order by id),0)) as value_diff from t1 )b
on  a.id >= b.id
group by a.id, a.name, a.value
having max(b.value_diff) <= 2;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM