简体   繁体   中英

Skip rows in bigquery based on difference from value in previous row

assuming the table below is ordered by value (DESC), how can I return only those rows where the difference between current value and value in previous row is less than some number x (eg 2), and also discard the next rows once this condition is met for the first time

ie return only rows 1 and 2 below, since the difference between the values of rows 3 and 2 (9.0-4.0=5.0) >2, so we skip rows 3 and 4

with table as (
    select 1 as id, "a" as name, 10.0 as value UNION ALL
    select 2, "b", 9.0 UNION ALL 
    select 3, "c", 4.0 UNION ALL 
    select 4, "d", 1.0 UNION ALL 


id, name, value
1,   a,   10.0
2,   b,   9.0

We can use lag() to find the difference and combine id<=id and max(difference)<=2 to filter the results.

with t1 as (
    select 1 as id, 'a' as name, 10.0 as value UNION ALL
    select 2, 'b', 9.0 UNION ALL 
    select 3, 'c', 4.0 UNION ALL 
    select 4, 'd', 1.0  

  a.id, a.name, a.value,
  max(b.value_diff) max_diff
from t1 a
join (select id, abs(coalesce(value - lag(value) over (order by id),0)) as value_diff from t1 )b
on  a.id >= b.id
group by a.id, a.name, a.value
having max(b.value_diff) <= 2;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM