简体   繁体   English

如何获得累积数据? 我试图在 SQL 中获得以下分桶,但没有得到想要的结果/pyspark

[英]how to get a cumulative data? I am trying to get the below bucketing in SQL, but not getting desired result/pyspark

So that <30 bucket vales are not being shown in <60 and so on.因此 <30 个桶值不会在 <60 中显示,依此类推。 how can I achieve this我怎样才能做到这一点

    case when DATEDIFF((to_date('2022-01-07', 'yyyy-MM-dd')),max(to_date((column), 'yyyy-MM-dd'))) between 0 and 29 then '<30' 
        when DATEDIFF((to_date('2022-01-07', 'yyyy-MM-dd')),max(to_date((column), 'yyyy-MM-dd'))) between 0 and 59 then '<60'
        when DATEDIFF((to_date('2022-01-07', 'yyyy-MM-dd')),max(to_date((column), 'yyyy-MM-dd'))) between 0 and 89 then '<90' 
        when DATEDIFF((to_date('2022-01-07', 'yyyy-MM-dd')),max(to_date((column), 'yyyy-MM-dd'))) between 0 and 179 then '<180' 
        when DATEDIFF((to_date('2022-01-07', 'yyyy-MM-dd')),max(to_date((column), 'yyyy-MM-dd'))) > 180 then '>180'
     end 
     else 'bad_date' 

Just change your between conditions只需更改您的条件之间

... between 0 and 29 then '<30'
... between 30 and 59 then '<60'
... between 60 and 89 then '<90'
... between 90 and 179 then '<180'
... > 180 then '>180'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM