简体   繁体   English

我如何添加一个计数来对 SQL Hive 中的 null 个值进行排名?

[英]How I can I add a count to rank null values in SQL Hive?

This is what I have right now:这就是我现在所拥有的:

| time  | car_id | order | in_order |
|-------|--------|-------|----------|
| 12:31 | 32     | null  | 0        |
| 12:33 | 32     | null  | 0        |
| 12:35 | 32     | null  | 0        |
| 12:37 | 32     | 123   | 1        |
| 12:38 | 32     | 123   | 1        |
| 12:39 | 32     | 123   | 1        |
| 12:41 | 32     | 123   | 1        |
| 12:43 | 32     | 123   | 1        |
| 12:45 | 32     | null  | 0        |
| 12:47 | 32     | null  | 0        |
| 12:49 | 32     | 321   | 1        |
| 12:51 | 32     | 321   | 1        |

I'm trying to rank orders, including those who have null values, in this case by car_id.我正在尝试对订单进行排名,包括那些具有 null 值的订单,在本例中为 car_id。 This is the result I'm looking for:这是我正在寻找的结果:

| time  | car_id | order | in_order | row |
|-------|--------|-------|----------|-----|
| 12:31 | 32     | null  | 0        | 1   |
| 12:33 | 32     | null  | 0        | 1   |
| 12:35 | 32     | null  | 0        | 1   |
| 12:37 | 32     | 123   | 1        | 2   |
| 12:38 | 32     | 123   | 1        | 2   |
| 12:39 | 32     | 123   | 1        | 2   |
| 12:41 | 32     | 123   | 1        | 2   |
| 12:43 | 32     | 123   | 1        | 2   |
| 12:45 | 32     | null  | 0        | 3   |
| 12:47 | 32     | null  | 0        | 3   |
| 12:49 | 32     | 321   | 1        | 4   |
| 12:51 | 32     | 321   | 1        | 4   |

I just don't know how to manage a count for the null values.我只是不知道如何管理 null 值的计数。 Thanks!谢谢!

You can count the number of non-NULL values before each row and then use dense_rank() :您可以计算每行之前的非 NULL 值的数量,然后使用dense_rank()

select t.*,
       dense_rank() over (partition by car_id order by grp) as row
from (select t.*,
             count(order) over (partition by car_id order by time) as grp
      from t
     ) t;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM