簡體   English   中英

Hive邏輯獲取最小時間,最大時間和其他列

[英]Hive logic to get min time, max time and other columns

我有格式的數據

+---------------------+-------------------------+-------------------------+-----------+------+
|         id          |       start time        |        end time         | direction | name |
+---------------------+-------------------------+-------------------------+-----------+------+
| 9202340753368000000 | 2015-06-02 15:10:28.677 | 2015-06-02 15:32:22.677 |         3 | xyz  |
| 9202340753368000000 | 2015-06-02 14:55:37.353 | 2015-06-02 15:12:18.84  |         1 | xyz  |
+---------------------+-------------------------+-------------------------+-----------+------+

我需要輸出像最小開始時間,最大結束時間,最小開始時間的方向值和名稱

+---------------------+-------------------------+------------------------+-----------+------+
|         id          |       start time        |        end time        | direction | name |
+---------------------+-------------------------+------------------------+-----------+------+
| 9202340753368000000 | 2015-06-02 14:55:37.353 | 2015-06-02 15:32:22.677|         1 | xyz  |
+---------------------+-------------------------+------------------------+-----------+------+

我嘗試使用

select x.id, min(x.start_time) as mintime, max(x.end_time) maxtime , y.direction, y.name   
 from dir_samp x inner join ( 
 select id, start_time,  end_time, name, direction ,  
   rank() over ( partition by id
                order by start_time asc) as r 
   from dir_samp 
) y  on x.id = y.id  where y.r = 1 group by x.id , y.direction, y.name

是否還有其他更有效的邏輯? 請提供。

謝謝

您不需要內部聯接:

select y.id, min(y.start_time) as mintime, 
       max(y.end_time) maxtime , 
       max(case when y.r=1 then y.direction end) as direction, 
       max(case when y.r=1 then y.name end) as name 
from
( 
 select id, start_time,  end_time, name, direction ,  
   rank() over ( partition by id order by start_time asc) as r 
   from dir_samp 
) y 
group by y.id;
select      id
           ,min_vals.start_time
           ,end_time
           ,min_vals.direction
           ,min_vals.name

from       (select      id  
                       ,min(named_struct('start_time',start_time,'direction',direction,'name',name)) as min_vals
                       ,max(end_time)                                                                as end_time

            from        dir_samp

            group by    id
            ) t
;

+---------------------+----------------------------+----------------------------+-----------+------+
| id                  | start_time                 | end_time                   | direction | name |
+---------------------+----------------------------+----------------------------+-----------+------+
| 9202340753368000000 | 2015-06-02 14:55:37.353000 | 2015-06-02 15:32:22.677000 | 1         | xyz  |
+---------------------+----------------------------+----------------------------+-----------+------+

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM