简体   繁体   中英

Hive query select one column depending on another column during group by

There are similar questions out there, but the solution of them can't quite solve my problem. Consider the following table:

id type time
1  a    1
1  a    2
1  b    3
2  b    1
2  b    2

What I want is the id with the smallest time and the type associated with that time, so the result should be:

id type time
1  a    1
2  b    1

(if there is a tie in time with different types, choosing any type is acceptable) My current query looks like:

SELECT id, type, min(time) FROM t GROUP BY id, type;

which fails to address the duplicate type issue. Is there a query I can do to achieve that? Many thanks

Instead of group by , use row_number() :

select t.*
from (select t.*,
             row_number() over (partition by id order by time) as seqnum
      from t
     ) t
where seqnum = 1;

--Using sub query also we can achieve it.

CREATE TABLE #Temp (
    id INT
    ,[type] CHAR(1)
    ,[time] INT
    )

INSERT INTO #Temp VALUES 
(1,'a',1),
(1,'a',2),
(1,'b',3),
(2,'b',1),
(2,'b',2)

SELECT DISTINCT T.id
    ,T.type
    ,DT.MinTime
FROM #Temp T
INNER JOIN (
    SELECT MIN(TIME) AS MinTime
    FROM #Temp
    GROUP BY [TYPE]
    ) AS DT ON T.[time] = DT.MinTime

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM