Hive query select one column depending on another column during group by

Question

There are similar questions out there, but the solution of them can't quite solve my problem. Consider the following table:

id type time
1  a    1
1  a    2
1  b    3
2  b    1
2  b    2

What I want is the id with the smallest time and the type associated with that time, so the result should be:

id type time
1  a    1
2  b    1

(if there is a tie in time with different types, choosing any type is acceptable) My current query looks like:

SELECT id, type, min(time) FROM t GROUP BY id, type;

which fails to address the duplicate type issue. Is there a query I can do to achieve that? Many thanks

Answer 1

Instead of group by , use row_number() :

select t.*
from (select t.*,
             row_number() over (partition by id order by time) as seqnum
      from t
     ) t
where seqnum = 1;

Answer 2

--Using sub query also we can achieve it.

CREATE TABLE #Temp (
    id INT
    ,[type] CHAR(1)
    ,[time] INT
    )

INSERT INTO #Temp VALUES 
(1,'a',1),
(1,'a',2),
(1,'b',3),
(2,'b',1),
(2,'b',2)

SELECT DISTINCT T.id
    ,T.type
    ,DT.MinTime
FROM #Temp T
INNER JOIN (
    SELECT MIN(TIME) AS MinTime
    FROM #Temp
    GROUP BY [TYPE]
    ) AS DT ON T.[time] = DT.MinTime

Hive query select one column depending on another column during group by

Question

2 answers

solution1
1 ACCPTED 2017-05-11 02:47:00

solution2
-1 2017-05-11 07:46:41

Hive query select one column depending on another column during group by

Question

2 answers

solution1 1 ACCPTED 2017-05-11 02:47:00

solution2 -1 2017-05-11 07:46:41

solution1
1 ACCPTED 2017-05-11 02:47:00

solution2
-1 2017-05-11 07:46:41