简体   繁体   中英

Value from previous row in GROUP BY as column

I have this table:

+----------+-------------+-------------------+------------------+
|    userId|       testId|               date|              note|
+----------+-------------+-------------------+------------------+
| 123123123|            1|2019-01-22 02:03:00|               aaa|
| 123123123|            1|2019-02-22 02:03:00|               bbb|
| 123456789|            2|2019-03-23 02:03:00|               ccc|
| 123456789|            2|2019-04-23 02:03:00|               ddd|
| 321321321|            3|2019-05-23 02:03:00|               eee|
+----------+-------------+-------------------+------------------+

Would like to get newest note (whole row) for each group userId and testId :

SELECT
    n.userId,
    n.testId,
    n.date,
    n.note
FROM 
    notes n
INNER JOIN (
    SELECT 
        userId,
        testId,
        MAX(date) as maxDate
    FROM 
        notes
    GROUP BY 
        userId,
        testId
) temp ON n.userId = temp.userId AND n.testId = temp.testId AND n.date = temp.maxDate

It works.

But now I'd like to also have previous note in each row:

+----------+-------------+-------------------+-------------+------------+
|    userId|       testId|               date|         note|previousNote|
+----------+-------------+-------------------+-------------+------------+
| 123123123|            1|2019-02-22 02:03:00|          bbb|         aaa|
| 123456789|            2|2019-04-23 02:03:00|          ddd|         ccc|
| 321321321|            3|2019-05-23 02:03:00|          eee|        null|
+----------+-------------+-------------------+-------------+------------+

Have no idea how to do it. I heard about LAG() function which might be useful but found no good examples for my case.

I'd like to use it on dataframe in pyspark (if it's important)

use lag() and row_number analytic function

select userid,testid,date,note,previous_note
from
(select userid,testid,date,note,
lag(note)over(partition by useid,testid order by date) as previous_note,
row_number() over(partition by userid,testid order by date desc) rn
from table_name
) a where a.rn=1
select userid,testid,date,note,previous_note from
(select userid,testid,date,note,lead(note)
over(partition by userid,testid order by date desc) as previous_note,
row_number() over(partition by userid,testid order by date desc) srno
from Table_Name
) a where a.srno=1

I hope it will give you right answer which you want. it will give you latest date as new record and previous date note as previous_Note.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM