简体   繁体   中英

Select column based on max date timestamp

I'm trying to select the row with the max(date_added) paying respect to only duplicate values in col_a and col_b with log_id being unique. I've seen solutions with DISTINCT ON and using window functions, but I'm unable to get the syntax correct to get me the result I want.

From this table:

+--------+-------+-------+-------+-------+-------+-------------------------+
| log_id | col_a | col_b | col_c | col_d | col_e |       date_added        |
+--------+-------+-------+-------+-------+-------+-------------------------+
|      1 | ACME  | West  |    14 |    27 | A     | 2016-01-01 12:45:00.453 |
|      2 | ACME  | West  |    17 |    31 | A     | 2016-01-02 07:33:24.551 |
|      3 | ACME  | East  |    13 |    27 | B     | 2016-01-01 11:43:21.223 |
|      4 | ACME  | East  |    17 |    28 | A     | 2016-01-03 09:15:50.113 |
+--------+-------+-------+-------+-------+-------+-------------------------+

How can I return this:

+--------+-------+-------+-------+-------+-------+-------------------------+
| log_id | col_a | col_b | col_c | col_d | col_e |       date_added        |
+--------+-------+-------+-------+-------+-------+-------------------------+
|      2 | ACME  | West  |    17 |    31 | A     | 2016-01-02 07:33:24.551 |
|      4 | ACME  | East  |    17 |    28 | A     | 2016-01-03 09:15:50.113 |
+--------+-------+-------+-------+-------+-------+-------------------------+

one approach is to use window functions... something like...

with cte as (SELECTlog_id,col_a,col_b,col_c,col_d, col_e, date_added,
                    row_number() over (partition by col_A, col_B order by date_added desc) rn
             FROM tableName )
SELECT log_id,col_a,col_b,col_c,col_d,col_e,date_added 
FROM cte 
WHERE rn = 1;

What this does is assign a row number to each grouping of col_A and col_B (reseting row number to 1 for each A, b group) in the descending order of date added and then return only those records with a row number of 1 (having the max date)

or prior to window'd functions... we get a set of data for max dates in each col_a and Col_B group and then join back to the base set to limit the data.

Select A.* 
FROM TableName A
INNER JOIN (SELECT max(date_added) mda, col_A, col_B
            FROM tableName) B
 on B.MDA = A.Date_Added
and B.Col_A = A.Col_A
and B.Col_B = A.Col_B

Using distinct on it is simple:

select distinct on (col_a, col_b)
  *
from
  yourtable
order by
  col_a, col_b, -- columns from `distinct on` should be in the `order by` clause
  -- and here is a max value of date_added for every distinct (col_a, col_b) pairs 
  -- because of descending order
  date_added desc;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM