简体   繁体   中英

How can I get the sum(value) on the latest gather_time per group(name,col1) in PostgreSQL?

Actually, I got a good answer about the similar issue on below thread, but I need one more solution for different data set.

How to get the latest 2 rows ( PostgreSQL )

The Data set has historical data, and I just want to get sum(value) for the group on the latest gather_time. The final result should be as following:

 name  | col1 |     gather_time     | sum
-------+------+---------------------+-----
 first | 100  | 2016-01-01 23:12:49 |   6
 first | 200  | 2016-01-01 23:11:13 |   4

However, I just can see the data for the one group(first-100) with a query below meaning that there is no data for the second group(first-200). Thing is that I need to get the one row per the group. The number of the group can be vary.

select name,col1,gather_time,sum(value) 
from testtable
group by name,col1,gather_time
order by gather_time desc
limit 2;

 name  | col1 |     gather_time     | sum
-------+------+---------------------+-----
 first | 100  | 2016-01-01 23:12:49 |   6
 first | 100  | 2016-01-01 23:11:19 |   6
(2 rows)

Can you advice me to accomplish this requirement?

Data set

create table testtable
(
name varchar(30),
col1 varchar(30),
col2 varchar(30),
gather_time timestamp,
value integer
);


insert into testtable values('first','100','q1','2016-01-01 23:11:19',2);
insert into testtable values('first','100','q2','2016-01-01 23:11:19',2);
insert into testtable values('first','100','q3','2016-01-01 23:11:19',2);
insert into testtable values('first','200','t1','2016-01-01 23:11:13',2);
insert into testtable values('first','200','t2','2016-01-01 23:11:13',2);
insert into testtable values('first','100','q1','2016-01-01 23:11:11',2);
insert into testtable values('first','100','q1','2016-01-01 23:12:49',2);
insert into testtable values('first','100','q2','2016-01-01 23:12:49',2);
insert into testtable values('first','100','q3','2016-01-01 23:12:49',2);

select * 
from testtable 
order by name,col1,gather_time;

 name  | col1 | col2 |     gather_time     | value
-------+------+------+---------------------+-------
 first | 100  | q1   | 2016-01-01 23:11:11 |     2
 first | 100  | q2   | 2016-01-01 23:11:19 |     2
 first | 100  | q3   | 2016-01-01 23:11:19 |     2
 first | 100  | q1   | 2016-01-01 23:11:19 |     2
 first | 100  | q3   | 2016-01-01 23:12:49 |     2
 first | 100  | q1   | 2016-01-01 23:12:49 |     2
 first | 100  | q2   | 2016-01-01 23:12:49 |     2
 first | 200  | t2   | 2016-01-01 23:11:13 |     2
 first | 200  | t1   | 2016-01-01 23:11:13 |     2

One option is to join your original table to a table containing only the records with the latest gather_time for each name , col1 group. Then you can take the sum of the value column for each group to get the result set you want.

SELECT t1.name, t1.col1, MAX(t1.gather_time) AS gather_time, SUM(t1.value) AS sum
FROM testtable t1 INNER JOIN
(
    SELECT name, col1, col2, MAX(gather_time) AS maxTime
    FROM testtable
    GROUP BY name, col1, col2
) t2
ON t1.name = t2.name AND t1.col1 = t2.col1 AND t1.col2 = t2.col2 AND
    t1.gather_time = t2.maxTime
GROUP BY t1.name, t1.col1

If you wanted to use a subquery in the WHERE clause, as you attempted in your OP, to restrict to only records with the latest gather_time then you could try the following:

SELECT name, col1, gather_time, SUM(value) AS sum
FROM testtable t1
WHERE gather_time =
(
    SELECT MAX(gather_time) 
    FROM testtable t2
    WHERE t1.name = t2.name AND t1.col1 = t2.col1
)
GROUP BY name, col1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM