简体   繁体   中英

How to aggregate over unique values in Postgres with window functions

My program deals with insecticide sprayers attempting to spray a number of actual units based on an original target. For example, the sprayer Joe was supposed to spray 10 targets but he actually sprayed 7.

I am provided a table which is the summary of all planned targets (column= targets ) and actual targets (column= actual ), along with some other data, including the sprayer's name. Here's the schema:

CREATE TABLE spray_summary
(
 id character varying(1),
 target integer,
 ref_id character varying(1),
 actual integer,
 sprayer character varying(25)
)

The data is a denormalized join between (id, target) and (ref_id, actual, sprayer), but for now this table is all I have to work with. Here's the full results:

SELECT * FROM spray_summary
+----+--------+--------+--------+---------+
| id | target | ref_id | actual | sprayer |
+----+--------+--------+--------+---------+
| a  |      1 | "l"    |     10 | "Joe"   |
| a  |      1 | "m"    |     10 | "Joe"   |
| a  |      1 | "p"    |     10 | "Joe"   |
| c  |      3 | "n"    |     10 | "Joe"   |
| c  |      3 | "o"    |     10 | "Joe"   |
+----+--------+--------+--------+---------+

You can see that id value "a" is repeated three times due to the one-to-many join and "c" repeats twice. Given this, what I want to do is show the SUM of target values, the SUM of "actual" values, and the sprayer. I used the following query:

SELECT SUM(target) targets, SUM(actual) actuals, sprayer FROM spray_summary GROUP BY sprayer

Which returned the result:

+--------+--------+---------+
| target | actual | sprayer |
+--------+--------+---------+
|      9 |     50 | "Joe"   |
+--------+--------+---------+

While the sum is correct (5 * 10 = 50) for actual values, the target values is being multiplied because the dataset is denormalized. I want "target" to be unique relative to id and sprayer , so I tried a window function:

SELECT SUM(target) OVER(PARTITION BY sprayer, id),
sprayer,
SUM(actual)
FROM spray_summary
GROUP BY sprayer, target, id

This gives me the results:

+--------+--------+---------+
| target | actual | sprayer |
+--------+--------+---------+
|      1 |     30 | "Joe"   |
|      3 |     20 | "Joe"   |
+--------+--------+---------+

Still not right! The right solution would give the following:

+--------+--------+---------+
| target | actual | sprayer |
+--------+--------+---------+
|      4 |     50 | "Joe"   |
+--------+--------+---------+

But no matter how much I try to tweak the window function the rows are split because I to GROUP BY target which breaks up the rows. 使用GROUP BY 目标来分解行。 Any ideas? I know this can be rewritten by joining the table with itself, once for SUM target and once for SUM actual , but I don't have that option. I can modify the column definitions. 修改列定义。

Thanks in advance.

EDIT: I know this can be solved by pushing some of the logic into a sub-query, but I'm looking for a column-level solution if that's possible. The SQL is auto-generated, so I don't have much control over the structure, but I can modify the column definitions, hence I'm looking for a column-level solution, like a window function.

Worst case scenario is Postgres can't solve this at the column level and I'll have to re-write the SQL generators.

SQL Fiddle

select sum(target) as target, sum(actual) as actual, sprayer
from (
    select
        target,
        sum(actual) as actual,
        sprayer
    from spray_summary
    group by id, target, sprayer
) s
group by sprayer
order by sprayer
;
 target | actual | sprayer 
--------+--------+---------
      4 |     50 | joe

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM