[英]postgresql - Top 5 values in magnitude for multiple columns, alongside sum of values
我正在尋找關於計算多列的幅度前5個值的最佳(最有效)方法的建議,我還需要計算另一列的總和。
假設我有標題的數據:(人,日,月,每日,每日最大1,每日最大2),每個人的每一天我有總數量,最大數量(措施1)和最大數量(措施2) 。
我想要做的是計算每個人, 每個月 (1)每天的總和,(2)dailymax1的前5個值(3)dailymax2的前5個值。 可能是一個月內甚至沒有5個每日值,在這種情況下我希望返回null。
沒有連接,我想不出怎么做,因為我是一個sql新手。 我知道,對於組中前5個值中的每個值,將重復dailyqty的總和 - 這沒關系。
一些虛擬數據:
CREATE TABLE test (
person varchar(50),
daydate date,
month integer,
dailyqty double precision,
dailymax1 double precision,
dailymax2 double precision
);
INSERT INTO test(person, daydate, month, dailyqty, dailymax1, dailymax2)
VALUES
('A', '2015-01-01', 1, 5, 0.5, 4),
('A', '2015-01-02', 1, 8, 3, 4),
('A', '2015-01-03', 1, 7, 1, 3),
('A', '2015-01-04', 1, 1, 2, 2),
('A', '2015-01-05', 1, 9, 6, 8),
('A', '2015-01-06', 1, 7, 2.5, 7),
('A', '2015-01-07', 1, 2, 4, 7),
('A', '2015-01-08', 1, 5, 1, 3),
('B', '2015-01-01', 1, 20, 8, 1),
('B', '2015-01-02', 1, 22, 9, 2)
期望的結果
謝謝! 一種
此查詢復制問題中發布的所需結果:
SELECT xt1.person, xt1.month, xt1.monthlyqty, xt3.max1, xt4.max2
FROM (
SELECT SUM(COALESCE(t.dailyqty, 0)) as monthlyqty, t.person, t.month
FROM test t
GROUP by t.person, t.month
) xt1
CROSS JOIN (
VALUES (1), (2), (3), (4), (5)
) xt2
LEFT OUTER JOIN (
SELECT t.person, t.month, t.dailymax1 as max1
, ROW_NUMBER() OVER (PARTITION BY t.person, t.month ORDER BY t.dailymax1 DESC NULLS LAST) as colnumber
FROM test t
) xt3 ON xt2.column1 = xt3.colnumber AND xt1.person = xt3.person AND xt1.month = xt3.month
LEFT OUTER JOIN (
SELECT t.person, t.month, t.dailymax2 as max2
, ROW_NUMBER() OVER (PARTITION BY t.person, t.month ORDER BY t.dailymax2 DESC NULLS LAST) as colnumber
FROM test t
) xt4 ON xt2.column1 = xt4.colnumber AND xt1.person = xt4.person AND xt1.month = xt4.month;
需要考慮的一些事情可能會改變查詢......首先,您可以考慮一下dailyqty,dailymax1和dailymax2這些列是否真的可以為空(就像在您的表定義中一樣)。 如果不是,您可以將COALESCE(t.dailyqty, 0)
簡化為t.dailyqty
,將兩個DESC NULLS LAST
簡化為DESC
。
其次,你可以考慮更換CROSS JOIN
到xt2
與JOIN來調用generate_series
,如: CROSS JOIN generate_series (1, 5) xt2
然后更換xt2.column1
出場只有xt2
。 我不確定哪種方法會更有效,也許兩者都做類似的事情,但如果存在顯着差異,則值得檢查一下您的實際數據。
最后,您說您想為每個人和每月計算,但“月”可以指“月”列或“白天”列中的月份。 我選擇了第一個選項,因為它更容易編寫:),但修改了一些事情,查詢可以適用於其他列。
此查詢提供您需要的輸出:
WITH FilledData AS
(
WITH Filler AS
(
WITH t1 AS
(
SELECT DISTINCT person, month FROM test
),
t2 AS
(
SELECT generate_series as order FROM generate_series(1, 4)
)
SELECT t1.person, t1.month, CAST(NULL AS double precision) AS dailyqty, CAST(NULL AS double precision) AS dailymax1, CAST(NULL AS double precision) AS dailymax2 FROM t1 CROSS JOIN t2
)
SELECT person, month, dailyqty, dailymax1, dailymax2 FROM test UNION ALL
SELECT person, month, dailyqty, dailymax1, dailymax2 FROM Filler ORDER BY person, month
),
monthlyqty AS
(
SELECT person, month, SUM(dailyqty) AS monthlyqty FROM test GROUP BY person, month
),
dailymax1_table AS
(
SELECT person, month, dailymax1, dailymax1_order
FROM (
SELECT *, row_number() over (partition by person, month order by dailymax1 desc NULLS LAST) as dailymax1_order
FROM FilledData
) t1 WHERE dailymax1_order <= 5
),
dailymax2_table AS
(
SELECT person, month, dailymax2, dailymax2_order
FROM (
SELECT *, row_number() over (partition by person, month order by dailymax2 desc NULLS LAST) as dailymax2_order
FROM FilledData
) t2 WHERE dailymax2_order <= 5
)
SELECT dailymax1_table.person, dailymax1_table.month, monthlyqty.monthlyqty, dailymax1_table.dailymax1 as max1, dailymax2_table.dailymax2 as max2
FROM dailymax1_table JOIN monthlyqty
ON monthlyqty.person = dailymax1_table.person AND
monthlyqty.month = dailymax1_table.month
JOIN dailymax2_table ON
dailymax1_table.person = dailymax2_table.person AND
dailymax1_table.month = dailymax2_table.month AND
dailymax1_table.dailymax1_order = dailymax2_table.dailymax2_order;
您可以使用窗口函數將它們組合在一起:
select . . .
from (select t.*,
sum(dailyqty) over (partition by person, date_trunc('month', datecol)) as monthqty,
row_number() over (partition by person, date_trunc('month', datecol) order by dailyqty desc) as seqnum
from t
) t
where seqnum <= 5;
您可以從子查詢中提取所需的列。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.