[英]In Postgres how do I write a SQL query to select distinct values overall but aggregated over a set time period
What I mean by this is if I have a table called payments with a created_at
column and user_id
column I want to select the count of purchases aggregated weekly (can be any interval I want) but only selecting first time purchases eg if a user purchased for the first time in week 1 it would be counted but if he purchased again in week 2 he would not be counted.我的意思是,如果我有一个名为 payments 的表,其中包含
created_at
列和user_id
列,我想要 select 每周汇总的购买次数(可以是我想要的任何时间间隔),但只选择首次购买,例如,如果用户购买了第 1 周第一次购买会被计算在内,但如果他在第 2 周再次购买,则不会被计算在内。
created_at![]() |
user_id![]() |
---|---|
timestamp![]() |
1 ![]() |
timestamp![]() |
1 ![]() |
This is the query I came up with.这是我提出的查询。 The issue is if the user purchases multiple times they are all included.
问题是如果用户多次购买,它们都包含在内。 How can I improve this?
我该如何改进呢?
WITH dates AS
(
SELECT *
FROM generate_series(
'2022-07-22T15:30:06.687Z'::DATE,
'2022-11-21T17:04:59.457Z'::DATE,
'1 week'
) date
)
SELECT
dates.date::DATE AS date,
COALESCE(COUNT(DISTINCT(user_id)), 0) AS registrations
FROM
dates
LEFT JOIN
payment ON created_at::DATE BETWEEN dates.date AND dates.date::date + '1 ${dateUnit}'::INTERVAL
GROUP BY
dates.date
ORDER BY
dates.date DESC;
You want to count only first purchases.您只想计算首次购买。 So get those first purchases in the first step and work with these.
因此,在第一步中获得那些首次购买并使用它们。
WITH dates AS
(
SELECT *
FROM generate_series(
'2022-07-22T15:30:06.687Z'::DATE,
'2022-11-21T17:04:59.457Z'::DATE,
'1 week'
) date
)
, first_purchases AS
(
SELECT user_id, MIN(created_at:DATE) AS purchase_date
FROM payment
GROUP BY user_id
)
SELECT
d.date,
COALESCE(COUNT(p.purchase_date), 0) AS registrations
FROM
dates d
LEFT JOIN
first_purchases p ON p.purchase_date >= d.date
AND p.purchase_date < d.date + '1 ${dateUnit}'::INTERVAL
GROUP BY
d.date
ORDER BY
d.date DESC;
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.