![](/img/trans.png)
[英]How to write a SQL query to find out customers who have purchased at least two times every month of the year
[英]Postgresql - Query the List of customers who have bought grocery at least once every month for 3 continuous months
有以下数据
purchase_date customer_id
2015-05-25 03:24:09+05:30 | 15
2015-06-21 06:33:35+05:30 | 15
2015-07-02 02:03:32+05:30 | 17
2015-07-25 10:20:31+05:30 | 15
2015-07-25 10:20:31+05:30 | 12
2015-07-26 01:20:31+05:30 | 17
2015-08-26 03:24:09+05:30 | 17
2015-08-21 03:21:21+05:30 | 14
我想获取连续 3 个月每月有一次条目的 customer_ids 列表。
我正在使用 PostgreSQL 10.14
SELECT
customer_id
FROM (
SELECT
*,
age( -- 5
month,
-- 4
first_value(month) OVER (PARTITION BY customer_id ORDER BY month ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)
) as months,
-- 3
COUNT(*) OVER (PARTITION BY customer_id ORDER BY month ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)
FROM (
SELECT DISTINCT -- 2
date_trunc('month', purchase_date) as month, -- 1
customer_id
FROM mytable
) s
) s
WHERE months = interval '2 months' and count = 3 -- 6
所有步骤的示例都可以在上面链接的小提琴中看到!
date_trunc('month', ...)
将日期标准化为实际月份的第一天。 所以,我们可以在一个月内建立一组日期,不管实际是哪一天DISTINCT
消除所有绑定的记录。 因此,如果客户在同一个月有两条记录,则不应再识别这些重复记录COUNT()
仅识别当前记录和组中的前两行。 因此,customer_id 组的第一条记录将返回 1,第三条记录(如果可用)将返回 3。COUNT()
是不够的。 因为它可以是,我们有 1、3、5 个月,很明显,没有连续的月份。 所以,我们需要现在第一个月的滚动window。 如果是两个月前,我们可以肯定,window 确实包含连续 3 个月:两个月前、一个月前和当前月。此外:
如果您介意至少更新到Postgres 11 ,您可以省略步骤 4 到 6,因为它支持日期范围而不是 window 函数的行。 所以,我们不需要单独处理日期范围。
结果看起来更方便,不是吗?
SELECT
customer_id
FROM (
SELECT
*,
COUNT(*) OVER (PARTITION BY customer_id ORDER BY month RANGE BETWEEN interval '2 months' PRECEDING AND CURRENT ROW)
FROM (
SELECT DISTINCT
date_trunc('month', purchase_date) as month,
customer_id
FROM mytable
) s
) s
WHERE count = 3
您可以使用减少到客户和月份。 然后只需使用lag()
-- 一次:
select distinct customer_id
from (select customer_id, date_trunc('month', purchase_date) as yyyymm,
lag(date_trunc('month', purchase_date), 2) over (partition by customer_id order by min(purchase_date)) as prev2_yyyymm
from t
group by customer_id, date_trunc('month', purchase_date)
) t
where prev2_yyyymm = yyyymm - interval '2 month';
让我们将数据减少到几个月和客户:
WITH mc AS(
SELECT DISTINCT date_trunc('month', purchase_date) mo, customer_id
FROM t
)
然后添加一个使用分析来获取上一月和下一月的 CTE:
WITH mc AS(
SELECT DISTINCT date_trunc('month', purchase_date) mo, customer_id
FROM t
), mcpn AS (
SELECT
LAG(mo) OVER(partition by customer_id) as prevmo,
LEAD(mo) OVER(partition by customer_id) as nextmo,
mc.*
FROM mc
)
然后 select prevmo, mo, nextmo 相隔一个月:
WITH mc AS(
SELECT DISTINCT date_trunc('month', purchase_date) mo, customer_id
FROM t
), mcpn AS (
SELECT
LAG(mo) OVER(partition by customer_id) as prevmo,
LEAD(mo) OVER(partition by customer_id) as nextmo,
mc.*
FROM mc
)
SELECT customer_id FROM mcpn WHERE AGE(prevmo, mo) = interval '1 month' AND AGE(mo, nextmo) = interval '1 month'
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.