[英]SQL Retention Cohort Analysis
我正在尝试编写每月保留的查询,以计算从初始开始月份和向前返回的用户的百分比。
TABLE: customer_order
fields
id
date
store_id
TABLE: customer
id
person_id
job_id
first_time (bool)
这让我得到了基于第一个日期的初始月度队列
SELECT first_job_month, COUNT( DISTINCT person_id) user_counts
FROM
( SELECT DATE_TRUNC(MIN(CAST(date AS DATE)), month) first_job_month, person_id
FROM customer_order cd
INNER JOIN consumer co ON co.job_id = cd.id
GROUP BY 2
ORDER BY 1 ) first_d GROUP BY 1 ORDER BY 1
first_job_month user_counts
2018-04-01 36
2018-05-01 37
2018-06-01 39
2018-07-01 45
2018-08-01 38
我已经尝试了很多东西,但我无法弄清楚如何从第一个月开始跟踪原始同类群组/用户
有一些替代选项,比如使用窗口函数在同一子查询中执行(1)和(2),但最简单的选项是这一个:
WITH
cohorts as (
SELECT person_id, DATE_TRUNC(MIN(CAST(date AS DATE)), month) as first_job_month
FROM customer_order cd
JOIN consumer co
ON co.job_id = cd.id
GROUP BY 1
)
,orders as (
SELECT
*
,round(1.0*(DATE_TRUNC(MIN(CAST(cd.date AS DATE))-c.first_job_month)/30) as months_since_first_order
FROM cohorts c
JOIN customer_order cd
USING (person_id)
)
SELECT
first_job_month as cohort
,count(distinct person_id) as size
,count(distinct case when months_since_first_order>=1 then person_id end) as m1
,count(distinct case when months_since_first_order>=2 then person_id end) as m2
,count(distinct case when months_since_first_order>=3 then person_id end) as m3
-- hardcode up to the number of months you want and the history you have
FROM orders
GROUP BY 1
ORDER BY 1
请参阅,您可以在聚合函数(如COUNT
使用CASE
语句来标识要在同一组中聚合的行的不同子集。 这是SQL中最重要的BI技术之一。
注意, >=
not =
用于条件聚合,例如,如果客户在m1
之后以m3
购买并且不以m2
购买,则它们仍将以m2
计算。 如果您希望您的客户每月购买和/或查看每个月的实际保留期,并且如果后续月份值可能高于之前的月份,则可以使用=
。
此外,如果您不希望“三角”视图像您从此查询中获得的那个或者您不想对“mX”部分进行硬编码,则只需按first_job_month
和months_since_first_order
并计算不同。 一些可视化工具可能会使用这种简单的格式并从中制作三角视图。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.