[英]User retention at a week level
I have subscriptions data as shown below. 我有订阅数据,如下所示。
+------+-----------------+-----------+-----------+----------+--------+
| user | subscription_id | start | end | wk_start | wk_end |
+------+-----------------+-----------+-----------+----------+--------+
| 1 | 1A | 6/1/2019 | 6/30/2019 | 22 | 27 |
| 2 | 2A | 6/1/2019 | 6/21/2019 | 22 | 25 |
| 3 | 3A | 6/1/2019 | 6/21/2019 | 22 | 25 |
| 4 | 4A | 6/1/2019 | 6/15/2019 | 22 | 24 |
| | | | | | |
| 1 | 1B | 7/4/2019 | 8/4/2019 | 27 | 32 |
| 2 | 2B | 7/1/2019 | 7/31/2019 | 27 | 31 |
| 3 | 3B | 6/24/2019 | 7/24/2019 | 26 | 30 |
+------+-----------------+-----------+-----------+----------+--------+
The data shows when a user bought a subscription. 数据显示用户何时购买了订阅。 It has user_id,subscription_id,start date and end_date
. 它具有user_id,subscription_id,start date and end_date
。 I want to find out the user retention. 我想找出用户保留率。
I want to see how many users that bought subscription for the first time in a particular week are active in the upcoming weeks. 我想查看在接下来的几周中有多少用户在特定的一周内首次购买了订阅。
They could be active on current subscription or new subscription bought after expiry of current subscription. 它们可以在当前订阅上处于活动状态,也可以在当前订阅到期后购买新订阅。
The desired output is as below 所需的输出如下
+----------+-------------+----------------+--+-----------------------------------------------------------------------------+
| start_wk | Rolling_wk | Retained Users | | Active User(Not a part of desired output) |
+----------+-------------+----------------+--+-----------------------------------------------------------------------------+
| 22 | 22 | 4 | | 1,2,3,4 |
| 22 | 23 | 4 | | 1,2,3,4 |
| 22 | 24 | 4 | | 1,2,3,4 |
| 22 | 25 | 3 | | 1,2,3 |
| 22 | 26 | 2 | | 1,3(with subscription_id = 3B) |
| 22 | 27 | 3 | | 1,2,3(1 is counted only once. He was active with subscription_id 1A and 1B) |
| 22 | 28 | 3 | | 1,2,3 |
| 22 | 29 | 3 | | 1,2,3 |
| 22 | 30 | 3 | | 1,2,3 |
+----------+-------------+----------------+--+-----------------------------------------------------------------------------+
Note that Active User
is not a part of desired output. 请注意, Active User
不是所需输出的一部分。 It is only to understanding how the number in column Retained_User
is obtained. 仅了解如何获取Retained_User
列中的数字。
I want columns start_wk
, Rolling_wk
and Retained Users
as output. 我希望将start_wk
, Rolling_wk
和Retained Users
列作为输出。
I will have a huge data like this for each week and want output for each week in similar fashion. 我每周都会有大量这样的数据,并希望每周以类似的方式输出。 In each case start_wk
will change and rolling_wk
will start from start_wk
在每种情况下, start_wk
都会发生变化, rolling_wk
将从start_wk
开始
+----------+------------+----------------+
| start_wk | rolling_wk | Retained_users |
+----------+------------+----------------+
| 22 | 22 | 100 |
| 22 | 23 | 80 |
| 22 | 24 | 50 |
| 22 | …… | …… |
| 22 | ……. | ……. |
| 23 | 23 | 150 |
| 23 | 24 | 120 |
| 23 | 25 | 110 |
| 23 | 26 | 94 |
| 23 | …… | …… |
| 23 | ……. | ……. |
| 23 | ……. | ……. |
| 24 | 24 | 78 |
| 24 | 25 | 56 |
| 24 | 26 | 43 |
| 24 | ……. | ……. |
| 24 | ……. | ……. |
+----------+------------+----------------+
Any help will be appreciated. 任何帮助将不胜感激。
Your query should be somewhere around this below query to have sequence and count of subscribers who have multiple/single subscriptions and with max validity of their ongoing+upcoming plan < to current wk_start+rownum(in oracle as 22+1, 22+2... )/row_number() (in Sql server I guess) 您的查询应位于以下查询的周围,以具有多个/单个订阅且其正在进行的+即将进行的计划的最大有效性<到当前wk_start + rownum(在oracle中为22 + 1、22 + 2)的订户顺序和计数。 ..)/ row_number()(我猜是在Sql Server中)
Select wk_start, wk_start+rownum,
(Select count(*) from table where
(wk_start+rownum) <= All (Select
wk_max from
(SELECT user,
count(*) as
"no_of_subscriptions",
max(wk_end) as wk_max
from table
group by user) as Retained Users
from table;
I will make helper table weeks
which will have entries from 1 to 56 as column 'week', you can use loop as well. 我将创建辅助表weeks
,其中“周”列的输入范围为1到56,也可以使用循环。 Basically weeks
table represents all possible week numbers. 基本上, weeks
表代表所有可能的周数。
select
w1.week, w2.week, count(s1.user) as Retained_Users
from
weeks w1, weeks2 w2, subscriptions s1
where
w1.week <= w2.week and
s1.wk_start <= ALL(
select s2.wk_start
from subscriptions s2
where s2.user = s1.user
)
and
( select true
from subscriptions s3
where s3.user = s1.user and
s3.wk_start <= w2.week and
w2.week <= s3.wk_end
limit 1)
group by w1.week, w2.week
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.