繁体   English   中英

如何使用SQL计算每个月的保留期

[英]How to calculate retention month over month using SQL

试图获得一个基本表,该表显示从一个月到下个月的保留期。 因此,如果某人在上个月购买了某商品,而在下个月又购买了,则该数字将被计算在内。

month, num_transactions, repeat_transactions, retention
2012-02, 5, 2, 40%
2012-03, 10, 3, 30%
2012-04, 15, 8, 53%

因此,如果上个月购买的每个人都在下个月再次购买,则您有100%的权益。

到目前为止,我只能手动计算东西。 这给了我两个月都看到过的行:

select count(*) as num_repeat_buyers from 

(select distinct
  to_char(transaction.timestamp, 'YYYY-MM') as month,
  auth_user.email
from
  auth_user,
  transaction
where
  auth_user.id = transaction.buyer_id and
  to_char(transaction.timestamp, 'YYYY-MM') = '2012-03'
) as table1,


(select distinct
  to_char(transaction.timestamp, 'YYYY-MM') as month,
  auth_user.email
from
  auth_user,
  transaction
where
  auth_user.id = transaction.buyer_id and
  to_char(transaction.timestamp, 'YYYY-MM') = '2012-04'
) as table2
where table1.email = table2.email

这是不对的,但我觉得我可以使用Postgres的某些窗口函数。 请记住,开窗函数不允许您指定WHERE子句。 您通常可以访问前几行和前几行:

select month, count(*) as num_transactions, count(*) over (PARTITION BY month ORDER BY month)
from 
    (select distinct
      to_char(transaction.timestamp, 'YYYY-MM') as month,
      auth_user.email
    from
      auth_user,
      transaction
    where
      auth_user.id = transaction.buyer_id
    order by
      month
    ) as transactions_by_month
group by
    month

给定以下测试表(您应该提供):

CREATE TEMP TABLE transaction (buyer_id int, tstamp timestamp);
INSERT INTO transaction VALUES 
 (1,'2012-01-03 20:00')
,(1,'2012-01-05 20:00')
,(1,'2012-01-07 20:00')  -- multiple transactions this month
,(1,'2012-02-03 20:00')  -- next month
,(1,'2012-03-05 20:00')  -- next month
,(2,'2012-01-07 20:00')
,(2,'2012-03-07 20:00')  -- not next month
,(3,'2012-01-07 20:00')  -- just once
,(4,'2012-02-07 20:00'); -- just once

auth_user与该问题无关。
使用tstamp作为列名,因为我不使用基本类型作为标识符。

我将使用窗口函数lag()识别重复的购买者。 为了简短起见,我将聚合和窗口函数合并到一个查询级别。 请记住,窗口函数是集合函数之后应用的。

WITH t AS (
   SELECT buyer_id
         ,date_trunc('month', tstamp) AS month
         ,count(*) AS item_transactions
         ,lag(date_trunc('month', tstamp)) OVER (PARTITION BY  buyer_id
                                           ORDER BY date_trunc('month', tstamp)) 
          = date_trunc('month', tstamp) - interval '1 month'
            OR NULL AS repeat_transaction
   FROM   transaction
   WHERE  tstamp >= '2012-01-01'::date
   AND    tstamp <  '2012-05-01'::date -- time range of interest.
   GROUP  BY 1, 2
   )
SELECT month
      ,sum(item_transactions) AS num_trans
      ,count(*) AS num_buyers
      ,count(repeat_transaction) AS repeat_buyers
      ,round(
          CASE WHEN sum(item_transactions) > 0
             THEN count(repeat_transaction) / sum(item_transactions) * 100
             ELSE 0
          END, 2) AS buyer_retention
FROM   t
GROUP  BY 1
ORDER  BY 1;

结果:

  month  | num_trans | num_buyers | repeat_buyers | buyer_retention_pct
---------+-----------+------------+---------------+--------------------
 2012-01 |         5 |          3 |             0 |               0.00
 2012-02 |         2 |          2 |             1 |              50.00
 2012-03 |         2 |          2 |             1 |              50.00

我扩展了您的问题,以提供交易数量和购买者数量之间的差异。

repeat_transactionOR NULL用于将FALSE转换为NULL ,因此在下一步中,这些值不会被count()

-> SQLfiddle。

这使用CASEEXISTS来获取重复的事务:

SELECT
    *,
    CASE
        WHEN num_transactions = 0
        THEN 0
        ELSE round(100.0 * repeat_transactions / num_transactions, 2)
    END AS retention
FROM
    (
        SELECT
            to_char(timestamp, 'YYYY-MM') AS month,
            count(*) AS num_transactions,
            sum(CASE
                WHEN EXISTS (
                    SELECT 1
                    FROM transaction AS t
                    JOIN auth_user AS u
                    ON t.buyer_id = u.id
                    WHERE
                        date_trunc('month', transaction.timestamp)
                            + interval '1 month'
                            = date_trunc('month', t.timestamp)
                        AND auth_user.email = u.email
                )
                THEN 1
                ELSE 0
            END) AS repeat_transactions
        FROM
            transaction
            JOIN auth_user
            ON transaction.buyer_id = auth_user.id
        GROUP BY 1
    ) AS summary
ORDER BY 1;

编辑:再次阅读问题后,从负1个月更改为正1个月。 我现在的理解是,如果某人在2012-02年买了东西,然后又在2012-03年买了东西,那么他或她在2012-02年的交易就算作当月保留。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM