简体   繁体   中英

Repeat Customers Each Year (Retention)

I've been working on this and I don't think I'm doing it right. |D

Our database doesn't keep track of how many customers we retain so we looked for an alternate method. It's outlined in this article . It suggests you have this table to fill in:

Year  Number of Customers   Number of customers Retained in 2009    Percent (%) Retained in 2009    Number of customers Retained in 2010    Percent (%) Retained in 2010  ....

2008
2009
2010
2011
2012
Total

The table would go out to 2012 in the headers. I'm just saving space.

It tells you to find the total number of customers you had in your starting year. To do this, I used this query since our starting year is 2008:

select YEAR(OrderDate) as 'Year', COUNT(distinct(billemail)) as Customers
from dbo.tblOrder
where OrderDate >= '2008-01-01' and OrderDate <= '2008-12-31'
group by YEAR(OrderDate)

At the moment we just differentiate our customers by email address.

Then you have to search for the same names of customers who purchased again in later years (ours are 2009, 10, 11, and 12).

I came up with this. It should find people who purchased in both 2008 and 2009.

SELECT YEAR(OrderDate) as 'Year',COUNT(distinct(billemail)) as Customers
FROM dbo.tblOrder o with (nolock)
WHERE o.BillEmail IN (SELECT DISTINCT o1.BillEmail 
        FROM dbo.tblOrder o1 with (nolock)
        WHERE o1.OrderDate BETWEEN '2008-1-1' AND '2009-1-1')
    AND o.BillEmail IN (SELECT DISTINCT o2.BillEmail 
        FROM dbo.tblOrder o2 with (nolock)
        WHERE o2.OrderDate BETWEEN '2009-1-1' AND '2010-1-1')
    --AND o.OrderDate BETWEEN '2008-1-1' AND '2013-1-1' 
    AND o.BillEmail NOT LIKE '%@halloweencostumes.com'
    AND o.BillEmail NOT LIKE ''
GROUP BY YEAR(OrderDate)

So I'm just finding the customers who purchased in both those years. And then I'm doing an independent query to find those who purchased in 2008 and 2010, then 08 and 11, and then 08 and 12. This one finds 2008 and 2010 purchasers:

SELECT YEAR(OrderDate) as 'Year',COUNT(distinct(billemail)) as Customers
FROM dbo.tblOrder o with (nolock)
WHERE o.BillEmail IN (SELECT DISTINCT o1.BillEmail 
        FROM dbo.tblOrder o1 with (nolock)
        WHERE o1.OrderDate BETWEEN '2008-1-1' AND '2009-1-1')
    AND o.BillEmail IN (SELECT DISTINCT o2.BillEmail 
        FROM dbo.tblOrder o2 with (nolock)
        WHERE o2.OrderDate BETWEEN '2010-1-1' AND '2011-1-1')
    --AND o.OrderDate BETWEEN '2008-1-1' AND '2013-1-1' 
    AND o.BillEmail NOT LIKE '%@halloweencostumes.com'
    AND o.BillEmail NOT LIKE ''
GROUP BY YEAR(OrderDate)

So you see I have a different query for each year comparison. They're all unrelated. So in the end I'm just finding people who bought in 2008 and 2009, and then a potentially different group that bought in 2008 and 2010, and so on. For this to be accurate, do I have to use the same grouping of 2008 buyers each time? So they bought in 2009 and 2010 and 2011, and 2012?

This is where I'm worried and not sure how to proceed or even find such data.

Any advice would be appreciated! Thanks!

How about a cross-tab on a per customer basis to help you out...

From this, you can start to analyze a bit more in bulk by comparing a customer's current year to the previous and have a total customers count for each respective year. From that you can run whatever percentages you want in your final output

This should get you a whole set of all years in question, and you can just keep adding years as need be for comparison. It should be very quick, especially if you have an index on ( BillEMail, OrderDate ).

The premise is that the inner query just blows through all the records, and on a customer basis sets a flag of 1 if there are ANY orders within the given year (via MAX()). It does it via case/when so each year is detected for a customer. Once that has been determined, the outer query then rolls those up comparing each customer with if they had a sale in one year vs the prior, if so, SUM() 1 vs 0 and you have your counts of retention.

SELECT
      SUM( case when PreQry.C2011 = 1 and PreQry.C2012 = 1 then 1 else 0 end ) as Retain2011_2012,
      SUM( case when PreQry.C2010 = 1 and PreQry.C2011 = 1 then 1 else 0 end ) as Retain2010_2011,
      SUM( case when PreQry.C2009 = 1 and PreQry.C2010 = 1 then 1 else 0 end ) as Retain2009_2010,
      SUM( case when PreQry.C2008 = 1 and PreQry.C2009 = 1 then 1 else 0 end ) as Retain2008_2009,
      SUM( PreQry.C2012 ) CustCount2012,
      SUM( PreQry.C2011 ) CustCount2011,
      SUM( PreQry.C2010 ) CustCount2010,
      SUM( PreQry.C2009 ) CustCount2009,
      SUM( PreQry.C2008 ) CustCount2008
   from 
      ( select 
              O.BillEMail as customer,
              MAX( CASE when YEAR( O.OrderDate ) = 2012 then 1 else 0 end ) as C2012, 
              MAX( CASE when YEAR( O.OrderDate ) = 2011 then 1 else 0 end ) as C2011, 
              MAX( CASE when YEAR( O.OrderDate ) = 2010 then 1 else 0 end ) as C2010, 
              MAX( CASE when YEAR( O.OrderDate ) = 2009 then 1 else 0 end ) as C2009, 
              MAX( CASE when YEAR( O.OrderDate ) = 2008 then 1 else 0 end ) as C2008
           from 
              dbo.tblOrder O
           where 
                  O.OrderDate >= '2008-01-01' 
              AND O.BillEmail NOT LIKE '%@halloweencostumes.com'
              AND O.BillEmail NOT LIKE ''
           group by 
              O.BillEMail ) as PreQry

Now, if you wanted to detect how many were "NEW" for a given year, you could just add additional columns such as testing the previous year sale flag = 0 vs current year = 1 such as

SUM( case when PreQry.C2011 = 0 and PreQry.C2012 = 1 then 1 else 0 end ) as NewIn2012,
SUM( case when PreQry.C2010 = 0 and PreQry.C2011 = 1 then 1 else 0 end ) as NewIn2011,
SUM( case when PreQry.C2009 = 0 and PreQry.C2010 = 1 then 1 else 0 end ) as NewIn2010,
SUM( case when PreQry.C2008 = 0 and PreQry.C2009 = 1 then 1 else 0 end ) as NewIn2009

If I understand your problem right, then you sounds like you've gotten mixed up in the details. It depends on what you want your definition of 'retain' to be. How about 'also bought in some previous year'? Then, for year X, a customer is retained if they also bought from you in a previous year.

For 2012, for example:

SELECT YEAR(OrderDate) as 'Year',COUNT(distinct(billemail)) as Customers
FROM dbo.tblOrder o with (nolock)
WHERE o.BillEmail IN (SELECT DISTINCT o1.BillEmail 
        FROM dbo.tblOrder o1 with (nolock)
        WHERE o1.OrderDate BETWEEN '2012-1-1' AND '2013-1-1')
    AND o.BillEmail IN (SELECT DISTINCT o2.BillEmail 
        FROM dbo.tblOrder o2 with (nolock)
        WHERE o2.OrderDate < '2012-1-1')
    AND o.BillEmail NOT LIKE '%@halloweencostumes.com'
    AND o.BillEmail NOT LIKE ''
GROUP BY YEAR(OrderDate)

Does this work?

Edit

You can take this a step farther and abstract out the year so 1 query will suffice:

SELECT YEAR(O.OrderDate) as 'Year',COUNT(distinct(billemail)) as Customers
FROM dbo.tblOrder o with (nolock)
WHERE o.BillEmail IN (SELECT DISTINCT o1.BillEmail 
        FROM dbo.tblOrder o1 with (nolock)
        WHERE year(o1.OrderDate)=YEAR(O.OrderDate)
    AND o.BillEmail IN (SELECT DISTINCT o2.BillEmail 
        FROM dbo.tblOrder o2 with (nolock)
        WHERE year(o2.OrderDate) < year(o.orderdate)
    AND o.BillEmail NOT LIKE '%@halloweencostumes.com'
    AND o.BillEmail NOT LIKE ''
GROUP BY YEAR(OrderDate)

This should give you, for each year in which you had orders, the count of distinct customers, and the count of customers who also purchased in a previous year. However, it's not in the same format as the table you want to populate.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM