简体   繁体   中英

I am struggling to output the correct SQL code on a simple query

There is a table with the following properties: Order: order_id, order_item_id, customer_id, date, product_id, revenue

I need to answer the following question: How many new customers did we acquire in 2018. (If a customer made their first purchase then we call them as new customer (for that order only))?

I wrote the following query:

Select Count(customer_id) as New_Customers From Order
Where date < 2019-01-01 AND NOT date < 2018-01-01
Group by customer_id

The person assigning this looked at it and told me that I need to create a filter for the new customers. I thought the "Where" clause accomplished this but I guess not.

Any help would be appreciated.

I think this is what your trying to achieve.

select count(customer_id) as New_Customers from Orders
Where customer_id not in (select customer_id from Orders Where date < '2018-01-01')
and year(date) = '2018'

The solution would be finding very first date of each customer and checking whether it falls in the given date range.

  • To find out minimum(oldest) purchase date, we can just group by on customer_id and use MIN(date) for that.
  • But the trick is that we can't apply WHERE clause as GROUP BY clause executes after WHERE clause, so using WHERE clause will remove actual records and data would be inappropriate. So use HAVING clause to filter out the grouped results.
SELECT 
    customer_id AS New_Customers, 
    MIN(`date`) First_Purchase_Date
FROM order 
GROUP BY customer_id
HAVING MIN(`date`) BETWEEN '2018-01-01' AND '2019-01-01';
select count(customer_id) as New_Customers from
(
Select
rank() over (partition by customer_id order by date asc) as purchase_rank
,customer_id
,date
From Order
) as ranked_orders
where date between '2018-01-01' and '2019-01-01'
and purchase_rank = 1

I think you want something like the above. You have to create a rank that's partitioned by customer_id and ordered by the date ascending so that each customers first purchase will have a rank of 1. Then you only grab those orders in 2018 where the rank is 1 and count those.

Additionally, you can change the where clause to be purchase_rank > 1 and in the select clause change it to count(distinct customer_id) and you can then get the number of customers who made purchases as returning customers (as opposed to new customers.)

Simplest query would be to use EXISTS as follows:

select count(distinct O1.customer_id) as New_Customers 
from Orders O1
Where not exist 
  (select 1 
  from Orders O2 
  Where O2.date < date '2018-01-01'
  and O1.customer_id = O2.customer_id)
and extract(YEAR from date) = '2018'

Cheers!!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM