[英]SAS start and end date from consecutive run
I have a dataset of customers buying items in multiple batches of consecutive days over the year eg Customer A buys on the 1st of January, the 2nd of January and the 3rd of January, stops, then buys again on the 1st of February, the 2nd of February and the 3rd of February. 我有一个客户数据集,该数据集在一年中连续多天购买商品,例如,客户A在1月1日,1月2日和1月3日购买,然后停止,然后在2月1日,2月2日再次购买2月和2月3日。
I'm looking to capture the first and last date of each consecutive batch for each customer (so the usual MIN / MAX will miss out of batches in between dates). 我正在为每个客户捕获每个连续批次的第一个和最后一个日期(因此,通常的MIN / MAX将在两个日期之间错过批次)。
I've experimented with RETAIN and LAG and I'm getting close but its not quite what I want. 我已经对RETAIN和LAG进行了实验,但是我已经接近了,但是它并不是我想要的。
How do I create a query that will display two rows for Customer A? 如何创建将为客户A显示两行的查询? ie row 1 showing start date of the 1st of January and end date of the 3rd of January;
即第1行显示1月1日的开始日期和1月3日的结束日期; row 2 showing start date of the 1st of February and end date of the 3rd of February.
第2行显示2月1日的开始日期和2月3日的结束日期。
You are asking to group the values based on the presence of a gap between the dates. 您要根据日期之间是否存在间隔对值进行分组。 So test for that and create a new group number variable.
因此,进行测试并创建一个新的组号变量。 Then you can use that new grouping variable in your analysis.
然后,您可以在分析中使用该新分组变量。
data want ;
set have ;
by id date;
dif_days = dif(sales_date);
if first.id then group=1;
else if dif_days > 1 then group+1;
run;
You can adjust the number of days in the last IF statement to adjust how large of a gap you want to allow and still consider the events as part of the same group. 您可以在最后一个IF语句中调整天数,以调整您希望允许的间隔,并仍将事件视为同一组的一部分。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.