简体   繁体   中英

SQL Server: grouping on rows

I have data like below:

在此处输入图片说明

I want to group the rows for the same visitors having purchase = 1 and all their previous visits where purchase = 0. For the above data example, the rows should be grouped as:

  • Rows 1 and 2 should be grouped together (because visit_id 1002 has purchase = 1 and visit_id 1001 is the previous visit before the purchase having purchase = 0)
  • Row 3 should be grouped alone (because visit_id 1003 has purchase = 1 and there is no previous visit to visit_id 1003 having purchase = 0) (visit_id 1001 cannot be considered as the previous visit of visit_id 1003 because visit_id 1002 occurred between 1001 and 1003 and it has purchase = 1)
  • Row 4 should be grouped alone (because visit_id 2001 does not have any previous visit)
  • Rows 5,6 and 7 should be grouped together (because visit_id 2004 has purchase = 1 and visit_ids 2002 and 2003 are the previous visits which have purchase = 0)

How could this be achieved? I am using SQL Server 2012.

I am expecting output similar to below:

在此处输入图片说明

Code to generate the above data:

CREATE TABLE [#tmp_data]
(
    [visitor]       INT, 
    [visit_id]      INT, 
    [visit_time]    DATETIME, 
    [purchase]      BIT
);

INSERT INTO #tmp_data( visitor, visit_id, visit_time, purchase )
VALUES( 1, 1001, '2020-01-01 10:00:00', 0 ), 
( 1, 1002, '2020-01-02 11:00:00', 1 ), 
( 1, 1003, '2020-01-02 14:00:00', 1 ), 
( 2, 2001, '2020-01-01 10:00:00', 1 ), 
( 2, 2002, '2020-01-07 11:00:00', 0 ), 
( 2, 2003, '2020-01-08 14:00:00', 0 ), 
( 2, 2004, '2020-01-11 14:00:00', 1 );

I'm not sure what you mean by "grouped". But your description of a grouping is the number of 1 values on or after a given value. So, this assigns a value per visitor

select td.*,
       sum(case when purchase = 1 then 1 else 0 end) over (partition by visitor order by visit_time desc) as grouping
from #tmp_data td;

This can be simplified to:

select td.*,
       sum( convert(int, purchase) ) over (partition by visitor order by visit_time desc) as grouping
from tmp_data td
order by visitor, visit_time;

Note: This just assigns a "grouping". You can aggregate however you want after that.

Here is a db<>fiddle.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM