简体   繁体   中英

SQL query to find all combinations in a shopping cart and their occurrence

I have a table with two columns, namely [USER] and [ITEM]. Each item does not appear more than once.

An example of the table could be:

[USER]  [ITEM]
A        001
A        002
B        002
B        001
B        003
C        001

I would like to extract ALL sequences of items ever bought using SQL. In this case:

[SEQUENCE]    [OCCURRENCES]    [LENGTH SEQUENCE]
001            3                 1
002            2                 1
003            1                 1
001-002        2                 2
001-002-003    1                 3

I believe the best way to sort the data into a table would be:

[SEQUENCE]    [ITEM]    [OCCURENCES]   [LENGTH SEQUENCE]
1             001        3              1
2             002        2              1
3             003        1              1
4             001        2              2
4             002        2              2
5             001        1              3
5             002        1              3
5             003        1              3

I have found this post " SQL Query For Most Popular Combination ", but it only extracts combinations of 2 elements.

Do you have any idea on how to obtain such output? Thanks!

To do this kind of frequency analysis, you need a way to create all combinations of products purchased in each transaction. For that recursive SQL is the way to go.

Starting with a table of purchases:

create table purchases (id varchar(6), product varchar(6));
insert into purchases 
values ('A','001')
      ,('A','002')
      ,('B','002')
      ,('B','001')
      ,('B','003')
      ,('C','001');

We use the following recursive query to generate all purchase combinations per transaction limited to at most 5 items per combination (you can change that limit if desired), then perform the frequency analysis on the generated combinations in the query following the recursive common table expression:

with recur(id, length, combo, lastitem) as (
  -- Anchor Query
  select p.id, 1, cast(product as varchar(max)), product from purchases p

  union all -- Recursive Part
  select r.id, length+1, combo+','+product, product
    from recur r
    join purchases p
      on p.id = r.id
     and p.product > r.lastitem
   where r.length < 5
)
-- Output query
select length, combo, count(*) frequency
  from recur
 group by length, combo
 order by frequency desc
     , length desc
     , combo;

Yielding the following results for the given data:

length | combo       | frequency
-----: | :---------- | --------:
     1 | 001         |         3
     2 | 001,002     |         2
     1 | 002         |         2
     3 | 001,002,003 |         1
     2 | 001,003     |         1
     2 | 002,003     |         1
     1 | 003         |         1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM