I have a table with two columns, namely [USER] and [ITEM]. Each item does not appear more than once.
An example of the table could be:
[USER] [ITEM]
A 001
A 002
B 002
B 001
B 003
C 001
I would like to extract ALL sequences of items ever bought using SQL. In this case:
[SEQUENCE] [OCCURRENCES] [LENGTH SEQUENCE]
001 3 1
002 2 1
003 1 1
001-002 2 2
001-002-003 1 3
I believe the best way to sort the data into a table would be:
[SEQUENCE] [ITEM] [OCCURENCES] [LENGTH SEQUENCE]
1 001 3 1
2 002 2 1
3 003 1 1
4 001 2 2
4 002 2 2
5 001 1 3
5 002 1 3
5 003 1 3
I have found this post " SQL Query For Most Popular Combination ", but it only extracts combinations of 2 elements.
Do you have any idea on how to obtain such output? Thanks!
To do this kind of frequency analysis, you need a way to create all combinations of products purchased in each transaction. For that recursive SQL is the way to go.
Starting with a table of purchases:
create table purchases (id varchar(6), product varchar(6));
insert into purchases
values ('A','001')
,('A','002')
,('B','002')
,('B','001')
,('B','003')
,('C','001');
We use the following recursive query to generate all purchase combinations per transaction limited to at most 5 items per combination (you can change that limit if desired), then perform the frequency analysis on the generated combinations in the query following the recursive common table expression:
with recur(id, length, combo, lastitem) as (
-- Anchor Query
select p.id, 1, cast(product as varchar(max)), product from purchases p
union all -- Recursive Part
select r.id, length+1, combo+','+product, product
from recur r
join purchases p
on p.id = r.id
and p.product > r.lastitem
where r.length < 5
)
-- Output query
select length, combo, count(*) frequency
from recur
group by length, combo
order by frequency desc
, length desc
, combo;
Yielding the following results for the given data:
length | combo | frequency
-----: | :---------- | --------:
1 | 001 | 3
2 | 001,002 | 2
1 | 002 | 2
3 | 001,002,003 | 1
2 | 001,003 | 1
2 | 002,003 | 1
1 | 003 | 1
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.