简体   繁体   中英

Items appearing together in a basket

I am completely stuck - I would like to count the number of cases an item appears together with another item in a basket, ie some correlation:

   basket item
0       a    x
1       a    y
2       a    z
3       b    x
4       b    y
5       c    k
6       c    y
7       c    z
8       d    x
9       d    y
10      d    z
11      d    l

That is, items x , y , and z appear together in basket a , items x and y appear together in basket b , etc.

I am looking for a way to get here:

    x   y   z   k   l
x       3   2       1
y   3       3   1   1
z   2   3       1   1
k       1   1       
l   1   1   1       

The matrix is obviously symmetric. The first 3 in the top row means, that x and y appear 3 times together in one basket (namely a , b and d ), etc.

Thanks a bunch!

Use DataFrame.merge by same DataFrame with df , then remove rows with same item values and last count combinations by crosstab , last if need order of values like original add DataFrame.reindex :

df1 = df.merge(df, on='basket')
i = df['item'].unique()
df1 = df1[df1.item_x != df1.item_y]
df2 = pd.crosstab(df1.item_x, df1.item_y).reindex(index=i, columns=i)

print (df2)
item_y  x  y  z  k  l
item_x               
x       0  3  2  0  1
y       3  0  3  1  1
z       2  3  0  1  1
k       0  1  1  0  0
l       1  1  1  0  0

If need missing values use DataFrame.pivot_table :

df1 = df.merge(df, on='basket')
i = df['item'].unique()
df1 = df1[df1.item_x != df1.item_y]
df2 = df1.pivot_table(index='item_x', 
                      columns='item_y', 
                      aggfunc='size').reindex(index=i, columns=i)

print (df2)
item_y    x    y    z    k    l
item_x                         
x       NaN  3.0  2.0  NaN  1.0
y       3.0  NaN  3.0  1.0  1.0
z       2.0  3.0  NaN  1.0  1.0
k       NaN  1.0  1.0  NaN  NaN
l       1.0  1.0  1.0  NaN  NaN

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM