简体   繁体   English

当两个值在同一个groupby列pandas中时,如何创建一个矩阵?

[英]how to create a matrix when two values are in the same groupby column pandas?

So i basically have a dataframe of products and orders:所以我基本上有一个产品和订单的数据框:

product    order
 apple      111
 orange     111
 apple      121
 beans      121
 rice       131
 orange     131
 apple      141
 orange     141

What i need to do is, groupby the products based on the id of the order, and generate this matrix with the value of times they appeared together in the same order.我需要做的是,根据订单的 id 对产品进行分组,并使用它们以相同顺序一起出现的次数来生成这个矩阵。 I don't know any efficient way of doing this, if someone could help me!我不知道这样做的任何有效方法,如果有人可以帮助我!

           apple   orange  beans rice
 apple       x        2      1     0
 orange      2        x      0     1
 beans       1        0      x     0
 rice        0        1      0     x

One option is to join the dataframe with itself on order and then calculate the cooccurrences using crosstab on the two product columns:一种选择是按order将数据框与自身连接,然后在两个product列上使用crosstab计算共现:

df.merge(df, on='order').pipe(lambda df: pd.crosstab(df.product_x, df.product_y))

product_y  apple  beans  orange  rice
product_x                            
apple          3      1       2     0
beans          1      1       0     0
orange         2      0       3     1
rice           0      0       1     1

Another way is to perform a crosstab between product and order, then do a matrix multiplication @ with the transpose so:另一种方法是在产品和订单之间执行crosstab ,然后使用转置进行矩阵乘法@ ,这样:

a_ = pd.crosstab(df['product'], df['order'])
res = a_@a_.T
print(res)
product  apple  beans  orange  rice
product                            
apple        3      1       2     0
beans        1      1       0     0
orange       2      0       3     1
rice         0      0       1     1

or using pipe to do a one liner:或使用pipe做单衬:

res = pd.crosstab(df['product'], df['order']).pipe(lambda x: x@x.T)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM