简体   繁体   English

每个 pandas 行与另一个 pandas dataframe 作为新列的相关性

[英]Correlation of every pandas row with another pandas dataframe as a new column

Assuming I have the following df :假设我有以下df

Company   Apples   Mangoes   Oranges

Amazon       0.75      0.6     0.98
BellTM       0.23      0.75    0.14
Cadbury      0.4       0.44    0.86

and then another data frame called vendor :然后是另一个名为vendor的数据框:

Company   Apples   Mangoes   Oranges

Deere       0.11      0.3     0.79

I want to find the row-wise correlation of each company with the company Deere in the vendor data frame.我想在vendor数据框中找到每家公司与Deere公司的逐行相关性。 I want the outputted correlation coefficient added as a column called Correlationcoef to the original data frame df:我希望将输出的相关系数作为名为 Correlationcoef 的列添加到原始数据帧 df 中:

Company   Apples   Mangoes   Oranges     Corrcoef

Amazon       0.75      0.6     0.98     0.77955981 
BellTM       0.23      0.75    0.14    -0.37694478
Cadbury      0.4       0.44    0.86     0.98092707

When I attempt the following:当我尝试以下操作时:

df.iloc[:,1:].corrwith(vendor.iloc[:,1:], axis=1)

I get a list with NaN values.我得到一个包含 NaN 值的列表。 I obtained the Corrcoef values manually by saving each row as an array and using np.corrcoef(x1,y)我通过将每一行保存为数组并使用np.corrcoef(x1,y)手动获得了 Corrcoef 值

You need to use a Series in corrwith.您需要使用对应的系列。

You can use:您可以使用:

df.set_index('Company').corrwith(vendor.set_index('Company').loc['Deere'], axis=1)

output: output:

Company
Amazon     0.779560
BellTM    -0.376945
Cadbury    0.980927
dtype: float64

With your code:使用您的代码:

df.iloc[:, 1:].corrwith(vendor.iloc[0,1:].astype(float), axis=1)

output: output:

0    0.779560
1   -0.376945
2    0.980927
dtype: float64

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM