[英]Correlation of every pandas row with another pandas dataframe as a new column
Assuming I have the following df
:假设我有以下
df
:
Company Apples Mangoes Oranges
Amazon 0.75 0.6 0.98
BellTM 0.23 0.75 0.14
Cadbury 0.4 0.44 0.86
and then another data frame called vendor
:然后是另一个名为
vendor
的数据框:
Company Apples Mangoes Oranges
Deere 0.11 0.3 0.79
I want to find the row-wise correlation of each company with the company Deere
in the vendor
data frame.我想在
vendor
数据框中找到每家公司与Deere
公司的逐行相关性。 I want the outputted correlation coefficient added as a column called Correlationcoef to the original data frame df:我希望将输出的相关系数作为名为 Correlationcoef 的列添加到原始数据帧 df 中:
Company Apples Mangoes Oranges Corrcoef
Amazon 0.75 0.6 0.98 0.77955981
BellTM 0.23 0.75 0.14 -0.37694478
Cadbury 0.4 0.44 0.86 0.98092707
When I attempt the following:当我尝试以下操作时:
df.iloc[:,1:].corrwith(vendor.iloc[:,1:], axis=1)
I get a list with NaN values.我得到一个包含 NaN 值的列表。 I obtained the Corrcoef values manually by saving each row as an array and using
np.corrcoef(x1,y)
我通过将每一行保存为数组并使用
np.corrcoef(x1,y)
手动获得了 Corrcoef 值
You need to use a Series in corrwith.您需要使用对应的系列。
You can use:您可以使用:
df.set_index('Company').corrwith(vendor.set_index('Company').loc['Deere'], axis=1)
output: output:
Company
Amazon 0.779560
BellTM -0.376945
Cadbury 0.980927
dtype: float64
With your code:使用您的代码:
df.iloc[:, 1:].corrwith(vendor.iloc[0,1:].astype(float), axis=1)
output: output:
0 0.779560
1 -0.376945
2 0.980927
dtype: float64
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.