[英]In Python how to do Correlation between Multiple Columns more than 2 variables?
I have a Pandas Dataframe like so:我有一个像这样的熊猫数据框:
id cat1 cat2 cat3 num1 num2
1 0 WN 29 2003 98
2 1 TX 12 755 76
3 0 WY 11 845 32
4 1 IL 19 935 46
I want to find out the correlation between cat1
and column cat3
, num1
and num2
or between cat1
and num1
and num2
or between cat2
and cat1, cat3, num1, num2
我想找出cat1
和cat3
列、 num1
和num2
之间或cat1
和num1
和num2
之间或cat2
和cat1, cat3, num1, num2
之间的相关性
When I use df.corr()
it gives Correlation between all the columns in the dataframe, but I want to see Correlation between just these selective columns detailed above.当我使用df.corr()
时,它会给出数据框中所有列之间的相关性,但我想查看上面详述的这些选择性列之间的相关性。
How do I do that in Python pandas?我如何在 Python pandas 中做到这一点?
A Thousand thanks in advance for your answers.提前一千感谢您的回答。
I tried the following and it worked:我尝试了以下并且有效:
features1=list(['cat1','cat2','cat3'])
features2=list(['Cat1', 'Cat2','num1','num2'])
df[features1].corr()
df[features2].corr()
Good way to select the columns based on the need when you have a very high number of variables in your dataset.当数据集中有大量变量时,根据需要选择列的好方法。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.