[英]how to use pandas to create correlation matrix of multivariate normal distribution?
In R, we could create the correlation matrix like this: 在R中,我们可以这样创建相关矩阵:
makecov <- function(rho,n) {
m <- matrix(nrow=n,ncol=n)
m <- ifelse(row(m)==col(m),1,rho)
return(m)
}
As we know the correlation,the result would be: 我们知道相关性,结果将是:
makecov(0.2,3)
# [,1] [,2] [,3]
#[1,] 1.0 0.2 0.2
#[2,] 0.2 1.0 0.2
#[3,] 0.2 0.2 1.0
But in pandas,how could we create the same matrix efficiently? 但是在大熊猫中,我们如何有效地创建相同的矩阵? Here is my solution:
这是我的解决方案:
def makecov(rho,n):
m=[rho/2]*n*n
m=np.array(m).reshape([n,n])
return m+m.T-np.diag([rho]*n)+np.diag([1]*n)
And the result would be: 结果将是:
In [21]:makecov(0.2,3)
Out[21]:
array([[ 1. , 0.2, 0.2],
[ 0.2, 1. , 0.2],
[ 0.2, 0.2, 1. ]])
Is there some more elegant ways to do that with pandas? 还有一些更优雅的方法可以对付大熊猫吗?
I would recommend using numpy's covariance matrix method instead: http://docs.scipy.org/doc/numpy/reference/generated/numpy.cov.html 我建议您改用numpy的协方差矩阵方法: http : //docs.scipy.org/doc/numpy/reference/generation/numpy.cov.html
Pandas in my experience is better used for data cleaning and whatnot. 根据我的经验,熊猫最好用于数据清理之类的东西。 I usually let numpy do the heavy statistical lifting.
我通常让numpy做繁重的统计工作。
It looks like you could do 看起来你可以做
def makecov(rho, n):
out = numpy.eye(n) + rho
numpy.fill_diagonal(out, 1)
return out
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.