简体   繁体   English

如何使用大熊猫创建多元正态分布的相关矩阵?

[英]how to use pandas to create correlation matrix of multivariate normal distribution?

In R, we could create the correlation matrix like this: 在R中,我们可以这样创建相关矩阵:

makecov <- function(rho,n) {
    m <- matrix(nrow=n,ncol=n)
    m <- ifelse(row(m)==col(m),1,rho)
    return(m)
}

As we know the correlation,the result would be: 我们知道相关性,结果将是:

makecov(0.2,3)
#     [,1] [,2] [,3]
#[1,]  1.0  0.2  0.2
#[2,]  0.2  1.0  0.2
#[3,]  0.2  0.2  1.0

But in pandas,how could we create the same matrix efficiently? 但是在大熊猫中,我们如何有效地创建相同的矩阵? Here is my solution: 这是我的解决方案:

def makecov(rho,n):
    m=[rho/2]*n*n
    m=np.array(m).reshape([n,n])
    return m+m.T-np.diag([rho]*n)+np.diag([1]*n)

And the result would be: 结果将是:

In [21]:makecov(0.2,3)
Out[21]: 
array([[ 1. ,  0.2,  0.2],
       [ 0.2,  1. ,  0.2],
       [ 0.2,  0.2,  1. ]])

Is there some more elegant ways to do that with pandas? 还有一些更优雅的方法可以对付大熊猫吗?

I would recommend using numpy's covariance matrix method instead: http://docs.scipy.org/doc/numpy/reference/generated/numpy.cov.html 我建议您改用numpy的协方差矩阵方法: http : //docs.scipy.org/doc/numpy/reference/generation/numpy.cov.html

Pandas in my experience is better used for data cleaning and whatnot. 根据我的经验,熊猫最好用于数据清理之类的东西。 I usually let numpy do the heavy statistical lifting. 我通常让numpy做繁重的统计工作。

It looks like you could do 看起来你可以做

def makecov(rho, n):
    out = numpy.eye(n) + rho
    numpy.fill_diagonal(out, 1)
    return out

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM