简体   繁体   English

使用 For 循环的协方差矩阵 Python

[英]Covariance Matrix using For Loops Python

I am trying to develop a code to compute a covariance matrix of a dataset using For Loops instead of Numpy.我正在尝试开发一个代码来使用 For 循环而不是 Numpy 来计算数据集的协方差矩阵。 The code I have so far generates an error:我到目前为止的代码会产生一个错误:

def cov_naive(X):
    """Compute the covariance for a dataset of size (D,N) 
    where D is the dimension and N is the number of data points"""
    D, N = X.shape
    ### Edit the code below to compute the covariance matrix by iterating over the dataset.
    covariance = np.zeros((D, D))
    mean = np.mean(X, axis=1)
    for i in range(D):
        for j in range(D):
            covariance[i,j] += (X[:,i] - mean[i]) @ (X[:,j] - mean[j])

    return covariance/N

I am trying to perform the below test to validate that it works:我正在尝试执行以下测试以验证它是否有效:

# Let's first test the functions on some hand-crafted dataset.
X_test = np.arange(6).reshape(2,3)
expected_test_mean = np.array([1., 4.]).reshape(-1, 1)
expected_test_cov = np.array([[2/3., 2/3.], [2/3.,2/3.]])

print('X:\n', X_test)
print('Expected mean:\n', expected_test_mean)
print('Expected covariance:\n', expected_test_cov)

np.testing.assert_almost_equal(mean(X_test), expected_test_mean)
np.testing.assert_almost_equal(mean_naive(X_test), expected_test_mean)
np.testing.assert_almost_equal(cov(X_test), expected_test_cov)
np.testing.assert_almost_equal(cov_naive(X_test), expected_test_cov)

and get the following error:并得到以下错误:

AssertionError: 
Arrays are not almost equal to 7 decimals
AssertionError                            Traceback (most recent call last)
<ipython-input-21-6a6498089109> in <module>()
 12 
 13 np.testing.assert_almost_equal(cov(X_test), expected_test_cov)
---> 14 np.testing.assert_almost_equal(cov_naive(X_test), expected_test_cov)

Any help would be greatly appreciated!任何帮助将不胜感激!

The mistake lies in that line错误在于那一行

mean = np.mean(X, axis=1)

it should be:它应该是:

mean = np.mean(X, axis=0)

as you are computing the mean over the columns (ie dataset Dimensionality)当您计算列的平均值时(即数据集维度)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM