简体   繁体   English

使用np.linalg.norm检查PCA中的特征向量

[英]Use of np.linalg.norm for checking the eigen vectors in PCA

I was following a tutorial on PCA I came to the point of selecting the principal components. 我遵循了有关PCA的教程,然后才开始选择主要组件。

This is the link for the tutorial on PCA : 这是PCA教程的链接

I am stuck at this point in the code. 我现在仍然停留在代码中。 I couldn't understand what it actually does? 我不明白它的实际作用?

eigen_values, eigen_vectors = np.linalg.eig(cor_mat2)

for ev in eigen_vectors:
    np.testing.assert_array_almost_equal(1.0, np.linalg.norm(ev))
print('Everything ok!')

I really appreciate if anyone could help me understand. 如果有人可以帮助我理解我,我非常感谢。

What does np.linalg.norm checks here? np.linalg.norm在这里检查什么?

As can be read in np.linalng.norm documentation, this function calculates L2 Norm of the vector. 可以在np.linalng.norm文档中阅读,此函数计算向量的L2 Norm

All this loop does is ensuring, that each eigenvector is of unit length, so each eigenvector's importance for data representation can be compared using eigenvalues . 该循环所做的全部工作是确保每个eigenvector具有单位长度,因此可以使用eigenvalues来比较每个特征向量对数据表示的重要性。

Eigenvectors span a new base for your projection, and as such, those are of unit length (as described in the article). 特征向量跨越了投影的新基础,因此,它们具有单位长度(如文章中所述)。 They wouldn't have to be but it's easier that way, you can think of it like new xyz axis in 3-D (such canonnical base is always constructed of vectors containing zeros in all dimensions and one in only one place, x would be vector (1, 0, 0) , y would be (0, 1, 0) and z (0, 0, 1) ). 它们不一定非要如此,但是用这种方法更容易,您可以将其视为3-D中的新xyz轴(这样的规范基始终由在各个维度上都包含零且仅在一个位置包含一个零的向量构成, x为向量(1, 0, 0) ,y将是(0, 1, 0)和z (0, 0, 1) )。

In order to get the new directions containing most information about data (linear-wise at least, most variance) and perform dimensionality reduction of your desired size (say N ), we will have to compare their "influence" on data. 为了获得包含有关数据的最多信息的新方向(至少是线性的,最大方差)并执行所需大小的降维(例如N ),我们将必须比较它们对数据的“影响”。 That's what eigenvalues are used for, as eigenvectors cannot be compared unles normalized to the same (unit) scale. 这就是特征值的用途,因为无法将特征向量标准化为相同(单位)标度后再进行比较。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM