简体   繁体   English

用matlab做PCA和Whitening

[英]Doing PCA and Whitening with matlab

My task is to do PCA and whitening transform with given 2dimentional 5000data.我的任务是使用给定的二维 5000 数据进行 PCA 和白化变换。

What I understand with PCA is analyzing the main axis of the data with covariance Matrix's Eigen Vector and rotate the main axis to the x axis!我对PCA的理解是用协方差矩阵的特征向量分析数据的主轴并将主轴旋转到x轴!

So here's what I did.所以这就是我所做的。

[BtEvector,BtEvalue]=eig(MYCov);% Eigen value and vector using built-in function

I first calculated eigen values and vectors.我首先计算了特征值和向量。 The result was结果是

BtEvalue=[4.027487815706757,0;0,8.903923357227459] 

and

BtEvector=[0.033937679569230,-0.999423951036524;-0.999423951036524,-0.033937679569230]

So I figured out that the main axis will have eigen value of 8.903923357227459 and eigen vector of [-0.999423951036524,-0.033937679569230] which is the second corresponding term.所以我发现主轴的特征值为 8.903923357227459,特征向量为[-0.999423951036524,-0.033937679569230] ,这是第二个对应项。

After then, because it's two dimentional data, I let cos(theta) as -0.9994.. and sin(theta)=-0.033937 .之后,因为它是二维数据,我让 cos(theta) 为 -0.9994.. 和sin(theta)=-0.033937 Because I thought the main axis of the data(eigen vector [-0.999423951036524,-0.033937679569230] ) has to be x axis I made rotational axis R= [cos(-Theta)-sin(-theta);sin(-theta) cos(-theta)] .因为我认为数据的主轴(特征向量[-0.999423951036524,-0.033937679569230] )必须是 x 轴我做了旋转轴R= [cos(-Theta)-sin(-theta);sin(-theta) cos(-theta)] Let original data sets A=>2*5000 , I did A*R to get rotated data.让原始数据集A=>2*5000 ,我做了A*R来获得旋转数据。

Also, For whitening case, using Cholesky whitening, I made whitening transformation matrix as inv(Covariance Matrix) .此外,对于白化情况,使用 Cholesky 白化,我将白化变换矩阵设为inv(Covariance Matrix)

Is there something wrong with my algorithm?我的算法有问题吗? Could someone testify if there's error or misunderstanding please?如果有错误或误解,有人可以作证吗? Thank you a lot in advance.非常感谢您提前。

Since your data is two-dimensional, the covariance matrix that you calculated is not accurate.由于您的数据是二维的,您计算的协方差矩阵不准确。 If you only calculate the covariance with respect to one axis (say x), you're assuming that the covariance along the y axis is identity.如果你只计算关于一个轴(比如 x)的协方差,你就假设沿 y 轴的协方差是恒等。 This is obviously not true.这显然不是真的。 Although you've attempted to address this, there's a sound procedure that you can use (I've explained below).尽管您已尝试解决此问题,但您可以使用一个完善的程序(我已在下面进行了解释)。

Unfortunately, this is a common mistake.不幸的是,这是一个常见的错误。 Have a look at this paper , where it is explained exactly how the covariance should be calculated.看看这篇论文,其中解释了应该如何计算协方差。

In summary, you can calculate the covariance along each axis (Sx and Sy).总之,您可以计算每个轴(Sx 和 Sy)的协方差。 Then approximate the 2D covariance of the vectorized matrix as kron(Sx,Sy).然后将向量化矩阵的二维协方差近似为 kron(Sx,Sy)。 This will be a better approximation of the 2D covariance.这将是二维协方差的更好近似。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM