简体   繁体   中英

How to get the 1st Principal Component by PCA using Python?

I have a set of 2D vectors presented in a n*2 matrix form.

I wish to get the 1st principal component, ie the vector that indicates the direction with the largest variance.

I have found a rather detailed documentation on this from Rice University.

Based on this, I have imported the data and done the following:

import numpy as np

dataMatrix = np.array(aListOfLists)   # Convert a list-of-lists into a numpy array.  aListOfLists is the data points in a regular list-of-lists type matrix.
myPCA = PCA(dataMatrix)   # make a new PCA object from a numpy array object

Then how may I get the 3D vector that is the 1st Principal Component?

PCA gives only 2d vecs from 2d data.

Look at the picture in Wikipedia PCA :
starting with a point cloud (dataMatrix) like that, and using matplotlib.mlab.PCA ,
myPCA.Wt[0] is the first PC, the long one in the picture.

It isn't obvious from your example that you are using matplotlib.mlab.PCA but if so, the documentation states that the returned object has an attribute Wt , which is "the weight vector for projecting a numdims point or array into PCA space".

PCA returns the eigenvalues in descending order (you can tell by looking at the fracs attribute of the returned object). So the first principal component (first eigenvector) will be the first row of Wt .

As noted by @denis, your eigenvectors will be 2D (not 3D) since your input data are 2D.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM