简体   繁体   中英

R equivalent to the SAS "BY" statement in PRINCOMP Procedure

I am using R princomp for PCA, however, I have a dataset with a factor variable, and I would like to run princomp on each factor.

This can be done in SAS with the "BY" statement that " performs BY group processing, which enables you to obtain separate analyses on grouped observations" (from https://support.sas.com/rnd/app/stat/procedures/princomp.html )

Can this be done by princomp in R or do I have to split my data into several datasets and run princomp on each?

All the best,

It is very simple in R once you understand a bit about how lists work. For that you should spend a bit of time with an R tutorial that includes a discussion of lists. Using a data set available on R:

data(iris)
str(iris)
# 'data.frame': 150 obs. of  5 variables:
#  $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
#  $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
#  $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
#  $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
#  $ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...

First split the data frame into 3 separate data frames, one for each Species and store them in a list. We'll leave out the Species label since it will not be used in the principal components and then run the analysis on each group:

iris.spl <- split(iris[, 1:4], iris$Species)
iris.spl.pca <- lapply(iris.spl, prcomp, scale.=TRUE)

To preserve Species in each data frame in the list, you would use the following code:

iris.spl <- split(iris, iris$Species)
iris.spl.pca <- lapply(iris.spl, function(x) prcomp(x[, 1:4], scale.=TRUE))

To get the basic results:

iris.spl.pca

To get a particular result use:

iris.spl.pca[[1]] # or iris.spl.pca[["setosa"]]

I used prcomp based on the advice given in the Details section of the manual page for princomp . Using scale.=TRUE analyzes the correlation matrix, removing it would analyze the covariance matrix.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM