简体   繁体   中英

r covariance matrix and correlation matrix

Hello I am using the data dystrophy from package ipred. I've used a subset to separate from carriers and normal:

carrier = subset(dystrophy,dystrophy$Class == "carrier")
normal = subset(dystrophy,dystrophy$Class == "normal")

and I've reduce this data selecting only the patients with 1 visit at the hospital:

carrier = subset(carrier,carrier$OBS == "1")
normal = subset(normal,normal$OBS == "1")

So now I would like to practice calculating the means vector, covariance matrix and a correlation matrix of the proteins but by separated groups(Class factor).

I 've tried with cor and cov, but I think I am doing something wrong. Any help would be appreciated. thanks!!

This may get you started. Using your variables, you can get the means for each of the proteins using:

sapply(carrier[,6:9], mean, na.rm=T)
sapply(normal[,6:9], mean, na.rm=T)

For the correlation and covariance you can use:

cor(carrier[,6:9], use="pairwise.complete.obs")
cor(normal[,6:9], use="pairwise.complete.obs")

cov(carrier[,6:9], use="pairwise.complete.obs")
cov(normal[,6:9], use="pairwise.complete.obs")

The 6:9 part is there to restrict the computation to the proteins and not include other features like Age. The use="pairwise.complete.obs" part is there to handle the missing values.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM