简体   繁体   中英

Linear regression when response values are high-dimensional

I am trying to do linear regression with some data I just got, but I just do not know how to start. The problem to me is that the response (y) values are multi-dimensional like a vector.

For example:

sample 1, y <- c(3,7,10,36,23), while x1 <- 3, x2 <- 2, x3 <- 12, ....
sample 2, y <- c(4,5,13,21,9), while x1 <- 4, x2 <- 5, x3 <- 7, ....
....

You can do this fairly easily in R.

df <- data.frame(c(3,4),c(7,5),c(10,13),c(36,21),c(23,9),c(3,4),c(2,5),c(12,7))
colnames(df) <- c("y1","y2","y3","y4","y5","x1","x2","x3")
lmod1 <- lm(cbind(y1,y2,y3,y4,y5)~x1+x2+x3,data=df)

#or you can combine these into a matrix
y <- matrix(c(df$y1,df$y2,df$y3,df$y4,df$y5),ncol=5)
x <- matrix(c(df$x1,df$x2,df$x3),ncol=3)
lmod2 <- lm(y~x)

Note - in this example code the estimates for x2/x3 will fail, this is just because there are so few observations (but it should work for the real data set).

edit - I should probably also add that this approach will fit separate univariate models, which assumes independence between y1,y2,y3 etc.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM