简体   繁体   中英

Linear Regression in For Loop

I get an error for running the code below. I haven not figured out what I am doing wrong - sorry if it is obvious, I am new to R. The idea is to "generate" 100 regressions and output the estimated slope 100 times.

set.seed(21) 
x <- seq(1,40,1) 
for (i in 1:100 ) {
  y[i] = 2*x+1+5*rnorm(length(x))
  reg[i] <- lm(y[i]~x)
  slp[i] <-  coef(reg[i])[2]
  }

You need to create the matrix/vector y , reg , slp first, to be able to write to position i like: y[i] <- . You can do something along:

set.seed(21) 
x <- seq(1,40,1) 
slp <- numeric(100)
for (i in 1:100 ) {
  y <- 2*x+1+5*rnorm(length(x))
  reg <- lm(y~x)
  slp[i] <-  coef(reg)[2]
}

   > slp
  [1] 2.036344 1.953487 1.949170 1.961897 2.098186 2.027659 2.002638 2.107278
  [9] 2.036880 1.980800 1.893701 1.925230 1.927503 2.073176 2.101303 1.943719
      ...
 [97] 1.966039 2.041239 2.063801 2.066801

There are several problems with the way you use indexing. You'll probably need to spend some time again on a short tutorial about R for beginners, and not "rush" to loops and regressions...

In the end, you want to have a vector containing 100 slope values. You need to define this (empty) vector 'slp' prior to running the loop and then fill each i th element with its value in the loop.

On the other hand, 1) at each iteration you don't fill the i th element of y but create a whole new vector y with as many values as there are in x... 2) you don't need to keep every regression so you don't need to "index" your object reg.

So here it is:

set.seed(21) 
x <- seq(1,40,1) 
slp=rep(NA,100)
for (i in 1:100) {
    y = 2*x+1+5*rnorm(length(x))
    reg <- lm(y~x)
    slp[i]<-coef(reg)[2]
}
print(slp)

In addition to the other answers, there is a better (more efficient and easier) possibility. lm accepts a matrix as input for y:

set.seed(21)
y <- matrix(rep(2*x + 1, 100) + 5 *rnorm(length(x) * 100), ncol = 100)
reg1 <- lm(y ~ x)
slp1 <- coef(reg1)[2,]
all.equal(slp, slp1)
#[1] TRUE

If you had a function other than lm and needed a loop, you should use replicate instead of a for loop:

set.seed(21) 
slp2 <- replicate(100, {
  y = 2*x+1+5*rnorm(length(x))
  reg <- lm(y~x)
  unname(coef(reg)[2])
})
all.equal(slp, slp2)
#[1] TRUE

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM