How do I save the results of this for loop as a vector rather than as a single value (in R)?

Question

I am new to for-loops and am having trouble saving the results of a for loop in the way that I want.

The loop I'm currently running looks like this:

# Setup objects
n = 100
R = (1:1000)
P = seq(-.9, .9, .1)
betahat_OLS = rep(NA, 1000)
Bhat_OLS = rep(NA, 19)

# Calculate betahat_OLS for each p in P and each r in R
for (p in P) {
  for (r in R) {
    # Simulate data
    v = rnorm(n)
    e = rnorm(n)
    z = rnorm(n)
    u = p*v+e
    x = z+v
    y = 0*x+u
    #Calculate betahat_OLS
    betahat_OLS[r] = sum(x*y)/sum(x^2)
  }
  #Calculate Bhat_OLS
  Bhat_OLS = sum(betahat_OLS)/1000-0
}

# Make a scatterplot with p on the x-axis and Bhat_OLS on the y-axis
plot(P, Bhat_OLS)

The loop seems to be working correctly, except for the fact that I would like to end up with 19 values of Bhat_OLS and only currently get 1 value. I want to have a Bhat_OLS value for each value of p in P so that I can plot Bhat_OLS against p; I just don't know how to tell R to do that.

Any help would be greatly appreciated!

Answer 1

You can write your results into a data frame with two columns, containing P and Bhat_OLS .

# Setup objects
n = 100
R = (1:1000)
P = seq(-.9, .9, .1)
betahat_OLS = rep(NA, 1000)
Bhat_OLS = rep(NA, 19)

# initialize result data frame
results <- data.frame(matrix(ncol = 2, nrow = 0, 
                      dimnames = list(NULL, c("P", "Bhat_OLS"))))

# Calculate betahat_OLS for each p in P and each r in R
for (p in P) {
    for (r in R) {
        # Simulate data
        v = rnorm(n)
        e = rnorm(n)
        z = rnorm(n)
        u = p*v+e
        x = z+v
        y = 0*x+u
        #Calculate betahat_OLS
        betahat_OLS = sum(x*y)/sum(x^2)
    }
    #Calculate Bhat_OLS
    Bhat_OLS = sum(betahat_OLS)/1000-0
    
    # insert P and Bhat_OLS into results
    results[nrow(results) + 1,] = c(p, Bhat_OLS)
}

# Make a scatterplot with p on the x-axis and Bhat_OLS on the y-axis
plot(results$P, results$Bhat_OLS)

Answer 2

The fact that you loop over the probabilities makes it difficult with the indices. You could loop over seq(P) instead and subset P[i] . Also, at the end you need Bhat_OLS[i] . Then it works.

# Setup objects
n <- 100
R <- (1:1000)
P <- seq(-.9, .9, .1)
betahat_OLS <- rep(NA, length(R))
Bhat_OLS <- rep(NA, length(P))

set.seed(42)  ## for sake of reproducibility

# Calculate betahat_OLS for each p in P and each r in R
for (i in seq(P)) {
  for (r in R) {
    # Simulate data
    v <- rnorm(n)
    e <- rnorm(n)
    z <- rnorm(n)
    u <- P[i]*v + e
    x <- z + v
    y <- 0*x + u
    #Calculate betahat_OLS
    betahat_OLS[r] <- sum(x*y)/sum(x^2)
  }
  #Calculate Bhat_OLS
  Bhat_OLS[i] <- sum(betahat_OLS)/1000 - 0
}

# Make a scatterplot with p on the x-axis and Bhat_OLS on the y-axis
plot(P, Bhat_OLS, xlim=c(-1, 1))

Alternative solution `vapply`

In a more R-ish way (right now it is more c-ish) you could define the simulation in a function sim() and use vapply for the outer loop. (Actually also for the inner loop, but I've tested it and this way it's faster.)

sim <- \(p, n=100, R=1:1000) {
  r <- rep(NA, max(R))
  for (i in R) {
    v <- rnorm(n)
    e <- rnorm(n)
    z <- rnorm(n)
    u <- p*v + e
    x <- z + v
    y <- 0*x + u
    r[i] <- sum(x*y)/sum(x^2)
  }
  return(sum(r/1000 - 0))
}

set.seed(42)
Bhat_OLS1 <- vapply(seq(-.9, .9, .1), \(p) sim(p), 0)

stopifnot(all.equal(Bhat_OLS, Bhat_OLS1))

Note:

R.version.string
# [1] "R version 4.1.2 (2021-11-01)"

How do I save the results of this for loop as a vector rather than as a single value (in R)?

Question

2 answers

solution1
1 ACCPTED 2021-12-19 15:56:43

solution2
0 2021-12-19 16:01:31

Alternative solution `vapply`

How do I save the results of this for loop as a vector rather than as a single value (in R)?

Question

2 answers

solution1 1 ACCPTED 2021-12-19 15:56:43

solution2 0 2021-12-19 16:01:31

Alternative solution vapply

solution1
1 ACCPTED 2021-12-19 15:56:43

solution2
0 2021-12-19 16:01:31

Alternative solution `vapply`