How to make the loop faster?

Question

My code looks as below, I am wondering if there any better way to make it faster:

pos=NULL
row=data.frame(matrix(nrow=216,ncol=4))
colnames(row)=c("sub","subi","group","trial")
for (i in 1:100000){
  row$sub="Positive"
  row$subi=NA
  row$group=NA
  row$subi[1:144]=c(1:144)
  row$group[1:144]=1
  row$subi[145:216]=c(1:72)
  row$group[145:216]=2
  row$trial=i
  pos=rbind(pos,row)
}

Answer 1

No loop needed. You can build a data.frame or tibble (my example) on your own.

Given you want to adjust the row length later:

library(dplyr)

n_rows <- 10000

tibble(
  trail = 1:n_rows,
  sub = "positive",
  subi = c(seq(1:144), seq(1:72), rep(NA, n_rows-216)),     
  group = c(rep(1, 144), rep(2, 72), rep(NA, n_rows-216))
  )

Output is:

# A tibble: 10,000 × 4
   trail sub       subi group
   <int> <chr>    <int> <dbl>
 1     1 positive     1     1
 2     2 positive     2     1
 3     3 positive     3     1
 4     4 positive     4     1
 5     5 positive     5     1
 6     6 positive     6     1
 7     7 positive     7     1
 8     8 positive     8     1
 9     9 positive     9     1
10    10 positive    10     1
# … with 9,990 more rows

Answer 2

It looks like you are trying to replicate this data frame 100,000 times, with each iteration of the frame having a different trial number.

data.frame(sub = rep("Positive", 216), 
           subi = c(1:144, 1:72), 
           group = rep(c(1, 2), c(144, 72)))

The replicate function is great for running static code multiple time. So one option is to create your 100,000 copies and then update the trial number.

FrameList <- 
  replicate(n = 100, 
            {
              data.frame(sub = rep("Positive", 216), 
                         subi = c(1:144, 1:72), 
                         group = rep(c(1, 2), c(144, 72)), 
                         trial = rep(NA_real_, 216))
            }, 
            simplify = FALSE)

To update the trial number, you can go with a for loop

for (i in seq_along(FrameList)){
  FrameList$trial <- i
}

or you can try something fancy-pants, but taking a lot more code

FrameList <- mapply(function(FL, i){
                      FL$trial <- i 
                      FL
                    },
                    FrameList, 
                    seq_along(FrameList), 
                    SIMPLIFY = FALSE)

Whichever way you go, you can stack them all together with

Frame <- do.call("rbind", FrameList)

This certainly isn't the most elegant way to do this, so watch for others to give you other clever tricks. But this, I would guess, would be the basic process to follow.

Answer 3

The only thing different in each pass of the loop is trial . rep is your friend. For the other columns, R will automatically recycle to match the longest column (here, it is trial with 21.6M rows).

pos <- data.frame(
  sub = "Positive",
  subi = c(1:144, 1:72),
  group = rep.int(1:2, c(144, 72)),
  trial = rep(1:1e5, each = 216)
)

How to make the loop faster?

Question

3 answers

solution1
1 2022-05-12 10:26:15

solution2
0 2022-05-12 11:34:44

solution3
0 ACCPTED 2022-05-12 12:31:25

How to make the loop faster?

Question

3 answers

solution1 1 2022-05-12 10:26:15

solution2 0 2022-05-12 11:34:44

solution3 0 ACCPTED 2022-05-12 12:31:25

solution1
1 2022-05-12 10:26:15

solution2
0 2022-05-12 11:34:44

solution3
0 ACCPTED 2022-05-12 12:31:25