I have some lines of code with a for loop that look like this:
somevector2 <- c(length = somevector2_length)
for(string in somevector1){
df2 <- df1[df1$col1 == string, ]
ff <- somefunction(df2$col2)
somevector2 <- c(somevector2, ff)
}
From what i understood initializing the vector with the correct length should make the loop faster, but it still takes quite sometimes although the somefunction(df2$col2)
does some simple operations. somevector1
it's just a vector of strings
Is there a way to make this loop faster in R? thank you very much
Sorry, but that's not how you are supposed to post a question on SO. :( You should provide a working example. Also, that's not the way to create a vector of a fixed length.
Let's see a reproducible example of what you posted:
##### this makes your example reproducible
somevector1 <- unique(iris$Species)
df1 <- iris
names(df1) <- paste0("col", 5:1)
somefunction <- sum
somevector2_length <- 3
##### this is your code
# somevector2 <- c(length = somevector2_length) # <- this was wrong
somevector2 <- c()
for(string in somevector1){
df2 <- df1[df1$col1 == string, ]
ff <- somefunction(df2$col2)
somevector2 <- c(somevector2, ff)
}
So this is the final result:
somevector2
#> 12.3 66.3 101.3
What I suggest you is to use this line of code down here, instead of your code. You will get a similar result (it's a NAMED numeric vector).
tapply(df1$col2, df1$col1, somefunction)
#> setosa versicolor virginica
#> 12.3 66.3 101.3
You can get rid of the names with unname()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.