I am trying to see how many points
are within every center
points but this is currently done in a for loop. Would it be possible if this could be vectorized? Seen below is a snippet that could be reproducible. Thank you.
require(geosphere)
centers <- as.data.frame(matrix(rnorm(10, mean = 40, sd = .5), ncol = 2, byrow = TRUE))
points <- matrix(rnorm(100, mean = 40, sd = 1), ncol = 2, byrow = TRUE)
for(i in 1:dim(centers)[1]){
# Calculate number of points that are 50 km within every center point
centers[i,3] <- sum(geosphere::distHaversine(points,
centers[i,c(1:2)]) /
1000 < 50, na.rm = TRUE)
}
I don't think you can really vectorise the function if it can process only one point at a time. You can replace the for
loop with sapply
and see if there is any performance improvement.
library(geosphere)
centers$total <- sapply(seq(nrow(centers)), function(i) {
sum(distHaversine(points, centers[i,]) /1000 < 50, na.rm = TRUE)
})
You can use split
with row
and sapply
followed by a colSums
:
library(geosphere)
centers$res <- colSums(
sapply(split(as.matrix(centers[, 1:2]), row(centers)[, 1:2]),
distHaversine, p1 = points) / 1000 < 50, na.rm = TRUE)
It gives the same:
# compute the old result to compare with
for(i in 1:dim(centers)[1])
centers[i,4] <- sum(geosphere::distHaversine(points,
centers[i,c(1:2)]) /
1000 < 50, na.rm = TRUE)
# gives the same
all.equal(centers$res, centers[, 4])
#R> [1] TRUE
A possible alternative is:
dists <- tapply(as.matrix(centers[, 1:2]), row(centers[, 1:2]),
distHaversine, p1 = points)
centers$res <- colSums(simplify2array(dists) / 1000 < 50, na.rm = TRUE)
or to use an anonymous function. This would be like Ronak Shah answer but with tapply
:
centers$res <- c(tapply(
as.matrix(centers[, 1:2]), row(centers[, 1:2]), function(x)
sum(distHaversine(points, x) / 1000 < 50, na.rm = TRUE)))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.