簡體   English   中英

如何在 R 中矢量化用於坐標計算的 for 循環?

[英]How to vectorize a for loop for coordinates calculation in R?

我試圖查看每個center內有多少points ,但目前這是在 for 循環中完成的。 如果可以將其矢量化,是否有可能? 下面看到的是一個可以重現的片段。 謝謝你。

require(geosphere)

centers <- as.data.frame(matrix(rnorm(10, mean = 40, sd = .5), ncol = 2, byrow = TRUE))
points <- matrix(rnorm(100, mean = 40, sd = 1), ncol = 2, byrow = TRUE)

for(i in 1:dim(centers)[1]){
  # Calculate number of points that are 50 km within every center point
  centers[i,3] <- sum(geosphere::distHaversine(points, 
                                               centers[i,c(1:2)]) /
                        1000 < 50, na.rm = TRUE)
}

如果 function 一次只能處理一個點,我認為你不能真正矢量化它。 您可以用sapply替換for循環,看看是否有任何性能改進。

library(geosphere)

centers$total <- sapply(seq(nrow(centers)), function(i) {
      sum(distHaversine(points, centers[i,]) /1000 < 50, na.rm = TRUE)
})  

您可以使用splitrowsapply后跟colSums

library(geosphere)
centers$res <- colSums(
  sapply(split(as.matrix(centers[, 1:2]), row(centers)[, 1:2]), 
         distHaversine, p1 = points) / 1000 < 50, na.rm = TRUE)

它給出了相同的:

# compute the old result to compare with
for(i in 1:dim(centers)[1])
  centers[i,4] <- sum(geosphere::distHaversine(points, 
                                               centers[i,c(1:2)]) /
                        1000 < 50, na.rm = TRUE)

# gives the same
all.equal(centers$res, centers[, 4])
#R> [1] TRUE

一個可能的替代方案是:

dists <- tapply(as.matrix(centers[, 1:2]), row(centers[, 1:2]), 
                distHaversine, p1 = points)
centers$res <- colSums(simplify2array(dists) / 1000 < 50, na.rm = TRUE)

或使用匿名 function。 這就像Ronak Shah的回答,但帶有tapply

centers$res <- c(tapply(
  as.matrix(centers[, 1:2]), row(centers[, 1:2]), function(x)
    sum(distHaversine(points, x) / 1000 < 50, na.rm = TRUE)))

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM