[英]Spatial rolling functions (min, max, mean)
I'm currently working on a project where I need to calculate the rolling minimum over a spatial window of 30 meters (it's a square around the central point).我目前正在做一个项目,我需要在 30 米的空间窗口上计算滚动最小值(它是围绕中心点的正方形)。 On my data frame for each point I have the X and Y coordinates and the variable Z for which I'm trying to get the rolling minimum.
在每个点的数据框中,我都有 X 和 Y 坐标以及我试图获得滚动最小值的变量 Z。
So far I have accomplished it using for loops with conditionals and data table filtering.到目前为止,我已经使用带有条件和数据表过滤的 for 循环来完成它。 This takes some time, specially when the data bases have over a million points.
这需要一些时间,特别是当数据库有超过一百万个点时。 I would really appreciate if you could help me with some tips of how to improve the performance of this code.
如果您能帮助我提供一些有关如何提高此代码性能的提示,我将不胜感激。
d = 1
attach(data)
#### OPTION 1 - CONDITIONAL ####
op1 = NULL
for (i in 1:nrow(data)) {
op1[i]<-
min(
ifelse(POINT_X>=POINT_X[i]-d,
ifelse(POINT_X<=POINT_X[i]+d,
ifelse(POINT_Y>=POINT_Y[i]-d,
ifelse(POINT_Y<=POINT_Y[i]+d, Z, Z[i]),Z[i]),Z[i]),Z[i]), na.rm = T)}
#### OPTION 2 - SUBSET ####
setDT(data)
local_min = function(i){
x = POINT_X[i]
y = POINT_Y[i]
base = data[POINT_X %inrange% c(x-d,x+d)&
POINT_Y %inrange% c(y-d,y+d)]
local_min = min(base$Z, na.rm=T)
return(local_min)}
op2 = NULL
for (i in 1:nrow(data)) {
op2[i]<- local_min(i)}
I've tried other alternatives but the most common type of rolling statistic functions on R are based on index windows rather than values of other variables.我尝试了其他替代方法,但 R 上最常见的滚动统计函数类型是基于索引窗口而不是其他变量的值。 Here's some data for you to try the the code above with
d=1
.这里有一些数据供您使用
d=1
尝试上面的代码。 I would be really grateful if you could help me improve this process.如果您能帮助我改进这个过程,我将不胜感激。
data = data.frame(POINT_X=rep(1:5, each =5),
POINT_Y=rep(1:5,5),
Z=1:25)
The desired output should look like this:所需的输出应如下所示:
> op1
[1] 1 1 2 3 4 1 1 2 3 4 6 6 7 8 9 11 11 12 13 14 16 16 17 18 19
I think it's important to note that currently the option 1 is faster than the option 2. Thanks in advance for your attention.我认为需要注意的是,目前选项 1 比选项 2 更快。提前感谢您的关注。 :)
:)
You could use a non-equi join :您可以使用非 equi join :
d = 1
data[,`:=`(xmin = POINT_X-d,
xmax = POINT_X+d,
ymin = POINT_Y-d,
ymax = POINT_Y+d)]
data[data,on=.(POINT_X >= xmin,
POINT_X <= xmax,
POINT_Y >= ymin,
POINT_Y <= ymax)][
,.(rollmin=min(Z)),by=.(POINT_X,POINT_Y)][
,rollmin]
#[1] 1 1 2 3 4 1 1 2 3 4 6 6 7 8 9 11 11 12 13 14 16 16 17 18 19
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.