简体   繁体   English

通过R中的邻近度进行空间滤波

[英]spatial filtering by proximity in R

I have occurrence points for a species, and I'd like to remove potential sampling bias (where some regions might have much greater density of points than others). 我有一个物种的出现点,我想消除潜在的采样偏差(某些区域的点密度可能比其他区域大得多)。 One way to do this would be to maximize a subset of points that are no less than a certain distance X of each other. 一种做到这一点的方法将是使彼此不小于一定距离X的点子集最大化。 Essentially, I would prevent points from being too close to each other. 本质上,我将防止点之间的距离太近。

Are there any existing R functions to do this? 是否有任何现有的R函数可以做到这一点? I've searched through various spatial packages, but haven't found anything, and can't figure out exactly how to implement this myself. 我搜索了各种空间包,但是什么也没找到,也无法自己弄清楚如何实现。

An example occurrence point dataset can be downloaded here . 可以在此处下载示例出现点数据集。

Thanks! 谢谢!

Following Josh O'Brien's advice, I looked at spatstat's rMaternI function, and came up with the following. 按照Josh O'Brien的建议,我查看了spatstat的rMaternI函数,并提出了以下内容。 It seems to work pretty well. 看来效果很好。

The distance is in map units. 距离以地图单位为单位。 It would be nice to incorporate one of R's distance functions that always returns distances in meters, rather than input units, but I couldn't figure that out... 最好合并R的距离函数之一,该函数始终以米为单位返回距离,而不是以输入单位为单位,但是我无法弄清楚……

require(spatstat)
require(maptools)
occ <- readShapeSpatial('occurrence_example.shp')

filterByProximity <- function(occ, dist) {
    pts <- as.ppp.SpatialPoints(occ)
    d <- nndist(pts)
    z <- which(d > dist)
    return(occ[z,])
}

occ2 <- filterByProximity(occ,dist=0.2)
plot(occ)
plot(occ2,add=T,col='blue',pch=20)

I've written a new version of this function that no longer really follows rMaternII. 我已经写了这个函数的新版本,不再真正地跟随rMaternII。 The input can either be a SpatialPoints, SpatialPointsDataFrame or matrix object. 输入可以是SpatialPoints,SpatialPointsDataFrame或矩阵对象。

Seems to work well, but suggestions welcome! 似乎运作良好,但欢迎提出建议!

filterByProximity <- function(xy, dist, mapUnits = F) {
    #xy can be either a SpatialPoints or SPDF object, or a matrix
    #dist is in km if mapUnits=F, in mapUnits otherwise
    if (!mapUnits) {
        d <- spDists(xy,longlat=T)
    }
    if (mapUnits) {
        d <- spDists(xy,longlat=F)
    }
    diag(d) <- NA
    close <- (d <= dist)
    diag(close) <- NA
    closePts <- which(close,arr.ind=T)
    discard <- matrix(nrow=2,ncol=2)
    if (nrow(closePts) > 0) {
            while (nrow(closePts) > 0) {
                if ((!paste(closePts[1,1],closePts[1,2],sep='_') %in% paste(discard[,1],discard[,2],sep='_')) & (!paste(closePts[1,2],closePts[1,1],sep='_') %in% paste(discard[,1],discard[,2],sep='_'))) {
                discard <- rbind(discard, closePts[1,])
                closePts <- closePts[-union(which(closePts[,1] == closePts[1,1]), which(closePts[,2] == closePts[1,1])),]
                }
            }
        discard <- discard[complete.cases(discard),]
        return(xy[-discard[,1],])
    }
    if (nrow(closePts) == 0) {
        return(xy)
    }
}

Let's test it: 让我们测试一下:

require(rgeos)
require(sp)
pts <- readWKT("MULTIPOINT ((3.5 2), (1 1), (2 2), (4.5 3), (4.5 4.5), (5 5), (1 5))")

pts2 <- filterByProximity(pts,dist=2, mapUnits=T)

plot(pts)
axis(1)
axis(2)
apply(as.data.frame(pts),1,function(x) plot(gBuffer(SpatialPoints(coords=matrix(c(x[1],x[2]),nrow=1)),width=2),add=T))
plot(pts2,add=T,col='blue',pch=20,cex=2)

在此处输入图片说明

There is also an R package called spThin that performs spatial thinning on point data. 还有一个名为spThin的R包, spThin点数据执行空间细化。 It was developed for reducing the effects of sampling bias for species distribution models, and does multiple iterations for optimization. 它是为减少采样偏差对物种分布模型的影响而开发的,并且进行了多次迭代以进行优化。 The function is quite easy to implement---the vignette can be found here . 该功能很容易实现-小插图可以在这里找到。 There is also a paper in Ecography with details about the technique. Ecography上还有一篇论文详细介绍了该技术。

Rather than removing data points, you might consider spatial declustering . 除了删除数据点,您还可以考虑空间聚类 This involves giving points in clusters a lower weight than outlying points. 这涉及给群集中的点赋予比外围点更低的权重。 The two simplest ways to do this involve a polygonal segmentation, like a Voronoi diagram, or some arbitrary grid. 两种最简单的方法包括多边形分割,例如Voronoi图或任意网格。 Both methods will weight points in each region according to the area of the region. 两种方法都将根据区域的面积对每个区域中的点进行加权。

For example, if we take the points in your test (1,1),(2,2),(4.5,4.5),(5,5),(1,5) and apply a regular 2-by-2 mesh, where each cell is three units on a side, then the five points fall into three cells. 例如,如果我们在测试中取点(1,1),(2,2),(4.5,4.5),(5,5),(1,5)并应用常规的2×2网格,其中每个像元在一侧是三个单元,则五个点落入三个像元。 The points ((1,1),(2,2)) falling into the cell [0,3]X[0,3] would each have weights 1/( no. of points in current cell TIMES tot. no. of occupied cells ) = 1 / ( 2 * 3 ). 落入单元格[0,3]X[0,3]的点((1,1),(2,2))每个将具有权重1 /(在当前单元格TIMES tot。no。中的点数占用的单元数)= 1 /(2 * 3)。 The same thing goes for the points ((4.5,4.5),(5,5)) in the cell (3,6]X(3,6] . The "outlier", (1,5) would have a weight 1 / ( 1 * 3 ). The nice thing about this technique is that it is a quick way to generate a density based weighting scheme. 单元格(3,6]X(3,6]的点((4.5,4.5),(5,5))也是一样。“异常值” (1,5)的权重为1 /(1 * 3)。这项技术的优点在于,它是生成基于密度的加权方案的快速方法。

A polygonal segmentation involves drawing a polygon around each point and using the area of that polygon to calculate the weight. 多边形分割涉及在每个点周围绘制一个多边形,并使用该多边形的面积来计算权重。 Generally, the polygons completely cover the entire region, and the weights are calculated as the inverse of the area of each polygon. 通常,多边形完全覆盖整个区域,权重计算为每个多边形的面积的倒数。 A Voronoi diagram is usually used for this, but polygonal segmentations may be calculated using other techniques, or may be specified by hand. 通常使用Voronoi图,但是可以使用其他技术来计算多边形分割,也可以手动指定多边形分割。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM