简体   繁体   中英

Removing overlapping polygons, too many points

I have many points of data. In fact, too many points. None of the points are overlapping but some are quite near to each other. I'd like to have less points but without moving any of the locations.

I'd to end up with as many points as possible but only points which are at least ~5.7km apart from any other point. (if there is a little bit of overlap it's okay -- an error of 0.5km is acceptable)

I tried to write an algorithm in R to accomplish this but there are quite a few unexpected results. I have some data that is about 300,000 points covering the earth. I have some other data that is a few million. When I execute the algorithm I can segment the data by country which could reduce those numbers into the 20,000 to 100,000 range. If the location of the points didn't matter then I would probably just make an interpolated raster and call it good but for this problem I need to keep the specific location intact.

Another thing that I tried is to make a regular grid of 0.028 degrees and run NNJoin to find the nearest data point. This worked a bit better than my R code but the results are a bit funny as you can probably imagine.

Another idea I had was to Buffer the points the count how many of the points intersect with the Buffered layer. I'm still working on this one

Is there an already established method for arriving at this result? I am comfortable to work with PostGIS, QGIS, Python, R if there is a package or library that can do this.

tl;dr how do I reduce dense clusters of points but maintain coverage with a reduced set of points?

Here is an approach.

Example data

x <- runif(10000, -180, 180)
y <- runif(10000, -90, 90)
pts <- cbind(x, y)

Solution

library(raster)
# you will want a lower resolution than this
r <- raster(nrow=18, ncol=36, vals=1) 
# get cell numbers
cells <- cellFromXY(r, pts)
# pick one point per cell
sel <- aggregate(pts, list(cells), function(i)i[1])  # or sample

Let's see

plot(r)
points(pts, cex=.1)
points(sel[,2:3], pch=20, col="red")

Note that this use lon/lat so the distances are not the same across latitudes. Not sure if that matters; but if so you could transform.

Later:

There are several ways to create shifted variations by changing the extent, or when creating the RasterLayer. See ?raster and ?extent for more. You can also use shift

#add a row and a column
r1 <- raster(nrow=19, ncol=37, xmx=190, ymn=-100)
r2 <- shift(r1, -.5*xres(r1), -.5*yres(r1))

plot(as(r1, "SpatialPolygons"))
lines(as(r2, "SpatialPolygons"), col="red")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM