在 R 中查找多邊形的最近鄰居

Question

我有一個數據框，其坐標已轉換為 R 中的 sf 對象，如下所示：

> head(df1)
  Cell_ID   Spot_ID       X       Y
1       0 600000000 193.722 175.733
2       0 600000001 192.895 176.727
3       0 600000002 193.828 177.462
4       8 600000003 178.173 178.220
5       7 600000004 187.065 178.285
6       0 600000005 190.754 178.186

> df1_sf <- st_as_sf(df1,
                     coords = c('X', 'Y')) %>%
    group_by(Cell_ID) %>%
    summarise() %>%
    ungroup() %>%  
    st_convex_hull()
>plot(st_geometry(df1_sf), border = "red")

然后我可以繪制我所有的多邊形，它看起來像這樣：

現在我想獲取每個多邊形的鄰居的 ID。 這樣做我正在做

n = st_set_geometry(st_intersection(df1_sf,df1_sf), NULL)
head(n)
# A tibble: 6 x 2
  Cell_ID Cell_ID.1
    <int>     <int>
1       0         0
2       7         0
3      51         0
4       1         1
5       4         1
6       5         1

但這只是一項普通的工作，因為它需要一個交叉點，而如果它們是最近的交叉點，我也對它們感興趣（關閉雖然不像下面的圖片那樣接觸，Cell_ID 1 將作為鄰居單元格 3-6 但也會檢測到單元格 7，因為例如它在給定的半徑內）。 有人能幫我解決這個問題嗎？

謝謝！！

Answer 1

為了說明在每個多邊形周圍使用緩沖區（每個多邊形的數學膨脹）的極好建議，這里是一個快速而骯臟的spatstat解決方案。

首先加載包並制作一些示例數據：

library(spatstat)
dat <- tiles(dirichlet(cells))
ii <- seq(2, 42, by=2)
dat[ii] <- lapply(dat[ii], erosion, r = .01)
dat <- lapply(seq_along(dat), function(i) cbind(Cell_ID = i, as.data.frame(dat[[i]])))
dat <- Reduce(rbind, dat)
df1 <- cbind(Spot_ID = 1:nrow(dat), dat)
head(df1)
#>   Spot_ID Cell_ID         x         y
#> 1       1       1 0.4067780 0.0819020
#> 2       2       1 0.3216680 0.1129640
#> 3       3       1 0.1967080 0.0000000
#> 4       4       1 0.4438430 0.0000000
#> 5       5       2 0.5630909 0.1146781
#> 6       6       2 0.4916145 0.1649979

拆分每個Cell_ID ，找到凸包並繪制數據：

dat <- split(df1[,c("x", "y")], df1$Cell_ID)
dat <- lapply(dat, convexhull)
plot(owin(), main = "")
for(i in seq_along(dat)){
  plot(dat[[i]], add = TRUE, border = "red")
}

擴張每個多邊形：

bigdat <- lapply(dat, dilation, r = 0.0125)

進行一個簡單的 for 循環分配哪些擴張多邊形重疊（即完整的 n^2 成對交集）：

neigh <- list()
for(i in seq_along(bigdat)){
  overlap <- sapply(bigdat[-i], function(x) !is.empty(intersect.owin(x, bigdat[[i]])))
  neigh[[i]] <- which(overlap)
}

繪制具有鄰居數量的擴張多邊形（鄰居的 id 在列表neigh ）：

plot(owin(), main = "")
for(i in seq_along(bigdat)){
  plot(bigdat[[i]], add = TRUE, border = "red")
}
text.ppp(cells, labels = sapply(neigh, length))

基於替代曲面細分的解決方案

是否需要使用凸包作為單元格區域的定義？ 我很想簡單地用樣本點的質心表示每個單元格，然后使用 Dirichlet/Voronoi 細分作為區域。 這些在任何地方都有明確定義的鄰居，唯一的問題是如何定義單元格集合的邊界區域。

拆分每個Cell_ID ，找到質心，細分並繪制數據：

dat <- split(df1[,c("x", "y")], df1$Cell_ID)
dat <- t(sapply(dat, colMeans))
X <- as.ppp(dat, W = ripras)
D <- dirichlet(X)
plot(D)

查找鄰居ID的額外代碼：

eps <- sqrt(.Machine$double.eps) # Epsilon for numerical comparison below
tilelist <- tiles(D)
v_list <- lapply(tilelist, vertices.owin)
v_list <- lapply(v_list, function(v){ppp(v$x, v$y, window = Window(X), check = FALSE)})
neigh <- list()
dd <- safedeldir(X)
for(i in seq_len(npoints(X))){
  ## All neighbours from deldir (infinite border tiles)
  all_neigh <- c(dd$delsgs$ind1[dd$delsgs$ind2==i],
                 dd$delsgs$ind2[dd$delsgs$ind1==i])
  ## The remainder keeps only neighbour tiles that share a vertex with tile i:
  true_neigh <- sapply(v_list[all_neigh], function(x){min(nncross.ppp(v_list[[i]], x))}) < eps
  neigh[[i]] <- sort(all_neigh[true_neigh])
}
plot(D, main = "Tessellation with Cell_ID")
text(X)

neigh[[1]] # Neighbours of tile 1
#> [1] 2 7 8
neigh[[10]] # Neighbours of tile 10
#> [1]  3  4  5  9 15 16 20

Answer 2

從您的問題來看，您似乎對通用最近鄰類型方法更感興趣。 如果這過於簡單化，請糾正我。

您可以簡單地取中心坐標並使用任何knn類型的算法將k nearest neighbours分類到給定的坐標，而不是考慮每個多邊形及其邊界。

由於我無權訪問您的數據，因此我創建了一些虛擬坐標。 使用包RANN和函數nn2 請參見此處。

install.packages('RANN')
library(RANN)

# Make dummy coordinates
df <- 
  data.frame(   X = runif(100)
              , Y = runif(100)
               )

# Find closest 5 points between df and itself
closest <- nn2(data = df, query = df , k = 5)

closest$nn.idx # Index of Closest neigbours
closest$nn.dists # Euclidean distance of Closest neigbours

# Note the first colum is a reference to itself, so real 5 nearest neighbours (not including itself) would mean you select k = 6.

> head(closest$nn.idx) # Euclidean distance of Closest neigbours
     [,1] [,2] [,3] [,4] [,5]
[1,]    1   82   31   86   49
[2,]    2   22   41   34   91
[3,]    3   96   20   55   32
[4,]    4   65   53   77   14
[5,]    5   38   48   59   30
[6,]    6   36   43   97   61

> head(closest$nn.dists) # Euclidean distance of Closest neigbours
     [,1]       [,2]       [,3]       [,4]       [,5]
[1,]    0 0.04971692 0.06305752 0.08597908 0.09485483
[2,]    0 0.03668956 0.05248395 0.09570358 0.10489092
[3,]    0 0.07257007 0.10263107 0.11204297 0.13275642
[4,]    0 0.07209561 0.07227328 0.07259919 0.07326718
[5,]    0 0.02842711 0.06003873 0.08930219 0.12286905
[6,]    0 0.08018734 0.09312385 0.10844622 0.11368332

您也可以使用searchtype = "radius"和radius根據問題中提到的半徑方法來執行此操作。

在 R 中查找多邊形的最近鄰居

問題描述

2 個解決方案

解決方案1
1 已采納 2020-10-23 06:40:31

基於替代曲面細分的解決方案

解決方案2
0 2020-10-22 23:27:12

在 R 中查找多邊形的最近鄰居

問題描述

2 個解決方案

解決方案1 1 已采納 2020-10-23 06:40:31

基於替代曲面細分的解決方案

解決方案2 0 2020-10-22 23:27:12

解決方案1
1 已采納 2020-10-23 06:40:31

解決方案2
0 2020-10-22 23:27:12