简体   繁体   English

重叠多边形时在geopandas中的空间连接?

[英]Spatial join in geopandas when overlapping polygons?

I have two datasets, one with points (shops) and one with polygons (districts).我有两个数据集,一个是点(商店),一个是多边形(区)。

The districts dataset sometimes has overlapping polygons (as I have buffered them). Districts 数据集有时具有重叠的多边形(因为我已经缓冲了它们)。

I want to know if each polygon has any matching points?我想知道每个多边形是否有匹配点?

joined = geopandas.sjoin(districts,shops, op='contains', how='inner')
joined

The above code probably give me only one of the matching polygons.上面的代码可能只给了我一个匹配的多边形。 How do I check each polygon?如何检查每个多边形?

TL/DR TL/DR

gpd.sjoin(districts, shops, how="left", op="contains") \
.reset_index()\
.rename(columns={"index": "districts"})\
.groupby(["districts"])\
.agg(nshops=("index_right", "nunique"), lshops=("index_right", "unique"))\
.astype(str)\
.replace("[nan]", "")

Explanations说明

Problem setup问题设置

Let's said that we have 3 districts ( d1 , d2 and d3 ).假设我们有 3 个区( d1d2d3 )。 Two district overlap ( d1 and d2 ).两个区重叠( d1d2 )。 We have 4 shops.我们有4家商店。 s1 is inside district d1 , s2 is inside district d2 . s1d1区内, s2d2区内。 s12 inside district d1 and d2 . s12位于d1d2区内。 s3 is not in any district. s3不在任何区。

We generate this geometry in python using shapely :我们使用shapely在 python 中生成这个几何:

from shapely.geometry import Point
from shapely.geometry import Polygon
import matplotlib.pyplot as plt
# Create Polygons for the districts
d1 = Polygon([(0, 0), (3, 0), (3, 3), (0, 3)])
d2 = Polygon([(1, 1), (4, 1), (4, 4), (1, 4)])
d3 = Polygon([(5, 2), (6, 2), (6, 3), (5, 3)])
# Create Points for the shops
s1 = Point(0.5, 0.5)
s2 = Point(3.5, 3.5)
s3 = Point(4.5, 2)
# This shop is in distric 1 and distric 2.
s12 = Point(2, 2)

Saving the geometry into GeoPandas DataFrame and using matplotlib we can have a look at the configuration:将几何图形保存到 GeoPandas DataFrame 中并使用 matplotlib 我们可以查看配置:

import geopandas as gpd
import matplotlib.pyplot as plt
districts = gpd.GeoDataFrame(index=['d1', 'd2', 'd3'], geometry=[d1, d2, d3])
shops = gpd.GeoDataFrame(index=['s1', 's12', 's2', 's3'], geometry=[s1, s12, s2, s3])
ax = districts.boundary.plot()
shops.plot(ax=ax, color='red')
plt.show()

在此处输入图片说明

Now let's have a look at how spatial join are working in GeoPandas.现在让我们看看空间连接在 GeoPandas 中是如何工作的。 We have to be careful at the order of the dataframe because the operation is not commutative.我们必须小心数据帧的顺序,因为操作不是可交换的。 Meaning gpd.sjoin(shops, districts, how="inner", op="contains") is not equal to gpd.sjoin(districts, shops, how="inner", op="contains") .意思是gpd.sjoin(shops, districts, how="inner", op="contains")不等于gpd.sjoin(districts, shops, how="inner", op="contains")

Now let's have a look to six arrangement:现在让我们来看看六种排列:

1 gpd.sjoin(shops, districts, how="left", op="contains") 1 gpd.sjoin(shops, districts, how="left", op="contains")

geometry几何学 index_right index_right
s1 s1 POINT (0.5 0.5)积分 (0.5 0.5) nan
s12 s12 POINT (2 2)点 (2 2) nan
s2 s2 POINT (3.5 3.5)积分 (3.5 3.5) nan
s3 s3 POINT (4.5 2)积分 (4.5 2) nan

Keep the shops as index and filling districts columns with NaN.将商店作为索引并用 NaN 填充区列。

2 gpd.sjoin(shops, districts, how="right", op="contains") 2 gpd.sjoin(shops, districts, how="right", op="contains")

index_left index_left geometry几何学
d1 d1 nan POLYGON ((0 0, 3 0, 3 3, 0 3, 0 0))
d2 d2 nan POLYGON ((1 1, 4 1, 4 4, 1 4, 1 1))
d3 d3 nan POLYGON ((5 2, 6 2, 6 3, 5 3, 5 2))

Keep the districts as index and filling shops columns with NaN.将地区作为索引并用 NaN 填充商店列。

3 gpd.sjoin(shops, districts, how="inner", op="contains") 3 gpd.sjoin(shops, districts, how="inner", op="contains")

geometry几何学 index_right index_right

This return an empty dataframe because points can't contains polygons.这将返回一个空数据框,因为点不能包含多边形。

4 gpd.sjoin(districts, shops, how="left", op="contains") 4 gpd.sjoin(districts, shops, how="left", op="contains")

geometry几何学 index_right index_right
d1 d1 POLYGON ((0 0, 3 0, 3 3, 0 3, 0 0)) s1 s1
d1 d1 POLYGON ((0 0, 3 0, 3 3, 0 3, 0 0)) s12 s12
d2 d2 POLYGON ((1 1, 4 1, 4 4, 1 4, 1 1)) s12 s12
d2 d2 POLYGON ((1 1, 4 1, 4 4, 1 4, 1 1)) s2 s2
d3 d3 POLYGON ((5 2, 6 2, 6 3, 5 3, 5 2)) nan

Keep the districts as index.保持地区作为索引。

5 gpd.sjoin(districts, shops, how="right", op="contains") 5 gpd.sjoin(districts, shops, how="right", op="contains")

index_left index_left geometry几何学
s1 s1 d1 d1 POINT (0.5 0.5)积分 (0.5 0.5)
s12 s12 d1 d1 POINT (2 2)点 (2 2)
s12 s12 d2 d2 POINT (2 2)点 (2 2)
s2 s2 d2 d2 POINT (3.5 3.5)积分 (3.5 3.5)
s3 s3 nan POINT (4.5 2)积分 (4.5 2)

Keep shop as index .保持商店作为索引。

6 gpd.sjoin(districts, shops, how="inner", op="contains") 6 gpd.sjoin(districts, shops, how="inner", op="contains")

geometry几何学 index_right index_right
d1 d1 POLYGON ((0 0, 3 0, 3 3, 0 3, 0 0)) s1 s1
d1 d1 POLYGON ((0 0, 3 0, 3 3, 0 3, 0 0)) s12 s12
d2 d2 POLYGON ((1 1, 4 1, 4 4, 1 4, 1 1)) s12 s12
d2 d2 POLYGON ((1 1, 4 1, 4 4, 1 4, 1 1)) s2 s2

Close to what we obtain with left, keep districts as index, but drop NaN.接近我们得到的左边,保留区作为索引,但删除 NaN。

Answer to the question回答问题

    gpd.sjoin(districts, shops, how="left", op="contains") \
    .reset_index()\
    .rename(columns={"index": "districts"})\
    .groupby(["districts"])\
    .agg(nshops=("index_right", "nunique"), lshops=("index_right", "unique"))\
    .astype(str)\
    .replace("[nan]", "")
districts nshops商店 lshops商店
d1 d1 2 2 ['s1' 's12'] ['s1''s12']
d2 d2 2 2 ['s12' 's2'] ['s12''s2']
d3 d3 0 0

Like this we can known if each polygon has any matching points这样我们就可以知道每个多边形是否有任何匹配点

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM