重叠多边形时在geopandas中的空间连接？

Question

I have two datasets, one with points (shops) and one with polygons (districts).我有两个数据集，一个是点（商店），一个是多边形（区）。

The districts dataset sometimes has overlapping polygons (as I have buffered them). Districts 数据集有时具有重叠的多边形（因为我已经缓冲了它们）。

I want to know if each polygon has any matching points?我想知道每个多边形是否有匹配点？

joined = geopandas.sjoin(districts,shops, op='contains', how='inner')
joined

The above code probably give me only one of the matching polygons.上面的代码可能只给了我一个匹配的多边形。 How do I check each polygon?如何检查每个多边形？

Answer 1

TL/DR TL/DR

gpd.sjoin(districts, shops, how="left", op="contains") \
.reset_index()\
.rename(columns={"index": "districts"})\
.groupby(["districts"])\
.agg(nshops=("index_right", "nunique"), lshops=("index_right", "unique"))\
.astype(str)\
.replace("[nan]", "")

Explanations说明

Problem setup问题设置

Let's said that we have 3 districts ( d1 , d2 and d3 ).假设我们有 3 个区（ d1 、 d2和d3 ）。 Two district overlap ( d1 and d2 ).两个区重叠（ d1和d2 ）。 We have 4 shops.我们有4家商店。 s1 is inside district d1 , s2 is inside district d2 . s1在d1区内， s2在d2区内。 s12 inside district d1 and d2 . s12位于d1和d2区内。 s3 is not in any district. s3不在任何区。

We generate this geometry in python using shapely :我们使用shapely在 python 中生成这个几何：

from shapely.geometry import Point
from shapely.geometry import Polygon
import matplotlib.pyplot as plt
# Create Polygons for the districts
d1 = Polygon([(0, 0), (3, 0), (3, 3), (0, 3)])
d2 = Polygon([(1, 1), (4, 1), (4, 4), (1, 4)])
d3 = Polygon([(5, 2), (6, 2), (6, 3), (5, 3)])
# Create Points for the shops
s1 = Point(0.5, 0.5)
s2 = Point(3.5, 3.5)
s3 = Point(4.5, 2)
# This shop is in distric 1 and distric 2.
s12 = Point(2, 2)

Saving the geometry into GeoPandas DataFrame and using matplotlib we can have a look at the configuration:将几何图形保存到 GeoPandas DataFrame 中并使用 matplotlib 我们可以查看配置：

import geopandas as gpd
import matplotlib.pyplot as plt
districts = gpd.GeoDataFrame(index=['d1', 'd2', 'd3'], geometry=[d1, d2, d3])
shops = gpd.GeoDataFrame(index=['s1', 's12', 's2', 's3'], geometry=[s1, s12, s2, s3])
ax = districts.boundary.plot()
shops.plot(ax=ax, color='red')
plt.show()

Now let's have a look at how spatial join are working in GeoPandas.现在让我们看看空间连接在 GeoPandas 中是如何工作的。 We have to be careful at the order of the dataframe because the operation is not commutative.我们必须小心数据帧的顺序，因为操作不是可交换的。 Meaning gpd.sjoin(shops, districts, how="inner", op="contains") is not equal to gpd.sjoin(districts, shops, how="inner", op="contains") .意思是gpd.sjoin(shops, districts, how="inner", op="contains")不等于gpd.sjoin(districts, shops, how="inner", op="contains") 。

Now let's have a look to six arrangement:现在让我们来看看六种排列：

1 `gpd.sjoin(shops, districts, how="left", op="contains")` 1 `gpd.sjoin(shops, districts, how="left", op="contains")`

	geometry几何学	index_right index_right
s1 s1	POINT (0.5 0.5)积分 (0.5 0.5)	nan南
s12 s12	POINT (2 2)点 (2 2)	nan南
s2 s2	POINT (3.5 3.5)积分 (3.5 3.5)	nan南
s3 s3	POINT (4.5 2)积分 (4.5 2)	nan南

Keep the shops as index and filling districts columns with NaN.将商店作为索引并用 NaN 填充区列。

2 `gpd.sjoin(shops, districts, how="right", op="contains")` 2 `gpd.sjoin(shops, districts, how="right", op="contains")`

	index_left index_left	geometry几何学
d1 d1	nan南	`POLYGON ((0 0, 3 0, 3 3, 0 3, 0 0))`
d2 d2	nan南	`POLYGON ((1 1, 4 1, 4 4, 1 4, 1 1))`
d3 d3	nan南	`POLYGON ((5 2, 6 2, 6 3, 5 3, 5 2))`

Keep the districts as index and filling shops columns with NaN.将地区作为索引并用 NaN 填充商店列。

3 `gpd.sjoin(shops, districts, how="inner", op="contains")` 3 `gpd.sjoin(shops, districts, how="inner", op="contains")`

geometry几何学	index_right index_right

This return an empty dataframe because points can't contains polygons.这将返回一个空数据框，因为点不能包含多边形。

4 `gpd.sjoin(districts, shops, how="left", op="contains")` 4 `gpd.sjoin(districts, shops, how="left", op="contains")`

	geometry几何学	index_right index_right
d1 d1	`POLYGON ((0 0, 3 0, 3 3, 0 3, 0 0))`	s1 s1
d1 d1	`POLYGON ((0 0, 3 0, 3 3, 0 3, 0 0))`	s12 s12
d2 d2	`POLYGON ((1 1, 4 1, 4 4, 1 4, 1 1))`	s12 s12
d2 d2	`POLYGON ((1 1, 4 1, 4 4, 1 4, 1 1))`	s2 s2
d3 d3	`POLYGON ((5 2, 6 2, 6 3, 5 3, 5 2))`	nan南

Keep the districts as index.保持地区作为索引。

5 `gpd.sjoin(districts, shops, how="right", op="contains")` 5 `gpd.sjoin(districts, shops, how="right", op="contains")`

	index_left index_left	geometry几何学
s1 s1	d1 d1	POINT (0.5 0.5)积分 (0.5 0.5)
s12 s12	d1 d1	POINT (2 2)点 (2 2)
s12 s12	d2 d2	POINT (2 2)点 (2 2)
s2 s2	d2 d2	POINT (3.5 3.5)积分 (3.5 3.5)
s3 s3	nan南	POINT (4.5 2)积分 (4.5 2)

Keep shop as index .保持商店作为索引。

6 `gpd.sjoin(districts, shops, how="inner", op="contains")` 6 `gpd.sjoin(districts, shops, how="inner", op="contains")`

	geometry几何学	index_right index_right
d1 d1	`POLYGON ((0 0, 3 0, 3 3, 0 3, 0 0))`	s1 s1
d1 d1	`POLYGON ((0 0, 3 0, 3 3, 0 3, 0 0))`	s12 s12
d2 d2	`POLYGON ((1 1, 4 1, 4 4, 1 4, 1 1))`	s12 s12
d2 d2	`POLYGON ((1 1, 4 1, 4 4, 1 4, 1 1))`	s2 s2

Close to what we obtain with left, keep districts as index, but drop NaN.接近我们得到的左边，保留区作为索引，但删除 NaN。

Answer to the question回答问题

    gpd.sjoin(districts, shops, how="left", op="contains") \
    .reset_index()\
    .rename(columns={"index": "districts"})\
    .groupby(["districts"])\
    .agg(nshops=("index_right", "nunique"), lshops=("index_right", "unique"))\
    .astype(str)\
    .replace("[nan]", "")

districts区	nshops商店	lshops商店
d1 d1	2 2	['s1' 's12'] ['s1''s12']
d2 d2	2 2	['s12' 's2'] ['s12''s2']
d3 d3	0 0

Like this we can known if each polygon has any matching points这样我们就可以知道每个多边形是否有任何匹配点

重叠多边形时在geopandas中的空间连接？

问题描述

1 个解决方案

解决方案1
0 2021-10-14 13:08:04

TL/DR TL/DR

Explanations说明

Problem setup问题设置

Now let's have a look to six arrangement:现在让我们来看看六种排列：

1 `gpd.sjoin(shops, districts, how="left", op="contains")` 1 `gpd.sjoin(shops, districts, how="left", op="contains")`

2 `gpd.sjoin(shops, districts, how="right", op="contains")` 2 `gpd.sjoin(shops, districts, how="right", op="contains")`

3 `gpd.sjoin(shops, districts, how="inner", op="contains")` 3 `gpd.sjoin(shops, districts, how="inner", op="contains")`

4 `gpd.sjoin(districts, shops, how="left", op="contains")` 4 `gpd.sjoin(districts, shops, how="left", op="contains")`

5 `gpd.sjoin(districts, shops, how="right", op="contains")` 5 `gpd.sjoin(districts, shops, how="right", op="contains")`

6 `gpd.sjoin(districts, shops, how="inner", op="contains")` 6 `gpd.sjoin(districts, shops, how="inner", op="contains")`

Answer to the question回答问题

重叠多边形时在geopandas中的空间连接？

问题描述

1 个解决方案

解决方案1 0 2021-10-14 13:08:04

TL/DR TL/DR

Explanations说明

Problem setup问题设置

Now let's have a look to six arrangement:现在让我们来看看六种排列：

1 gpd.sjoin(shops, districts, how="left", op="contains") 1 gpd.sjoin(shops, districts, how="left", op="contains")

2 gpd.sjoin(shops, districts, how="right", op="contains") 2 gpd.sjoin(shops, districts, how="right", op="contains")

3 gpd.sjoin(shops, districts, how="inner", op="contains") 3 gpd.sjoin(shops, districts, how="inner", op="contains")

4 gpd.sjoin(districts, shops, how="left", op="contains") 4 gpd.sjoin(districts, shops, how="left", op="contains")

5 gpd.sjoin(districts, shops, how="right", op="contains") 5 gpd.sjoin(districts, shops, how="right", op="contains")

6 gpd.sjoin(districts, shops, how="inner", op="contains") 6 gpd.sjoin(districts, shops, how="inner", op="contains")

Answer to the question回答问题

解决方案1
0 2021-10-14 13:08:04

1 `gpd.sjoin(shops, districts, how="left", op="contains")` 1 `gpd.sjoin(shops, districts, how="left", op="contains")`

2 `gpd.sjoin(shops, districts, how="right", op="contains")` 2 `gpd.sjoin(shops, districts, how="right", op="contains")`

3 `gpd.sjoin(shops, districts, how="inner", op="contains")` 3 `gpd.sjoin(shops, districts, how="inner", op="contains")`

4 `gpd.sjoin(districts, shops, how="left", op="contains")` 4 `gpd.sjoin(districts, shops, how="left", op="contains")`

5 `gpd.sjoin(districts, shops, how="right", op="contains")` 5 `gpd.sjoin(districts, shops, how="right", op="contains")`

6 `gpd.sjoin(districts, shops, how="inner", op="contains")` 6 `gpd.sjoin(districts, shops, how="inner", op="contains")`