[英]Spatial join in geopandas when overlapping polygons?
I have two datasets, one with points (shops) and one with polygons (districts).我有两个数据集,一个是点(商店),一个是多边形(区)。
The districts dataset sometimes has overlapping polygons (as I have buffered them). Districts 数据集有时具有重叠的多边形(因为我已经缓冲了它们)。
I want to know if each polygon has any matching points?我想知道每个多边形是否有匹配点?
joined = geopandas.sjoin(districts,shops, op='contains', how='inner')
joined
The above code probably give me only one of the matching polygons.上面的代码可能只给了我一个匹配的多边形。 How do I check each polygon?
如何检查每个多边形?
gpd.sjoin(districts, shops, how="left", op="contains") \
.reset_index()\
.rename(columns={"index": "districts"})\
.groupby(["districts"])\
.agg(nshops=("index_right", "nunique"), lshops=("index_right", "unique"))\
.astype(str)\
.replace("[nan]", "")
Let's said that we have 3 districts ( d1 , d2 and d3 ).假设我们有 3 个区( d1 、 d2和d3 )。 Two district overlap ( d1 and d2 ).
两个区重叠( d1和d2 )。 We have 4 shops.
我们有4家商店。 s1 is inside district d1 , s2 is inside district d2 .
s1在d1区内, s2在d2区内。 s12 inside district d1 and d2 .
s12位于d1和d2区内。 s3 is not in any district.
s3不在任何区。
We generate this geometry in python using shapely
:我们使用
shapely
在 python 中生成这个几何:
from shapely.geometry import Point
from shapely.geometry import Polygon
import matplotlib.pyplot as plt
# Create Polygons for the districts
d1 = Polygon([(0, 0), (3, 0), (3, 3), (0, 3)])
d2 = Polygon([(1, 1), (4, 1), (4, 4), (1, 4)])
d3 = Polygon([(5, 2), (6, 2), (6, 3), (5, 3)])
# Create Points for the shops
s1 = Point(0.5, 0.5)
s2 = Point(3.5, 3.5)
s3 = Point(4.5, 2)
# This shop is in distric 1 and distric 2.
s12 = Point(2, 2)
Saving the geometry into GeoPandas DataFrame and using matplotlib we can have a look at the configuration:将几何图形保存到 GeoPandas DataFrame 中并使用 matplotlib 我们可以查看配置:
import geopandas as gpd
import matplotlib.pyplot as plt
districts = gpd.GeoDataFrame(index=['d1', 'd2', 'd3'], geometry=[d1, d2, d3])
shops = gpd.GeoDataFrame(index=['s1', 's12', 's2', 's3'], geometry=[s1, s12, s2, s3])
ax = districts.boundary.plot()
shops.plot(ax=ax, color='red')
plt.show()
Now let's have a look at how spatial join are working in GeoPandas.现在让我们看看空间连接在 GeoPandas 中是如何工作的。 We have to be careful at the order of the dataframe because the operation is not commutative.
我们必须小心数据帧的顺序,因为操作不是可交换的。 Meaning
gpd.sjoin(shops, districts, how="inner", op="contains")
is not equal to gpd.sjoin(districts, shops, how="inner", op="contains")
.意思是
gpd.sjoin(shops, districts, how="inner", op="contains")
不等于gpd.sjoin(districts, shops, how="inner", op="contains")
。
gpd.sjoin(shops, districts, how="left", op="contains")
gpd.sjoin(shops, districts, how="left", op="contains")
geometry![]() |
index_right ![]() |
|
---|---|---|
s1 ![]() |
POINT (0.5 0.5)![]() |
nan![]() |
s12 ![]() |
POINT (2 2)![]() |
nan![]() |
s2 ![]() |
POINT (3.5 3.5)![]() |
nan![]() |
s3 ![]() |
POINT (4.5 2)![]() |
nan![]() |
Keep the shops as index and filling districts columns with NaN.将商店作为索引并用 NaN 填充区列。
gpd.sjoin(shops, districts, how="right", op="contains")
gpd.sjoin(shops, districts, how="right", op="contains")
index_left ![]() |
geometry![]() |
|
---|---|---|
d1 ![]() |
nan![]() |
POLYGON ((0 0, 3 0, 3 3, 0 3, 0 0)) |
d2 ![]() |
nan![]() |
POLYGON ((1 1, 4 1, 4 4, 1 4, 1 1)) |
d3 ![]() |
nan![]() |
POLYGON ((5 2, 6 2, 6 3, 5 3, 5 2)) |
Keep the districts as index and filling shops columns with NaN.将地区作为索引并用 NaN 填充商店列。
gpd.sjoin(shops, districts, how="inner", op="contains")
gpd.sjoin(shops, districts, how="inner", op="contains")
geometry![]() |
index_right ![]() |
---|
This return an empty dataframe because points can't contains polygons.这将返回一个空数据框,因为点不能包含多边形。
gpd.sjoin(districts, shops, how="left", op="contains")
gpd.sjoin(districts, shops, how="left", op="contains")
geometry![]() |
index_right ![]() |
|
---|---|---|
d1 ![]() |
POLYGON ((0 0, 3 0, 3 3, 0 3, 0 0)) |
s1 ![]() |
d1 ![]() |
POLYGON ((0 0, 3 0, 3 3, 0 3, 0 0)) |
s12 ![]() |
d2 ![]() |
POLYGON ((1 1, 4 1, 4 4, 1 4, 1 1)) |
s12 ![]() |
d2 ![]() |
POLYGON ((1 1, 4 1, 4 4, 1 4, 1 1)) |
s2 ![]() |
d3 ![]() |
POLYGON ((5 2, 6 2, 6 3, 5 3, 5 2)) |
nan![]() |
Keep the districts as index.保持地区作为索引。
gpd.sjoin(districts, shops, how="right", op="contains")
gpd.sjoin(districts, shops, how="right", op="contains")
index_left ![]() |
geometry![]() |
|
---|---|---|
s1 ![]() |
d1 ![]() |
POINT (0.5 0.5)![]() |
s12 ![]() |
d1 ![]() |
POINT (2 2)![]() |
s12 ![]() |
d2 ![]() |
POINT (2 2)![]() |
s2 ![]() |
d2 ![]() |
POINT (3.5 3.5)![]() |
s3 ![]() |
nan![]() |
POINT (4.5 2)![]() |
Keep shop as index .保持商店作为索引。
gpd.sjoin(districts, shops, how="inner", op="contains")
gpd.sjoin(districts, shops, how="inner", op="contains")
geometry![]() |
index_right ![]() |
|
---|---|---|
d1 ![]() |
POLYGON ((0 0, 3 0, 3 3, 0 3, 0 0)) |
s1 ![]() |
d1 ![]() |
POLYGON ((0 0, 3 0, 3 3, 0 3, 0 0)) |
s12 ![]() |
d2 ![]() |
POLYGON ((1 1, 4 1, 4 4, 1 4, 1 1)) |
s12 ![]() |
d2 ![]() |
POLYGON ((1 1, 4 1, 4 4, 1 4, 1 1)) |
s2 ![]() |
Close to what we obtain with left, keep districts as index, but drop NaN.接近我们得到的左边,保留区作为索引,但删除 NaN。
gpd.sjoin(districts, shops, how="left", op="contains") \
.reset_index()\
.rename(columns={"index": "districts"})\
.groupby(["districts"])\
.agg(nshops=("index_right", "nunique"), lshops=("index_right", "unique"))\
.astype(str)\
.replace("[nan]", "")
districts![]() |
nshops![]() |
lshops![]() |
---|---|---|
d1 ![]() |
2 ![]() |
['s1' 's12'] ![]() |
d2 ![]() |
2 ![]() |
['s12' 's2'] ![]() |
d3 ![]() |
0 ![]() |
Like this we can known if each polygon has any matching points这样我们就可以知道每个多边形是否有任何匹配点
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.