I have two datasets, one with points (shops) and one with polygons (districts).
The districts dataset sometimes has overlapping polygons (as I have buffered them).
I want to know if each polygon has any matching points?
joined = geopandas.sjoin(districts,shops, op='contains', how='inner')
joined
The above code probably give me only one of the matching polygons. How do I check each polygon?
gpd.sjoin(districts, shops, how="left", op="contains") \
.reset_index()\
.rename(columns={"index": "districts"})\
.groupby(["districts"])\
.agg(nshops=("index_right", "nunique"), lshops=("index_right", "unique"))\
.astype(str)\
.replace("[nan]", "")
Let's said that we have 3 districts ( d1 , d2 and d3 ). Two district overlap ( d1 and d2 ). We have 4 shops. s1 is inside district d1 , s2 is inside district d2 . s12 inside district d1 and d2 . s3 is not in any district.
We generate this geometry in python using shapely
:
from shapely.geometry import Point
from shapely.geometry import Polygon
import matplotlib.pyplot as plt
# Create Polygons for the districts
d1 = Polygon([(0, 0), (3, 0), (3, 3), (0, 3)])
d2 = Polygon([(1, 1), (4, 1), (4, 4), (1, 4)])
d3 = Polygon([(5, 2), (6, 2), (6, 3), (5, 3)])
# Create Points for the shops
s1 = Point(0.5, 0.5)
s2 = Point(3.5, 3.5)
s3 = Point(4.5, 2)
# This shop is in distric 1 and distric 2.
s12 = Point(2, 2)
Saving the geometry into GeoPandas DataFrame and using matplotlib we can have a look at the configuration:
import geopandas as gpd
import matplotlib.pyplot as plt
districts = gpd.GeoDataFrame(index=['d1', 'd2', 'd3'], geometry=[d1, d2, d3])
shops = gpd.GeoDataFrame(index=['s1', 's12', 's2', 's3'], geometry=[s1, s12, s2, s3])
ax = districts.boundary.plot()
shops.plot(ax=ax, color='red')
plt.show()
Now let's have a look at how spatial join are working in GeoPandas. We have to be careful at the order of the dataframe because the operation is not commutative. Meaning gpd.sjoin(shops, districts, how="inner", op="contains")
is not equal to gpd.sjoin(districts, shops, how="inner", op="contains")
.
gpd.sjoin(shops, districts, how="left", op="contains")
geometry | index_right | |
---|---|---|
s1 | POINT (0.5 0.5) | nan |
s12 | POINT (2 2) | nan |
s2 | POINT (3.5 3.5) | nan |
s3 | POINT (4.5 2) | nan |
Keep the shops as index and filling districts columns with NaN.
gpd.sjoin(shops, districts, how="right", op="contains")
index_left | geometry | |
---|---|---|
d1 | nan | POLYGON ((0 0, 3 0, 3 3, 0 3, 0 0)) |
d2 | nan | POLYGON ((1 1, 4 1, 4 4, 1 4, 1 1)) |
d3 | nan | POLYGON ((5 2, 6 2, 6 3, 5 3, 5 2)) |
Keep the districts as index and filling shops columns with NaN.
gpd.sjoin(shops, districts, how="inner", op="contains")
geometry | index_right |
---|
This return an empty dataframe because points can't contains polygons.
gpd.sjoin(districts, shops, how="left", op="contains")
geometry | index_right | |
---|---|---|
d1 | POLYGON ((0 0, 3 0, 3 3, 0 3, 0 0)) |
s1 |
d1 | POLYGON ((0 0, 3 0, 3 3, 0 3, 0 0)) |
s12 |
d2 | POLYGON ((1 1, 4 1, 4 4, 1 4, 1 1)) |
s12 |
d2 | POLYGON ((1 1, 4 1, 4 4, 1 4, 1 1)) |
s2 |
d3 | POLYGON ((5 2, 6 2, 6 3, 5 3, 5 2)) |
nan |
Keep the districts as index.
gpd.sjoin(districts, shops, how="right", op="contains")
index_left | geometry | |
---|---|---|
s1 | d1 | POINT (0.5 0.5) |
s12 | d1 | POINT (2 2) |
s12 | d2 | POINT (2 2) |
s2 | d2 | POINT (3.5 3.5) |
s3 | nan | POINT (4.5 2) |
Keep shop as index .
gpd.sjoin(districts, shops, how="inner", op="contains")
geometry | index_right | |
---|---|---|
d1 | POLYGON ((0 0, 3 0, 3 3, 0 3, 0 0)) |
s1 |
d1 | POLYGON ((0 0, 3 0, 3 3, 0 3, 0 0)) |
s12 |
d2 | POLYGON ((1 1, 4 1, 4 4, 1 4, 1 1)) |
s12 |
d2 | POLYGON ((1 1, 4 1, 4 4, 1 4, 1 1)) |
s2 |
Close to what we obtain with left, keep districts as index, but drop NaN.
gpd.sjoin(districts, shops, how="left", op="contains") \
.reset_index()\
.rename(columns={"index": "districts"})\
.groupby(["districts"])\
.agg(nshops=("index_right", "nunique"), lshops=("index_right", "unique"))\
.astype(str)\
.replace("[nan]", "")
districts | nshops | lshops |
---|---|---|
d1 | 2 | ['s1' 's12'] |
d2 | 2 | ['s12' 's2'] |
d3 | 0 |
Like this we can known if each polygon has any matching points
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.