[英]How to do a point in polygon query efficiently using geopandas?
I have a shapefile that has all the counties for the US, and I am doing a bunch of queries at a lat/lon point and then finding what county the point lies in. Right now I am just looping through all the counties and doing pnt.within(county).我有一个包含美国所有县的 shapefile,我在纬度/经度点进行了一堆查询,然后找到该点所在的县。现在我只是遍历所有县并执行 pnt .within(县)。 This isn't very efficient.
这不是很有效。 Is there a better way to do this?
有一个更好的方法吗?
Your situation looks like a typical case where spatial joins
are useful.您的情况看起来像是
spatial joins
很有用的典型情况。 The idea of spatial joins is to merge data using geographic coordinates instead of using attributes.空间连接的想法是使用地理坐标而不是使用属性来合并数据。
Three possibilities in geopandas
: geopandas
中的三种可能性:
intersects
within
contains
It seems like you want within
, which is possible using the following syntax:似乎您想要
within
,可以使用以下语法:
geopandas.sjoin(points, polygons, how="inner", op='within')
Note: You need to have installed rtree
to be able to perform such operations.注意:您需要安装
rtree
才能执行此类操作。 If you need to install this dependency, use pip
or conda
to install it如果需要安装这个依赖,使用
pip
或者conda
安装
As an example, let's plot European cities.例如,让我们 plot 欧洲城市。 The two example datasets are
两个示例数据集是
import geopandas
import matplotlib.pyplot as plt
world = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))
cities = geopandas.read_file(geopandas.datasets.get_path('naturalearth_cities'))
countries = world[world['continent'] == "Europe"].rename(columns={'name':'country'})
countries.head(2)
pop_est continent country iso_a3 gdp_md_est geometry
18 142257519 Europe Russia RUS 3745000.0 MULTIPOLYGON (((178.725 71.099, 180.000 71.516...
21 5320045 Europe Norway -99 364700.0 MULTIPOLYGON (((15.143 79.674, 15.523 80.016, ...
cities.head(2)
name geometry
0 Vatican City POINT (12.45339 41.90328)
1 San Marino POINT (12.44177 43.93610)
cities
is a worldwide dataset and countries
is an European wide dataset. cities
是全球数据集, countries
是欧洲范围内的数据集。
Both dataset need to be in the same projection system.两个数据集都需要在同一个投影系统中。 If not, use
.to_crs
before merging.如果没有,请在合并前使用
.to_crs
。
data_merged = geopandas.sjoin(cities, countries, how="inner", op='within')
Finally, to see the result let's do a map最后,看看结果让我们做一个 map
f, ax = plt.subplots(1, figsize=(20,10))
data_merged.plot(axes=ax)
countries.plot(axes=ax, alpha=0.25, linewidth=0.1)
plt.show()
and the underlying dataset merges together the information we need基础数据集将我们需要的信息合并在一起
data_merged.head(5)
name geometry index_right pop_est continent country iso_a3 gdp_md_est
0 Vatican City POINT (12.45339 41.90328) 141 62137802 Europe Italy ITA 2221000.0
1 San Marino POINT (12.44177 43.93610) 141 62137802 Europe Italy ITA 2221000.0
192 Rome POINT (12.48131 41.89790) 141 62137802 Europe Italy ITA 2221000.0
2 Vaduz POINT (9.51667 47.13372) 114 8754413 Europe Austria AUT 416600.0
184 Vienna POINT (16.36469 48.20196) 114 8754413 Europe Austria AUT 416600.0
Here, I used inner
join method but that's a parameter you can change if, for instance, you want to keep all points, including those not within a polygon.在这里,我使用了
inner
连接方法,但如果您想保留所有点,包括不在多边形内的点,您可以更改该参数。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.