简体   繁体   English

如何使用 geopandas 有效地在多边形查询中做一个点?

[英]How to do a point in polygon query efficiently using geopandas?

I have a shapefile that has all the counties for the US, and I am doing a bunch of queries at a lat/lon point and then finding what county the point lies in. Right now I am just looping through all the counties and doing pnt.within(county).我有一个包含美国所有县的 shapefile,我在纬度/经度点进行了一堆查询,然后找到该点所在的县。现在我只是遍历所有县并执行 pnt .within(县)。 This isn't very efficient.这不是很有效。 Is there a better way to do this?有一个更好的方法吗?

Your situation looks like a typical case where spatial joins are useful.您的情况看起来像是spatial joins很有用的典型情况。 The idea of spatial joins is to merge data using geographic coordinates instead of using attributes.空间连接的想法是使用地理坐标而不是使用属性来合并数据。

Three possibilities in geopandas : geopandas中的三种可能性:

  • intersects
  • within
  • contains

It seems like you want within , which is possible using the following syntax:似乎您想要within ,可以使用以下语法:

geopandas.sjoin(points, polygons, how="inner", op='within')

Note: You need to have installed rtree to be able to perform such operations.注意:您需要安装rtree才能执行此类操作。 If you need to install this dependency, use pip or conda to install it如果需要安装这个依赖,使用pip或者conda安装

Example例子

As an example, let's plot European cities.例如,让我们 plot 欧洲城市。 The two example datasets are两个示例数据集是

import geopandas
import matplotlib.pyplot as plt

world = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))
cities = geopandas.read_file(geopandas.datasets.get_path('naturalearth_cities'))
countries = world[world['continent'] == "Europe"].rename(columns={'name':'country'})

countries.head(2)
    pop_est     continent   country     iso_a3  gdp_md_est  geometry
18  142257519   Europe  Russia  RUS     3745000.0   MULTIPOLYGON (((178.725 71.099, 180.000 71.516...
21  5320045     Europe  Norway  -99     364700.0    MULTIPOLYGON (((15.143 79.674, 15.523 80.016, ...

cities.head(2)
    name    geometry
0   Vatican City    POINT (12.45339 41.90328)
1   San Marino  POINT (12.44177 43.93610)

cities is a worldwide dataset and countries is an European wide dataset. cities是全球数据集, countries是欧洲范围内的数据集。

Both dataset need to be in the same projection system.两个数据集都需要在同一个投影系统中。 If not, use .to_crs before merging.如果没有,请在合并前使用.to_crs

data_merged = geopandas.sjoin(cities, countries, how="inner", op='within')

Finally, to see the result let's do a map最后,看看结果让我们做一个 map

f, ax = plt.subplots(1, figsize=(20,10))
data_merged.plot(axes=ax)
countries.plot(axes=ax, alpha=0.25, linewidth=0.1)
plt.show()

在此处输入图像描述

and the underlying dataset merges together the information we need基础数据集将我们需要的信息合并在一起

data_merged.head(5)

    name    geometry    index_right     pop_est     continent   country     iso_a3  gdp_md_est
0   Vatican City    POINT (12.45339 41.90328)   141     62137802    Europe  Italy   ITA     2221000.0
1   San Marino  POINT (12.44177 43.93610)   141     62137802    Europe  Italy   ITA     2221000.0
192     Rome    POINT (12.48131 41.89790)   141     62137802    Europe  Italy   ITA     2221000.0
2   Vaduz   POINT (9.51667 47.13372)    114     8754413     Europe  Austria     AUT     416600.0
184     Vienna  POINT (16.36469 48.20196)   114     8754413     Europe  Austria     AUT     416600.0

Here, I used inner join method but that's a parameter you can change if, for instance, you want to keep all points, including those not within a polygon.在这里,我使用了inner连接方法,但如果您想保留所有点,包括不在多边形内的点,您可以更改该参数。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM