简体   繁体   中英

How to do a point in polygon query efficiently using geopandas?

I have a shapefile that has all the counties for the US, and I am doing a bunch of queries at a lat/lon point and then finding what county the point lies in. Right now I am just looping through all the counties and doing pnt.within(county). This isn't very efficient. Is there a better way to do this?

Your situation looks like a typical case where spatial joins are useful. The idea of spatial joins is to merge data using geographic coordinates instead of using attributes.

Three possibilities in geopandas :

  • intersects
  • within
  • contains

It seems like you want within , which is possible using the following syntax:

geopandas.sjoin(points, polygons, how="inner", op='within')

Note: You need to have installed rtree to be able to perform such operations. If you need to install this dependency, use pip or conda to install it

Example

As an example, let's plot European cities. The two example datasets are

import geopandas
import matplotlib.pyplot as plt

world = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))
cities = geopandas.read_file(geopandas.datasets.get_path('naturalearth_cities'))
countries = world[world['continent'] == "Europe"].rename(columns={'name':'country'})

countries.head(2)
    pop_est     continent   country     iso_a3  gdp_md_est  geometry
18  142257519   Europe  Russia  RUS     3745000.0   MULTIPOLYGON (((178.725 71.099, 180.000 71.516...
21  5320045     Europe  Norway  -99     364700.0    MULTIPOLYGON (((15.143 79.674, 15.523 80.016, ...

cities.head(2)
    name    geometry
0   Vatican City    POINT (12.45339 41.90328)
1   San Marino  POINT (12.44177 43.93610)

cities is a worldwide dataset and countries is an European wide dataset.

Both dataset need to be in the same projection system. If not, use .to_crs before merging.

data_merged = geopandas.sjoin(cities, countries, how="inner", op='within')

Finally, to see the result let's do a map

f, ax = plt.subplots(1, figsize=(20,10))
data_merged.plot(axes=ax)
countries.plot(axes=ax, alpha=0.25, linewidth=0.1)
plt.show()

在此处输入图像描述

and the underlying dataset merges together the information we need

data_merged.head(5)

    name    geometry    index_right     pop_est     continent   country     iso_a3  gdp_md_est
0   Vatican City    POINT (12.45339 41.90328)   141     62137802    Europe  Italy   ITA     2221000.0
1   San Marino  POINT (12.44177 43.93610)   141     62137802    Europe  Italy   ITA     2221000.0
192     Rome    POINT (12.48131 41.89790)   141     62137802    Europe  Italy   ITA     2221000.0
2   Vaduz   POINT (9.51667 47.13372)    114     8754413     Europe  Austria     AUT     416600.0
184     Vienna  POINT (16.36469 48.20196)   114     8754413     Europe  Austria     AUT     416600.0

Here, I used inner join method but that's a parameter you can change if, for instance, you want to keep all points, including those not within a polygon.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM