简体   繁体   中英

How do i select objects within a geographic regions in a pandas dataframe

I'm trying to selection objects within a region from a pandas dataframe which contains a list of item ids and lat lon pairs. Is there a selection method for this? I think this would be similar to this SO question but using PANDAS instead of SQL

Selecting geographical points within area

Here is my table saved in locations.csv

ID, LAT, LON
001,35.00,-75.00
002,35.01,-80.00 
...
999,25.76,-64.00

I can load the dataframe, and select a rectangular region:

import pandas as pd
df = pd.read_csv('locations.csv', delimiter=',')
lat_max = 32.323496
lat_min = 25.712767
lon_max = -72.863358
lon_min = -74.729456
small_df = df[df['LAT'] > lat_min][df['LAT'] < lat_max][df['LON'] < lon_max][df['LON'] > lon_min]

How would I select objects within an irregular region?

How would I structure the dataframe selection command?

I can build a lambda function that will produce a True value for LAT and LON within the region but I'm not sure how to use that with a pandas dataframe.

A process to select points within a region as performed by the working code below starts with creating 2 geodataframes. The first one contains a polygon, and the second contains all the points to do spatial join with the first. The spatial join operator within is used to enable the points that fall inside the polygon to be selected. The result of the operation is also a geodataframe, it contains only the required points that fall within the area of the polygon.

The content of locations.csv ; 6 lines with column headers. Note: no spaces in the first row.

ID,LAT,LON
1, 15.1, 10.0
2, 15.2, 15.1
3, 15.3, 20.2
4, 15.4, 25.3
5, 15.5, 30.4

The code:

import pandas as pd
import geopandas as gpd
from shapely import wkt
from shapely.geometry import Point, Polygon
from shapely.wkt import loads

# Create a geo-dataframe `polygon_df` having 1 row of polygon
# This polygon will be used to select points in a geodataframe
d = {'poly_id':[1], 'wkt':['POLYGON ((30 10, 40 40, 20 40, 10 20, 30 10))']}
df = pd.DataFrame( data=d )
geometry = [loads(pgon) for pgon in df.wkt]
polygon_df = gpd.GeoDataFrame(df, \
                   crs={'init': 'epsg:4326'}, \
                   geometry=geometry)

# One can plot this polygon with the command:
# polygon_df.plot()

# Read the file with `pandas`
locs = pd.read_csv('locations.csv', sep=',')

# Making it a geo-dataframe with new name: `geo_locs`
geo_locs = gpd.GeoDataFrame(locs, crs={'init': 'epsg:4326'})
locs_geom = [Point(xy) for xy in zip(geo_locs.LON, geo_locs.LAT)]
geo_locs['wkt'] = geo_locs.apply( lambda x: Point(x.LON, x.LAT), axis=1 )
geo_locs = gpd.GeoDataFrame(geo_locs, crs={'init': 'epsg:4326'}, \
    geometry=geo_locs['wkt'])

# Do a spatial join of `point` within `polygon`, get the result in `pts_in_poly` GeodataFrame.
pts_in_poly = gpd.sjoin(geo_locs, polygon_df, op='within', how='inner')

# Print the ID of the points that fall within the polygon.
print(pts_in_poly.ID)

# The output will be:
#2    3
#3    4
#4    5
#Name: ID, dtype: int64

# Plot the polygon and all the points.
ax1 = polygon_df.plot(color='lightgray', zorder=1)
geo_locs.plot(ax=ax1, zorder=5, color="red")

The output plot:

在此处输入图片说明

In the plot, the points with ID's 3, 4, and 5 fall within the polygon.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM