简体   繁体   中英

Checking if points fall into shapefile (polygons and points are in different unit)

I'm trying to create a mask following this question's answer , so that I can color areas inside my shape file only. I will post the code example here (which is pretty much the same with the code in my previous question, but with the solution code included).

A quick summary of my code: It's a data of (lon, lat, tem). I transform the lon and lat into ccrs.PlateCaree() coordinate system (which is in meter and not degree anymore), and put the transformed coordinates in xp, yp. The 3 variables xp, yp, and tem are then interpolated and then I use pmeshcolor() to draw a color map (the same color map in my previous question that I left the link above). Now I want to limit the color map to be inside the shape file only (which is the grey lines). Code is as below:

import cartopy.crs as ccrs
import cartopy.feature as cfeature
import matplotlib.pyplot as plt
import matplotlib.colors as clr
import pandas as pd
import numpy as np

from metpy.interpolate import interpolate_to_grid, remove_nan_observations
from cartopy.io.shapereader import Reader
from cartopy.feature import ShapelyFeature

canada_east = -95
canada_west = -101.8
canada_north = 52.8
canada_south = 48.85

central_lon = (canada_east + canada_west)/2
central_lat = (canada_north + canada_south)/2

crs = ccrs.LambertConformal(central_longitude = central_lon, central_latitude = central_lat)
lat = np.array([49.8134 50.904  50.698  49.095  49.436  49.9607 49.9601 49.356  50.116

49.402  52.3472 50.411  49.24   49.876  49.591  49.905  49.498  49.088
49.118  50.5947 49.3776 49.148  49.1631 51.358  49.826  50.4324 49.96
49.68   49.875  50.829  51.572])
lon = np.array([-100.3721  -97.273   -99.068   -97.528  -100.308   -98.9054  -98.6367
-99.248   -96.434  -100.93   -101.1099 -100.893  -100.055   -99.909
-97.518   -99.354   -98.03    -99.325   -99.054   -98.0035 -100.5387
-100.491   -97.1454 -100.361   -96.776   -99.4392  -97.7463  -97.984
-95.92    -98.111  -100.488])
tem = np.array([-8.45   -4.026  -5.993  -3.68   -7.35   -7.421  -6.477  -8.03   -3.834
-13.04   -4.057  -8.79   -6.619 -10.89   -4.465  -8.41   -4.861  -9.93
-7.125  -4.424 -11.95   -9.56   -3.86   -7.17   -4.193  -7.653  -4.883
-5.631  -3.004  -4.738  -8.81])

xp, yp, _ = crs.transform_points(ccrs.PlateCarree(), lon, lat ).T
xp, yp, tem = remove_nan_observations(xp, yp, tem)

alt_x, alt_y, data = interpolate_to_grid( xp, yp, tem, minimum_neighbors=2, search_radius=240000, interp_type = 'barnes', hres = 1000)

# Create the figure and grid for subplots
fig = plt.figure(figsize=(17, 12))

# Main ax
ax = plt.subplot(111, projection=crs)
ax.set_extent([canada_west, canada_east, canada_south, canada_north], ccrs.PlateCarree())


# Read shape file again using geopandas and create a mask

cat_gdf = geopandas.read_file('MB_AGregion_Perim_South.shp')
cat_gdf = cat_gdf.to_crs(epsg = 4326)
mask = shapely.vectorized.contains(cat_gdf.dissolve().geometry.item(), alt_x, alt_y)

 print(mask)

which gives

[[False False False ... False False False]
 [False False False ... False False False]
 [False False False ... False False False]
 ...
 [False False False ... False False False]
 [False False False ... False False False]
 [False False False ... False False False]]

Process finished with exit code 0

My alt_x and alt_y is in meter and my shape file geometries are in degree. I believe this is the reason why the mask contains all false value (which means masking everything). However, I'm unable to find a solution to this problem. I would appreciate a lot if someone can help me with this problem.

Here's another MRE. Keeping things very simple - I'm just using the bbox and vector of (x, y) values rather than the mesh you have. But it should be enough to illustrate the issue you're dealing with.

import cartopy.crs as ccrs
import shapely.geometry
import shapely.vectorized
import numpy as np, pandas as pd, geopandas as gpd, matplotlib.pyplot as plt

canada_east = -95
canada_west = -101.8
canada_north = 52.8
canada_south = 48.85

central_lon = (canada_east + canada_west)/2
central_lat = (canada_north + canada_south)/2

crs = ccrs.LambertConformal(
    central_longitude=central_lon,
    central_latitude=central_lat,
)

x = np.arange(-120, -90)
y = np.ones_like(x) * 50

bbox = shapely.geometry.Polygon([
    (canada_west, canada_south),
    (canada_east, canada_south),
    (canada_east, canada_north),
    (canada_west, canada_north),
 ])

gdf = gpd.GeoDataFrame(geometry=[bbox], crs='epsg:4326')

Here's what x, y, and the geodataframe look like. Note that they're all in degrees, techincally WGS84, aka 'epsg:4326' :

In [2]: print(f"x values: {x}\n\ny values: {y}\n\ngeodataframe:\n{gdf}")
x values: [-120 -119 -118 -117 -116 -115 -114 -113 -112 -111 -110 -109 -108 -107
 -106 -105 -104 -103 -102 -101 -100  -99  -98  -97  -96  -95  -94  -93
  -92  -91]

y values: [50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50
 50 50 50 50 50 50]

geodataframe:
                                            geometry
0  POLYGON ((-101.80000 48.85000, -95.00000 48.85...

If we use shapely.vectorized.contains when both the points and the geodataframe are in degrees, we get the expected answer:

In [3]: shapely.vectorized.contains(gdf.dissolve().geometry.item(), x, y)
Out[3]:
array([False, False, False, False, False, False, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False,  True,  True,  True,  True,  True,  True, False, False,
       False, False, False])

The CRS PlateCarree() is a plotting CRS which preserves lat/lon values, so you can think of it as equivalent to WGS84 (only difference is PlateCarree has a defined mapping to pixel values). So the following translates our lat/lon values into the CRS of our object CRS , which is LambertConformal :

In [4]: xp, yp, _ = crs.transform_points(ccrs.PlateCarree(), x, y ).T

Now, our xp and yp values are in meters and look very different from the values in the geodataframe:

In [5]: print(f"xp values: {xp}\n\nyp values: {yp}\n\ngeodataframe:\n{gdf}")
xp values: [-1555398.30438114 -1484657.61843652 -1413737.15236547 -1342645.49406745
 -1271391.25217193 -1199983.05499594 -1128429.54949926 -1056739.40023735
  -984921.28831211  -912983.91032072  -840935.97730252  -768786.21368415
  -696543.35622316  -624216.15294999  -551813.36210868  -479343.75109632
  -406816.09540141  -334239.17754116  -261621.78599806  -188972.71415562
  -116300.7592336    -43614.72122269    29076.59818104   101764.396642
   174439.87225098   247094.22459092   319718.65580267   392304.37165025
   464842.58258582   537324.50481399]

yp values: [ 92537.27851361  75810.36405389  59862.8937992   44696.79886026
  30313.91572949  16715.98605863   3904.65644788  -8118.52175353
 -19352.09263517 -29794.69590175 -39445.06703778 -48302.03746074
 -56364.53466258 -63631.58233956 -70102.30051051 -75775.90562338
 -80651.71065011 -84729.12516982 -88007.65544033 -90486.90445793
 -92166.57200545 -93046.45468863 -93126.44596073 -92406.53613545
 -90886.8123881  -88567.45874503 -85448.75606135 -81531.08198695
 -76814.91092072 -71300.81395314]

geodataframe:
                                            geometry
0  POLYGON ((-101.80000 48.85000, -95.00000 48.85...

At this point, if we use contains to find points within the geodataframe, it's clear why we get only False values:

In [6]: shapely.vectorized.contains(gdf.dissolve().geometry.item(), xp, yp)
Out[6]:
array([False, False, False, False, False, False, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False, False, False])

To convert the geodataframe to our target crs , we can use to_crs :

In [7]: gdf_lambert = gdf.to_crs(crs)

Now, our values line up, and contains works as expected once again:

In [8]: print(f"xp values: {xp}\n\nyp values: {yp}\n\ngdf_lambert:\n{gdf_lambert}")
xp values: [-1555398.30438114 -1484657.61843652 -1413737.15236547 -1342645.49406745
 -1271391.25217193 -1199983.05499594 -1128429.54949926 -1056739.40023735
  -984921.28831211  -912983.91032072  -840935.97730252  -768786.21368415
  -696543.35622316  -624216.15294999  -551813.36210868  -479343.75109632
  -406816.09540141  -334239.17754116  -261621.78599806  -188972.71415562
  -116300.7592336    -43614.72122269    29076.59818104   101764.396642
   174439.87225098   247094.22459092   319718.65580267   392304.37165025
   464842.58258582   537324.50481399]

yp values: [ 92537.27851361  75810.36405389  59862.8937992   44696.79886026
  30313.91572949  16715.98605863   3904.65644788  -8118.52175353
 -19352.09263517 -29794.69590175 -39445.06703778 -48302.03746074
 -56364.53466258 -63631.58233956 -70102.30051051 -75775.90562338
 -80651.71065011 -84729.12516982 -88007.65544033 -90486.90445793
 -92166.57200545 -93046.45468863 -93126.44596073 -92406.53613545
 -90886.8123881  -88567.45874503 -85448.75606135 -81531.08198695
 -76814.91092072 -71300.81395314]

gdf_lambert:
                                            geometry
0  POLYGON ((-251935.076 -217891.780, 251935.076 ...

In [9]: shapely.vectorized.contains(gdf_lambert.dissolve().geometry.item(), xp, yp)
Out[9]:
array([False, False, False, False, False, False, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False,  True,  True,  True,  True,  True,  True,  True, False,
       False, False, False])

Graphical representation

We can also see this very clearly when plotting. Here's the points and shapefile in lat/lon space, using the contains mask to color the points blue if outside and green if inside:


In [12]: fig, ax = plt.subplots()
    ...: gdf.plot(ax=ax, alpha=0.5)
    ...: mask = shapely.vectorized.contains(gdf.dissolve().geometry.item(), x, y)
    ...: ax.scatter(x[mask], y[mask], color='green')
    ...: ax.scatter(x[~mask], y[~mask], color='blue')
Out[12]: <matplotlib.collections.PathCollection at 0x18f5a3ac0>

点的直线,其中一些正确指示为与 shapefile 相交

If we mix and match projections, none of the points are in the shape (which is a tiny invisible spec near (0, 0) when the axes are scaled by meters):

In [14]: fig, ax = plt.subplots()
    ...: gdf.plot(ax=ax, alpha=0.5)
    ...: mask = shapely.vectorized.contains(gdf.dissolve().geometry.item(), xp, yp)
    ...: ax.scatter(xp[mask], yp[mask], color='green')
    ...: ax.scatter(xp[~mask], yp[~mask], color='blue')
Out[14]: <matplotlib.collections.PathCollection at 0x1919e58d0>

蓝点的弯曲弧线,不与 shapefile 相交

When the shapefile and the points are both in LambertConformal, we're back in business:

In [16]: fig, ax = plt.subplots()
    ...: gdf_lambert.plot(ax=ax, alpha=0.5)
    ...: mask = shapely.vectorized.contains(gdf_lambert.dissolve().geometry.item(), xp, yp)
    ...: ax.scatter(xp[mask], yp[mask], color='green')
    ...: ax.scatter(xp[~mask], yp[~mask], color='blue')
Out[16]: <matplotlib.collections.PathCollection at 0x191a88ee0>

一些正确显示为与 shapefile 相交的曲线点

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM