[英]Spatial binning from a spatial dataframe using geopandas (Python)
I want to do a spatial binning (using median as aggregation function) starting from a CSV file containing pollutant values measured at positions long and lat.我想从包含在经纬度位置测量的污染物值的 CSV 文件开始进行空间分箱(使用中值作为聚合函数)。
The resulting map should be something as:生成的 map 应该是:
But for data applied to a city's extent.但是对于应用于城市范围的数据。 At this regard I found this tutorial that is close to what I want to do, but I was not able to get the desired result.
在这方面,我发现本 教程与我想要做的很接近,但我无法得到想要的结果。 I think that I'm missing something on how to correctly use
dissolve
and plot the resulting data (better using Folium
) Any useful example code?我认为我缺少有关如何正确使用
dissolve
和 plot 生成数据的内容(更好地使用Folium
)任何有用的示例代码?
shapely.geometry.box()
shapely.geometry.box()
创建网格很简单aggfunc
to demonstrate multiple metrics can be calculatedaggfunc
来演示可以计算多个指标import geopandas as gpd
import shapely.geometry
import numpy as np
# equivalent of CSV, all earthquake points globally
gdf_e = gpd.read_file(
"https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_month.geojson"
)
# get geometry of bounding area. Have selected a state rather than a city
gdf_CA = gpd.read_file(
"https://raw.githubusercontent.com/glynnbird/usstatesgeojson/master/california.geojson"
).loc[:, ["geometry"]]
BOXES = 50
a, b, c, d = gdf_CA.total_bounds
# create a grid for Califormia, could be a city
gdf_grid = gpd.GeoDataFrame(
geometry=[
shapely.geometry.box(minx, miny, maxx, maxy)
for minx, maxx in zip(np.linspace(a, c, BOXES), np.linspace(a, c, BOXES)[1:])
for miny, maxy in zip(np.linspace(b, d, BOXES), np.linspace(b, d, BOXES)[1:])
],
crs="epsg:4326",
)
# remove grid boxes created outside actual geometry
gdf_grid = gdf_grid.sjoin(gdf_CA).drop(columns="index_right")
# get earthquakes that have occured within one of the grid geometries
gdf_e_CA = gdf_e.loc[:, ["geometry", "mag"]].sjoin(gdf_grid)
# get median magnitude of eargquakes in grid
gdf_grid = gdf_grid.join(
gdf_e_CA.dissolve(by="index_right", aggfunc="median").drop(columns="geometry")
)
# how many earthquakes in the grid
gdf_grid = gdf_grid.join(
gdf_e_CA.dissolve(by="index_right", aggfunc=lambda d: len(d))
.drop(columns="geometry")
.rename(columns={"mag": "number"})
)
# drop grids geometries that have no measures and create folium map
m = gdf_grid.dropna().explore(column="mag")
# for good measure - boundary on map too
gdf_CA["geometry"].apply(lambda g: shapely.geometry.MultiLineString([p.exterior for p in g.geoms])).explore(m=m)
I want to convert a pandas DataFrame to a spatial enabled geopandas one as:我想将 pandas DataFrame 转换为启用空间的 geopandas 之一:
df=pd.read_csv('../Desktop/test_esri.csv')
df.head()
Then converted using:然后使用以下转换:
gdf = geopandas.GeoDataFrame(
df, geometry=geopandas.points_from_xy(df.long, df.lat))
from pyproj import crs
crs_epsg = crs.CRS.from_epsg(4326)
gdf=gdf.set_crs('epsg:4326')
Then I want to overimpose a spatial grid as:然后我想将空间网格过度叠加为:
import numpy as np
import shapely
from pyproj import crs
# total area for the grid
xmin, ymin, xmax, ymax= gdf.total_bounds
# how many cells across and down
n_cells=30
cell_size = (xmax-xmin)/n_cells
# projection of the grid
# crs = "+proj=sinu +lon_0=0 +x_0=0 +y_0=0 +a=6371007.181 +b=6371007.181 +units=m +no_defs"
# create the cells in a loop
grid_cells = []
for x0 in np.arange(xmin, xmax+cell_size, cell_size ):
for y0 in np.arange(ymin, ymax+cell_size, cell_size):
# bounds
x1 = x0-cell_size
y1 = y0+cell_size
grid_cells.append( shapely.geometry.box(x0, y0, x1, y1) )
cell = geopandas.GeoDataFrame(grid_cells, columns=['geometry'],
crs=crs.CRS('epsg:4326'))
Then merge the grid with geopandas dataframe:然后将网格与 geopandas dataframe 合并:
merged = geopandas.sjoin(gdf, cell, how='left', predicate='within')
To finally compute the desired metric inside "dissolve":最终在“溶解”中计算所需的指标:
# Compute stats per grid cell -- aggregate fires to grid cells with dissolve
dissolve = merged.dissolve(by="index_right", aggfunc="median")
But I think I did something wrong with the "cell" grid and I can't figure it out!!但我认为我在“单元格”网格上做错了,我想不通! An extract of csv file used con be found here .
在这里可以找到 csv 文件的摘录。
Finally solved with the following code:最后用以下代码解决:
import pandas as pd
import geopandas as gpd
import pyproj
import matplotlib.pyplot as plt
import numpy as np
import shapely
from folium import plugins
df=pd.read_csv('../Desktop/test_esri.csv')
gdf_monica = gpd.GeoDataFrame(
df, geometry=gpd.points_from_xy(df.long, df.lat))
gdf_monica=gdf_monica.set_crs('epsg:4326')
# equivalent of CSV, all earthquake points globally
gdf_e = gdf_monica
# get geometry of bounding area. Have selected a state rather than a city
gdf_CA = gpd.read_file('https://raw.githubusercontent.com/openpolis/geojson-italy/master/geojson/limits_IT_municipalities.geojson')#.loc[:, ["geometry"]]
gdf_CA =gdf_CA[gdf_CA['name']=='Portici'].loc[:,['geometry']]
BOXES = 50
a, b, c, d = gdf_CA.total_bounds
# create a grid for Califormia, could be a city
gdf_grid = gpd.GeoDataFrame(
geometry=[
shapely.geometry.box(minx, miny, maxx, maxy)
for minx, maxx in zip(np.linspace(a, c, BOXES), np.linspace(a, c, BOXES)[1:])
for miny, maxy in zip(np.linspace(b, d, BOXES), np.linspace(b, d, BOXES)[1:])
],
crs="epsg:4326",
)
# remove grid boxes created outside actual geometry
gdf_grid = gdf_grid.sjoin(gdf_CA).drop(columns="index_right")
# get earthquakes that have occured within one of the grid geometries
gdf_e_CA = gdf_e.loc[:, ["geometry", "CO"]].sjoin(gdf_grid)
# get median magnitude of eargquakes in grid
gdf_grid = gdf_grid.join(
gdf_e_CA.dissolve(by="index_right", aggfunc="median").drop(columns="geometry")
)
# how many earthquakes in the grid
gdf_grid = gdf_grid.join(
gdf_e_CA.dissolve(by="index_right", aggfunc=lambda d: len(d))
.drop(columns="geometry")
.rename(columns={"CO": "number"})
)
# drop grids geometries that have no measures and create folium map
m = gdf_grid.dropna().explore(column="CO")
# for good measure - boundary on map too
gdf_CA["geometry"].apply(lambda g: shapely.geometry.MultiLineString([p.exterior for p in g.geoms])).explore(m=m)
As you can understand, I have little or no knowledge regarding spatial analysis.如您所知,我对空间分析知之甚少或一无所知。 I was not able to get correct results without using geojson data that describe a geometry within which the points of interest fall.
如果不使用描述兴趣点所在几何的 geojson 数据,我无法获得正确的结果。 If anyone could add more insights... thanks!
如果有人可以添加更多见解...谢谢!
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.