简体   繁体   English

GeoPandas密谋 - 任何加快速度的方法?

[英]GeoPandas plotting - any way to speed things up?

I'm running a gradient descent algorithm on some geo data. 我在一些地理数据上运行梯度下降算法。 The goal is to assign different areas to different clusters to minimize some objective function. 目标是为不同的集群分配不同的区域,以最大限度地减少某些目标函数。 I am trying to make a short movie showing how the algorithm progresses. 我正在尝试制作一部简短的电影,展示算法的进展情况。 Right now my approach is to plot the map at each step, then use some other tools to make a little movie from all the static images (pretty simple). 现在我的方法是在每一步绘制地图,然后使用其他一些工具从所有静态图像制作一个小电影(非常简单)。 But, I have about 3000 areas to plot and the plot command takes a good 90 seconds or more to run, which kills my algorithm. 但是,我有大约3000个区域要绘制,并且plot命令需要90秒或更长时间才能运行,这会杀死我的算法。

There are some obvious shortcuts: save images every Nth iteration, save all the steps in a list and make all the images at the end (perhaps in parallel). 有一些明显的快捷方式:每隔N次迭代保存图像,保存列表中的所有步骤,并在结尾处制作所有图像(可能是并行)。 That's all fine for now, but ultimately I'm aiming for some interactive functionality where a user can put in some parameters and see their map converge in real time. 这一切都很好,但最终我的目标是一些交互式功能,用户可以放入一些参数并看到他们的地图实时汇聚。 Seems like updating the map on the fly would be best in that case. 在这种情况下,似乎在运行时更新地图是最好的。

Any ideas? 有任何想法吗? Here's the basic command (using the latest dev version of geopandas) 这是基本命令(使用最新的开发版本的geopandas)

fig, ax = plt.subplots(1,1, figsize=(7,5))
geo_data.plot(column='cluster',ax=ax, cmap='gist_rainbow',linewidth=0)
fig.savefig(filename, bbox_inches='tight', dpi=400)

Also tried something akin to the following (an abbreviated version is below). 还尝试了类似于以下内容的东西(下面是缩写版本)。 I open a single plot, and change it and save it with each iteration. 我打开一个图,并在每次迭代时更改并保存。 Doesn't seem to speed things up at all. 似乎没有加快速度。

fig, ax = plt.subplots(1,1, figsize=(7,5))
plot = geo_data.plot(ax=ax)
for iter in range(100): #just doing 100 iterations now
    clusters = get_clusters(...)
    for i_d, district in  enumerate(plot.patches):
        if cluster[i] == 1
            district.set_color("#FF0000")
        else:
            district.set_color("#d3d3d3")
    fig.savefig('test'+str(iter)+'.pdf')

update: taken a look at drawnow and other pointers from real-time plotting in while loop with matplotlib , but shapefiles seems to be too big/clunky to work in real time. 更新:看看drawow和其他指针来自matplotlib while循环中的实时绘图 ,但shapefiles似乎太大/笨重,无法实时工作。

I think two aspects can possibly improve the performance: 1) using a matplotlib Collection (the current geopandas implementation is plotting each polygon separately) and 2) only updating the color of the polygons and not plotting it again each iteration (this you already do, but with using a collection this will be much simpler). 我认为两个方面可以改善性能:1)使用matplotlib Collection(当前的geopandas实现分别绘制每个多边形)和2)只更新多边形的颜色而不是每次迭代再次绘制它(你已经这样做了,但是通过使用集合,这将更加简单)。

1) Using a matplotlib Collection to plot the Polygons 1)使用matplotlib集合绘制多边形

This is a possible implementation for a more efficient plotting function with geopandas to plot a GeoSeries of Polygons: 这是一个更有效的绘图功能的可能实现,使用geopandas绘制GeoSeries of Polygons:

from matplotlib.collections import PatchCollection
from matplotlib.patches import Polygon
import shapely

def plot_polygon_collection(ax, geoms, values=None, colormap='Set1',  facecolor=None, edgecolor=None,
                            alpha=0.5, linewidth=1.0, **kwargs):
    """ Plot a collection of Polygon geometries """
    patches = []

    for poly in geoms:

        a = np.asarray(poly.exterior)
        if poly.has_z:
            poly = shapely.geometry.Polygon(zip(*poly.exterior.xy))

        patches.append(Polygon(a))

    patches = PatchCollection(patches, facecolor=facecolor, linewidth=linewidth, edgecolor=edgecolor, alpha=alpha, **kwargs)

    if values is not None:
        patches.set_array(values)
        patches.set_cmap(colormap)

    ax.add_collection(patches, autolim=True)
    ax.autoscale_view()
    return patches

This is ca an order of 10x faster as the current geopandas plotting method. 这比当前的地图和绘图方法快10倍。

2) Updating the colors of the Polygons 2)更新多边形的颜色

Once you have the figure, updating the colors of a Collection of Polygons, can be done in one step using the set_array method, where the values in the array indicate the color (converted to a color depending on the colormap) 获得图形后,可以使用set_array方法一步完成更新多边形集合的颜色,其中数组中的值指示颜色(根据颜色图转换为颜色)

Eg (considering s_poly a GeoSeries with polygons): 例如(考虑s_poly a GeoSeries with polygons):

fig, ax = plt.subplots(subplot_kw=dict(aspect='equal'))
col = plot_polygon_collection(ax, s_poly.geometry)
# update the color
col.set_array( ... )

Full example with some dummy data: 一些虚拟数据的完整示例:

from shapely.geometry import Polygon

p1 = Polygon([(0, 0), (1, 0), (1, 1)])
p2 = Polygon([(2, 0), (3, 0), (3, 1), (2, 1)])
p3 = Polygon([(1, 1), (2, 1), (2, 2), (1, 2)])
s = geopandas.GeoSeries([p1, p2, p3])

Plotting this: 绘制这个:

fig, ax = plt.subplots(subplot_kw=dict(aspect='equal'))
col = plot_polygon_collection(ax, s.geometry)

gives: 得到:

在此输入图像描述

Then updating the color with an array indicating the clusters: 然后使用指示聚类的数组更新颜色:

col.set_array(np.array([0,1,0]))

gives

在此输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM