简体   繁体   中英

How return a shp file from a GeoDataframe, GeoPandas, without write this into disk?

I have a little service in flask that get's a geoJson and returns a zip with.shp, .dbf and.shx files. I can return this zip file with the function gdf.to_file('file.shp') , but I don't want to write in disk, because it will cost more time and reduce the lifespan of the device (my service will be called more than a thousand times in a single day). How can I do this? Here follows the code I'm using:

from flask import Flask, request, send_file
import json
import geopandas as gpd
import io
from zipfile import ZipFile
import os

app = Flask(__name__)
@app.route('/geojson2shp', methods=['POST'])
def result():
    geojson = json.loads(request.values['geojson'])
    gdf = gpd.GeoDataFrame.from_features(geojson['features'])

    currentPath = os.path.dirname(os.path.abspath(__file__))

    gdf.to_file(currentPath+'/tmp/file.shp',  driver='ESRI Shapefile')

    zipObj = ZipFile(currentPath+'/tmp/sample.zip', 'w')
    zipObj.write(currentPath+'/tmp/file.shp')
    zipObj.write(currentPath+'/tmp/file.dbf')
    zipObj.write(currentPath+'/tmp/file.shx')
    zipObj.close()

    return send_file(currentPath+'/tmp/sample.zip', attachment_filename='sample.zip')


Edit:

I've been searching for a way to use BytesIO with Fiona, and thanks to @matthew-borish link and @marat hint I found it. With the code that follows I've managed to create a.shp file using only the memory.

def resultWithMemory():
    geojson = json.loads(request.values['geojson'])
    gdf = gpd.GeoDataFrame.from_features(geojson['features'])

    shp = io.BytesIO()
    gdf.to_file(shp,  driver='ESRI Shapefile')

    zip_buffer = io.BytesIO()
    with ZipFile(zip_buffer, 'w') as zf:
        zf.writestr("file.shp", shp.getbuffer())
    zip_buffer.seek(0)
    
    return send_file(zip_buffer, attachment_filename='sample.zip', as_attachment=True)

But it's have a little problem: the.shx and.dbf files does'nt come in "shp" BufferIO object. Why? I'll explore the Fiona source code later and try to find it.

You could store your gdfs in a dict with timestamps as the keys.

import datetime
gdf_dict = {}

def result():
    # geojson = json.loads(request.values['geojson'])
    # gdf = gpd.GeoDataFrame.from_features(geojson['features'])
    
    # dummy df for demo purposes
    gdf = gpd.GeoDataFrame(geometry=[LineString([(1, 1), (4, 4)]),
                                            LineString([(1, 4), (4, 1)]),
                                            LineString([(6, 1), (6, 6)])])

    return gdf

Each time your service is called you can store the result in the gdf_dict.

gdf_dict[datetime.datetime.now()] = result()

If you want to, you can save to disc occasionally with pickle.

with open('gdf.pickle', 'wb') as handle:
    pickle.dump(gdf_dict, handle)

And vice versa.

with open('gdf.pickle', 'rb') as handle:
    from_pickle = pickle.load(handle)

You can search by timestamp if you'd like with this.

def get_gdfs(gdf_dict, from_date_str, to_date_str):
    return sorted((k, v) for k, v in gdf_dict.items() if from_date_str <= k <= to_date_str)


gdfs_by_date = get_gdfs(gdf_dict, datetime.datetime(2022, 1, 21, 17, 46, 30, 408735), datetime.datetime(2023, 1, 22, 17, 48, 30, 408735))

Alternatively, or in conjunction with some of the above, you could use pd.concat() to combine gdfs and a datetimeindex for any queries. Leave a comment if you're curious on how that might look and I can provide update.

this is what I am also looking for too. Did you ever find a way to export using BytesIO but also including the.shx and.dbx files in the zip?

Thanks

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM