简体   繁体   中英

Can I save a GeoDataFrame that contains an array to a GeoPackage file?

I have a geopandas GeoDataFrame with some attribute columns and a geometry column (just a regular GDF). Usually I save GDF's as GeoPackage giles (.gpkg) using:

gdf.to_file('path_to_file.gpkg', driver='GPKG')

This works fine, unless my GDF has a column where the entries are arrays. So say I have two columns next to the geometry column and one of them contains a numpy array for each entry. If I then try to save as a gpkg it gives me the error:

ValueError: Invalid field type <class 'numpy.ndarray'>

So it appears that a gpkg cannot handle arrays in the table. The arrays I want to include are simple flags (so values of 0 and 1). I found two workarounds which work alright but are a bit messy:

  1. Make a string of the array values. This works but I would very much prefer leaving it as an array...
  2. Create a separate column for every array value. This would also work but then I get a GDF with a lot of columns and I feel there should be a better way to do this.

Does anybody know of a better workaround to this issue?

I believe this is just a limitation of the.gpkg format. However, I think the best workaround approach is to store the arrays as strings, like you suggested. You can easily convert them back into arrays in news gdf if you need to with ast literal_eval().

import pandas as pd
import numpy as np
import geopandas as gpd
from shapely.geometry import LineString, Point
from ast import literal_eval

gdf = gpd.GeoDataFrame({'id': [1, 2, 3], 'array_col': [np.array([0,1,2]), np.array([0,1,2]), np.array([0,1,2])]},
                       geometry=[LineString([(1, 1), (4, 4)]),
                                 LineString([(1, 4), (4, 1)]),
                                LineString([(6, 1), (6, 6)])])

gdf['array_col'] = gdf['array_col'].apply(lambda x: str(x))

gdf.to_file('path_to_file.gpkg', driver='GPKG')

gpkg = gpd.read_file('path_to_file.gpkg')

gpkg['array_col'] = gpkg['array_col'].apply(lambda x: np.array(literal_eval(x.replace(' ', ','))))

After this, we can access our np arrays again.

print(gpkg['array_col'][0])

array([0, 1, 2])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM