简体   繁体   中英

Is there a file size limit to xarray in python?

I want to open a netCDF4 data set using xarray.

I have two examples. A big file with over 3 million points in the time series (3.2GB). A small file with 9999 points in the time series (9.8 MB). This code will open the small file.

ds = xr.open_dataset(smallfile, chunks={'rec': 3600}, decode_times=False)

If I use the big file, I get an unknown error. The behavior is consistent on two different windows machines with miniconda installed.

What is going on here? What else should I check for?

Thanks in advance.

Neither xarray nor netCDF4-Python have file size limits. They've been used successfully for files in the 10-100GB range.

Your problem looks similar to those reported in this netCDF4-Python issue for reading large files on Windows with Python 3: https://github.com/Unidata/netcdf4-python/issues/535

More broadly, you might run into limitations of the netCDF file format itself. Version 4, which xarray supports via netCDF4-Python and h5netcdf, is based on HDF5 and has no file size limits. Version 3, which xarray supports via netCDF4-Python and scipy, has a 2GB file size limit unless using the "64-bit offset" version (which even then still has a <4GB limit per variable).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM