简体   繁体   English

python中的xarray是否有文件大小限制?

[英]Is there a file size limit to xarray in python?

I want to open a netCDF4 data set using xarray. 我想使用xarray打开netCDF4数据集。

I have two examples. 我有两个例子。 A big file with over 3 million points in the time series (3.2GB). 一个大文件,时间序列超过300万点(3.2GB)。 A small file with 9999 points in the time series (9.8 MB). 一个小文件,时间序列中有9999个点(9.8 MB)。 This code will open the small file. 此代码将打开小文件。

ds = xr.open_dataset(smallfile, chunks={'rec': 3600}, decode_times=False)

If I use the big file, I get an unknown error. 如果我使用大文件,我会收到一个未知错误。 The behavior is consistent on two different windows machines with miniconda installed. 在安装了miniconda的两台不同的Windows机器上,行为是一致的。

What is going on here? 这里发生了什么? What else should I check for? 我还应该检查什么?

Thanks in advance. 提前致谢。

Neither xarray nor netCDF4-Python have file size limits. xarray和netCDF4-Python都没有文件大小限制。 They've been used successfully for files in the 10-100GB range. 它们已成功用于10-100GB范围内的文件。

Your problem looks similar to those reported in this netCDF4-Python issue for reading large files on Windows with Python 3: https://github.com/Unidata/netcdf4-python/issues/535 您的问题看起来与此netCDF4-Python问题中报告的类似,用于使用Python 3在Windows上读取大型文件: https//github.com/Unidata/netcdf4-python/issues/535

More broadly, you might run into limitations of the netCDF file format itself. 更广泛地说,您可能会遇到netCDF文件格式本身的限制 Version 4, which xarray supports via netCDF4-Python and h5netcdf, is based on HDF5 and has no file size limits. xarray通过netCDF4-Python和h5netcdf支持的版本4基于HDF5,没有文件大小限制。 Version 3, which xarray supports via netCDF4-Python and scipy, has a 2GB file size limit unless using the "64-bit offset" version (which even then still has a <4GB limit per variable). xarray通过netCDF4-Python和scipy支持的版本3具有2GB的文件大小限制,除非使用“64位偏移”版本(即使每个变量仍然具有<4GB的限制)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM