I am having problems trying to create a very big netCDF file in python in a machine with 8gb of RAM.
I created a very big array with numpy.memmap in order to have this array in disk and not in ram because its size exceeds the available ram and swap space. (ram and swap = 8 gb each)
I created a variable in the nc file with
var = ncout.createVariable('data',ARRAY.dtype,\
('time','latitude','longitude',),\
chunksizes=(5000,61,720))
var[:]=ARRAY[:]
When the code reach this point It loads into the ram the ARRAY that is saved in disk and then I have memory error.
How can I save such a big files?
Thanks.
The best way to read and write large NetCDF4 files is with Xarray , which reads and writes data in chunks automatically using Dask below the hood.
import xarray as xr
ds = xr.open_dataset('my_big_input_file.nc',
chunks={'time':5000, ,'latitude':61, ,'longitude':720})
ds.to_netcdf('my_big_output_file.nc',mode='w')
You can speed things up by using parallel computing with Dask .
Iterating directly over an array gives you the slices along the first dimension. Using enumerate
will give you both the slice and the index:
for ind, slice in enumerate(ARRAY):
var[ind] = slice
I'm not positive whether netCDF4-python will keep the slices around in memory, though.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.