简体   繁体   中英

Notebook kernel dies when saving multiple netCDF files with xarray

When trying to save multiple xarray datasets to netCDF within a Jupyter notebook, the kernel keeps dying, while the following exception is output, hinting that files aren't being closed properly.

Here I'm trying to save subsets of NASA-NEX data that are opened via OpenDAP links and xarray. Then I want to concatenate a number of the NEX models together into an ensemble dataset. The following function does this:

for var in variables:
        for path in scenarios:
            for model in models:
                try:
                    time_range = slice(time_start, time_end)
                    data = xr.open_dataset(f'https://dataserver.nccs.nasa.gov/thredds/dodsC/bypass/NEX-GDDP/bcsd/{path}/'
                                          f'r1i1p1/{var}/{model}.ncml').sel(lon=longitudes, lat=latitudes, time=time_range)
                    print(var, path, model, 'success')
                    data.to_netcdf(path=f'{var}_{path}_{model}.nc')
                    print(f'Saved {model}')
                    data.close()
                except TypeError:
                    time_range = slice(cftime.DatetimeNoLeap(time_start.year,time_start.month,time_start.day), 
                                       cftime.DatetimeNoLeap(time_end.year,time_end.month,time_end.day))
                    data = xr.open_dataset(f'https://dataserver.nccs.nasa.gov/thredds/dodsC/bypass/NEX-GDDP/bcsd/{path}/'
                                          f'r1i1p1/{var}/{model}.ncml').sel(lon=longitudes, lat=latitudes, time=time_range)
                    new_timeindex = data.indexes['time'].to_datetimeindex()
                    data['time'] = new_timeindex
                    del new_timeindex
                    print(var, path, model, 'success w/ no leap')

                #if save_netcdf:
                    # Now save data to netCDF
                    #data.to_netcdf(path=f'{var}_{path}_{model}.nc')
                    #print(f'Saved {model}')
                    #data.close()

            print('Concatenating models into ensemble')
            data_final = xr.open_mfdataset(f'{var}_{path}_*.nc', combine='nested', concat_dim='model')
            data_final.to_netcdf(path=f'{var}_{path}_ensemble.nc')
    return data_final

While running, there are the following exceptions thrown at each file that do not stop the code from running:

Exception ignored in: <function CachingFileManager.__del__ at 0x31f530b00>
Traceback (most recent call last):
  File "/Users/Zach_Bruick/opt/miniconda3/envs/climate2/lib/python3.7/site-packages/xarray/backends/file_manager.py", line 243, in __del__
    self.close(needs_lock=False)
  File "/Users/Zach_Bruick/opt/miniconda3/envs/climate2/lib/python3.7/site-packages/xarray/backends/file_manager.py", line 221, in close
    file.close()
  File "netCDF4/_netCDF4.pyx", line 2276, in netCDF4._netCDF4.Dataset.close
  File "netCDF4/_netCDF4.pyx", line 2260, in netCDF4._netCDF4.Dataset._close
  File "netCDF4/_netCDF4.pyx", line 1754, in netCDF4._netCDF4._ensure_nc_success
RuntimeError: NetCDF: cannot delete file

I can see my memory creeping up with each file that's saved. Is this a bug or am I leaking memory in this?

I think I ran into the exact same issue. I bet the version of libnetcdf you have installed is >=4.7.1. Try pinning the version to 4.6.2.

I'm not entirely sure, but there seems to be a very subtle regression.

It didn't happen when I had only one file, or when I didn't save the result to netcdf to disk, and it happened only with OpenDAP links.

Edit: I found the related issue on GitHub https://github.com/Unidata/netcdf4-python/issues/982

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM