简体   繁体   English

使用 xarray 保存多个 netCDF 文件时,笔记本内核死机

[英]Notebook kernel dies when saving multiple netCDF files with xarray

When trying to save multiple xarray datasets to netCDF within a Jupyter notebook, the kernel keeps dying, while the following exception is output, hinting that files aren't being closed properly.在 Jupyter notebook 中尝试将多个 xarray 数据集保存到 netCDF 时,内核不断死亡,同时输出以下异常,暗示文件未正确关闭。

Here I'm trying to save subsets of NASA-NEX data that are opened via OpenDAP links and xarray.在这里,我试图保存通过 OpenDAP 链接和 xarray 打开的 NASA-NEX 数据的子集。 Then I want to concatenate a number of the NEX models together into an ensemble dataset.然后我想将多个 NEX 模型连接在一起形成一个集成数据集。 The following function does this:以下函数执行此操作:

for var in variables:
        for path in scenarios:
            for model in models:
                try:
                    time_range = slice(time_start, time_end)
                    data = xr.open_dataset(f'https://dataserver.nccs.nasa.gov/thredds/dodsC/bypass/NEX-GDDP/bcsd/{path}/'
                                          f'r1i1p1/{var}/{model}.ncml').sel(lon=longitudes, lat=latitudes, time=time_range)
                    print(var, path, model, 'success')
                    data.to_netcdf(path=f'{var}_{path}_{model}.nc')
                    print(f'Saved {model}')
                    data.close()
                except TypeError:
                    time_range = slice(cftime.DatetimeNoLeap(time_start.year,time_start.month,time_start.day), 
                                       cftime.DatetimeNoLeap(time_end.year,time_end.month,time_end.day))
                    data = xr.open_dataset(f'https://dataserver.nccs.nasa.gov/thredds/dodsC/bypass/NEX-GDDP/bcsd/{path}/'
                                          f'r1i1p1/{var}/{model}.ncml').sel(lon=longitudes, lat=latitudes, time=time_range)
                    new_timeindex = data.indexes['time'].to_datetimeindex()
                    data['time'] = new_timeindex
                    del new_timeindex
                    print(var, path, model, 'success w/ no leap')

                #if save_netcdf:
                    # Now save data to netCDF
                    #data.to_netcdf(path=f'{var}_{path}_{model}.nc')
                    #print(f'Saved {model}')
                    #data.close()

            print('Concatenating models into ensemble')
            data_final = xr.open_mfdataset(f'{var}_{path}_*.nc', combine='nested', concat_dim='model')
            data_final.to_netcdf(path=f'{var}_{path}_ensemble.nc')
    return data_final

While running, there are the following exceptions thrown at each file that do not stop the code from running:运行时,每个文件都会抛出以下不停止代码运行的异常:

Exception ignored in: <function CachingFileManager.__del__ at 0x31f530b00>
Traceback (most recent call last):
  File "/Users/Zach_Bruick/opt/miniconda3/envs/climate2/lib/python3.7/site-packages/xarray/backends/file_manager.py", line 243, in __del__
    self.close(needs_lock=False)
  File "/Users/Zach_Bruick/opt/miniconda3/envs/climate2/lib/python3.7/site-packages/xarray/backends/file_manager.py", line 221, in close
    file.close()
  File "netCDF4/_netCDF4.pyx", line 2276, in netCDF4._netCDF4.Dataset.close
  File "netCDF4/_netCDF4.pyx", line 2260, in netCDF4._netCDF4.Dataset._close
  File "netCDF4/_netCDF4.pyx", line 1754, in netCDF4._netCDF4._ensure_nc_success
RuntimeError: NetCDF: cannot delete file

I can see my memory creeping up with each file that's saved.我可以看到我的记忆随着保存的每个文件而增长。 Is this a bug or am I leaking memory in this?这是一个错误还是我在泄漏内存?

I think I ran into the exact same issue.我想我遇到了完全相同的问题。 I bet the version of libnetcdf you have installed is >=4.7.1.我敢打赌您安装的 libnetcdf 版本是 >=4.7.1。 Try pinning the version to 4.6.2.尝试将版本固定到 4.6.2。

I'm not entirely sure, but there seems to be a very subtle regression.我不完全确定,但似乎有一个非常微妙的回归。

It didn't happen when I had only one file, or when I didn't save the result to netcdf to disk, and it happened only with OpenDAP links.当我只有一个文件时,或者当我没有将结果保存到 netcdf 到磁盘时,它不会发生,并且只发生在 OpenDAP 链接上。

Edit: I found the related issue on GitHub https://github.com/Unidata/netcdf4-python/issues/982编辑:我在 GitHub https://github.com/Unidata/netcdf4-python/issues/982上发现了相关问题

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM