简体   繁体   中英

Join/merge multiple NetCDF files using xarray

I have a folder with NetCDF files from 2006-2100, in ten year blocks (2011-2020, 2021-2030 etc).

I want to create a new NetCDF file which contains all of these files joined together. So far I have read in the files:

ds = xarray.open_dataset('Path/to/file/20062010.nc')
ds1 = xarray.open_dataset('Path/to/file/20112020.nc')
etc.

Then merged these like this:

dsmerged = xarray.merge([ds,ds1])

This works, but is clunky and there must be a simpler way to automate this process, as I will be doing this for many different folders full of files. Is there a more efficient way to do this?

EDIT:

Trying to join these files using glob:

for filename in glob.glob('path/to/file/.*nc'):
    dsmerged = xarray.merge([filename])

Gives the error:

AttributeError: 'str' object has no attribute 'items'

This is reading only the text of the filename, and not the actual file itself, so it can't merge it. How do I open, store as a variable, then merge without doing it bit by bit?

If you are looking for a clean way to get all your datasets merged together, you can use some form of list comprehension and the xarray.merge function to get it done. The following is an illustration:

ds = xarray.merge([xarray.open_dataset(f) for f in glob.glob('path/to/file/.*nc')])

In response to the out of memory issues you encountered, that is probably because you have more files than the python process can handle. The best fix for that is to use the xarray.open_mfdataset function, which actually uses the library dask under the hood to break the data into smaller chunks to be processed. This is usually more memory efficient and will often allow you bring your data into python. With this function, you do not need a for-loop ; you can just pass it a string glob in the form "path/to/my/files/*.nc" . The following is equivalent to the previously provided solution, but more memory efficient:

ds = xarray.open_mfdataset('path/to/file/*.nc')

I hope this proves useful.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM