简体   繁体   中英

Add 'constant' dimension to xarray Dataset

I have a series of monthly gridded datasets in CSV form. I want to read them, add a few dimensions, and then write to netcdf. I've had great experience using xarray (xray) in the past so thought I'd use if for this task.

I can easily get them into a 2D DataArray with something like:

data = np.ones((360,720))
lats = np.arange(-89.75, 90, 0.5) * -1
lngs = np.arange(-179.75, 180, 0.5)
coords =  {'lat': lats, 'lng':lngs}
da = xr.DataArray(data, coords=coords)

But when I try to add another dimension, which would convey information about time (all data is from the same year/month), things start to go sour.

I've tried two ways to crack this:

1) expand my input data to mxnx 1, something like:

data = np.ones((360,720))
lats = np.arange(-89.75, 90, 0.5) * -1
lngs = np.arange(-179.75, 180, 0.5)
coords =  {'lat': lats, 'lng':lngs}
data = data[:,:,np.newaxis]

Then I follow the same steps as above, with coords updated to contain a third dimension.

lats = np.arange(-89.75, 90, 0.5) * -1
lngs = np.arange(-179.75, 180, 0.5)
coords =  {'lat': lats, 'lng':lngs}
coords['time'] = pd.datetime(year, month, day))
da = xr.DataArray(data, coords=coords)
da.to_dataset(name='variable_name')

This is fine for creating a DataArray -- but when I try to convert to a dataset (so I can write to netCDF), I get an error about 'ValueError: Coordinate objects must be 1-dimensional'

2) The second approach I've tried is taking my dataarray, casting it to a dataframe, setting the index to ['lat','lng', 'time'] and then going back to a dataset with xr.Dataset.from_dataframe() . I've tried this -- but it takes 20+ min before I kill the process.

Does anyone know how I can get a Dataset with a monthly 'time' dimension?

Your first example is pretty close:

lats = np.arange(-89.75, 90, 0.5) * -1
lngs = np.arange(-179.75, 180, 0.5)
coords =  {'lat': lats, 'lng': lngs}
coords['time'] = [datetime.datetime(year, month, day)]
da = xr.DataArray(data, coords=coords, dims=['lat', 'lng', 'time'])
da.to_dataset(name='variable_name')

You'll notice a few changes in my version:

  1. I'm passing in a first for the 'time' coordinate instead of a scalar. You need to pass in a list or 1d array to get a 1D coordinate variable, which is what you need if you also use 'time' as a dimension. That's what the error ValueError: Coordinate objects must be 1-dimensional is trying to tell you (by the way -- if you have ideas for how to make that error message more helpful, I'm all ears!).
  2. I'm providing a dims argument to the DataArray constructor. Passing in a (non-ordered) dictionary is a little dangerous because the iteration order is not guaranteed.
  3. I also switched to datetime.datetime instead of pd.datetime . The later is simply an alias for the former.

Another sensible approach is to use concat with a list of one item once you've added 'time' as a scalar coordinate, eg,

lats = np.arange(-89.75, 90, 0.5) * -1
lngs = np.arange(-179.75, 180, 0.5)
coords =  {'lat': lats, 'lng': lngs, 'time': datetime.datetime(year, month, day)}
da = xr.DataArray(data, coords=coords, dims=['lat', 'lng'])
expanded_da = xr.concat([da], 'time')

This version generalizes nicely to joining together data from a bunch of days -- you simply make the list of DataArrays longer. In my experience, most of the time the reason why you want the extra dimension in the first place is to be able to able to concat along it. Length 1 dimensions are not very useful otherwise.

You can use .expand_dims() to add a new dimension and .assign_coords() to add coordinate values for the corresponding dimension. Below code adds new_dim dimension to ds dataset and sets a corresponding corrdinate with the list_of_values you provide.

expanded_ds = ds.expand_dims("new_dim").assign_coords(new_dim=("new_dim", [list_of_values]))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM