简体   繁体   中英

How to reshape xarray data with new dimensions

I'm fairly new to the xarray library, and I am stuck in a what it seems a fairly straight-forward task. I have global climate data in a GRIB file for different 30-km grids. The data looks like this:

<xarray.Dataset>
Dimensions:     (time: 736, values: 542080)
Coordinates:
    number      int64 0
  * time        (time) datetime64[ns] 2007-12-01 ... 2008-03-01T21:00:00
    step        timedelta64[ns] 00:00:00
    surface     int64 0
    latitude    (values) float64 89.78 89.78 89.78 ... -89.78 -89.78 -89.78
    longitude   (values) float64 0.0 20.0 40.0 60.0 ... 280.0 300.0 320.0 340.0
    valid_time  (time) datetime64[ns] 2007-12-01 ... 2008-03-01T21:00:00
Dimensions without coordinates: values
Data variables:
    t2m         (time, values) float32 247.30748 247.49889 ... 225.18036
Attributes:
    GRIB_edition:            1
    GRIB_centre:             ecmf
    GRIB_centreDescription:  European Centre for Medium-Range Weather Forecasts
    GRIB_subCentre:          0
    Conventions:             CF-1.7
    institution:             European Centre for Medium-Range Weather Forecasts
    history:                 2020-01-21T09:40:59 GRIB to CDM+CF via cfgrib-0....

And that is fine. I can access to different time instances and plot stuff, even access to the data per cell using data.t2m.data . But, the data is indexed only by time and value , this last one is -I assume- a cell number identifier, but is not reading latitude and longitude as meaningful dimensions.

On the documentation, the authors use airtemp reanalysis data as an example, these data is indexed by lat , lon , and time , and that is what I want to do with my dataset.

<xarray.Dataset>
Dimensions:  (lat: 25, lon: 53, time: 2920)
Coordinates:
  * lat      (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0
  * lon      (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0
  * time     (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00
Data variables:
    air      (time, lat, lon) float32 ...
Attributes:
    Conventions:  COARDS
    title:        4x daily NMC reanalysis (1948)
    description:  Data is from NMC initialized reanalysis\n(4x/day).  These a...
    platform:     Model
    references:   http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...

There is a straight forward way of doing this re-indexing in the xarray environment? I guess I can just simply extract the numpy arrays and jump to pandas or something else, but I find the xarray library really powerful and useful.

One way might be to manually construct a pandas.MultiIndex from the latitude and longitude variables, assign it as the coordinate for the values dimension, and then unstack the Dataset:

import pandas as pd

index = pd.MultiIndex.from_arrays(
    [ds.longitude.values, ds.latitude.values], names=['lon', 'lat']
)
ds['values'] = index
reshaped = ds.unstack('values')

For more on this, see this section under the "Reshaping and reorganizing data" section of the xarray documentation.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM