I'm fairly new to the xarray
library, and I am stuck in a what it seems a fairly straight-forward task. I have global climate data in a GRIB
file for different 30-km grids. The data looks like this:
<xarray.Dataset>
Dimensions: (time: 736, values: 542080)
Coordinates:
number int64 0
* time (time) datetime64[ns] 2007-12-01 ... 2008-03-01T21:00:00
step timedelta64[ns] 00:00:00
surface int64 0
latitude (values) float64 89.78 89.78 89.78 ... -89.78 -89.78 -89.78
longitude (values) float64 0.0 20.0 40.0 60.0 ... 280.0 300.0 320.0 340.0
valid_time (time) datetime64[ns] 2007-12-01 ... 2008-03-01T21:00:00
Dimensions without coordinates: values
Data variables:
t2m (time, values) float32 247.30748 247.49889 ... 225.18036
Attributes:
GRIB_edition: 1
GRIB_centre: ecmf
GRIB_centreDescription: European Centre for Medium-Range Weather Forecasts
GRIB_subCentre: 0
Conventions: CF-1.7
institution: European Centre for Medium-Range Weather Forecasts
history: 2020-01-21T09:40:59 GRIB to CDM+CF via cfgrib-0....
And that is fine. I can access to different time instances and plot stuff, even access to the data per cell using data.t2m.data
. But, the data is indexed only by time
and value
, this last one is -I assume- a cell number identifier, but is not reading latitude
and longitude
as meaningful dimensions.
On the documentation, the authors use airtemp
reanalysis data as an example, these data is indexed by lat
, lon
, and time
, and that is what I want to do with my dataset.
<xarray.Dataset>
Dimensions: (lat: 25, lon: 53, time: 2920)
Coordinates:
* lat (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0
* lon (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0
* time (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00
Data variables:
air (time, lat, lon) float32 ...
Attributes:
Conventions: COARDS
title: 4x daily NMC reanalysis (1948)
description: Data is from NMC initialized reanalysis\n(4x/day). These a...
platform: Model
references: http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...
There is a straight forward way of doing this re-indexing in the xarray
environment? I guess I can just simply extract the numpy
arrays and jump to pandas
or something else, but I find the xarray
library really powerful and useful.
One way might be to manually construct a pandas.MultiIndex
from the latitude and longitude variables, assign it as the coordinate for the values
dimension, and then unstack the Dataset:
import pandas as pd
index = pd.MultiIndex.from_arrays(
[ds.longitude.values, ds.latitude.values], names=['lon', 'lat']
)
ds['values'] = index
reshaped = ds.unstack('values')
For more on this, see this section under the "Reshaping and reorganizing data" section of the xarray documentation.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.