[英]How to reshape xarray data with new dimensions
I'm fairly new to the xarray
library, and I am stuck in a what it seems a fairly straight-forward task.我对
xarray
库xarray
,而且我被困在一个看起来相当简单的任务中。 I have global climate data in a GRIB
file for different 30-km grids.我在
GRIB
文件中有不同 30 公里网格的全球气候数据。 The data looks like this:数据如下所示:
<xarray.Dataset>
Dimensions: (time: 736, values: 542080)
Coordinates:
number int64 0
* time (time) datetime64[ns] 2007-12-01 ... 2008-03-01T21:00:00
step timedelta64[ns] 00:00:00
surface int64 0
latitude (values) float64 89.78 89.78 89.78 ... -89.78 -89.78 -89.78
longitude (values) float64 0.0 20.0 40.0 60.0 ... 280.0 300.0 320.0 340.0
valid_time (time) datetime64[ns] 2007-12-01 ... 2008-03-01T21:00:00
Dimensions without coordinates: values
Data variables:
t2m (time, values) float32 247.30748 247.49889 ... 225.18036
Attributes:
GRIB_edition: 1
GRIB_centre: ecmf
GRIB_centreDescription: European Centre for Medium-Range Weather Forecasts
GRIB_subCentre: 0
Conventions: CF-1.7
institution: European Centre for Medium-Range Weather Forecasts
history: 2020-01-21T09:40:59 GRIB to CDM+CF via cfgrib-0....
And that is fine.这很好。 I can access to different time instances and plot stuff, even access to the data per cell using
data.t2m.data
.我可以访问不同的时间实例并绘制内容,甚至可以使用
data.t2m.data
访问每个单元格的数据。 But, the data is indexed only by time
and value
, this last one is -I assume- a cell number identifier, but is not reading latitude
and longitude
as meaningful dimensions.但是,数据仅按
time
和value
索引,最后一个是 - 我假设 - 一个单元格编号标识符,但没有将latitude
和longitude
作为有意义的维度读取。
On the documentation, the authors use airtemp
reanalysis data as an example, these data is indexed by lat
, lon
, and time
, and that is what I want to do with my dataset.在文档中,作者以
airtemp
再分析数据为例,这些数据由lat
、 lon
和time
索引,这就是我想要对我的数据集做的事情。
<xarray.Dataset>
Dimensions: (lat: 25, lon: 53, time: 2920)
Coordinates:
* lat (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0
* lon (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0
* time (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00
Data variables:
air (time, lat, lon) float32 ...
Attributes:
Conventions: COARDS
title: 4x daily NMC reanalysis (1948)
description: Data is from NMC initialized reanalysis\n(4x/day). These a...
platform: Model
references: http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...
There is a straight forward way of doing this re-indexing in the xarray
environment?在
xarray
环境中进行这种重新索引有直接的方法吗? I guess I can just simply extract the numpy
arrays and jump to pandas
or something else, but I find the xarray
library really powerful and useful.我想我可以只简单的提取
numpy
阵列,并跳转到pandas
或其他什么东西,但我觉得xarray
图书馆真正强大的和有用的。
One way might be to manually construct a pandas.MultiIndex
from the latitude and longitude variables, assign it as the coordinate for the values
dimension, and then unstack the Dataset:一种方法可能是从纬度和经度变量手动构造一个
pandas.MultiIndex
,将其指定为values
维度的坐标,然后取消堆叠数据集:
import pandas as pd
index = pd.MultiIndex.from_arrays(
[ds.longitude.values, ds.latitude.values], names=['lon', 'lat']
)
ds['values'] = index
reshaped = ds.unstack('values')
For more on this, see this section under the "Reshaping and reorganizing data" section of the xarray documentation.有关更多信息,请参阅 xarray 文档的“重塑和重组数据”部分下的此部分。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.