[英]Remove a dimension from some variables in an xarray Dataset
I have an xarray Dataset where some variables have more dimensions than necessary (eg, a 3D dataset where the "latitude" and "longitude" variables also vary along time).我有一个 xarray 数据集,其中一些变量的维数比必要的多(例如,“纬度”和“经度”变量也随时间变化的 3D 数据集)。 How do I remove the extra dimensions?
如何去除多余的尺寸?
For example, in the dataset below, 'bar' is a 2D variable along the x
and y
axes, with constant values along the x
axis.例如,在下面的数据集,“巴”是2D可变沿着
x
和y
轴,与沿着所述常数值x
轴。 How do I remove the x
dimension from 'bar' but not 'foo'?如何从“bar”而不是“foo”中删除
x
维度?
>>> ds = xr.Dataset({'foo': (('x', 'y'), np.random.randn(2, 3))},
{'x': [1, 2], 'y': [1, 2, 3],
'bar': (('x', 'y'), [[4, 5, 6], [4, 5, 6]])})
>>> ds
<xarray.Dataset>
Dimensions: (x: 2, y: 3)
Coordinates:
* x (x) int64 1 2
* y (y) int64 1 2 3
bar (x, y) int64 4 5 6 4 5 6
Data variables:
foo (x, y) float64 -0.9595 0.6704 -1.047 0.9948 0.8241 1.643
The most direct way to remove the extra dimension (using indexing) results in a slightly confusing error message:删除额外维度的最直接方法(使用索引)会导致一个稍微令人困惑的错误消息:
>>> ds['bar'] = ds['bar'].sel(x=1)
ValueError: dimension 'x' already exists as a scalar variable
The problem is that when you do indexing in xarray, it keeps around indexed coordinates as scalar coordinates:问题在于,当您在 xarray 中进行索引时,它会将索引坐标保留为标量坐标:
>>> ds['bar'].sel(x=1)
<xarray.DataArray 'bar' (y: 3)>
array([4, 5, 6])
Coordinates:
x int64 1
* y (y) int64 1 2 3
bar (y) int64 4 5 6
This is often useful, but in this case the scalar coordinate 'x'
on the indexed array conflicts with the non-scalar coordinate (and dimension) 'x'
when you try to set it on the original dataset.这通常很有用,但在这种情况下,当您尝试在原始数据集上设置它时,索引数组上的标量坐标
'x'
与非标量坐标(和维度) 'x'
发生冲突。 Hence xarray errors instead of overriding the variable.因此 xarray 错误而不是覆盖变量。
To get around this, you need to drop the scalar 'x'
after indexing.为了解决这个问题,您需要在索引后删除标量
'x'
。 In the current version of xarray, you can do this with drop
:在当前版本的 xarray 中,您可以使用
drop
执行此操作:
>>> ds['bar'] = ds['bar'].sel(x=1).drop('x')
>>> ds
<xarray.Dataset>
Dimensions: (x: 2, y: 3)
Coordinates:
* x (x) int64 1 2
* y (y) int64 1 2 3
bar (y) int64 4 5 6
Data variables:
foo (x, y) float64 -0.9595 0.6704 -1.047 0.9948 0.8241 1.643
In future versions of xarray (v0.9 and later), you will be able to drop coordinates when indexing by writing drop=True
, eg, ds['bar'].sel(x=1, drop=True)
.在 xarray 的未来版本(v0.9 及更高版本)中,您将能够通过编写
drop=True
来删除坐标,例如ds['bar'].sel(x=1, drop=True)
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.