简体   繁体   English

如何用两个新维度替换 xarray object 的当前维度

[英]How to replace the current dimension of an xarray object with two new ones

I am a Pandas user migrating to Xarray because I work with geospatial 3D data.我是迁移到 Xarray 的 Pandas 用户,因为我使用地理空间 3D 数据。 Some stuff I only know how to do using Pandas and many times doesn't make any sense to convert to a Pandas DataFrame and then reconvert it to Xarray Dataset object. Some stuff I only know how to do using Pandas and many times doesn't make any sense to convert to a Pandas DataFrame and then reconvert it to Xarray Dataset object.

What I am trying to do is to replace the current dimension of a Xarray object with two new ones, and those two new ones are currently data variables in the Xarray object .我想要做的是用两个新的替换Xarray object的当前维度,这两个新的当前是Xarray object中的数据变量。

We start from the point that the data is a Xarray object just like:我们从dataXarray object的观点开始,就像:

<xarray.Dataset>
Dimensions:  (index: 9)
Coordinates:
  * index    (index) int64 0 1 2 3 4 5 6 7 8
Data variables:
    Letter   (index) object 'A' 'A' 'A' 'B' 'B' 'B' 'C' 'C' 'C'
    Number   (index) int64 1 2 3 1 2 3 1 2 3
    Value1   (index) float64 0.5453 1.184 -1.177 0.8232 ... -1.253 0.3274 -1.583
    Value2   (index) float64 -0.4184 -0.3325 0.6826 ... -0.264 0.07381 0.4357

What I am trying to do is to reshape and reindexing the variables Value1 and Value2 to assign Letter and Number as its dimensions.我想要做的是重塑和重新索引变量Value1Value2以将LetterNumber指定为其维度。 The way I am used to doing is:我习惯的做法是:

reindexed = data.to_dataframe().set_index(['Letter','Number']).to_xarray()

That returns:返回:

<xarray.Dataset>
Dimensions:  (Letter: 3, Number: 3)
Coordinates:
  * Letter   (Letter) object 'A' 'B' 'C'
  * Number   (Number) int64 1 2 3
Data variables:
    Value1   (Letter, Number) float64 0.5453 1.184 -1.177 ... 0.3274 -1.583
    Value2   (Letter, Number) float64 -0.4184 -0.3325 0.6826 ... 0.07381 0.4357

This works very well if the data is not too big, but this seems stupid for me because it will load it into memory when I convert to DataFrame.如果数据不是太大,这非常有效,但这对我来说似乎很愚蠢,因为当我转换为 DataFrame 时,它会将其加载到 memory 中。 I would like to find a way to do the same thing faster and lighter using Xarray only.我想找到一种仅使用 Xarray 更快、更轻松地做同样事情的方法。

To help to reproduce the same problem, I made a code here below just to create a data similar to the one I have after reading the NetCDF file.为了帮助重现相同的问题,我在下面编写了一个代码,只是为了创建一个类似于我在阅读 NetCDF 文件后拥有的数据。

import numpy as np
import pandas as pd


df = pd.DataFrame()
df['Letter'] = 'A A A B B B C C C'.split()
df['Number'] = [1,2,3,1,2,3,1,2,3]
df['Value1'] = np.random.randn(9)
df['Value2'] = np.random.randn(9)
data = df.to_xarray()

You should be able to do this using the code below.您应该可以使用下面的代码执行此操作。 You cannot remove dimensions in xarray, so you will have to replace the values of "index" with the values of Letter or Number first, and then rename the index dimension.您无法删除 xarray 中的维度,因此您必须先将“索引”的值替换为字母或数字的值,然后重命名索引维度。

import numpy as np
import pandas as pd

df = pd.DataFrame()
df['Letter'] = 'A A A B B B C C C'.split()
df['Number'] = [1,2,3,1,2,3,1,2,3]
df['Value1'] = np.random.randn(9)
df['Value2'] = np.random.randn(9)
data = df.to_xarray()

(
data
 .assign_coords({"index": data.Letter.values})
 .assign_coords({"Number":data.Number.values})
 .drop("Letter")
 .rename_dims({"index":"Letter"})      
 .rename({"index":"Letter"})        
)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM