xarray groupby 坐標和非坐標變量

Question

我正在嘗試計算 xarray 中變量的分布。 我可以通過將 xarray 轉換為 Pandas 數據框來實現我正在尋找的內容，如下所示：

lon = np.linspace(0,10,11)
lat =  np.linspace(0,10,11)
time = np.linspace(0,10,1000)


temperature = 3*np.random.randn(len(lat),len(lon),len(time))

ds = xr.Dataset(
    data_vars=dict(
        temperature=(["lat", "lon", "time"], temperature),
    ),
    coords=dict(
        lon=lon,
        lat=lat,
        time=time,
    ),
)

bin_t = np.linspace(-10,10,21)
DS = ds.to_dataframe()
DS.loc[:,'temperature_bin'] = pd.cut(DS['temperature'],bin_t,labels=(bin_t[0:-1]+bin_t[1:])*0.5)
DS_stats = DS.reset_index().groupby(['lat','lon','temperature_bin']).count()
ds_stats = DS_stats.to_xarray()

<xarray.Dataset>
Dimensions:          (lat: 11, lon: 11, temperature_bin: 20)
Coordinates:
  * lat              (lat) float64 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0
  * lon              (lon) float64 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0
  * temperature_bin  (temperature_bin) float64 -9.5 -8.5 -7.5 ... 7.5 8.5 9.5
Data variables:
    time             (lat, lon, temperature_bin) int64 0 1 8 13 18 ... 9 5 3 0
    temperature      (lat, lon, temperature_bin) int64 0 1 8 13 18 ... 9 5 3 0

有沒有辦法在不轉換為數據幀的情況下生成 ds_stats？ 我曾嘗試使用 groupby_bins 但這不會保留坐標。

print(ds.groupby_bins('temperature',bin_t).count())

distributed.utils_perf - WARNING - full garbage collections took 21% CPU time recently (threshold: 10%)

<xarray.Dataset>
Dimensions:           (temperature_bins: 20)
Coordinates:
  * temperature_bins  (temperature_bins) object (-10.0, -9.0] ... (9.0, 10.0]
Data variables:
    temperature       (temperature_bins) int64 121 315 715 1677 ... 709 300 116

Answer 1

使用xhistogram可能會有所幫助。

使用與您在上面設置的相同定義，

from xhistogram import xarray as xhist
ds_stats = xhist.histogram(ds.temperature, bins=bin_t,dim=['time'])

應該做的伎倆。

一個區別是它返回一個DataArray ，而不是一個Dataset ，所以如果你想為多個變量做它，你必須為每個變量單獨做，然后重新組合，我相信。

xarray groupby 坐標和非坐標變量

問題描述

1 個解決方案

解決方案1
1 2021-10-26 19:36:34

xarray groupby 坐標和非坐標變量

問題描述

1 個解決方案

解決方案1 1 2021-10-26 19:36:34

解決方案1
1 2021-10-26 19:36:34