简体   繁体   English

Python Xarray 沿日期时间维度集成二维数组

[英]Python Xarray integrate a 2D array along datetime dimension

I have very large 2D variables in Xarray.我在 Xarray 中有非常大的二维变量。 They have the form: counts(time, altitude) where time is a numpy datetime every 10 seconds, altitude is float, counts are floats with occasional NaNs.它们具有以下形式:counts(time, altitude) 其中 time 是每 10 秒的 numpy 日期时间,altitude 是浮点数,counts 是带有偶尔 NaN 的浮点数。

I would like to reduce the resolution to every 15 minutes by summing or averaging over the corresponding columns.我想通过对相应列进行求和或平均将分辨率降低到每 15 分钟一次。

Likewise, I would like to do the same along the rows of counts in the altitude dimension.同样,我想沿着高度维度中的计数行执行相同的操作。

I would appreciate some advice on how this should be done in Python (I'm still on the learning curve for Python).我将不胜感激 Python 中有关如何完成此操作的一些建议(我仍在 Python 的学习曲线上)。

You could use the resample method from xarray (see examples in docs https://xarray.pydata.org/en/stable/generated/xarray.Dataset.resample.html ).您可以使用 xarray 中的 resample 方法(请参阅文档https://xarray.pydata.org/en/stable/generated/xarray.Dataset.resample.html中的示例)。

There are named time offsets based on pandas (see https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#dateoffset-objects ).有基于 pandas 的命名时间偏移(参见https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#dateoffset-objects )。

For example, a yearly mean based on a daily database should be something like this:例如,基于每日数据库的年度平均值应该是这样的:

ds.resample(time='Y').mean('time')

In your case it should be:在你的情况下,它应该是:

ds.resample(time='15min').mean('time')

where time is the time variable in your dataset and '15min' is the named time offset from https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#dateoffset-objects其中 time 是数据集中的时间变量,“15min”是https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#dateoffset-objects的命名时间偏移量

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM