[英]xarray - resample time series data from daily to hourly
I have a year-long data set (netCDF) with time, latitude and longitude as coordinates, and KBDI-AWAP as variable that's sampled every day. 我有一个长达一年的数据集(netCDF),其中时间,纬度和经度为坐标,而KBDI-AWAP作为变量,每天都会进行采样。
The data is loaded to xarray's Dataset with Python and is printed as below: 数据使用Python加载到xarray的数据集,并按以下方式打印:
print(mds_kbdi)
Output: 输出:
<xarray.Dataset>
Dimensions: (latitude: 106, longitude: 193, time: 365)
Coordinates:
* latitude (latitude) float32 -39.2 -39.149525 ... -33.950478 -33.9
* longitude (longitude) float32 140.8 140.84792 140.89584 ... 149.95209 150.0
* time (time) datetime64[ns] 2017-01-01 2017-01-02 ... 2017-12-31
Data variables:
KBDI-AWAP (time, latitude, longitude) float32 dask.array<shape=(365, 106, 193), chunksize=(31, 106, 193)>
Attributes:
creationTime: 1525760660
creationTimeString: Mon May 7 23:24:20 PDT 2018
Conventions: COARDS
Put it in details (for each latitude and longitude): 详细说明(针对每个纬度和经度):
Date KBDI-AWAP
2017-01-01 10.5
2017-01-02 9.2
2017-01-03 9.8
... ...
2017-12-31 8.2
I would like to resample the KBDI-AWAP values into an interval of a hour. 我想将KBDI-AWAP值重新采样为一个小时的间隔。 So the dimension of the resampled dataset will be (latitude: 106, longitude: 193, time: 8760).
因此,重新采样的数据集的维度为(纬度:106,经度:193,时间:8760)。 Each KBDI-AWAP value for an hour within the same date should have the same value as the date's value in the original dataset.
同一日期内一个小时内的每个KBDI-AWAP值应与原始数据集中的日期值具有相同的值。
The resampled data will be (for each latitude and longitude): 重新采样的数据将是(对于每个纬度和经度):
Date KBDI-AWAP
2017-01-01T00:00:00 10.5
2017-01-01T01:00:00 10.5
2017-01-01T02:00:00 10.5
...
2017-01-02T00:00:00 9.2
2017-01-02T01:00:00 9.2
2017-01-02T02:00:00 9.2
...
2017-01-03T00:00:00 9.8
2017-01-03T01:00:00 9.8
2017-01-03T02:00:00 9.8
... ...
... ...
2017-12-31T21:00:00 8.2
2017-12-31T22:00:00 8.2
2017-12-31T23:00:00 8.2
Thinking that I should use the resample
function on Dataset, I tried with mds_kbdi_hourly = mds_kbdi.resample(time='H')
but this only output a DatasetResample object instead of a new Dataset. 考虑到我应该在数据集上使用
resample
功能,我尝试了mds_kbdi_hourly = mds_kbdi.resample(time='H')
但这仅输出DatasetResample对象,而不是新的数据集。
I tried both pad() and ffill() with the DatasetResample object. 我尝试了DatasetResample对象的pad()和ffill()。 The resampled data seems to have missed some data with either of them.
重新采样的数据似乎都丢失了其中一些数据。 The generated ['time'] coordindates are
生成的['time']协调是
['2017-01-01T00:00:00.000000000'
'2017-01-01T01:00:00.000000000'
'2017-01-01T02:00:00.000000000' ...
'2017-12-30T22:00:00.000000000'
'2017-12-30T23:00:00.000000000'
'2017-12-31T00:00:00.000000000'].
It is missing timestamps from 2017-12-31T01:00:00.000000000
to 2017-12-31T23:00:00.000000000
. 它缺少从
2017-12-31T01:00:00.000000000
到2017-12-31T23:00:00.000000000
时间戳。 How to fix this problem? 如何解决这个问题?
You are looking for the pad
or ffill
method. 您正在寻找
pad
或ffill
方法。 For example: 例如:
mds_kbdi.resample(time='1H').pad()
The resample
method always returns a Resample object. resample
方法始终返回Resample对象。 The resample object is only useful if you apply one of its methods (eg pad). 仅当您应用对象的一种方法(例如pad)时,重采样对象才有用。
Xarray's documentation lists the available resample methods here: http://xarray.pydata.org/en/stable/api.html#resample-objects Xarray的文档在此处列出了可用的重采样方法: http : //xarray.pydata.org/en/stable/api.html#resample-objects
and provides some examples of how they are used here: http://xarray.pydata.org/en/stable/time-series.html#resampling-and-grouped-operations 并提供了一些如何在此处使用它们的示例: http : //xarray.pydata.org/en/stable/time-series.html#resampling-and-grouped-operations
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.