在月底和一天结束时重新采样时间序列数据

Question

I have a timeseries data with the following format.我有以下格式的时间序列数据。

DateShort (%d/%m/%Y)日期短 (%d/%m/%Y)	TimeFrom时间从	TimeTo时间到	Value价值
1/1/2018 2018 年 1 月 1 日	0:00 0:00	1:00 1:00	6414 6414
1/1/2018 2018 年 1 月 1 日	1:00 1:00	2:00 2:00	6153 6153
... ...	... ...	... ...	... ...
1/1/2018 2018 年 1 月 1 日	23:00 23:00	0:00 0:00	6317 6317
2/1/2018 2018 年 2 月 1 日	0:00 0:00	1:00 1:00	6046 6046
... ...	... ...	... ...	... ...

I would like to re-sample data at the end of the month and at the end of the day.我想在月底和一天结束时重新采样数据。

The dataset could be retrieved from https://pastebin.com/raw/NWdigN97可以从https://pastebin.com/raw/NWdigN97检索数据集

pandas.DataFrame.resample() provides 'M' rule to retrieve data from the end of the month but at the beginning of the day. pandas.DataFrame.resample()提供'M'规则来检索月末但一天开始的数据。
See https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html见https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html

Do you have better solution to accomplish this?你有更好的解决方案来完成这个吗？

I have the following sample code:我有以下示例代码：

import numpy as np
import pandas as pd

ds_url = 'https://pastebin.com/raw/NWdigN97'

df = pd.read_csv(ds_url, header=0)

df['DateTime'] = pd.to_datetime(
    df['DateShort'] + ' ' + df['TimeFrom'],
    format='%d/%m/%Y %H:%M'
)

df.drop('DateShort', axis=1, inplace=True)
df.set_index('DateTime', inplace=True)

df.resample('M').asfreq()

The output is output 是

           TimeFrom TimeTo  Value
DateTime                         
2018-01-31     0:00   1:00   7215
2018-02-28     0:00   1:00   8580
2018-03-31     0:00   1:00   6202
2018-04-30     0:00   1:00   5369
2018-05-31     0:00   1:00   5840
2018-06-30     0:00   1:00   5730
2018-07-31     0:00   1:00   5979
2018-08-31     0:00   1:00   6009
2018-09-30     0:00   1:00   5430
2018-10-31     0:00   1:00   6587
2018-11-30     0:00   1:00   7948
2018-12-31     0:00   1:00   6193

However, the correct output should be但是，正确的 output 应该是

           TimeFrom TimeTo  Value
DateTime                            
2018-01-31  23:00   0:00    7605
2018-02-28  23:00   0:00    8790
2018-03-31  23:00   0:00    5967
2018-04-30  23:00   0:00    5595
2018-05-31  23:00   0:00    5558
2018-06-30  23:00   0:00    5153
2018-07-31  23:00   0:00    5996
2018-08-31  23:00   0:00    5757
2018-09-30  23:00   0:00    5785
2018-10-31  23:00   0:00    6437
2018-11-30  23:00   0:00    7830
2018-12-31  23:00   0:00    6767

Answer 1

Try this:尝试这个：

df.groupby(pd.Grouper(freq='M')).last()

Output: Output：

           TimeFrom TimeTo  Value
DateTime                         
2018-01-31    23:00   0:00   7605
2018-02-28    23:00   0:00   8790
2018-03-31    23:00   0:00   5967
2018-04-30    23:00   0:00   5595
2018-05-31    23:00   0:00   5558
2018-06-30    23:00   0:00   5153
2018-07-31    23:00   0:00   5996
2018-08-31    23:00   0:00   5757
2018-09-30    23:00   0:00   5785
2018-10-31    23:00   0:00   6437
2018-11-30    23:00   0:00   7830
2018-12-31    23:00   0:00   6707

在月底和一天结束时重新采样时间序列数据

问题描述

1 个解决方案

解决方案1
0 2021-12-19 04:19:40

在月底和一天结束时重新采样时间序列数据

问题描述

1 个解决方案

解决方案1 0 2021-12-19 04:19:40

解决方案1
0 2021-12-19 04:19:40