[英]Get mean of previous month in Pandas
我的 pandas 数据帧有点麻烦。 我想将测量值与上个月的测量值进行比较。 为此,我需要一个额外的列,其中包含标准偏差和上个月的平均值。
我有下表:
Time Value 1 Value 2 Value 3 Value 4
0 2020-04-01 03:42:51.531 9.189975 6.475000 3.962500 6.100006
1 2020-04-06 05:42:39.778 8.799253 7.300000 3.775000 6.119995
2 2020-04-06 06:45:55.211 8.824507 7.250000 3.600000 6.100006
3 2020-04-06 18:53:15.861 8.132523 6.312500 3.275000 6.100006
4 2020-04-07 05:39:54.373 8.772517 6.887500 3.962500 6.100006
... ... ... ... ... ...
17271 2021-03-31 22:12:32.374 9.012240 7.375000 3.750000 6.179993
17272 2021-03-31 22:43:51.906 9.038265 7.225000 3.800000 6.200012
17273 2021-03-31 23:12:27.061 9.091208 7.137500 3.887500 6.179993
17274 2021-03-31 23:44:14.439 9.109208 7.287500 3.962500 6.199997
17275 2021-04-01 00:00:00.000 9.111931 7.274812 3.973665 6.198373
对于每个时间步的四个测量值中的每一个,我想要一个额外的列(总共 8 个额外的列),其中包含上个月的平均值和标准偏差。 例如,对于 2021 年 1 月的每个值 1 测量值,我想要 2020 年 12 月的值 1 均值和标准差。
我已经为此工作了几天,但我无法为它编写一个有效的 python 代码。 我希望有人可以帮助我。 提前致谢!
您应该能够使用下面的代码来实现您的目标。 它计算每个月的平均值和标准差,然后使用Time
列进行查找/合并:
from pandas.tseries.offsets import MonthEnd
previous_month = df["Time"].dt.normalize() - MonthEnd(1)
mu = df.groupby(pd.Grouper(key="Time", freq="M")).mean()
sigma = df.groupby(pd.Grouper(key="Time", freq="M")).std()
df = df.merge(
mu,
how="left",
left_on=previous_month,
right_index=True,
suffixes=("", "_prev_mean"),
)
df = df.merge(
sigma,
how="left",
left_on=previous_month,
right_index=True,
suffixes=("", "_prev_std"),
)
如果您的数据如下所示:
Time Value 1 Value 2 Value 3 Value 4
0 2021-01-01 01:37:49.148748 0.014568 0.041711 0.009694 0.047044
1 2021-01-01 03:29:24.939551 0.032042 0.073345 0.014901 0.051690
2 2021-01-01 06:00:53.871182 0.040758 0.105496 0.046904 0.073747
3 2021-01-01 16:59:30.672400 0.061262 0.113711 0.083658 0.073939
4 2021-01-02 01:36:59.195226 0.090762 0.115689 0.087191 0.081972
.. ... ... ... ... ...
495 2021-04-18 05:26:41.805694 10.883107 11.917340 12.850949 13.834590
496 2021-04-18 11:52:30.124759 10.889271 11.946243 12.860569 13.870959
497 2021-04-18 13:27:59.735432 10.932131 11.977409 12.949012 13.929994
498 2021-04-18 18:58:02.280739 10.979734 11.988028 12.952918 13.991210
499 2021-04-18 19:17:01.745781 10.997603 11.995105 12.991302 13.995131
[500 rows x 5 columns]
它看起来像这样:
Time Value 1 Value 2 Value 3 Value 4 Value 1_prev_mean Value 2_prev_mean Value 3_prev_mean Value 4_prev_mean Value 1_prev_std Value 2_prev_std Value 3_prev_std Value 4_prev_std
0 2021-01-01 01:37:49.148748 0.014568 0.041711 0.009694 0.047044 NaN NaN NaN NaN NaN NaN NaN NaN
1 2021-01-01 03:29:24.939551 0.032042 0.073345 0.014901 0.051690 NaN NaN NaN NaN NaN NaN NaN NaN
2 2021-01-01 06:00:53.871182 0.040758 0.105496 0.046904 0.073747 NaN NaN NaN NaN NaN NaN NaN NaN
3 2021-01-01 16:59:30.672400 0.061262 0.113711 0.083658 0.073939 NaN NaN NaN NaN NaN NaN NaN NaN
4 2021-01-02 01:36:59.195226 0.090762 0.115689 0.087191 0.081972 NaN NaN NaN NaN NaN NaN NaN NaN
.. ... ... ... ... ... ... ... ... ... ... ... ... ...
495 2021-04-18 05:26:41.805694 10.883107 11.917340 12.850949 13.834590 7.446702 8.458356 9.071314 9.948837 0.830943 0.961324 1.091647 1.102397
496 2021-04-18 11:52:30.124759 10.889271 11.946243 12.860569 13.870959 7.446702 8.458356 9.071314 9.948837 0.830943 0.961324 1.091647 1.102397
497 2021-04-18 13:27:59.735432 10.932131 11.977409 12.949012 13.929994 7.446702 8.458356 9.071314 9.948837 0.830943 0.961324 1.091647 1.102397
498 2021-04-18 18:58:02.280739 10.979734 11.988028 12.952918 13.991210 7.446702 8.458356 9.071314 9.948837 0.830943 0.961324 1.091647 1.102397
499 2021-04-18 19:17:01.745781 10.997603 11.995105 12.991302 13.995131 7.446702 8.458356 9.071314 9.948837 0.830943 0.961324 1.091647 1.102397
[500 rows x 13 columns]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.