繁体   English   中英

在 Pandas 中获取上个月的平均值

[英]Get mean of previous month in Pandas

我的 pandas 数据帧有点麻烦。 我想将测量值与上个月的测量值进行比较。 为此,我需要一个额外的列,其中包含标准偏差和上个月的平均值。

我有下表:

    Time    Value 1 Value 2 Value 3 Value 4
0   2020-04-01 03:42:51.531 9.189975    6.475000    3.962500    6.100006
1   2020-04-06 05:42:39.778 8.799253    7.300000    3.775000    6.119995
2   2020-04-06 06:45:55.211 8.824507    7.250000    3.600000    6.100006
3   2020-04-06 18:53:15.861 8.132523    6.312500    3.275000    6.100006
4   2020-04-07 05:39:54.373 8.772517    6.887500    3.962500    6.100006
... ... ... ... ... ...
17271   2021-03-31 22:12:32.374 9.012240    7.375000    3.750000    6.179993
17272   2021-03-31 22:43:51.906 9.038265    7.225000    3.800000    6.200012
17273   2021-03-31 23:12:27.061 9.091208    7.137500    3.887500    6.179993
17274   2021-03-31 23:44:14.439 9.109208    7.287500    3.962500    6.199997
17275   2021-04-01 00:00:00.000 9.111931    7.274812    3.973665    6.198373

对于每个时间步的四个测量值中的每一个,我想要一个额外的列(总共 8 个额外的列),其中包含上个月的平均值和标准偏差。 例如,对于 2021 年 1 月的每个值 1 测量值,我想要 2020 年 12 月的值 1 均值和标准差。

我已经为此工作了几天,但我无法为它编写一个有效的 python 代码。 我希望有人可以帮助我。 提前致谢!

您应该能够使用下面的代码来实现您的目标。 它计算每个月的平均值和标准差,然后使用Time列进行查找/合并

from pandas.tseries.offsets import MonthEnd

previous_month = df["Time"].dt.normalize() - MonthEnd(1)
mu = df.groupby(pd.Grouper(key="Time", freq="M")).mean()
sigma = df.groupby(pd.Grouper(key="Time", freq="M")).std()

df = df.merge(
    mu,
    how="left",
    left_on=previous_month,
    right_index=True,
    suffixes=("", "_prev_mean"),
)
df = df.merge(
    sigma,
    how="left",
    left_on=previous_month,
    right_index=True,
    suffixes=("", "_prev_std"),
)

如果您的数据如下所示:

                          Time    Value 1    Value 2    Value 3    Value 4
0   2021-01-01 01:37:49.148748   0.014568   0.041711   0.009694   0.047044
1   2021-01-01 03:29:24.939551   0.032042   0.073345   0.014901   0.051690
2   2021-01-01 06:00:53.871182   0.040758   0.105496   0.046904   0.073747
3   2021-01-01 16:59:30.672400   0.061262   0.113711   0.083658   0.073939
4   2021-01-02 01:36:59.195226   0.090762   0.115689   0.087191   0.081972
..                         ...        ...        ...        ...        ...
495 2021-04-18 05:26:41.805694  10.883107  11.917340  12.850949  13.834590
496 2021-04-18 11:52:30.124759  10.889271  11.946243  12.860569  13.870959
497 2021-04-18 13:27:59.735432  10.932131  11.977409  12.949012  13.929994
498 2021-04-18 18:58:02.280739  10.979734  11.988028  12.952918  13.991210
499 2021-04-18 19:17:01.745781  10.997603  11.995105  12.991302  13.995131

[500 rows x 5 columns]

它看起来像这样:

                          Time    Value 1    Value 2    Value 3    Value 4  Value 1_prev_mean  Value 2_prev_mean  Value 3_prev_mean  Value 4_prev_mean  Value 1_prev_std  Value 2_prev_std  Value 3_prev_std  Value 4_prev_std
0   2021-01-01 01:37:49.148748   0.014568   0.041711   0.009694   0.047044                NaN                NaN                NaN                NaN               NaN               NaN               NaN               NaN
1   2021-01-01 03:29:24.939551   0.032042   0.073345   0.014901   0.051690                NaN                NaN                NaN                NaN               NaN               NaN               NaN               NaN
2   2021-01-01 06:00:53.871182   0.040758   0.105496   0.046904   0.073747                NaN                NaN                NaN                NaN               NaN               NaN               NaN               NaN
3   2021-01-01 16:59:30.672400   0.061262   0.113711   0.083658   0.073939                NaN                NaN                NaN                NaN               NaN               NaN               NaN               NaN
4   2021-01-02 01:36:59.195226   0.090762   0.115689   0.087191   0.081972                NaN                NaN                NaN                NaN               NaN               NaN               NaN               NaN
..                         ...        ...        ...        ...        ...                ...                ...                ...                ...               ...               ...               ...               ...
495 2021-04-18 05:26:41.805694  10.883107  11.917340  12.850949  13.834590           7.446702           8.458356           9.071314           9.948837          0.830943          0.961324          1.091647          1.102397
496 2021-04-18 11:52:30.124759  10.889271  11.946243  12.860569  13.870959           7.446702           8.458356           9.071314           9.948837          0.830943          0.961324          1.091647          1.102397
497 2021-04-18 13:27:59.735432  10.932131  11.977409  12.949012  13.929994           7.446702           8.458356           9.071314           9.948837          0.830943          0.961324          1.091647          1.102397
498 2021-04-18 18:58:02.280739  10.979734  11.988028  12.952918  13.991210           7.446702           8.458356           9.071314           9.948837          0.830943          0.961324          1.091647          1.102397
499 2021-04-18 19:17:01.745781  10.997603  11.995105  12.991302  13.995131           7.446702           8.458356           9.071314           9.948837          0.830943          0.961324          1.091647          1.102397

[500 rows x 13 columns]

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM