如何平均每月数据以获得 Python 中的年度值？

Question

I have a dataset that looks like this:我有一个如下所示的数据集：

Date日期	Value价值
1871-01 1871-01	4.5 4.5
1871-02 1871-02	10.7 10.7
1871-03 1871-03	8.9 8.9
1871-04 1871-04	1.3 1.3

all the way to 2021-12.一直到 2021-12 学年。

how do I get the average value for each year in Python?如何在 Python 中获得每年的平均值？ For example, the 1871 average would be the average of all of the values from 1871-01 to 1871-1 and I would like it for all years from 1871-2021.例如，1871 年的平均值将是从 1871-01 到 1871-1 的所有值的平均值，我希望它适用于从 1871 年到 2021 年的所有年份。 ... ... ……

Answer 1

given your data is in a pandas dataframe called df:鉴于您的数据位于名为 df 的 pandas 数据框中：

>>> df
    Date        Value
0   1871-01     4.5
1   1871-02     10.7
2   1871-03     8.9
3   1871-04     1.3
4   1872-02     1.5
5   1872-03     15.9
6   1872-04     7.3
>>> year_df = df.set_index(pd.to_datetime(df['Date'])).groupby(pd.Grouper(freq='Y')).mean()
>>> year_df.index = year_df.index.year
>>> year_df
Date    Value
1871    6.35
1872    8.233333333333333

Answer 2

Depends on the what format the data is being given to you.取决于向您提供数据的格式。 Is it json?是json吗？ csv? .csv？ If you already know how to import and read the data with python.. you just need to assign the years to variables and average them.如果您已经知道如何使用 python 导入和读取数据。您只需将年份分配给变量并对其进行平均。 (x1 + x2 + x3) / (number of averaged variables) (x1 + x2 + x3) / (平均变量数)

Answer 3

Make a numpy array with the values, reshape and use np.mean.使用值创建一个 numpy 数组，重塑并使用 np.mean。

Example with only 3 years worth of "data"仅具有 3 年“数据”价值的示例

import numpy as np

values=np.random.normal(0,1,36)
yearly_avgs=np.mean(values.reshape((len(values)//12,12)),axis=1)

Answer 4

This will help you to get average of all the data according to monthly average for years.这将帮助您根据多年的月平均值获得所有数据的平均值。 In this method there's no need to set date as index and will return single level dataframe as shown in output.在此方法中，无需将date设置为索引，并将返回单级数据帧，如输出所示。

import pandas as pd
import numpy as np
df=pd.DataFrame({"date":pd.date_range("1871-01","2021-12",freq="M"),"val":np.random.randint(10,100,[1811])}) # 1811 months
df[df["date"].dt.year==1871].mean() # 57.666667
df.groupby(pd.PeriodIndex(df["date"],freq="y"))["val"].mean().reset_index()

Above method will return same output even if date feature is of str data type.即使date特征是str数据类型，上述方法也将返回相同的输出。

Following below will return the same output given the column/feature is date type.鉴于列/功能是date类型，以下将返回相同的输出。

df.groupby(df["date"].dt.year)["val"].mean().reset_index()

Output .head() :输出.head() ：

	date日期	val值
0 0	1871 1871	57.666667 57.666667
1 1	1872 1872年	58.916667 58.916667
2 2	1873 1873年	52.416667 52.416667
3 3	1874 1874年	41.666667 41.666667
4 4	1875 1875年	57.583333 57.583333

如何平均每月数据以获得 Python 中的年度值？

问题描述

4 个解决方案

解决方案1
1 2022-07-13 18:43:17

解决方案2
0 2022-07-13 18:32:00

解决方案3
0 2022-07-13 18:39:32

解决方案4
0 2022-07-13 18:47:54

如何平均每月数据以获得 Python 中的年度值？

问题描述

4 个解决方案

解决方案1 1 2022-07-13 18:43:17

解决方案2 0 2022-07-13 18:32:00

解决方案3 0 2022-07-13 18:39:32

解决方案4 0 2022-07-13 18:47:54

解决方案1
1 2022-07-13 18:43:17

解决方案2
0 2022-07-13 18:32:00

解决方案3
0 2022-07-13 18:39:32

解决方案4
0 2022-07-13 18:47:54