[英]Python dataframe - conditional average per year
I have the following code:我有以下代码:
import csv
import pandas as pd
import numpy as np
stocks_dataframe = pd.read_csv('^GSPC.csv', delimiter = ',')
stocks_dataframe['Percent_change'] = stocks_dataframe['Close'].pct_change()
stocks_dataframe['positive_return_day'] = np.where(stocks_dataframe['Percent_change']>=0, 1, 0)
stocks_dataframe['negative_return_day'] = np.where(stocks_dataframe['Percent_change']<0, 1, 0)
stocks_dataframe['positive_return_day'].value_counts()
stocks_dataframe['date'] = pd.to_datetime(stocks_dataframe['Date'])
stocks_dataframe['year'], stocks_dataframe['month'] = stocks_dataframe['date'].dt.year, stocks_dataframe['date'].dt.month
yearly_data = pd.DataFrame()
yearly_data['positive_return_day'] = stocks_dataframe['positive_return_day'].groupby([stocks_dataframe.year]).agg('sum')
yearly_data['negative_return_day'] = stocks_dataframe['negative_return_day'].groupby([stocks_dataframe.year]).agg('sum')
stocks_dataframe.groupby(stocks_dataframe.year)['Percent_change'].transform('mean')
How can I calculate the average return separately for positive return days and negative return days?如何分别计算正回报天数和负回报天数的平均回报? I would like to get these values per year and store them in the yearly_data dataframe.我想每年获取这些值并将它们存储在 yearly_data dataframe 中。
Here is the head of the stocks dataframe:这是股票dataframe的负责人:
stocks_dataframe.head()
Out[35]:
Date Open High ... year month negative_return_day
0 1999-12-31 1464.469971 1472.420044 ... 1999 12 0
1 2000-01-03 1469.250000 1478.000000 ... 2000 1 1
2 2000-01-04 1455.219971 1455.219971 ... 2000 1 1
3 2000-01-05 1399.420044 1413.270020 ... 2000 1 0
4 2000-01-06 1402.109985 1411.900024 ... 2000 1 0
[5 rows x 13 columns]
can't you just groupby again?你不能再分组吗?
for year, df in stocks_dataframe.groupby(stocks_dataframe.year):
print(year)
print(df.groupby(df.negative_return_day).Percent_change.mean())
edit: now you can get the year too编辑:现在你也可以得到年份
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.