Pandas Dataframe 组内 function

Question

I have a pandas dataframe with stock price data shown below:我有一个 pandas dataframe 股票价格数据如下所示：

      ticker       date    open    high     low   close      volume
0        A2M 2015-03-31   0.555   0.595   0.530   0.565   4816294.0
1        A2M 2015-04-30   0.475   0.500   0.475   0.500    531816.0
2        A2M 2015-05-29   0.475   0.475   0.455   0.465   5665854.0
3        A2M 2015-06-30   0.640   0.650   0.630   0.640   1691918.0
4        A2M 2015-07-31   0.750   0.760   0.730   0.735    714927.0
...      ...        ...     ...     ...     ...     ...         ...
45479    ZFX 2008-01-31  10.090  10.490   9.860  10.280   4484500.0
45480    ZFX 2008-02-29  10.650  11.130  10.650  11.130  15525073.0
45481    ZFX 2008-03-31  10.010  10.080   9.920   9.980   4256951.0
45482    ZFX 2008-04-30   9.900  10.190   9.850  10.100   3522569.0
45483    ZFX 2008-05-30   9.750   9.750   9.450   9.500   8270995.0

My goal is to include columns within the dataframe for the 3,6,9,12 month rate of change.我的目标是在 dataframe 中包含 3、6、9、12 个月变化率的列。 I have developed the function below:我开发了下面的function：

#defines the ROC function
def roc (df, roc_periods):
    roc = df['close'] / df['close'].shift(roc_periods) - 1
    return pd.DataFrame(roc)

#defines the periods for the ROC calculations
def roc_periods(df, months):
    for month in months:
        df['{}mo_roc'.format(month)] = roc(df, month)
    return df

#specify the roc periods to calculate
periods = roc_periods(monthly_raw_data, [3, 6, 9, 12])

The output dataframe is as follows: output dataframe如下：

      ticker       date    open    high     low   close      volume   3mo_roc  \
0        A2M 2015-03-31   0.555   0.595   0.530   0.565   4816294.0       NaN   
1        A2M 2015-04-30   0.475   0.500   0.475   0.500    531816.0       NaN   
2        A2M 2015-05-29   0.475   0.475   0.455   0.465   5665854.0       NaN   
3        A2M 2015-06-30   0.640   0.650   0.630   0.640   1691918.0  0.132743   
4        A2M 2015-07-31   0.750   0.760   0.730   0.735    714927.0  0.470000   
...      ...        ...     ...     ...     ...     ...         ...       ...   
45479    ZFX 2008-01-31  10.090  10.490   9.860  10.280   4484500.0 -0.382583   
45480    ZFX 2008-02-29  10.650  11.130  10.650  11.130  15525073.0 -0.229224   
45481    ZFX 2008-03-31  10.010  10.080   9.920   9.980   4256951.0 -0.195161   
45482    ZFX 2008-04-30   9.900  10.190   9.850  10.100   3522569.0 -0.017510   
45483    ZFX 2008-05-30   9.750   9.750   9.450   9.500   8270995.0 -0.146451   

        6mo_roc   9mo_roc  12mo_roc  
0           NaN       NaN       NaN  
1           NaN       NaN       NaN  
2           NaN       NaN       NaN  
3           NaN       NaN       NaN  
4           NaN       NaN       NaN  
...         ...       ...       ...  
45479 -0.483677 -0.378852 -0.373171  
45480 -0.340640 -0.367614 -0.334330  
45481 -0.436795 -0.469713 -0.367554  
45482 -0.393393 -0.492717 -0.389728  
45483 -0.342105 -0.437204 -0.460227

The problem is that I cannot seem to get the.groupby() method to work.问题是我似乎无法让 .groupby() 方法工作。 As a result, the rate of change columns roll through all tickers as if they were continuous, rather than calculate for each ticker.因此，变化率列在所有代码中滚动，就好像它们是连续的一样，而不是针对每个代码进行计算。 I've tried to place the .groupby() method throughout the code, however I receive KeyError: 'ticker' messages.我试图在整个代码中放置.groupby()方法，但是我收到KeyError: 'ticker'消息。 For the purposes of asking on here - I've removed my attempts at groupby all together.出于询问的目的 - 我已经一起删除了我对groupby的尝试。

Answer 1

You can pass parameters to a function that you apply after a groupby.您可以将参数传递给在 groupby 之后应用的 function。 Just change roc_periods to use it:只需更改roc_periods即可使用它：

#defines the periods for the ROC calculations
def roc_periods(df, months):
    for month in months:
        df['{}mo_roc'.format(month)] = df.groupby('ticker').apply(roc, month)
    return df

Pandas Dataframe 组内 function

问题描述

1 个解决方案

解决方案1
0 已采纳 2020-04-23 08:18:25

Pandas Dataframe 组内 function

问题描述

1 个解决方案

解决方案1 0 已采纳 2020-04-23 08:18:25

解决方案1
0 已采纳 2020-04-23 08:18:25