简体   繁体   中英

Selecting specific rows from a python dataframe for an ols regression in PANDAS

How do I select specific rows from a python dataframe for an ols regression in PANDAS?

I have a pandas dataframe with 1,000 rows. I want to regress column A on columns B + C, for the first 10 rows. When I type:

mod = pd.ols(y=df[‘A’], x=df[[‘B’,’C’]], window=10) 

I get the regression results for rows 991-1000. How do I specify that I want the FIRST (or second, etc.) 10 rows?

Thanks in advance.

I think you can use iloc :

mod = pd.ols(y=df['A'].iloc[2:12], x=df[['B','C']].iloc[2:12], window=10)

Or ix :

mod = pd.ols(y=df.ix[2:12, 'A'], x=df.ix[2:12, ['B', 'C']], window=10)

If you need all groups use range :

for i in range(10):
    #print i, i+10
    mod = pd.ols(y=df['A'].iloc[i:i + 10], x=df[['B','C']].iloc[i:i + 10], window=10)

If you need help about ols , try help(pd.ols) in IPython , because this function in pandas docs is missing:

In [79]: help(pd.ols)
Help on function ols in module pandas.stats.interface:

ols(**kwargs)
    Returns the appropriate OLS object depending on whether you need
    simple or panel OLS, and a full-sample or rolling/expanding OLS.

    Will be a normal linear regression or a (pooled) panel regression depending
    on the type of the inputs:

    y : Series, x : DataFrame -> OLS
    y : Series, x : dict of DataFrame -> OLS
    y : DataFrame, x : DataFrame -> PanelOLS
    y : DataFrame, x : dict of DataFrame/Panel -> PanelOLS
    y : Series with MultiIndex, x : Panel/DataFrame + MultiIndex -> PanelOLS

    Parameters
    ----------
    y: Series or DataFrame
        See above for types
    x: Series, DataFrame, dict of Series, dict of DataFrame, Panel
    weights : Series or ndarray
        The weights are presumed to be (proportional to) the inverse of the
        variance of the observations.  That is, if the variables are to be
        transformed by 1/sqrt(W) you must supply weights = 1/W
    intercept: bool
        True if you want an intercept.  Defaults to True.
    nw_lags: None or int
        Number of Newey-West lags.  Defaults to None.
    nw_overlap: bool
        Whether there are overlaps in the NW lags.  Defaults to False.
    window_type: {'full sample', 'rolling', 'expanding'}
        'full sample' by default
    window: int
        size of window (for rolling/expanding OLS). If window passed and no
        explicit window_type, 'rolling" will be used as the window_type

    Panel OLS options:
        pool: bool
            Whether to run pooled panel regression.  Defaults to true.
        entity_effects: bool
            Whether to account for entity fixed effects.  Defaults to false.
        time_effects: bool
            Whether to account for time fixed effects.  Defaults to false.
        x_effects: list
            List of x's to account for fixed effects.  Defaults to none.
        dropped_dummies: dict
            Key is the name of the variable for the fixed effect.
            Value is the value of that variable for which we drop the dummy.

            For entity fixed effects, key equals 'entity'.

            By default, the first dummy is dropped if no dummy is specified.
        cluster: {'time', 'entity'}
            cluster variances

    Examples
    --------
    # Run simple OLS.
    result = ols(y=y, x=x)

    # Run rolling simple OLS with window of size 10.
    result = ols(y=y, x=x, window_type='rolling', window=10)
    print(result.beta)

    result = ols(y=y, x=x, nw_lags=1)

    # Set up LHS and RHS for data across all items
    y = A
    x = {'B' : B, 'C' : C}

    # Run panel OLS.
    result = ols(y=y, x=x)

    # Run expanding panel OLS with window 10 and entity clustering.
    result = ols(y=y, x=x, cluster='entity', window_type='expanding', window=10)

    Returns
    -------
    The appropriate OLS object, which allows you to obtain betas and various
    statistics, such as std err, t-stat, etc.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM