简体   繁体   中英

Rename Pandas DataFrame inside function does not work

I want to implement this function in order to create new columns with new names. If I apply line by line the code works perfectly. If I run the function, the line lag.columns = [rename] does not work.

What is happening?

T  = [50, 48, 47, 49, 51, 53, 54, 52]
v1 = [1, 3, 2, 4, 5, 5, 6, 2] 
v2 = [2, 5, 4, 2, 3, 1, 6, 9]

dataframe = pd.DataFrame({'T': T, 'v1': v1, 'v2': v2})


def timeseries_to_supervised(data, ts=1, dropnan=True):
    '''
    Helper function to convert a timeseries dataframe to supervised
    The response must be placed as the first column
    Arguments:
        :data --> dataframe to transform into supervised
        :timesteps --> number of timesteps we want to shift
    Returns:
        :final --> numpy array transformed        
    ''' 
    # n_vars = 1 if type(data) is list else data.shape[1]
    # y = data.loc[1]

    # Create lags
    for i, col in enumerate(list(data)):

        name = col
        rename = name + '(t-1)'
        lag  = pd.DataFrame(data.iloc[:, i]).shift(1)
        lag.colums = [rename]
        data = pd.concat([data, lag], axis=1)

    return data

reframed = timeseries_to_supervised(dataframe, 1)

So, it is returning the data frame with the new columns but the names of the columns don't include the changing part.

Thanks in advance!

this works for me:

import pandas as pd
T  = [50, 48, 47, 49, 51, 53, 54, 52]
v1 = [1, 3, 2, 4, 5, 5, 6, 2] 
v2 = [2, 5, 4, 2, 3, 1, 6, 9]

dataframe = pd.DataFrame({'T': T, 'v1': v1, 'v2': v2})


def timeseries_to_supervised(data, ts=1, dropnan=True):

    # n_vars = 1 if type(data) is list else data.shape[1]
    # y = data.loc[1]

    # Create lags
    for i, col in enumerate(list(data)):

        name = col
        rename = name + '(t-1)'
        lag = pd.DataFrame(data.iloc[:, i].shift(1).values, columns=[rename], index=data.index)
        data = pd.concat([data, lag], axis=1)

    return data

reframed = timeseries_to_supervised(dataframe, 1)
print reframed

only changed the way you create the new lag. This gives me:

   T   v1  v2   T(t-1)  v1(t-1)  v2(t-1)
0  50   1   2     NaN      NaN      NaN
1  48   3   5    50.0      1.0      2.0
2  47   2   4    48.0      3.0      5.0
3  49   4   2    47.0      2.0      4.0
4  51   5   3    49.0      4.0      2.0
5  53   5   1    51.0      5.0      3.0
6  54   6   6    53.0      5.0      1.0
7  52   2   9    54.0      6.0      6.0

You have a typo:

lag.colums = [rename]

This should be:

lag.columns = [rename]

It worked for me, that's my output:

    T  v1  v2  T(t-1)  v1(t-1)  v2(t-1)
0  50   1   2     NaN      NaN      NaN
1  48   3   5    50.0      1.0      2.0
2  47   2   4    48.0      3.0      5.0
3  49   4   2    47.0      2.0      4.0
4  51   5   3    49.0      4.0      2.0
5  53   5   1    51.0      5.0      3.0
6  54   6   6    53.0      5.0      1.0
7  52   2   9    54.0      6.0      6.0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM