简体   繁体   中英

How to apply a function to a list of dataframes in Pandas

I have a list of dataframes with the same structure:

list_df = [df1,df2]

Each dataframe has this structure:

df1=pd.DataFrame([2020-02,2020-01],['PC','PC'],[0.6,1.4],[0.5,1.3], columns=['Date', 'platform', "Day 1","Day 7"])

df2=pd.DataFrame([2020-02,2020-01],['Mobile','Mobile'],[0.6,1.2],[0.5,1.8], columns=['Date', 'platform', "Day 1","Day 7"])

I want to apply a custom function with several conditionals inside. When I try to run only the conditionals, without the function, it is applied to each data frame of the list, working perfectly. However, when I use it as a function it is not working, as it is only applied on the first dataframe of the list.

These are the codes I have been using:

for i in range(len(list_df)):
    r2_tester(df_r2, eq_name, list_df[i], list_days)

new_df_list = []
for df in list_df:
    new_df_list.append((r2_tester(df_r2,eq_name,list_df,list_days)))

r2_tester is the function I was referring to. The r2_tester is the function that runs different conditionals, calculating one curve fitting or another depending of the value of other table.

However, they both have the same issue: the function is only applied to the first dataframe. What am I doing wrong on the loop? As the conditionals work perfectly outside the function, I think the issue should be there.

The code with the r2_tester function:

def r2_tester(df_r2, eq_name, list_df, list_days):
    if eq_name == df_r2.columns[1]:
        for z in range(len(list_df)):
            x = list_df[z]['days']
            y = list_df[z]['value_a']
            pars, cov = curve_fit(f=base_f, xdata=x, ydata=y, p0=[0.5, 0.5, 0.5,0.5], bounds=(-np.inf, np.inf))
            b = pars[0]
            k = pars[1]
            s = pars[2]
            l = pars[3]
            for x in range(len(list_days)):
                df_row1 = {'days':list_days[x], 'platform':list_df[z].iloc[0].values[0], 'date_month':list_df[z].iloc[0].values[1],}
                list_df[z] = list_df[z].append(df_row1, ignore_index = True)
                day = base_f(list_days[x], b, k, s, l)
                list_df[z].iloc[-1,3] = day
            return list_df

    elif eq_name == df_r2.columns[2]:
        for z in range(len(list_df)):
            x = list_df[z]['days']
            y = list_df[z]['value_a']
            pars, cov = curve_fit(f=linear_f, xdata=x, ydata=y, p0=[0.5, 0.5, 0.5,0.5], bounds=(-np.inf, np.inf))
            b = pars[0]
            k = pars[1]
            s = pars[2]
            l = pars[3]
            for x in range(len(list_days)):
                df_row1 = {'days':list_days[x], 'platform':list_df[z].iloc[0].values[0], 'date_month':list_df[z].iloc[0].values[1],}
                list_df[z] = list_df[z].append(df_row1, ignore_index = True)
                day = linear_f(list_days[x], b, k, s, l)
                list_df[z].iloc[-1,3] = day
            return list_df

    elif eq_name == df_r2.columns[3]:
        for z in range(len(list_df)):
            x = list_df[z]['days']
            y = list_df[z]['value_a']
            pars, cov = curve_fit(f=exp_f, xdata=x, ydata=y, p0=[0.5, 0.5, 0.5,0.5], bounds=(-np.inf, np.inf))
            b = pars[0]
            k = pars[1]
            s = pars[2]
            l = pars[3]
            for x in range(len(list_days)):
                df_row1 = {'days':list_days[x], 'platform':list_df[z].iloc[0].values[0], 'date_month':list_df[z].iloc[0].values[1],}
                list_df[z] = list_df[z].append(df_row1, ignore_index = True)
                day = exp_f(list_days[x], b, k, s, l)
                list_df[z].iloc[-1,3] = day
            return list_df

    elif eq_name == df_r2.columns[4]:
        for z in range(len(list_df)):
            x = list_df[z]['days']
            y = list_df[z]['value_a']
            pars, cov = curve_fit(f=exp_l_f, xdata=x, ydata=y, p0=[0.5, 0.5, 0.5,0.5], bounds=(-np.inf, np.inf))
            b = pars[0]
            k = pars[1]
            s = pars[2]
            l = pars[3]
            for x in range(len(list_days)):
                df_row1 = {'days':list_days[x], 'platform':list_df[z].iloc[0].values[0], 'date_month':list_df[z].iloc[0].values[1],}
                list_df[z] = list_df[z].append(df_row1, ignore_index = True)
                day = exp_l_f(list_days[x], b, k, s, l)
                list_df[z].iloc[-1,3] = day
            return list_df

I cannot concat the dataframes because I have created the function over a list of dataframes.

Thanks!

IIUC use:

new_df_list = []
for df in list_df:
    new_df_list.append((r2_tester(df_r2,eq_name,df,list_days)))

Or:

new_df_list = [r2_tester(df_r2,eq_name,df,list_days) for df in list_df]

EDIT: If function working with list of DataFrames use:

new_df_list = r2_tester(df_r2, eq_name, list_df, list_days)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM