简体   繁体   中英

creating a function in python and applying it to a dataframe

Need some help creating and calling a function..

Current code:

Table1 = Table1.assign(Field3 = np.where(Table1.Field1.astype(str).str[0:5].isin(['CEQTY','LPCEQ']),
                                         Table1.Field2, 0)) 

Potential function?

def func(a,b):
   Table1 = Table1.assign(a = np.where(Table1.Field1.astype(str).str[0:5].isin([b]),
                                       Table1.Field2, 0)) 

Calling the function?

Table1.apply(func,a,b)?

I am repeating this procedure 100 times and the only thing that changes is 'Field3' and the arguments in isin. Keep getting an error when creating the function likely due to syntax

Consider replacing DataFrame.Apply since you are not running an exclusive operation on each column or row but conditional logic based on other columns, Field1 and Field2 , for new columns. Also, Dataframe.assign takes an unquoted name which will not work with passed in string.

Instead, run a simple Python function call that assigns column by string then returns a new df. Below demonstrates with random data, seeded for reproducibility, conditionally producing Fields3-5 :

import pandas as pd
import numpy as np

np.random.seed(55)    
Table1 = pd.DataFrame({'ID': [np.random.randint(15) for _ in range(50)],
                       'Field1': [np.random.choice(['CEQTY','LPCEQ','ABCDE','WVXYZ','12345'],1).item(0) 
                                  for _ in range(50)],
                       'Field2':  np.random.randn(50)*100
                       }, columns=['ID', 'Field1', 'Field2'])

def func(df):
    # ITERATE THROUGH LIST OF TUPLES (NEW COL AND LIST OF SEARCH ITEMS)
    for i in [('Field3',['CEQTY','LPCEQ']),
              ('Field4',['ABCDE','WVXYZ']),
              ('Field5',['12345'])]:

        # ASSIGN NEW COL, i[0], BY STRING BASED ON SEARCH LIST, i[1]
        df[i[0]] = np.where(df.Field1.astype(str).str[0:5].isin(i[1]), df.Field2, 0) 

    return df

output = func(Table1)    
print(output.head(10))
#    ID Field1      Field2      Field3      Field4     Field5
# 0  13  LPCEQ  105.640854  105.640854    0.000000   0.000000
# 1  10  12345  -13.049038    0.000000    0.000000 -13.049038
# 2   7  CEQTY  -85.079280  -85.079280    0.000000   0.000000
# 3   8  12345   12.047304    0.000000    0.000000  12.047304
# 4  13  12345  -29.095108    0.000000    0.000000 -29.095108
# 5  13  12345  -24.229704    0.000000    0.000000 -24.229704
# 6  13  LPCEQ  -97.472869  -97.472869    0.000000   0.000000
# 7   5  ABCDE -221.743951    0.000000 -221.743951   0.000000
# 8   7  LPCEQ   -0.155842   -0.155842    0.000000   0.000000
# 9   5  CEQTY    2.297829    2.297829    0.000000   0.000000

You are most likely receiving a syntax error as you're not calling func with the correct syntax.

Use this:

Table1.apply(func(a, b))

When calling a function, you need to use brackets around the arguments being supplied.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM