简体   繁体   中英

User defined function on pandas dataframe

Here is my Code:

dfnew=pd.DataFrame({ 'year': [2015,2016],
                      'month': [10, 12],
                      'day': [25,31]}) 
print(dfnew)

def calc(yy,n):

    if yy==2016:
        return yy*2*n
    else: 
        return yy

dfnew['nv']=map(calc, dfnew['year'],2)    
print(dfnew['nv'])

How I can get this code running without error? I want the function to be applied only on the 'Year' column of the dataframe for all rows and store output on a new column named 'nv' of the same dataframe.

Need apply for custom function:

dfnew['nv']= dfnew['year'].apply(lambda x: calc(x, 2))
print (dfnew)
   day  month  year    nv
0   25     10  2015  2015
1   31     12  2016  8064

Better is use mask for change values by condition:

dfnew['nv']= dfnew['year'].mask(dfnew['year'] == 2016, dfnew['year'] * 2 * 2)
print (dfnew)
   day  month  year    nv
0   25     10  2015  2015
1   31     12  2016  8064

Detail:

print (dfnew['year'] == 2016)
0    False
1     True
Name: year, dtype: bool

Many Thanks for your prompt reply. Your answer to my question has been very helpful.

In addition to this, I also needed to pass multiple column names to the function and this is how I have done it.

def yearCalc(year,month,n):
    if year == 2016:
        print("year:{} month:{}".format(year, month))
        return year * month * n
    else: 
        return year

df['nv']= df[['year' ,'month']].apply(lambda x: yearCalc(x['year'],x['month'],2),axis=1)

Many thanks.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM