Pandas dataframe: creating a new column that is a custom function using 2 other columns

Question

Consider the following data set stored in a pandas DataFrame dfX :

I have a function that is:

def someThingSpecial(x,y)
  # z = do something special with x,y
  return z

I now want to create a new column in df that bears the computed z value

Looking at other SO examples, I've tried several variants including:

dfX['C'] = dfX.apply(lambda x: someThingSpecial(x=x['A'], y=x['B']), axis=1)

Which returns errors. What is the right way to do this?

Answer 1

This seems to work for me on v0.21. Take a look -

df

   A  B
0  1  2
1  4  6
2  7  9

def someThingSpecial(x,y):
     return x + y


df.apply(lambda x: someThingSpecial(x.A, x.B), 1)

0     3
1    10
2    16
dtype: int64

You might want to try upgrading your pandas version to the latest stable release (0.21 as of now).

Here's another option. You can vectorise your function.

v = np.vectorize(someThingSpecial)

v now accepts arrays, but operates on each pair of elements individually. Note that this just hides the loop, as apply does, but is much cleaner. Now, you can compute C as so -

df['C'] = v(df.A, df.B)

Answer 2

if your function only needs one column's value, then do this instead of coldspeed's answer:

dfX['A'].apply(your_func)

to store it:

dfX['C'] = dfX['A'].apply(your_func)

Pandas dataframe: creating a new column that is a custom function using 2 other columns

Question

2 answers

solution1
2 ACCPTED 2017-12-13 19:26:37

solution2
1 2019-01-13 05:38:28

Pandas dataframe: creating a new column that is a custom function using 2 other columns

Question

2 answers

solution1 2 ACCPTED 2017-12-13 19:26:37

solution2 1 2019-01-13 05:38:28

solution1
2 ACCPTED 2017-12-13 19:26:37

solution2
1 2019-01-13 05:38:28