简体   繁体   中英

populate new column in a pandas dataframe which takes input from other columns

i have a function which should take x , y , z as input and returns r as output. For example : my_func( x , y, z) takes x = 10 , y = 'apple' and z = 2 and returns value in column r. Similarly, function takes x = 20, y = 'orange' and z =4 and populates values in column r. Any suggestions what would be the efficient code for this ?

Before :

   a  x       y       z      
   5  10   'apple'    2
   2  20   'orange'   4
   0  4    'apple'    2
   5  5    'pear'     6

After:

   a  x       y       z      r
   5  10   'apple'    2      x
   2  20   'orange'   4      x
   10  4   'apple'    2      x
   5  5    'pear'     6      x

Depends on how complex your function is. In general you can use pandas.DataFrame.apply :

>>> def my_func(x):
...     return '{0} - {1} - {2}'.format(x['y'],x['a'],x['x'])
... 
>>> df['r'] = df.apply(my_func, axis=1)
>>> df
   a   x         y  z                  r
0  5  10   'apple'  2   'apple' - 5 - 10
1  2  20  'orange'  4  'orange' - 2 - 20
2  0   4   'apple'  2    'apple' - 0 - 4
3  5   5    'pear'  6     'pear' - 5 - 5

axis=1 is to make your function work 'for each row' instead of 'for each column`:

Objects passed to functions are Series objects having index either the DataFrame's index (axis=0) or the columns (axis=1)

But if it's really simple function, like the one above, you probably can even do it without function, with vectorized operations.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM