简体   繁体   中英

pandas:add a column based on another one

I want to add a column based on column 'mths_since_recent_revol_delinq',if mths_since_recent_revol_delinq is null then get the new column equals 1,and get new dataframe like:

+----+--------------------------------+------------------------------------+
|    | mths_since_recent_revol_delinq | mths_since_recent_revol_delinq_add |
+----+--------------------------------+------------------------------------+
|  0 | NaN                            |                                  1 |
|  1 | 33                             |                                  0 |
|  2 | NaN                            |                                  1 |
|  3 | NaN                            |                                  1 |
|  4 | 57                             |                                  0 |
|  5 | 21                             |                                  0 |
|  6 | 60                             |                                  0 |
|  7 | NaN                            |                                  1 |
|  8 | 2                              |                                  0 |
|  9 | 24                             |                                  0 |
| 10 | NaN                            |                                  1 |
+----+--------------------------------+------------------------------------+

def label_race (df):
   if df['mths_since_recent_revol_delinq'].isnull():
      return 1
   else:
      return 0

Loan_a1['mths_since_recent_revol_delinq_add'] = Loan_a1.apply (lambda df: label_race(df),axis=1)

and Traceback :

--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) in () ----> 1 Loan_a1['mths_since_recent_revol_delinq_add'] = Loan_a1.apply (lambda df: label_race(df),axis=1)

D:\\Program Files (x86)\\Anaconda3\\lib\\site-packages\\pandas\\core\\frame.py in apply(self, func, axis, broadcast, raw, reduce, args, **kwds) 4150
if reduce is None: 4151 reduce = True -> 4152 return self._apply_standard(f, axis, reduce=reduce) 4153 else: 4154
return self._apply_broadcast(f, axis)

D:\\Program Files (x86)\\Anaconda3\\lib\\site-packages\\pandas\\core\\frame.py in _apply_standard(self, func, axis, ignore_failures, reduce) 4246 try: 4247 for i, v in enumerate(series_gen): -> 4248 results[i] = func(v) 4249 keys.append(v.name) 4250 except Exception as e:

in (df) ----> 1 Loan_a1['mths_since_recent_revol_delinq_add'] = Loan_a1.apply (lambda df: label_race(df),axis=1)

in label_race(df) 1 def label_race (df): ----> 2 if df['mths_since_recent_revol_delinq'].isnull(): 3 return 1 4 else: 5 return 0

AttributeError: ("'float' object has no attribute 'isnull'", 'occurred at index 0')

any ideas on how to fix it?thanks

Use isnull and then cast the result to int with astype :

Loan_a1 = pd.DataFrame({'mths_since_recent_revol_delinq': [np.nan, 33.0, np.nan, np.nan, 57.0, 21.0, 60.0, np.nan, 2.0, 24.0, np.nan]})

results_key = "mths_since_recent_revol_delinq_add"
input_key = "mths_since_recent_revol_delinq"
Loan_a1[results_key] = Loan_a1[input_key].isnull().astype(int)
print (Loan_a1)
    mths_since_recent_revol_delinq  mths_since_recent_revol_delinq_add
0                              NaN                                   1
1                             33.0                                   0
2                              NaN                                   1
3                              NaN                                   1
4                             57.0                                   0
5                             21.0                                   0
6                             60.0                                   0
7                              NaN                                   1
8                              2.0                                   0
9                             24.0                                   0
10                             NaN                                   1

Following @Jezrael's construction and definitions and the neat trick that True * 1 = 1 and False * 1 = 0 you can obtain the same result using (also assign):

Loan_a1.assign(results_key = lambda x:x[input_key].isnull() * 1)

Note that this approach directly returns a new data frame. No further assignment is necessary

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM