I want to add a column based on column 'mths_since_recent_revol_delinq',if mths_since_recent_revol_delinq is null then get the new column equals 1,and get new dataframe like:
+----+--------------------------------+------------------------------------+
| | mths_since_recent_revol_delinq | mths_since_recent_revol_delinq_add |
+----+--------------------------------+------------------------------------+
| 0 | NaN | 1 |
| 1 | 33 | 0 |
| 2 | NaN | 1 |
| 3 | NaN | 1 |
| 4 | 57 | 0 |
| 5 | 21 | 0 |
| 6 | 60 | 0 |
| 7 | NaN | 1 |
| 8 | 2 | 0 |
| 9 | 24 | 0 |
| 10 | NaN | 1 |
+----+--------------------------------+------------------------------------+
def label_race (df):
if df['mths_since_recent_revol_delinq'].isnull():
return 1
else:
return 0
Loan_a1['mths_since_recent_revol_delinq_add'] = Loan_a1.apply (lambda df: label_race(df),axis=1)
and Traceback :
--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) in () ----> 1 Loan_a1['mths_since_recent_revol_delinq_add'] = Loan_a1.apply (lambda df: label_race(df),axis=1)
D:\\Program Files (x86)\\Anaconda3\\lib\\site-packages\\pandas\\core\\frame.py in apply(self, func, axis, broadcast, raw, reduce, args, **kwds) 4150
if reduce is None: 4151 reduce = True -> 4152 return self._apply_standard(f, axis, reduce=reduce) 4153 else: 4154
return self._apply_broadcast(f, axis)D:\\Program Files (x86)\\Anaconda3\\lib\\site-packages\\pandas\\core\\frame.py in _apply_standard(self, func, axis, ignore_failures, reduce) 4246 try: 4247 for i, v in enumerate(series_gen): -> 4248 results[i] = func(v) 4249 keys.append(v.name) 4250 except Exception as e:
in (df) ----> 1 Loan_a1['mths_since_recent_revol_delinq_add'] = Loan_a1.apply (lambda df: label_race(df),axis=1)
in label_race(df) 1 def label_race (df): ----> 2 if df['mths_since_recent_revol_delinq'].isnull(): 3 return 1 4 else: 5 return 0
AttributeError: ("'float' object has no attribute 'isnull'", 'occurred at index 0')
any ideas on how to fix it?thanks
Use isnull and then cast the result to int with astype :
Loan_a1 = pd.DataFrame({'mths_since_recent_revol_delinq': [np.nan, 33.0, np.nan, np.nan, 57.0, 21.0, 60.0, np.nan, 2.0, 24.0, np.nan]})
results_key = "mths_since_recent_revol_delinq_add"
input_key = "mths_since_recent_revol_delinq"
Loan_a1[results_key] = Loan_a1[input_key].isnull().astype(int)
print (Loan_a1)
mths_since_recent_revol_delinq mths_since_recent_revol_delinq_add
0 NaN 1
1 33.0 0
2 NaN 1
3 NaN 1
4 57.0 0
5 21.0 0
6 60.0 0
7 NaN 1
8 2.0 0
9 24.0 0
10 NaN 1
Following @Jezrael's construction and definitions and the neat trick that True * 1 = 1
and False * 1 = 0
you can obtain the same result using (also assign):
Loan_a1.assign(results_key = lambda x:x[input_key].isnull() * 1)
Note that this approach directly returns a new data frame. No further assignment is necessary
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.