How to replace non integer values in a pandas Dataframe?

Question

I have a dataframe consisting of two columns, Age and Salary

Age   Salary
21    25000
22    30000
22    Fresher
23    2,50,000
24    25 LPA
35    400000
45    10,00,000

How to handle outliers in Salary column and replace them with an integer?

Answer 1

If need replace non numeric values use to_numeric with parameter errors='coerce' :

df['new'] = pd.to_numeric(df.Salary.astype(str).str.replace(',',''), errors='coerce')
              .fillna(0)
              .astype(int)
print (df)
   Age     Salary      new
0   21      25000    25000
1   22      30000    30000
2   22    Fresher        0
3   23   2,50,000   250000
4   24     25 LPA        0
5   35     400000   400000
6   45  10,00,000  1000000

Answer 2

使用numpy在哪里找到非数字值，替换为'0'。

df['New']=df.Salary.apply(lambda x: np.where(x.isdigit(),x,'0'))

Answer 3

If you use Python 3 use the following. I am not sure how other Python versions return type(x). However I would not replace missing or inconsistent values with 0, it is better to replace them with None. But let's say you want to replace string values (outliers or inconsistent values) with 0 :

df['Salary']=df['Salary'].apply(lambda x: 0 if str(type(x))=="<class 'str'>" else x)

How to replace non integer values in a pandas Dataframe?

Question

3 answers

solution1
8 ACCPTED 2017-03-21 14:36:51

solution2
1 2017-03-21 14:53:35

solution3
0 2019-03-05 21:58:24

How to replace non integer values in a pandas Dataframe?

Question

3 answers

solution1 8 ACCPTED 2017-03-21 14:36:51

solution2 1 2017-03-21 14:53:35

solution3 0 2019-03-05 21:58:24

solution1
8 ACCPTED 2017-03-21 14:36:51

solution2
1 2017-03-21 14:53:35

solution3
0 2019-03-05 21:58:24