Replacing column values in a pandas dataframe based if it contains a specific substring

Question

I am new to python data science and started solving questions. I got stuck in one problem where I am not able to replace some column values.

I am doing problem to predict old car price based on number of factors such as Power, seats, model, make, manufacturer and others. For a power column, fields are having values like as shown in snapshot

Some fields are having values null bhp . I am trying to replace these null values to nan so that I will be able to fill mean in those values in next step but I am unable to convert null to nan

Below is the code I am using

data["Power"]= data["Power"].str.split("bhp",expand = True)
#This is to change bhp

and then I am doing like this

for i in data.Power:
    if i=="null":
        data.Power = np.nan

It is not doing anything.

Answer 1

Instead of splitting and iterating, just search for "null" and replace with loc in one step.

data.loc[data['Power'].str.contains('null', na=False), 'Power'] = np.nan

You can use numpy.where to do the same thing, possibly faster,

data['Power'] = np.where(data['Power'].str.contains('null'), np.nan, data['Power'])

Replacing column values in a pandas dataframe based if it contains a specific substring

Question

1 answers

solution1
1 ACCPTED 2019-06-13 18:29:29

Replacing column values in a pandas dataframe based if it contains a specific substring

Question

1 answers

solution1 1 ACCPTED 2019-06-13 18:29:29

solution1
1 ACCPTED 2019-06-13 18:29:29