replace a string in entire dataframe from excel with value

Question

I have this kind of data from excel

dminerals=pd.read_excel(datafile)
print(dminerals.head(5))

Then I replace the 'Tr' and NaN value using for loop with this script

for key, value in dminerals.iteritems(): 
    dminerals[key] = dminerals[key].replace(to_replace='Tr', value=int(1))
    dminerals[key] = dminerals[key].replace(to_replace=np.nan, value=int(0))

then print it again, it seems working and print the dataframe types.But it shows object data type.

print(dminerals.head(5))
print(dminerals['C'].dtypes)

I tried using this .astype to change one of the column ['C'] to integer but the result is value error

dminerals['C'].astype(int)
ValueError: invalid literal for int() with base 10: 'tr'

I thought I already change the 'Tr' in the dataframe into integer value. Is there anything that I miss in the process above? Please help, thank you in advance!

Answer 1

You are replacing Tr with 1, however there is a tr that's not being replaced (this is what you ValueError is saying. Remember python is case sensitive. Also, using for loops is extremely inefficient you might want to try using the following lines of code:

dminerales = dminerales.replace({'Tr':1,'tr':1}).fillna(0)

I'm using fillna() which is also better to fill the null values with the specified value 0 in this case, instead of using repalce.

replace a string in entire dataframe from excel with value

Question

1 answers

solution1
1 ACCPTED 2020-11-10 14:27:25

replace a string in entire dataframe from excel with value

Question

1 answers

solution1 1 ACCPTED 2020-11-10 14:27:25

solution1
1 ACCPTED 2020-11-10 14:27:25