简体   繁体   English

用值替换excel中整个数据框中的字符串

[英]replace a string in entire dataframe from excel with value

I have this kind of data from excel我有这种来自excel的数据

dminerals=pd.read_excel(datafile)
print(dminerals.head(5))

在此处输入图片说明

Then I replace the 'Tr' and NaN value using for loop with this script然后我用这个脚本用 for 循环替换 'Tr' 和 NaN 值

for key, value in dminerals.iteritems(): 
    dminerals[key] = dminerals[key].replace(to_replace='Tr', value=int(1))
    dminerals[key] = dminerals[key].replace(to_replace=np.nan, value=int(0))

then print it again, it seems working and print the dataframe types.But it shows object data type.然后再次打印,它似乎工作并打印数据帧类型。但它显示对象数据类型。

print(dminerals.head(5))
print(dminerals['C'].dtypes)

在此处输入图片说明

I tried using this .astype to change one of the column ['C'] to integer but the result is value error我尝试使用此 .astype 将 ['C'] 列之一更改为整数,但结果是值错误

dminerals['C'].astype(int)
ValueError: invalid literal for int() with base 10: 'tr'

I thought I already change the 'Tr' in the dataframe into integer value.我以为我已经将数据框中的“Tr”更改为整数值。 Is there anything that I miss in the process above?在上面的过程中有什么我想念的吗? Please help, thank you in advance!请帮忙,先谢谢了!

You are replacing Tr with 1, however there is a tr that's not being replaced (this is what you ValueError is saying. Remember python is case sensitive. Also, using for loops is extremely inefficient you might want to try using the following lines of code:您正在用 1 替换Tr ,但是有一个tr未被替换(这就是您ValueError所说的。请记住,python 区分大小写。此外,使用 for 循环效率极低,您可能想尝试使用以下代码行:

dminerales = dminerales.replace({'Tr':1,'tr':1}).fillna(0)

I'm using fillna() which is also better to fill the null values with the specified value 0 in this case, instead of using repalce.我正在使用fillna()在这种情况下,它也可以更好地用指定的值0填充空值,而不是使用 repalce。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM