简体   繁体   English

如何替换非ASCII字符

[英]How to replace non-ASCII characters

I have a string in the form email [ à ] example.com 我有一个email [ à ] example.com格式为email [ à ] example.com的字符串

I want to make it email@example.com . 我想将其设置为email@example.com

I tried : 我试过了 :

print email.replace(u"\xa0", "@")
print email.replace(" [ à ] ", "@")
print email.replace(" à ", "@")
email = email.replace(u" à ", "@")

but I always get this error: 但我总是得到这个错误:

'ascii' codec can't decode byte 0xc3 in position 3: ordinal not in range(128) 'ascii'编解码器无法解码位置3的字节0xc3:序数不在范围内(128)

It works if you use the unicode type for both the string and replacement: 如果您将unicode类型用于字符串和替换,则可以使用:

>>> email = u"email [ à ] domain.fr"
>>> email.replace(u" [ à ] ", u"@")
u'email@domain.fr'

To get a unicode object out of str use .decode() : 要从str获取unicode对象,请使用.decode()

email.decode("utf-8")  # or provide another encoding

Alternatively, if you don't want to use unicode strings, use: 另外,如果您不想使用unicode字符串,请使用:

In [8]: email = 'email [ à ] domain.fr'

In [9]: email.replace(' [ \xc3\xa0 ] ', '@')
Out[9]: 'email@domain.fr'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM