say I have a string like this
example = u"这是一段很蛋疼的中文"
I wanna replace 蛋
with egg
, how can I finish this?
It seems example.replace()
is useless. And I tried regex, using re.match(u"蛋", "")
returns none.
I searched a lot, it seems I should use method like .decode
, but still it doesn't work, even example.replace(u"\蛋", "egg")
is useless.
So is there a way to process Chinese characters?
You should get the output as below in Python3 .
>>> import re
>>> example = u"这是一段很蛋疼的中文"
>>> re.search(u'蛋',example)
<_sre.SRE_Match object; span=(5, 6), match='蛋'>
>>> example.replace('蛋','egg')
'这是一段很egg疼的中文'
>>> re.sub('蛋','egg',example)
'这是一段很egg疼的中文'
>>> example.replace(u"\u86CB", "egg")
'这是一段很egg疼的中文'
>>> re.match('.*蛋',example)
<_sre.SRE_Match object; span=(0, 6), match='这是一段很蛋'>
re.match
will try to match the string from the beginning, so it will return None
in your case.
You can do something like this within Python2
:
Edit: Adding a correct encoded source file that has a coding spec also using unicode literals
will solve the issue.
#!/usr/local/bin/python
# -*- coding: utf-8 -*-
example = u"这是一段很蛋疼的中文"
print example.replace(u"这", u"egg")
# Within Python3
# print(example.replace("这", 'egg'))
Output:
egg是一段很蛋疼的中文
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.