简体   繁体   中英

python unicode string matching

I have a list of words converted into a list of unciode strings but i am not able to match the end strings of a particular word from a list of strings as for example:

list which contains of strings of which it needs to be removed.For example उपलब्धियां is the word when converted to unicode is u'\उ\प\ल\ब\्\ध\ि\य\ा\ं'

list which contains of strings which if found at the end of a word in unicode needs to be removed r3_bad= [u"0900", u"0901", u"0902",u"0903"]; in this case is u0902 is at the end from the bad string list so to be removed.

i tried

if re.search(r'u$[0-3]',word[-1]) :

it does not returns true i don't know why.

please help thanks in advance.

Why RegEx? I think you need something like plain string comparison:

s = u'\u0909\u092a\u0932\u092c\u094d\u0927\u093f\u092f\u093e\u0902'
r3_bad= [u'\u0900',u'\u0901',u'\u0902',u'\u0903']

print s # output: उपलब्धियां
if s[-1] in r3_bad: print s[:-1] # output: उपलब्धिया

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM