简体   繁体   中英

Python match unicode string as unicode

I am trying to match a unicode string, such that the unicode will not match the string literal.

def validate(username):
    if "admin" in username:
        return False
    else:
        return True

validate(username)

If I pass username="\a\d\m\i\n" , it will return False, since it is converting the unicode, and then matching, and "\a\d\m\i\n" is unicode for admin. Is there a way to match before converting? The input is not converted, it starts off as unicode. I have tried using regex, but have not succeeded.

In Python 3, there's no longer a distinction between "unicode" and "string". So the string "\a\d\m\i\n" is just a string of the characters a , d , m , i , n , but using unicode codepoint escape sequences; there's no "conversion" going on here, it is exactly equivalent to entering "admin" .

What are you trying to achieve?

Remember that string escape sequences, like \a , are translated during parsing by Python, they never actually end up as part of the string. If, instead, a user, for example, enters the literal string of characters \a\d\m\i\n into a text form, what you will get, in Python notation, will be a string equivalent to "\\\a\\\d\\\m\\\i\\\n" (notice the escaped backslashes, to indicate these are literal backslashes and not escape sequences).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM