简体   繁体   中英

How to print unicode character from a string variable?

I am new in programming world, and I am a bit confused.

I expecting that both print result the same graphical unicode exclamation mark symbol:

My experiment:

number   = 10071
byteStr  = number.to_bytes(4, byteorder='big')
hexStr   = hex(number)
uniChar  = byteStr.decode('utf-32be')
uniStr   = '\\u' + hexStr[2:6]
print(f'{number} - {hexStr[2:6]} - {byteStr} - {uniChar}')

print(f'{uniStr}')   # Not working
print(f'\u2757')     # Working

Output:

10071 - 2757 - b"\x00\x00'W" - ❗
\u2757
❗

What are the difference in the last two lines? Please, help me to understand it!

My environment is JupyterHub and v3.9 python.

An escape code evaluated by the Python parser when constructing literal strings. For example, the literal string '马' and '马' are evaluated by the parser as the same, length 1, string.

You can (and did) build a string with the 6 charactersby using an escape code for the backslash ( \\ ) to prevent the parser from evaluating those 6 characters as an escape code, which is why it prints as the 6-character .

If you build a byte string with those 6 characters, you can decode it with .decode('unicode-escape') to get the character:

>>> b'\\u2757'.decode('unicode_escape')
'❗'

But it is easier to use the chr() function on the number itself:

>>> chr(0x2757)
'❗'
>>> chr(10071)
'❗'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM