How to print unicode character from a string variable?

Question

I am new in programming world, and I am a bit confused.

I expecting that both print result the same graphical unicode exclamation mark symbol:

My experiment:

number   = 10071
byteStr  = number.to_bytes(4, byteorder='big')
hexStr   = hex(number)
uniChar  = byteStr.decode('utf-32be')
uniStr   = '\\u' + hexStr[2:6]
print(f'{number} - {hexStr[2:6]} - {byteStr} - {uniChar}')

print(f'{uniStr}')   # Not working
print(f'\u2757')     # Working

Output:

10071 - 2757 - b"\x00\x00'W" - ❗
\u2757
❗

What are the difference in the last two lines? Please, help me to understand it!

My environment is JupyterHub and v3.9 python.

Answer 1

An escape code evaluated by the Python parser when constructing literal strings. For example, the literal string '马' and '马' are evaluated by the parser as the same, length 1, string.

You can (and did) build a string with the 6 characters马by using an escape code for the backslash ( \\ ) to prevent the parser from evaluating those 6 characters as an escape code, which is why it prints as the 6-character ❗ .

If you build a byte string with those 6 characters, you can decode it with .decode('unicode-escape') to get the character:

>>> b'\\u2757'.decode('unicode_escape')
'❗'

But it is easier to use the chr() function on the number itself:

>>> chr(0x2757)
'❗'
>>> chr(10071)
'❗'

How to print unicode character from a string variable?

Question

1 answers

solution1
0 ACCPTED 2022-11-28 19:29:04

How to print unicode character from a string variable?

Question

1 answers

solution1 0 ACCPTED 2022-11-28 19:29:04

solution1
0 ACCPTED 2022-11-28 19:29:04