简体   繁体   中英

32-bit unicode in python

Python has an escape sequence \\u\u003c/code> to display unicode values. However this is restricted to only 16 bit unicode values. That is

>>> '\u1020'
'ဠ'

Whereas 32 bit uncode values do not work. That is

>>> '\u00001000'
'\x001000'

Which is obviously wrong. The python documentation mentions

The escape sequence \ indicates to insert the Unicode character with the ordinal value 0x0020 (the space character) at the given position.

The python How To Unicode clearly mentions the use of '\\U' to represent 32-bit unicode sequences.

>>> "\u0394"                          # Using a 16-bit hex value
'Δ'
>>> "\U00000394"                      # Using a 32-bit hex value
'Δ'

In this case

>>> '\U00001000'
'က'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM