简体   繁体   English

最高可用的 UNICODE 字符

[英]Highest usable UNICODE character

I'm writing a routine that saves large numbers to a file, but instead of writing the actual number as a string (eg. 999999), I'd like to use its equivalent UNICODE character (eg. 𘚟), regardless of whether it actually corresponds to a visible or recognizable character.我正在编写一个将大数字保存到文件的例程,但不是将实际数字写为字符串(例如 999999),我想使用其等效的 UNICODE 字符(例如 𘚟),无论它是否实际上对应于可见或可识别的字符。 Excluding surrogate pairs, does anyone know which numerical values correspond to a SINGLE Unicode character?排除代理对,有谁知道哪些数值对应于单个 Unicode 字符? I'm asking this since I noticed that certain numerical values correspond to a two-character Unicode code point.我之所以这么问是因为我注意到某些数值对应于两个字符的 Unicode 代码点。 Ex.前任。 999999 corresponds to 𘚟, whereas 999998 corresponds to 𘚟. 999999对应𘚟,而999998对应𘚟。

Unicode is currently defined to end at 10_ffff₁₆ = 1_114_111₁₀. Unicode 当前定义为以 10_ffff₁₆ = 1_114_111₁₀ 结尾。 Some languages are able to relax that restriction, eg某些语言能够放宽该限制,例如

#!/usr/bin/env perl
"\x{7fff_ffff_ffff_ffff}";
# ÿ¿¿¿¿¿¿¿¿¿¿
encode "UTF8", "\x{7fff_ffff_ffff_ffff}";
# 0xff 0x80 0x87 0xbf 0xbf 0xbf 0xbf 0xbf 0xbf 0xbf 0xbf 0xbf 0xbf

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM