将 unicode 代码点转换为 Ruby 中的字符串字符

Question

I have these values from a unicode database but I'm not sure how to translate them into the human readable form.我有来自 unicode 数据库的这些值，但我不确定如何将它们转换为人类可读的形式。 What are these even called?这些甚至叫什么？

Here they are:他们来了：

U+2B71F
U+2A52D
U+2A68F
U+2A690
U+2B72F
U+2B4F7
U+2B72B

How can I convert these to there readable symbols?如何将这些转换为可读符号？

Answer 1

How about:怎么样：

# Using pack
puts ["2B71F".hex].pack("U")

# Using chr
puts (0x2B71F).chr(Encoding::UTF_8)

In Ruby 1.9+ you can also do:在 Ruby 1.9+ 中，您还可以执行以下操作：

puts "\u{2B71F}"

Ie the \\u{}\u003c/code> escape sequence can be used to decode Unicode codepoints.即\\u{}\u003c/code>转义序列可用于解码 Unicode 代码点。

Answer 2

The unicode symbols like U+2B71F are referred to as a codepoint .像的unicode符号U+2B71F被称为codepoint 。

The unicode system defines a unique codepoint for each character in a multitude of world languages, scientific symbols, currencies etc. This character set is steadily growing. unicode 系统为多种世界语言、科学符号、货币等中的每个字符定义了一个唯一的codepoint 。这个字符集正在稳步增长。

For example, U+221E is infinity.例如， U+221E是无穷大。

The codepoints are hexadecimal numbers. codepoints是十六进制数。 There is always exactly one number defined per character.每个字符总是定义一个数字。

There are many ways to arrange this in memory.有很多方法可以在内存中安排它。 This is known as an encoding of which the common ones are UTF-8 and UTF-16 .这被称为一种encoding ，其中常见的是UTF-8和UTF-16 。 The conversion to and fro is well defined.来回转换是明确定义的。

Here you are most probably looking for converting the unicode codepoint to UTF-8 characters.在这里，您很可能正在寻找将 unicode codepoint转换为UTF-8字符的方法。

codepoint = "U+2B71F"

You need to extract the hex part coming after U+ and get only 2B71F .您需要提取U+之后的十六进制部分并仅获得2B71F 。 This will be the first group capture.这将是第一组捕获。 See this .看到这个。

codepoint.to_s =~ /U\+([0-9a-fA-F]{4,5}|10[0-9a-fA-F]{4})$/

And you're UTF-8 character will be:而你的 UTF-8 字符将是：

utf_8_character = [$1.hex].pack("U")

References:参考：

将 unicode 代码点转换为 Ruby 中的字符串字符

问题描述

2 个解决方案

解决方案1
36 已采纳 2011-08-07 23:52:57

解决方案2
19 2011-08-07 23:54:21

将 unicode 代码点转换为 Ruby 中的字符串字符

问题描述

2 个解决方案

解决方案1 36 已采纳 2011-08-07 23:52:57

解决方案2 19 2011-08-07 23:54:21

解决方案1
36 已采纳 2011-08-07 23:52:57

解决方案2
19 2011-08-07 23:54:21