转义Unicode替代字符？

Question

我有以下文本行（也请参见代码：

我想做的是将表情符号（电话图标）转义为两个\\ u字符，然后返回其原始电话图标？ 下面的第一个方法可以正常工作，但我本质上是想按一定范围进行转义，以便可以转义任何这样的字符。 我不知道如何使用下面的第一种方法。

如何使用UnicodeEscaper作为与StringEscapeUtils相同的输出来实现基于范围的转义（即转义为两个\\ uxx \\ uxx，然后转义为电话图标）？

import org.apache.commons.lang3.text.translate.UnicodeEscaper;
import org.apache.commons.lang3.text.translate.UnicodeUnescaper;

    String text = "Unicode surrogate here-> 📱<--here";
    // escape the entire string...not what I want because there could
    // be \n \r or any other escape chars that I want left in tact (i just want  a range)
    String text2 = org.apache.commons.lang.StringEscapeUtils.escapeJava(text);
    System.out.println(text2);   // "Unicode surrogate here-> \uD83D\uDCF1<--here"
    // unescape it back to the phone emoticon
    text2 = org.apache.commons.lang.StringEscapeUtils.unescapeJava(text);
    System.out.println(text2); // "Unicode surrogate here-> 📱<--here"

    // How do I do the same as above but but looking for a range of chars to escape (i.e. any unicode surrogate)
    // , which is what i want  and not to escape the entire string
    text2 = UnicodeEscaper.between(0x10000, 0x10FFFF).translate(text);
    System.out.println(text2); // "Unicode surrogate here-> \u1F4F1<--here"
    // unescape .... (need the phone emoticon here)
    text2 = (new UnicodeUnescaper().translate(text2));
    System.out.println(text2);// "Unicode surrogate here-> ὏1<--here"

Answer 1

答案太晚了。 但是我发现你需要

org.apache.commons.lang3.text.translate.JavaUnicodeEscaper

类而不是UnicodeEscaper。

使用它可以打印：

Unicode surrogate here-> \uD83D\uDCF1<--here

并且逃避效果很好。

Answer 2

您的字符串：

"Unicode surrogate here-> \u1F4F1<--here"

不按照您的想法去做。

char基本上是UTF-16代码单元，因此为16位。 所以这里发生的是您有\ὁ 1 ; 这说明了您的输出。

我不知道您在这里所说的“转义”，但是如果这用“ \\ u \\ u”代替了代理对，那么请看一下Character.toChars() 。 它将返回表示一个Unicode代码点所必需的char序列，无论它在BMP中（一个char）还是不在BMP中（两个char）。

对于代码点U + 1f4f1，它将返回一个具有两个元素的char数组，该数组分别具有字符0xd83d和0xdcf1。 这就是您想要的。

转义Unicode替代字符？

问题描述

2 个解决方案

解决方案1
3 2015-04-05 23:40:02

解决方案2
2 已采纳 2014-04-07 20:09:16

转义Unicode替代字符？

问题描述

2 个解决方案

解决方案1 3 2015-04-05 23:40:02

解决方案2 2 已采纳 2014-04-07 20:09:16

解决方案1
3 2015-04-05 23:40:02

解决方案2
2 已采纳 2014-04-07 20:09:16