在Java中将Unicode值转换为字符串

Question

I am trying to extract currencies in my texts and I am getting currencies from db which contains special currency symbols as well. 我正在尝试提取文本中的货币，并且从包含特殊货币符号的db中获取货币。 For example for the pound, I have unicode of pound "\£" in the db along with other identifiers such as "gbp" as well. 例如，对于磅，我在数据库中具有磅“ \\ u00A3”的unicode以及其他标识符，例如“ gbp”。

I am trying to get the corresponding symbol from the unicode and compare with my text in a loop as suggested in here . 我试图从unicode中获取相应的符号，并按照此处的建议在循环中与我的文本进行比较。

But when I evaluate my code, the result is like in the image here: 但是，当我评估我的代码时，结果如下面的图片所示：

private Optional<Currency> extractTokenWise(Iterable<String> tokens){
    try {
        for (String aToken : tokens) {
            for (String currency : currencies.keySet()) {
                for (String arep : currencies.get(currency)) {
                    if(arep.startsWith("\\")){ //special character for currency written in unicode representation                  
                        byte[] charset = arep.getBytes("UTF-8");
                        arep = new String(charset, "UTF-8");
                    }
                    if (aToken.equals(arep)) {
                        return Optional.of(Currency.findProperEnum(currency));
                    }
                }
            }
        }
    }catch (UnsupportedEncodingException e) {
        e.printStackTrace();
    }
    return Optional.empty();
}

It is interesting that when arep is equal to "\£" , it does not work but when I specifically give String value of "\£" , It produces the result I want. 有趣的是，当arep等于"\£" ，它不起作用，但是当我专门给出String值"\£" ，它将产生我想要的结果。 What am I missing here? 我在这里想念什么？

Answer 1

As mentioned in comments something like this should work: 如评论中所述，这样的方法应该起作用：

if (arep.startsWith("\\u")) {
        arep = Character.toString((char) Integer.parseInt(arep.substring(2), 16));
}

Answer 2

I think you mix up unicode escape sequences in java code with strings containing such escape sequences. 我认为您将Java代码中的unicode 转义序列与包含此类转义序列的字符串混合在一起。

String poundSign = "\£"; assigns poundSign a string containing the single character £ . 为poundSign分配一个包含单个字符£的字符串。 This string has a length of 1 character. 该字符串的长度为1个字符。 In memory and in the class file it will occupy 2 bytes. 在内存和类文件中，它将占用2个字节。

It looks like arep contains the string \£ as assigned by String unicodeEscapeForPoundSign = "\\\£"; 它看起来像arep包含字符串\£通过指定String unicodeEscapeForPoundSign = "\\\£"; -- that's what your first if statement tests for. -这就是您的第一个if语句要测试的内容。 It contains the unicode escape sequence as used in java code, but not the character this escape sequence represents . 它包含Java代码中使用的unicode转义序列 ，但不包含此转义序列表示的字符。 It contains the 6 characters '\\', 'u', '0', '0', 'A', and '3' (as your IDE shows). 它包含6个字符“ \\”，“ u”，“ 0”，“ 0”，“ A”和“ 3”（如您的IDE所示）。 arep.getBytes("UTF-8"); returns an array of just these characters and new String(charset, "UTF-8"); 返回仅包含这些字符和new String(charset, "UTF-8");的数组new String(charset, "UTF-8"); converts the array back to the string \£ and not the string £ 将数组转换回字符串\£而不是字符串£

The solution depends on what you get from your database . 解决方案取决于您从数据库中获得什么 。 Assuming you have a mapping from the db-value to a Currency object or the ISO currency code, you won't need your first if statement, just make sure arep contains the correct string: 假设您具有从db-value到Currency对象或ISO货币代码的映射，则不需要第一个if语句，只需确保arep包含正确的字符串即可：

String arep = "\£" (single pound character) String arep = "\£" （单英镑字符）
String arep = "\\\£" (pound character java unicode escape string) String arep = "\\\£" （磅字符java unicode转义字符串）

在Java中将Unicode值转换为字符串

问题描述

2 个解决方案

解决方案1
2 已采纳 2019-07-26 16:11:32

解决方案2
1 2019-07-26 16:27:11

在Java中将Unicode值转换为字符串

问题描述

2 个解决方案

解决方案1 2 已采纳 2019-07-26 16:11:32

解决方案2 1 2019-07-26 16:27:11

解决方案1
2 已采纳 2019-07-26 16:11:32

解决方案2
1 2019-07-26 16:27:11