从给定的字符串中获取Unicode编码字符（卡纳达语语言）

Question

String s1="\u0048\u0065\u006C\u006C\u006F";   // Hello
String s2="\u0CAE\u0CC1\u0C96\u0CAA\u0CC1\u0C9F";  // ಮುಖಪುಟ (Kannada Language)

System.out.println("s1: " + StringEscapeUtils.unescapeJava(s1));  // s1: Hello
System.out.println("s2: " + StringEscapeUtils.unescapeJava(s2));  // s2: ??????

When I print s1 , I get the result as Hello . 当我打印s1 ，得到的结果为Hello 。 When I print s2 , I get the result as ??????? 当我打印s2 ，得到的结果是??????? . 。

I want the output as ಮುಖಪುಟ for s2 . 我希望输出为s2 ಮುಖಪುಟ 。 How can I achieve this? 我该如何实现？

Answer 1

 ByteArrayOutputStream os = new ByteArrayOutputStream();
 PrintStream ps = new PrintStream(os);
 ps.println("\u0048\u0065\u006C\u006C\u006F \u0CAE\u0CC1\u0C96\u0CAA\u0CC1\u0C9F");  
 String output = os.toString("UTF8");
 System.out.println("result: "+output);   //  Hello ಮುಖಪುಟ

Answer 2

You need to add the encoding like "UTF-8" try this 您需要添加类似“ UTF-8”的编码，请尝试以下操作

String s1="\u0048\u0065\u006C\u006C\u006F";   // Hello
String s2="\u0CAE\u0CC1\u0C96\u0CAA\u0CC1\u0C9F";  // ಮುಖಪುಟ (Kannada Language)

System.out.println("s1: " + new String(s1.getBytes("UTF-8"), "UTF-8"));
System.out.println("s2: " + new String(s2.getBytes("UTF-8"), "UTF-8"));

Answer 3

If you are using Eclipse then please have a look at: https://decoding.wordpress.com/2010/03/18/eclipse-how-to-change-the-console-output-encoding/ 如果您使用的是Eclipse请查看： https : //decoding.wordpress.com/2010/03/18/eclipse-how-to-change-the-console-output-encoding/

Please simply output on the console as follows:- 请简单地在控制台上输出如下：

String s1="\u0048\u0065\u006C\u006C\u006F";   
String s2="\u0CAE\u0CC1\u0C96\u0CAA\u0CC1\u0C9F";
System.out.println("s1: " + s1);  // s1
System.out.println("s2: " + s2);  // s2

Hope, this is helpful to you. 希望对您有帮助。

Answer 4

The problem is most probably that System.out is not prepared to deal with Unicode. 问题很可能是System.out不准备处理Unicode。 It is an output stream that gets encoded in the so called default encoding . 它是一种输出流，该流以所谓的默认编码进行编码 。

The default encoding is most often (ie on Windows) some proprietary 8-bit character set, that simply can't handle unicode. 默认编码通常是（例如，在Windows上）一些专有的8位字符集，这些字符集根本无法处理unicode。

My tip: For the sake of testing, create your own PrintStream or PrintWriter with UTF-8 encoding. 我的提示：为了进行测试，请使用UTF-8编码创建自己的PrintStream或PrintWriter。

从给定的字符串中获取Unicode编码字符（卡纳达语语言）

问题描述

4 个解决方案

解决方案1
1 已采纳 2016-06-04 03:19:37

解决方案2
0 2016-06-01 09:17:24

解决方案3
0 2016-06-02 05:49:18

解决方案4
0 2016-06-03 09:53:08

从给定的字符串中获取Unicode编码字符（卡纳达语语言）

问题描述

4 个解决方案

解决方案1 1 已采纳 2016-06-04 03:19:37

解决方案2 0 2016-06-01 09:17:24

解决方案3 0 2016-06-02 05:49:18

解决方案4 0 2016-06-03 09:53:08

解决方案1
1 已采纳 2016-06-04 03:19:37

解决方案2
0 2016-06-01 09:17:24

解决方案3
0 2016-06-02 05:49:18

解决方案4
0 2016-06-03 09:53:08