String s1="\u0048\u0065\u006C\u006C\u006F"; // Hello
String s2="\u0CAE\u0CC1\u0C96\u0CAA\u0CC1\u0C9F"; // ಮುಖಪುಟ (Kannada Language)
System.out.println("s1: " + StringEscapeUtils.unescapeJava(s1)); // s1: Hello
System.out.println("s2: " + StringEscapeUtils.unescapeJava(s2)); // s2: ??????
When I print s1
, I get the result as Hello
. When I print s2
, I get the result as ???????
.
I want the output as ಮುಖಪುಟ
for s2
. How can I achieve this?
ByteArrayOutputStream os = new ByteArrayOutputStream();
PrintStream ps = new PrintStream(os);
ps.println("\u0048\u0065\u006C\u006C\u006F \u0CAE\u0CC1\u0C96\u0CAA\u0CC1\u0C9F");
String output = os.toString("UTF8");
System.out.println("result: "+output); // Hello ಮುಖಪುಟ
You need to add the encoding like "UTF-8" try this
String s1="\u0048\u0065\u006C\u006C\u006F"; // Hello
String s2="\u0CAE\u0CC1\u0C96\u0CAA\u0CC1\u0C9F"; // ಮುಖಪುಟ (Kannada Language)
System.out.println("s1: " + new String(s1.getBytes("UTF-8"), "UTF-8"));
System.out.println("s2: " + new String(s2.getBytes("UTF-8"), "UTF-8"));
If you are using Eclipse
then please have a look at: https://decoding.wordpress.com/2010/03/18/eclipse-how-to-change-the-console-output-encoding/
Please simply output on the console as follows:-
String s1="\u0048\u0065\u006C\u006C\u006F";
String s2="\u0CAE\u0CC1\u0C96\u0CAA\u0CC1\u0C9F";
System.out.println("s1: " + s1); // s1
System.out.println("s2: " + s2); // s2
Hope, this is helpful to you.
The problem is most probably that System.out
is not prepared to deal with Unicode. It is an output stream that gets encoded in the so called default encoding .
The default encoding is most often (ie on Windows) some proprietary 8-bit character set, that simply can't handle unicode.
My tip: For the sake of testing, create your own PrintStream or PrintWriter with UTF-8 encoding.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.