无法通过 System.in 读取日文字符

Question

Code:代码：

Scanner sc = new Scanner(System.in);
System.out.println("Enter Name : ");
String name = sc.nextLine();
System.out.println(name);

String encoding = "UTF-8";
System.out.println(new String(name.getBytes(encoding), "euc-jp"));
System.out.println(new String(name.getBytes(encoding), "Shift_JIS"));
System.out.println(new String(name.getBytes(encoding), "ISO-2022-JP"));
System.out.println(new String(name.getBytes(encoding), "ISO8859-1"));

Input:输入：

Enter Name : たなかです输入名称：たなかです

Output:输出：

F Q N @ F Q N@

鐃鐃鐃緒申鐃鐃 铙铙铙绪申铙铙

ｿｽF ｿｽQ ｿｽｿｽｿｽN ｿｽ@ ...

F Q N @ F Q N @

ï¿½Fï¿½Qï¿½ï¿½ï¿½Nï¿½@ ï¿½Fï¿½Qï¿½ï¿½ï¿½Nï¿½@

None of them are readable Japanese.它们都不是可读的日语。 I've also tried InputStreamReader and DataInputStream with Byte[] .我也试过InputStreamReader和DataInputStream与Byte[] 。

Answer 1

How to print the string properly to console with your code如何正确打印字符串以使用您的代码进行控制台

name.getBytes(encoding) in your code will get the raw-byte representation of the String name with UTF-8 encoding.代码中的name.getBytes(encoding)将使用 UTF-8 编码获取String name的原始字节表示。 So when you type "たなかです" in console, you will get the array of byte {0xE3, 0x81, 0x9F, 0xE3, 0x81, 0xAA, 0xE3, 0x81, 0x8B, 0xE3, 0x81, 0xA7, 0xE3, 0x81, 0x99} .所以当你在控制台输入“たなかです”时，你会得到字节数组{0xE3, 0x81, 0x9F, 0xE3, 0x81, 0xAA, 0xE3, 0x81, 0x8B, 0xE3, 0x81, 0xA7, 0xE3, 0x81, 0x99}

It is UTF-8 based representation, so the only encoding you can specify in the 2nd argument of the constructor String(byte[] bytes, String charsetName) is UTF-8 .它是基于 UTF-8 的表示，因此您可以在构造函数String(byte[] bytes, String charsetName)的第二个参数中指定的唯一编码是UTF-8 。

System.out.println(new String(name.getBytes(encoding), "UTF-8"));

It converts the byte array {0xE3, 0x81, 0x9F, ... } to a String object, and prints to the console properly.它将字节数组{0xE3, 0x81, 0x9F, ... }转换为String对象，并正确打印到控制台。

How to get the internal representation of a String as a byte array如何将字符串的内部表示作为字节数组

String object uses UTF-16 for the internal text representation (see https://docs.oracle.com/javase/8/docs/technotes/guides/intl/overview.html for details). String对象使用 UTF-16 作为内部文本表示（有关详细信息，请参阅https://docs.oracle.com/javase/8/docs/technotes/guides/intl/overview.html ）。

So you have to use name.getBytes("UTF-16") when you want to get the byte array that same as the internal text representation.因此，当您想要获取与内部文本表示相同的字节数组时，您必须使用name.getBytes("UTF-16") 。 You can reverse it to a String object with System.out.println(new String(name.getBytes("UTF-16"), "UTF-16"));您可以使用System.out.println(new String(name.getBytes("UTF-16"), "UTF-16"));将其反转为String对象System.out.println(new String(name.getBytes("UTF-16"), "UTF-16")); . .

Answer 2

there is slight problem in your following code snippet, you are using same encoding for different charsets,您的以下代码片段中存在小问题，您对不同的字符集使用相同的编码，

String encoding = System.getProperty("file.encoding"); 
System.out.println(new String(name.getBytes(encoding), "UTF-8"));

assuming you want to print the japanese characters using different charset's ,use this假设您想使用不同的字符集打印日语字符，请使用此

 System.out.println(new String(name.getBytes("euc-jp"), "euc-jp"));
 System.out.println(new String(name.getBytes("Shift_JIS"), "Shift_JIS"));
 System.out.println(new String(name.getBytes("ISO-2022-JP"), "ISO-2022-JP"));
 System.out.println(new String(name.getBytes("ISO8859-1"), "ISO8859-1"));

无法通过 System.in 读取日文字符

问题描述

2 个解决方案

解决方案1
0 2020-05-12 15:57:43

How to print the string properly to console with your code如何正确打印字符串以使用您的代码进行控制台

How to get the internal representation of a String as a byte array如何将字符串的内部表示作为字节数组

解决方案2
-1 2016-03-28 11:48:19

无法通过 System.in 读取日文字符

问题描述

2 个解决方案

解决方案1 0 2020-05-12 15:57:43

How to print the string properly to console with your code如何正确打印字符串以使用您的代码进行控制台

How to get the internal representation of a String as a byte array如何将字符串的内部表示作为字节数组

解决方案2 -1 2016-03-28 11:48:19

解决方案1
0 2020-05-12 15:57:43

解决方案2
-1 2016-03-28 11:48:19