[英]BufferedReader returns ISO-8859-15 String - how to convert to UTF16 String?
I have an FTP client class which returns InputStream pointing the file. 我有一个FTP客户端类,该类返回指向文件的InputStream。 I would like to read the file row by row with BufferedReader.
我想用BufferedReader逐行读取文件。 The issue is, that the client returns the file in binary mode, and the file has ISO-8859-15 encoding.
问题是,客户端以二进制模式返回文件,并且文件具有ISO-8859-15编码。
If the file/stream/whatever really contains ISO-8859-15 encoded text, you just need to specify that when you create the InputStreamReader: 如果文件/流/任何内容真正包含ISO-8859-15编码的文本,则只需在创建InputStreamReader时指定它:
BufferedReader br = new BufferedReader(
new InputStreamReader(ftp.getInputStream(), "ISO-8859-15"));
Then readLine()
will create valid Strings in Java's native encoding (which is UTF-16, not UTF-8). 然后,
readLine()
将以Java的本机编码(为UTF-16,而不是UTF-8)创建有效的字符串。
Try this: 尝试这个:
BufferedReader br = new BufferedReader(
new InputStreamReader(
ftp.getInputStream(),
Charset.forName("ISO-8859-15")
)
);
String row = br.readLine();
The original string is in ISO-8859-15, so the byte stream read by your InputStreamReader will be in this encoding. 原始字符串位于ISO-8859-15中,因此InputStreamReader读取的字节流将采用这种编码。 So read in using that encoding (specify this in the InputStreamReader constructor).
因此,请使用该编码进行阅读(在InputStreamReader构造函数中指定此编码)。 That tells the InputStreamReader that the incoming byte stream is in ISO-8859-15 and to perform the appropriate byte-to-character conversions.
这告诉InputStreamReader传入的字节流在ISO-8859-15中,并执行适当的字节到字符的转换。
Now it will be in the standard Java UTF-16 format, and you can then do what you wish. 现在,它将采用标准Java UTF-16格式,然后您就可以执行所需的操作。
I think the current problem is that you're reading it using your default encoding (by not specifying an encoding in InputStreamReader), and then trying to convert it, by which time it's too late. 我认为当前的问题是您正在使用默认编码(通过在InputStreamReader中未指定编码)读取它,然后尝试转换它,到那时为时已晚。
Using default behaviour for these sort of classes often ends in grief. 对此类使用默认行为通常会导致悲伤。 It's a good idea to specify encodings wherever you can, and/or default the VM encoding via
-Dfile.encoding
最好在任何地方指定编码,和/或通过
-Dfile.encoding
缺省设置VM编码。
Have you tried: 你有没有尝试过:
BufferedReader r = new BufferedReader(new InputStreamReader("ISO-8859-1"))
...
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.