简体   繁体   English

BufferedReader返回ISO-8859-15字符串-如何转换为UTF16字符串?

[英]BufferedReader returns ISO-8859-15 String - how to convert to UTF16 String?

I have an FTP client class which returns InputStream pointing the file. 我有一个FTP客户端类,该类返回指向文件的InputStream。 I would like to read the file row by row with BufferedReader. 我想用BufferedReader逐行读取文件。 The issue is, that the client returns the file in binary mode, and the file has ISO-8859-15 encoding. 问题是,客户端以二进制模式返回文件,并且文件具有ISO-8859-15编码。

If the file/stream/whatever really contains ISO-8859-15 encoded text, you just need to specify that when you create the InputStreamReader: 如果文件/流/任何内容真正包含ISO-8859-15编码的文本,则只需在创建InputStreamReader时指定它:

BufferedReader br = new BufferedReader(
    new InputStreamReader(ftp.getInputStream(), "ISO-8859-15"));

Then readLine() will create valid Strings in Java's native encoding (which is UTF-16, not UTF-8). 然后, readLine()将以Java的本机编码(为UTF-16,而不是UTF-8)创建有效的字符串。

Try this: 尝试这个:

BufferedReader br = new BufferedReader(
                        new InputStreamReader(
                            ftp.getInputStream(),
                            Charset.forName("ISO-8859-15")
                        )
                    );
String row = br.readLine();

The original string is in ISO-8859-15, so the byte stream read by your InputStreamReader will be in this encoding. 原始字符串位于ISO-8859-15中,因此InputStreamReader读取的字节流将采用这种编码。 So read in using that encoding (specify this in the InputStreamReader constructor). 因此,请使用该编码进行阅读(在InputStreamReader构造函数中指定此编码)。 That tells the InputStreamReader that the incoming byte stream is in ISO-8859-15 and to perform the appropriate byte-to-character conversions. 这告诉InputStreamReader传入的字节流在ISO-8859-15中,并执行适当的字节到字符的转换。

Now it will be in the standard Java UTF-16 format, and you can then do what you wish. 现在,它将采用标准Java UTF-16格式,然后您就可以执行所需的操作。

I think the current problem is that you're reading it using your default encoding (by not specifying an encoding in InputStreamReader), and then trying to convert it, by which time it's too late. 我认为当前的问题是您正在使用默认编码(通过在InputStreamReader中未指定编码)读取它,然后尝试转换它,到那时为时已晚。

Using default behaviour for these sort of classes often ends in grief. 对此类使用默认行为通常会导致悲伤。 It's a good idea to specify encodings wherever you can, and/or default the VM encoding via -Dfile.encoding 最好在任何地方指定编码,和/或通过-Dfile.encoding缺省设置VM编码。

Have you tried: 你有没有尝试过:

BufferedReader r = new BufferedReader(new InputStreamReader("ISO-8859-1"))
...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在JavaScript中使用ISO-8859-15字符集生成字符串? - How to generate a string using the ISO-8859-15 charset in JavaScript? 如何将UTF8字符串转换为UTF16 - How to convert UTF8 string to UTF16 使用Java将OCTL字符转换为ISO-8859-15(html) - Convert OCTL character to ISO-8859-15(html) with java 使用Java将MySql字符串ISO-8859-1转换为UTF-8 - Convert MySql string ISO-8859-1 to UTF-8 with Java 在Java中将字符串从UTF-8转换为ISO 8859-1 - Convert string from UTF-8 to ISO 8859-1 in Java ISO-8859-15中的Tomcat无法写入带有拉丁字符的文件名 - Tomcat in ISO-8859-15 cannot write filenames with latin characters ISO-8859-15 字符集在 Weblogic 中应用不正确,但在 Tomcat 中应用不正确 - ISO-8859-15 charset applying incorrectly in Weblogic but not Tomcat Java的; 尝试将包含ISO-8859-1编码的String转换为UTF-8但文件是UTF-8 - Java; Trying to convert a String which contains ISO-8859-1 encoding to UTF-8 but file is UTF-8 将以单个空字节结尾的字节数组转换为UTF16编码的字符串 - convert byte array ending with single null byte to UTF16 encoded string 我应该如何直接使用 UTF8 或 UTF16 随机生成一个固定长度的字符串变量? - How should I randomly generate a fixed length String variable directly using UTF8 or UTF16?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM