使用Java从文件中读取特殊字符？

Question

I am using java to read a text file and which has some special chars like Yen(¥) . 我正在使用Java读取文本文件，并且具有一些特殊字符，例如Yen(¥) 。 I have not specified any encoding/charset while reading a file and is working fine in windows. 我在读取文件时未指定任何编码/字符集，并且在Windows中工作正常。 But if i deploy the same in unix machine then ¥ is replaced by ' ? 但是，如果我在Unix机器中部署相同的内容，那么¥被替换为' ? '. '。 Now i am going to specify charset windows-1252 to avoid the issue. 现在，我将指定charset Windows-1252以避免出现此问题。 will windows-1252 work on unix/linux boxes? windows-1252可以在unix/linux机器上使用？ My unix box charset is set to ' utf-8 '. 我的unix框字符集设置为' utf-8 '。 am using below the code: 正在使用下面的代码：

LineIterator iterator =FileUtils.lineIterator(*filename*,"Windows-1252");

Answer 1

The class StandardCharsets gives you a list of encodings / charsets that are "guaranteed to be available on every implementation of the Java platform." StandardCharsets类为您提供了“保证在Java平台的每个实现中都可用”的编码/字符集列表。

This list doesn't contain the Windows encodings but for most common Java versions on Windows, Mac and Linux, Cp1251 is available. 该列表不包含Windows编码，但是对于Windows，Mac和Linux上的大多数常见Java版本，都可以使用Cp1251 。

Note that you'll get a UnsupportedCharsetException or UnsupportedEncodingException when it's not available, so the code above is safe (in the sense that it won't produce garbage). 请注意，当不可用时，您将收到UnsupportedCharsetException或UnsupportedEncodingException ，因此，上面的代码是安全的（就不会产生垃圾而言）。

If you want to be really safe, the common approach is to use only UTF-8 encoded data in your projects. 如果您想真正安全，通常的方法是在项目中仅使用UTF-8编码的数据。

Answer 2

如果我正确理解了您的问题，通常可以通过使用文本编辑器将文本文件保存为UTF-8编码，然后从Java程序打开该文件时再次指定UTF-8来解决此问题。

使用Java从文件中读取特殊字符？

问题描述

2 个解决方案

解决方案1
2 2013-09-25 10:21:20

解决方案2
1 2013-09-25 18:01:43

使用Java从文件中读取特殊字符？

问题描述

2 个解决方案

解决方案1 2 2013-09-25 10:21:20

解决方案2 1 2013-09-25 18:01:43

解决方案1
2 2013-09-25 10:21:20

解决方案2
1 2013-09-25 18:01:43