简体   繁体   English

使用 java -jar 运行代码时如何打印 UTF8

[英]How to print UTF8 when running code with java -jar

I'm writing a project which parses a UTF-8 encoded file.我正在编写一个解析 UTF-8 编码文件的项目。

I'm doing it this way我是这样做的

ArrayList<String> al = new ArrayList<>();
BufferedReader bufferedReader = new BufferedReader(new         
                                InputStreamReader(new FileInputStream(filename),"UTF8"));

String line = null;

while ((line = bufferedReader.readLine()) != null)
{

    al.add(line);
}

return al;

The strange thing is that it reads the file properly when I run it in IntelliJ, but not when I run it through java -jar (It gives me garbage values instead of UTF8).奇怪的是,当我在 IntelliJ 中运行它时它会正确读取文件,但当我通过java -jar运行它时却不会(它给我垃圾值而不是 UTF8)。

What can I do to either我能做些什么

  1. Run my Java through java -jar in the same environment as intelliJ or在与 intelliJ 相同的环境中通过 java -jar 运行我的 Java 或
  2. Fix my code so that it reads UTF-8 into the string修复我的代码,使其将 UTF-8 读入字符串

I think that what is going on here is that you just don't have your terminal setup correctly for your default encoding.我认为这里发生的事情是您没有为默认编码正确设置终端。 Basically, if your program runs correctly, then it's grabbing the UTF-8 bytes, storing them as Java strings, then outputting them to the terminal in whatever the default encoding scheme is.基本上,如果您的程序正确运行,那么它会抓取 UTF-8 字节,将它们存储为 Java 字符串,然后以任何默认编码方案将它们输出到终端 To find out what your default encoding scheme see this question .要了解您的默认编码方案,请参阅此问题 Then you need to ensure that your terminal that you are running your java -jar command from is compatible with it.然后,您需要确保运行java -jar命令的终端与其兼容。 For example, see my terminal settings/preferences on my Mac.例如,在我的 Mac 上查看我的终端设置/首选项。

UTF-8 的 Mac 终端设置

Oracle docs give a pretty straightforward answer about Charset : Oracle 文档对Charset给出了一个非常简单的答案:

Standard charsets标准字符集

Every implementation of the Java platform is required to support the following standard charsets. Java 平台的每个实现都需要支持以下标准字符集。 Consult the release documentation for your implementation to see if any other charsets are supported.请查阅您的实现的发布文档以查看是否支持任何其他字符集。 The behavior of such optional charsets may differ between implementations.此类可选字符集的行为可能因实现而异。

... ...

UTF-8 UTF-8

Eight-bit UCS Transformation Format八位 UCS 转换格式

So you should use new InputStreamReader(new FileInputStream(filename),"UTF-8"));所以你应该使用new InputStreamReader(new FileInputStream(filename),"UTF-8"));

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM