从导出的.jar运行时，扫描程序会停止随机读取，而从Eclipse运行时，扫描程序不会停止读取

Question

so today I ran into some trouble while using the java Scanner. 所以今天我在使用java扫描器时遇到了一些麻烦。 I use the Scanner class many times in my project and I never ran into any problem. 我在项目中多次使用Scanner类，但从未遇到任何问题。

Basically, I always do something like this: 基本上，我总是这样做：

try (Scanner scanner = new Scanner(file)) {
    while(scanner.hasNextLine()) {
        String line = scanner.nextLine();
        ...
    }
} catch ...
} finally ...

and the Scanner works just fine because it's just some simple code. 扫描器可以正常工作，因为它只是一些简单的代码。 Today, however, I used the code above to read text files with about 17000 lines. 但是，今天，我使用上面的代码读取了大约17000行的文本文件。

At first the code worked just fine (when running it through Eclipse) as I expected but then, after exporting the project, the Scanner would stop reading after about 400 lines. 最初，代码按我期望的那样工作正常（通过Eclipse运行时），但是随后，在导出项目后，扫描程序将在大约400行之后停止读取。

I googled a bit and in the end I solved the problem thanks to these answers: 我搜索了一下，最后通过以下答案解决了这个问题：

Answer #1 答案＃1
Answer #2 答案2

All I had to do was change the constructor from 我要做的就是将构造函数从

Scanner scanner = new Scanner(file)

to 至

Scanner scanner = new Scanner(new FileInputStream(sql)))

It is some weird encoding problem, I get it. 这是一些奇怪的编码问题，我明白了。 But why when I ran the code from Eclipse it worked flawlessy and when I ran it from my exported jar the Scanner stopped after reading about 400 lines? 但是，为什么当我从Eclipse运行代码时，它就可以正常工作，而当我从导出的jar中运行代码时，扫描仪在读取大约400行后就停止了？

The code does the exact same thing in both cases because I set up Eclipse so that it would use the same working directory as the exported .jar archive (because it has got some data subdirectories): 在这两种情况下，代码都执行完全相同的操作，因为我设置了Eclipse，以便它将使用与导出的.jar存档相同的工作目录（因为它具有一些数据子目录）：

Takes the same .gz archive 使用相同的.gz存档
Extract a file from the .gz 从.gz中提取文件
Read the file like I showed above 像我上面显示的那样读取文件

Not sure if it helps but Eclipse is set up to save source files in UTF-8 format. 不确定是否有帮助，但是Eclipse已设置为以UTF-8格式保存源文件。

Thanks in advance 提前致谢

Answer 1

But why when I ran the code from Eclipse it worked flawlessy and when I ran it from my exported jar the Scanner stopped after reading about 400 lines? 但是，为什么当我从Eclipse运行代码时，它就可以正常工作，而当我从导出的jar中运行代码时，扫描仪在读取大约400行后就停止了？

Scanner has two different constructors that accept a File as argument. Scanner有两个不同的构造函数，它们接受File作为参数。 From the docs : 从文档：

Scanner(File source) 扫描仪（文件源）

Bytes from the file are converted into characters using the underlying platform's default charset . 使用基础平台的默认charset将文件中的字节转换为字符。

and 和

Scanner(File source, String charsetName) 扫描仪（文件源，字符串charsetName）

Bytes from the file are converted into characters using the specified charset . 使用指定的字符集将文件中的字节转换为字符。

So if you do not specify the charsetName , it will use the environment's default charset. 因此，如果不指定charsetName ，它将使用环境的默认字符集。

The environment encoding when you run your project outside Eclipse is probably other than UTF-8. 在Eclipse外部运行项目时，环境编码可能不是UTF-8。 To check that this is the case you can write a simple program like this: 要检查这种情况，可以编写一个简单的程序，如下所示：

class CheckDefaultCharset {
    public static void main(String... args) {
        System.out.println(Charset.defaultCharset());
    }
}

And run it on both environments. 并在两种环境下运行。

For example, when running the above code from Eclipse I get: 例如，从Eclipse运行上述代码时，我得到：

UTF-8

And when running in the PowerShell (Windows 7), I get: 在PowerShell（Windows 7）中运行时，我得到：

windows-1252

To avoid this type of problem it would be better to always specify the encoding of the files you intend to use when using Scanner . 为了避免此类问题，最好始终指定要使用Scanner时要使用的文件的编码。

从导出的.jar运行时，扫描程序会停止随机读取，而从Eclipse运行时，扫描程序不会停止读取

问题描述

1 个解决方案

解决方案1
2 已采纳 2015-04-05 12:39:12

从导出的.jar运行时，扫描程序会停止随机读取，而从Eclipse运行时，扫描程序不会停止读取

问题描述

1 个解决方案

解决方案1 2 已采纳 2015-04-05 12:39:12

解决方案1
2 已采纳 2015-04-05 12:39:12