so today I ran into some trouble while using the java Scanner. I use the Scanner class many times in my project and I never ran into any problem.
Basically, I always do something like this:
try (Scanner scanner = new Scanner(file)) {
while(scanner.hasNextLine()) {
String line = scanner.nextLine();
...
}
} catch ...
} finally ...
and the Scanner works just fine because it's just some simple code. Today, however, I used the code above to read text files with about 17000 lines.
At first the code worked just fine (when running it through Eclipse) as I expected but then, after exporting the project, the Scanner would stop reading after about 400 lines.
I googled a bit and in the end I solved the problem thanks to these answers:
All I had to do was change the constructor from
Scanner scanner = new Scanner(file)
to
Scanner scanner = new Scanner(new FileInputStream(sql)))
It is some weird encoding problem, I get it. But why when I ran the code from Eclipse it worked flawlessy and when I ran it from my exported jar the Scanner stopped after reading about 400 lines?
The code does the exact same thing in both cases because I set up Eclipse so that it would use the same working directory as the exported .jar archive (because it has got some data subdirectories):
Not sure if it helps but Eclipse is set up to save source files in UTF-8 format.
Thanks in advance
But why when I ran the code from Eclipse it worked flawlessy and when I ran it from my exported jar the Scanner stopped after reading about 400 lines?
Scanner
has two different constructors that accept a File
as argument. From the docs :
Scanner(File source)
Bytes from the file are converted into characters using the underlying platform's default charset .
and
Scanner(File source, String charsetName)
Bytes from the file are converted into characters using the specified charset .
So if you do not specify the charsetName
, it will use the environment's default charset.
The environment encoding when you run your project outside Eclipse is probably other than UTF-8. To check that this is the case you can write a simple program like this:
class CheckDefaultCharset {
public static void main(String... args) {
System.out.println(Charset.defaultCharset());
}
}
And run it on both environments.
For example, when running the above code from Eclipse I get:
UTF-8
And when running in the PowerShell (Windows 7), I get:
windows-1252
To avoid this type of problem it would be better to always specify the encoding of the files you intend to use when using Scanner
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.