简体   繁体   中英

Reading a text file with a Scanner in Java - Token's return character

I'm triying to read the text file below with a java.util.Scanner in a simple Java Program.

0001;GUAJARA-MIRIM;RO
0002;ALTO ALEGRE DOS PARECIS;RO
0003;PORTO VELHO;RO

I read the text file using the code below:

scanner = new Scanner(filerader).useDelimiter("\\;|\\n");
while (scanner.hasNext()) {
    int id= scanner.nextInt();
    String name = scanner.next();
    String code = scanner.next();

    System.out.printf(".%s.%s.%d.\n", name, code, id);
}

The results are:

.GUAJARA-MIRIM.RO.1
.
.ALTO ALEGRE DOS PARECIS.RO.2
.
.PORTO VELHO.RO.3
.

But the result of the third token of each line has an incovenient '\\r' caracther at the end (ANSI code 13). I have no idea why (I used the '.' character on the formatting string to to make it clear where the '\\r' is).

So,

  1. Why there's a '\\r' at the end of the third token?
  2. How to bypass it.

It is very simple to use an workaround like code.substring(0, 2) , but instead I want to understand why there's a '\\r' character there.

You are using a Windows file, which uses \\r\\n as line delimiters (aka Carriage Return Line Feed). Unix uses only \\n (Line Feed).

To fix this, add \\r to your scanner delimiter.

In some file systems(specially Windows), \\r\\n is used a new line character . You are using \\n only a delimiter so \\r remain out. Add \\r also in your delimiters.

To make your code little more robust, use System.lineSeparator() to get the new line characters and use the delimiters accordingly.

The reason why it happens is already given, Other way to avoid this is to use scanner.nextLine() and then split by ; .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM