简体   繁体   中英

parsing each line in text file java

I have a text file that has the following lines:

150004|2012|12|15|0|0|3|0|0|-3.2411|83.9962|156.3321|1.1785|205.3125|2.0599
150004|2012|12|15|0|10|3|0|0|-3.4206|85.9575|150.4877|1.4142|226.7578|2.4276
150004|2012|12|15|0|20|3|0|0|-2.2696|86.2675|149.3848|2.1553|225.7031|3.4387

every '|' sign indicates it has a column. I have to extract the info from each line that is inside of '|' signs. When I try the following code:

File filer = new File("C:\\Users\\Ali Y. Akgul\\Desktop\\150004_15122012_G.txt");
        try (BufferedReader reader = new BufferedReader(new FileReader(filer))) {
            while (true) {
                String line = reader.readLine();
                if (line == null) {
                    break;
                }
                String[] fields = line.split("|");
                // process fields here
                for(int i=0;i<=fields.length;i++){
                    System.out.println(fields[i]);
                }
            }
        }
}

it gives me:

1
5
0
0
0
4
|
2
0
1
2
|
1
2
|
1
5
|
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 76
0
|
0
|
3
|
0
|
0
|
-
3
.
2
4
    at testenv.TestEnv.main(TestEnv.java:31)
1
1
|
8
3
.
9
9
6
2
|
1
5
6
.
3
3
2
1
|
1
.
1
7
8
5
|
2
0
5
.
3
1
2
5
|
2
.
0
5
9
9
Java Result: 1

How can I parse it correctly?

It is because that String.split uses a regex .

In regexes, the | character is a special character meaning either the pattern on the left OR on the right of the character. It has to be escaped with a backslash ( \\\\ )

The correct syntax is:

String[] fields = line.split("\\|");

Also, take nto that I didn't see the issue with the for loop, but that needs fixing too, that is why the ArrayOutOfBoundsException pops up its ugly head...

for(int i=0;i<=fields.length;i++) 

needs to be

for(int i=0;i<fields.length;i++)

(The '<=' must be '<')

Also the issue with your regexp pointed out in other answers.

| is a special character in regex which acts an OR operator, you'll need to escape the expression using:

String[] fields = line.split("\\|");

代替for(int i=0;i<=fields.length;i++){使用for(int i=0;i<fields.length;i++){因此在条件中使用<代替<=。

It seems that you have a boundary issue in the following lines:

for(int i=0;i<=fields.length;i++){
   System.out.println(fields[i]);
}

should be

for(int i=0;i<fields.length;i++){
   System.out.println(fields[i]);
}

Try this:

Path file = Paths.get("C:\\Users\\Ali Y. Akgul\\Desktop\\150004_15122012_G.txt");

ArrayList<String> lines = Files.readAllLines(file, Charset.defaultCharset());
ArrayList<String []> columns = new ArrayList<>();
for(String line : lines){
    columns.add(line.split('\|'));
}

// Now for each line you have columns.
for(String [] s : columns){ 
    System.out.println(Arrays.toString(s));
}

// To get only the values for column 8 onwards (in response to your comment)
for(String [] s : columns){ 
    String [] sublist = Arrays.copyOfRange(s, 8, s.length);
    System.out.println(Arrays.toString(sublist));
}

// To get only the columns from line 8 onwards
for(int i = 0; i < columns.size(); i++){
    System.out.println(Arrays.toString(columns.get(i)));
}        

应该小于: for(int i=0;i<fields.length;i++)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM