简体   繁体   中英

parsing CSV file in java with " in the CSV file

My CSV file has following text:

a, b, 0, "0, 1, 2", ""ab cd", 5", 10

My regex:

aColumnValue = dataRow.split(",(?=([^\"]*\"[^\"]*\")*[^\"]*$)");

where aColumnValue is a String array.

This regex is failing since on the '"' before ab the regex closes and searches for the next token.

Please help find the correct regex.

The correct count of tokens should be: 6 and the actual tokens should be

a
b
0
0, 1, 2,
"ab cd", 5
10

Thanks in advance.

Do not parse CSV using regex. Use libraries that know to do this well. For example OpenCSV or Apache commons CSV

There may be more issues. You should use some CSV parser like opencsv http://opencsv.sourceforge.net/

String input = "a, b, 0, \"0, 1, 2\", \"\"ab cd\", 5\", 10";
String[] parts = input.split(",(?=([^\"]*\"[^\"]*\")*(?![^\"]*\"))");

The parts variable contains:

a
 b
 0
 "0, 1, 2"
 ""ab cd", 5"
 10

Likely you need to remove " and spaces.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM