简体   繁体   中英

Parsing CSV with a RegEx in java - escape double quote within cell

I am looking for a java regex which will escape the doublequote within an excel cell.

I have followed this example but need another change in the regular expression to make it work for escaping doublequote within one of the cells.

Parsing CSV input with a RegEx in java

private final Pattern pattern = Pattern.compile("\"([^\"]*)\"|(?<=,|^)([^,]*)(?=,|$)");

Example Data:

"A,B" , "2" size" , "text1,text2, text3"

The regex from above fails at 2" .

I want the output to be as below .Doesn't matter if the outer double quotes are there or not.

"A,B"
"2" size"
"text1,text2, text3"

while I agree, that using regex for parsing a CVS is not really the best way, a slightly better pattern is:

Pattern pattern = Pattern.compile("^\"([^\"]*)\",|,\"([^\"]*)\",|,\"([^\"]*)\"$|(?<=,|^)([^,]*)(?=,|$)");

This will terminate a cell value only after quote and comma, or start it after a command and a quote.

well as FJ commented, the input data is ambiguous. But for your example input, you could try

  • string.split("\\",\\"") method to get a String[] . after this, you got an array with 3 elements:
 [ "A,B, 2" size, text1,text2, text3" ] 
  • remove the first character (which is double quote) of the first element of the array
  • remove the last character (which is double quote) of the last element of the array

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM