简体   繁体   中英

java regex to remove unwanted double quotes in csv

I have a csv file that has the following line. as you can see numbers are NOT enclosed in double quotes.

String theLine = "Corp:Industrial","5Nearest",51.93000000,"10:21:29","","","","10:21:29","7/5/2016","PER PHONE CALL WITH SAP, CORRECTING "C","359/317 97 SMRD 96.961 MADV",""

I try to read the above line and split using the regEX

String[] tokens = theLine.split(",(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))");

this doesn't split at every comma like I want it. "PER PHONE CALL WITH SAP, CORRECTING "C", is messing it up because it has additional ,(comma) and " (double quote). can some one please help me write a regex that will escape a additional double quote and a comma with in two double quotes.

I basically want :

"Corp:Industrial","5Nearest",51.93000000,"10:21:29","","","","10:21:29","7/5/2016","**PER PHONE CALL WITH SAP CORRECTING C**","359/317 97 SMRD 96.961 MADV",""

There are jobs that parsers are much better at than Regular Expressions, and this sort of thing is typically one of them. I'm not saying you can't make it work for you, but ... there are also open-source CSV Parsers you could press into service.

Having said that, your CSV looks suspect to me.

"PER PHONE CALL WITH SAP, CORRECTING "C",

That value has three quotes in it -- is it meant to represent a string with only a single quote inside? Or should the C be surrounded by quotes as well as the String?

Normally if you're going to include a double quote inside a double quote you need a special syntax for it. For CSV, the most common options would be doubling it, or escaping it with a character like a backslash:

"PER PHONE CALL WITH SAP, CORRECTING ""C""",

Or:

"PER PHONE CALL WITH SAP, CORRECTING \"C\"",

None of which will directly change your problem of using Regular Expressions, but once you have well-formed CSV, your odds of parsing it successfully go up.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM