简体   繁体   中英

Java: How to remove all line breaks between double quotes

I am having a big CSV file which I am parsing in Java. The problem is, that in some of the text sections, which are marked with "", I am having line breaks. I am now trying to remove all the line breaks in the "" sections but was not successful so far.

For example, I am having the following CSV:

"Test Line wo line break"; "Test Line 
with line break"
"Test Line2 wo line break"; "Test Line2 
with line break"

The result should be:

"Test Line wo line break"; "Test Line with line break"
"Test Line2 wo line break"; "Test Line2 with line break"

I have tried the following so far:

s.replaceAll("(\\w)*\r\n", "$1");

But this, unfortunately, replaces all line breaks, also the one at the end of the lines.

Then I added the double apostrophes to the regex:

s.replaceAll("\"(\\w)*\r\n\"", "$1");

But with this, unfortunately, nothing gets replaces at all.

Can you please help me find out what I am doing wrong here?

Thanks in advance

You may match all substrings between double quotation marks using a simple "[^"]*" regex and remove all linebreaks in between using

String s = "\"Test Line wo line break\"; \"Test Line \nwith line break\"\n\"Test Line2 wo line break\"; \"Test Line2 \nwith line break\"";
StringBuffer result = new StringBuffer();
Matcher m = Pattern.compile("\"[^\"]*\"").matcher(s);
while (m.find()) {
    m.appendReplacement(result, m.group().replaceAll("\\R+", ""));
}
m.appendTail(result);
System.out.println(result.toString());

Output:

"Test Line wo line break"; "Test Line with line break"
"Test Line2 wo line break"; "Test Line2 with line break"

See the Java demo online .

Note that .replaceAll("\\\\R+", "") finds 1 or more any line break sequences and removes them only from what `"[^"]*" matched.

I wouldn't recommend parsing CVS yourself if you can avoid it. In general parsing raw text often become a hazzle because you need to deal with all sorts of exceptions, and for instance you quite easily reach the point where regular expressions are not enough and you need to be able to parse context free grammars.

There are some options on libraries for parsing CSV here: CSV parsing in Java - working example..?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM