简体   繁体   中英

Regex to remove line break within double quote in CSV

Hi I have a csv file with an error in it.so i want it to correct with regular expression, some of the fields contain line break, Example as below

"AHLR150","CDS","-1","MDCPBusinessRelationshipID",,,"Investigating","1600 Amphitheatre Pkwy

California",,"Mountain View",,"United States",,"California",,,"94043-1351","9958"

the above two lines should be in one line

"AHLR150","CDS","-1","MDCPBusinessRelationshipID",,,"Investigating","1600 Amphitheatre PkwyCalifornia",,"Mountain View",,"United States",,"California",,,"94043-1351","9958"

I tried to use the below regex but it didnt help me

%s/\\([^\"]\\)\\n/\\1/

Try this:

public static void main(String[] args) {
    String input = "\"AHLR150\",\"CDS\",\"-1\",\"MDCPBusinessRelationshipID\","
            + ",,\"Investigating\",\"1600 Amphitheatre Pkwy\n"
            + "California\",,\"Mountain View\",,\"United\n"
            + "States\",,\"California\",,,\"94043-1351\",\"9958\"\n";

    Matcher matcher = Pattern.compile("\"([^\"]*[\n\r].*?)\"").matcher(input);
    Pattern patternRemoveLineBreak = Pattern.compile("[\n\r]");

    String result = input;
    while(matcher.find()) {
        String quoteWithLineBreak = matcher.group(1);
        String quoteNoLineBreaks = patternRemoveLineBreak.matcher(quoteWithLineBreak).replaceAll(" ");
        result = result.replaceFirst(quoteWithLineBreak, quoteNoLineBreaks);
    }

    //Output
    System.out.println(result);
}

Output:

"AHLR150","CDS","-1","MDCPBusinessRelationshipID",,,"Investigating","1600 Amphitheatre Pkwy California",,"Mountain View",,"United States",,"California",,,"94043-1351","9958"

Based on this you can try with:

/\r?\n|\r/

I checked it here and seems to be fine

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM