简体   繁体   中英

Need help in Regex to exclude splitting string within "

I need to split a String based on comma as seperator, but if the part of string is enclosed with " the splitting has to stop for that portion from starting of " to ending of it even it contains commas in between.

Can anyone please help me to solve this using regex with look around.

Resurrecting this question because it had a simple regex solution that wasn't mentioned. This situation sounds very similar to ["regex-match a pattern unless..."][4]

\"[^\"]*\"|(,)

The left side of the alternation matches complete double-quoted strings. We will ignore these matches. The right side matches and captures commas to Group 1, and we know they are the right ones because they were not matched by the expression on the left.

Here is working code (see online demo ):

import java.util.regex.*;
import java.util.List;

class Program {
    public static void main (String[] args) {

        String subject = "\"Messages,Hello\",World,Hobbies,Java\",Programming\"";
        Pattern regex = Pattern.compile("\"[^\"]*\"|(,)");
        Matcher m = regex.matcher(subject);
        StringBuffer b = new StringBuffer();
        while (m.find()) {
            if(m.group(1) != null) m.appendReplacement(b, "SplitHere");
            else m.appendReplacement(b, m.group(0));
        }
        m.appendTail(b);
        String replaced = b.toString();
        String[] splits = replaced.split("SplitHere");
        for (String split : splits)
            System.out.println(split);
    } // end main
} // end Program

Reference

  1. How to match pattern except in situations s1, s2, s3

Please try this:


(?<!\\G\\s*"[^"]*),


If you put this regex in your program, it should be:

String regex = "(?<!\\\\G\\\\s*\\"[^\\"]*),";


But 2 things are not clear:

  1. Does the " only start near the , , or it can start in the middle of content, such as AAA, BB"CC,DD" ? The regex above only deal with start neer , .

  2. If the content has " itself, how to escape? use "" or \\" ? The regex above does not deal any escaped " format.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM