简体   繁体   中英

Regex to capture comma separated groups of text in parentheses [Java]

I have a string that contains one or more (comma-separated) values, surrounded by quotes and enclosed in parentheses. So it can be of the type os IN ('WIN', 'MAC', 'LNU') (for multiple values) or just os IN ('WIN') for a single value.

I need to extract the values in a List .

I have tried this regex, but it captures all the values into one single list element as one whole String as 'WIN', 'MAC' , instead of two String values of WIN and MAC -

        List<String> matchList = new ArrayList<>();

        Pattern regex = Pattern.compile("\\((.+?)\\)");
        Matcher regexMatcher = regex.matcher(processedFilterString);

        while (regexMatcher.find()) {//Finds Matching Pattern in String
            matchList.add(regexMatcher.group(1));//Fetching Group from String
        }

Result:

Input: os IN ('WIN', 'MAC')
Output:
['WIN', 'MAC']
length: 1

In it's current form, the regex matches one or more characters surrounded by parentheses and captures them in a group, which is probably why the result is just one string. How can I adapt it to capture each of the values separately?

Edit - Just adding some more details. The input string can have multiple IN clauses containing other criteria, such as id IN ('xxxxxx') AND os IN ('WIN', 'MAC') . Also, the length of the matched characters is not necessarily the same, so it could be - os IN ('WIN', 'MAC', 'LNUX') .

You may try splitting the CSV string from the IN clause:

List<String> matchList = null;

Pattern regex = Pattern.compile("\\((.+?)\\)");
Matcher regexMatcher = regex.matcher(processedFilterString);

if (regexMatcher.find()) {
    String match = regexMatcher.group(1).replaceAll("^'|'$", "");
    String[] terms = match.split("'\\s*,\\s*'");
    matchList = Arrays.stream(terms).collect(Collectors.toList());
}

Note that if your input string could contain multiple IN clauses, then the above would need to be modified to use a while loop.

What I see from the examples in your question, your regular expression needs to find strings of at least three upper-case letters enclosed in single quotes.

import java.util.ArrayList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Solution {

    public static void main(String[] args) {
        String s = "os IN ('WIN', 'MAC', 'LNUX')";
        Pattern pattern = Pattern.compile("'([A-Z]{3,})'");
        Matcher matcher = pattern.matcher(s);
        List<String> list = new ArrayList<>();
        while (matcher.find()) {
            list.add(matcher.group(1));
        }
        System.out.println(list);
    }
}

Running the above code produces the following output:

[WIN, MAC, LNUX]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM