简体   繁体   中英

Java regex to match all words in a string

I am looking for a regex to match following pattern

(abc|def|ghi|abc+def+ghi|def+ghi)

essentially everything that's separated by | is an OR search and everything joined with + all words must be present.

I have to construct the regex dynamically based on an input string in the above format.

I tried following for AND searches:

(?=.*?\babc\b)(?=.*?\bdef\b)(?=(.*?\bghi\b)

following for OR searches

.*(abc|def).*

Is there a single regex possible? any examples would help

(abc|def|ghi)

这将匹配包含要查找的单词的每个字符串。

AND searches

You list the following:

(?=.*?\babc\b)(?=.*?\bdef\b)(?=(.*?\bghi\b)

My version:

(?=.*?\babc\b)(?=.*?\bdef\b)(?=.*?\bghi\b).

Note that your version appears an extra ( before the ghi test.

Also note that I include a . at the end (capture any single character), this is so the regular expression actually can match something otherwise you are just doing a lookahead with no actual search.

OR searches

For a search for "abc" OR "def" I would use the following regular expression:

\babc\b|\bdef\b

OR

\b(?:abc|def)\b

Combined

So for your example of (abc|def|ghi|abc+def+ghi|def+ghi) the actual regular expression might look like this:

\babc\b|\bdef\b|\bghi\b|(?=.*?\babc\b)(?=.*?\bdef\b)(?=.*?\bghi\b).|(?=.*?\bdef\b)(?=.*?\bghi\b).

It's kind of a bad example because it would match abc on it's own because of the first OR case instead of the requirement specified by the AND case in the middle.

Remember to specify your case sensitivty for the regular expression to.

Wrote this sample method match(String input, String searchFilter)

public static void main(String[] args) {
    String input = " dsfsdf Invalid Locatio sdfsdff Invalid c Test1 xx Test2";
    String searchFilter = "Invalid Pref Code|Invalid Location+Invalid company|Test|Test1+Test2";
    System.out.println(match(input, searchFilter));
}

/**
 * @param input
 * @param searchFilter
 */
private static boolean match(String input, String searchFilter) {
    List<String> searchParts = Arrays.asList(searchFilter.split("\\|"));
    ArrayList<String> ms = new ArrayList<String>();
    ArrayList<String> ps = new ArrayList<String>();
    for (String pls : searchParts) {
        if (pls.indexOf("+") > 0) {
            ms.add(pls);
        } else {
            ps.add(pls);
        }
    }
    ArrayList<String> patterns = new ArrayList<>();
    for (String msb : ms) {
        StringBuffer sb = new StringBuffer();
        for (String msbp : msb.trim().split("\\+")) {
            sb.append("(?=.*?\\b").append(msbp.trim()).append("\\b).");
        }
        patterns.add(sb.toString());
    }
    Pattern p = Pattern
            .compile("\\b(?:" + StringUtils.join(ps, "|") + ")\\b|"+ StringUtils.join(patterns, "|"), 
                    Pattern.CASE_INSENSITIVE);
    return p.matcher(input).find();
}
assertTrue(Pattern.matches("\\((\\w+(\\||\\+))+\\w+\\)", "(abc|def|ghi|abc+def+ghi|def+ghi)"));

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM