简体   繁体   中英

match regex for three words search java

I have two symbol string query for search algo. And i have a string that consist of three words separated by comma. What i want is to search among these three prarms.

eg "String, Text,Search"

if the input is "Te" the search should match, also "Str", "Se" should match.

I implemented using regex. But it works only for the first word. Note that i have a space before the second word.

        stringInput="String, Text,Search";
        word="St";
        String pattern1=word+"\\w*,\\s\\w*,\\w";

        String pattern2="\\w*,\\."+word+"\\w*,\\w";

        String pattern3="\\w*,\\w*,"+word+"\\w";

        Pattern patternCompiled1=Pattern.compile(pattern1);
        Pattern patternCompiled2=Pattern.compile(pattern2);
        Pattern patternCompiled3=Pattern.compile(pattern3);
        Matcher matcher1= patternCompiled1.matcher(inputString);

        Matcher matcher2= patternCompiled2.matcher(inputString);

        Matcher matcher3= patternCompiled3.matcher(inputString);

            if(matcher1.find() || matcher2.find() || matcher3.find()){
                return true;
            }

Can you help me why it doesnt work for the second and the third word?

Some clarifications

Word1, String1, String2 The first param is always only one word, The second param can be two or more words, and the third param two - it can be several words separated by space. eg. Text, Some Text,Other Text Text it can be anything Text, Some,Other Text also it can contain different symbols, what i want is to make search to match first letters of first word from each param.

Your patterns are incorrect. I highly recommend you learn some more about regex:

Your first pattern: word+"\\\\w*,\\\\s\\\\w*,\\\\w" matches:

  • The string to match
  • Followed by 0 or more word characters
  • Followed by a comma
  • Followed by a single white space character
  • Followed by 0 or more word characters
  • Followed by a comma
  • Followed by a single word character

This pattern works for the given input String, however would fail if there is a space after the last comma.

The second pattern: "\\\\w*,\\\\."+word+"\\\\w*,\\\\w" matches:

  • 0 or more word characters
  • Followed by a comma
  • Followed by a literal .
  • Followed by the string to match
  • Followed by 0 or more word characters
  • Followed by a comma
  • Followed by a single word character

This will not work because you have escaped the . character \\\\. which means it will match a literal . which your string does not contain.

Your final pattern: "\\\\w*,\\\\w*,"+word+"\\\\w" matches:

  • 0 or more word characters
  • Followed by a comma
  • Followed by 0 or more word characters
  • Followed by a comma
  • Followed by the string to match
  • Followed by a single word character

This will fail because you have not accounted for white space after the commas.

A single, correct regex pattern would be something like:

^(?:%s.*,.*,.*)|(?:.*,\\s*%s.*,.*)|(?:.*,.*,\\s*%s.*)$

Where %s is your string to search for.

Explanation:

  • ^ matches the start of the string, and $ the end of it.
  • There are three non-capturing groups (?:)
  • Each group is separated by a | which means or. So only one of these groups needs to match.
  • The first group is to match the search text at the start of the first word, and so is simply, the search text followed by 0 or more of any character, followed by a comma, followed by 0 or more of any character...
  • The second group is to match the search text at the start of the second word, this is similar to the first pattern, except for we only want to match whitespace before the second word instead of any character.
  • The third group is to match the search text at the start of the third word, this pattern is pretty much the same as the second, just shifted along.

Usage:

String pattern = String.format("^(?:%s.*,.*,.*)|(?:.*,\\s*%s.*,.*)|(?:.*,.*,\\s*%s.*)$", 
            searchText, searchText, searchText);

Matcher m = Pattern.compile(pattern).matcher(stringInput);
System.out.println(m.find());

However, there is a simpler solution without the need for a complex regex pattern.

Alternative solution (split into words and check if any start with the search text):

private boolean anyWordStartsWith(final String words, final String search) {
    for (final String word : words.split("\\s*,\\s*")) {
        if(word.startsWith(search)) return true;
    }
    return false;
}

Alternative solution (Java 8):

boolean anyMatch = Arrays.stream(stringInput.split("\\s*,\\s*"))
                         .anyMatch(word -> word.startsWith(searchText));

For pattern2 , the \\\\. will match the dot character, but there is not dot at this point (you might want to just use a dot, without the \\\\ , to match any character)

For pattern3 , you forgot the same dot (or the \\\\s you used in pattern1 ).

So this should look like:

String pattern1=word+"\\w*,\\s\\w*,\\w";
String pattern2="\\w*,."+word+"\\w*,\\w"; // Or replace dot with \\s
String pattern3="\\w*,.\\w*,"+word+"\\w"; //Same here

if you want it to work with stringInput="String, Text,Search";

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM