简体   繁体   中英

Java regex must match at beginning or end of String

I'm writing a program that takes two Strings as input and searches through the second if the first one is present. To return true, the first String has to be at the beginning/end of a word inside the second String. It cannot be in the middle of a word in the second String.

Example 1 (must return false):

String s1 = "press";
String s2 = "Regular expressions is hard to read"

Example 2 (must return true):

String s1 = "ONE";
String s2 = "ponep,onep!"

Example 3 (must return true):

String s1 = "ho";
String s2 = "Wow! How awesome is that!"

Here is my code, it returns false instead of true in the third example:

public static void main(String[] args) {    
    Scanner scanner = new Scanner(System.in);
    String part = scanner.nextLine();
    String line = scanner.nextLine();

    Pattern pattern = Pattern.compile("((.+\\s+)*|(.+,+)*"+part+"\\w.*)"+"|"+"(.+"+part+"(\\s+.+)*)",Pattern.CASE_INSENSITIVE);
    Matcher matcher = pattern.matcher(line);
    System.out.println(matcher.matches());
}

please help

Check out the word boundary matcher . It is a 0 length matcher but only matches at the boundary of a word (a position between a word and non-word character \\w and \\W ).

Your regex is then essentially \\bkeyword|keyword\\b . Either the keyword at the beginning or end of a word.

boolean check(String s1, String s2) {
    Pattern pattern = Pattern.compile("\\b" + Pattern.quote(s1) + "|" + Pattern.quote(s1) + "\\b", Pattern.CASE_INSENSITIVE);
    Matcher matcher = pattern.matcher(s2);
    return matcher.find();
}

Some key points I've added is Pattern.quote(s1) to ensure that if the first word is something like ab|c , it will match those 4 characters literally and not interpret it as a regex. Also, I've switched the check at the end to matcher.find() so we can write a simpler regex as the concern is simply the existence of a matching substring.

my suggestion would be

  1. Split the second string with specified delimiter(space or comma if that's your case)
  2. create regexp to match the specified word either at beginning or end.
  3. map the split words with regexp to get a boolean result array
  4. return true if any true is included in the result array

sample code

class Test {
public static void main(String[] args) {
    String first = "ho";
    String second = "Wow! How awesome is that!";

    String[] words = second.split("\\s|,");
    List<Boolean> results = Arrays.stream(words)
            .map(String::toLowerCase)
            .map(word -> match(first.toLowerCase(), word)).collect(Collectors.toList());
    System.out.println(results);
    System.out.println(results.contains(true));
}

private static boolean match(String patternWord, String matchedWord) {
    Pattern patten1 = Pattern.compile("^" + patternWord + "\\S*");
    Matcher matcher1 = patten1.matcher(matchedWord);

    Pattern pattern2 = Pattern.compile("\\S*" + patternWord + "$");
    Matcher matcher2 = pattern2.matcher(matchedWord);
    return matcher1.matches() || matcher2.matches();
}

}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM