简体   繁体   English

匹配Sentence java中List / Array的任何单词

[英]Match any word of a List/Array in a Sentence java

I have an List of word like below 我有一个如下所示的单词List

List<String> forbiddenWordList = Arrays.asList("LATE", "S/O", "SO", "W/O", "WO");

How can I understand a String Contains any one of the word of the List . 如何理解String包含List任何一个单词。 like.... 喜欢....

String name1 = "Adam Smith";      // false (not found)
String name2 = "Late H Milton";   // true  (found Late)
String name3 = "S/O Furi Kerman"; // true  (found S/O)
String name4 = "Conl Faruk";      // false (not found)
String name5 = "Furi Kerman WO";  // true  (found WO)

Regular Expression highly appreciated. 正则表达高度赞赏。

boolean containsForbiddenName = forbiddenWordList.stream()
     .anyMatch(forbiddenName -> name.toLowerCase()
          .contains(forbiddenName.toLowerCase()));
  1. turn the list to a string with the | 将列表转换为带有|的字符串 delimiter 分隔符

    String listDelimited = String.join("|", forbiddenWordList ) String listDelimited = String.join(“|”,forbiddenWordList)

  2. create the regex 创建正则表达式

    Pattern forbiddenWordPattern = Pattern.compile(listDelimited , Pattern.CASE_INSENSITIVE); 模式forbiddenWordPattern = Pattern.compile(listDelimited,Pattern.CASE_INSENSITIVE);

  3. test your text 测试你的文字

    boolean hasForbiddenWord = forbiddenWordPattern.matcher(text).find(); boolean hasForbiddenWord = forbiddenWordPattern.matcher(text).find();

(similar to the answer of @Maurice Perry) (类似于@Maurice Perry的回答)

You can use like this : 你可以像这样使用:

Iteration over words ( stream ) and returns true if any words (named w ) match with the condition ( contains ) 对单词stream )进行迭代,如果任何单词(名为w )与条件( contains )匹配,则返回true

public static boolean isForbidden(String word, List<String> words) {
     return words.stream().anyMatch(w -> (word.toLowerCase().contains(w.toLowerCase())));
}

Using regex , it will build the pattern itself from the List 使用正则表达式 ,它将从List构建模式本身

public static boolean isForbidden1(String word, List<String> words) {
     String forbiddenWordPattern = String.join("|", words);

     return Pattern.compile(forbiddenWordPattern, Pattern.CASE_INSENSITIVE)
                   .matcher(word)
                   .find();
 }

The list can be expressed as a pattern: 该列表可以表示为一种模式:

Pattern forbiddenWordPattern
        = Pattern.compile("LATE|S/O|SO|W/O|WO", Pattern.CASE_INSENSITIVE);

To test the presence of a word in a text, you would do: 要测试文本中是否存在单词,您可以:

boolean hasForbiddenWord = forbiddenWordPattern.matcher(text).find();

Finally I have got a Solution myself with the help all of you.... 最后,我自己帮助了所有人......

    String regex = String.join("|", forbiddenWordList.stream().map(word -> "\\b" + word + "\\b").collect(Collectors.toList()));
    Pattern pattern = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
    System.out.println(pattern.matcher(name).find());

The word boundary ( \\\\b ) helps to find exact word, not the matched text. 单词边界( \\\\b )有助于找到确切的单词,而不是匹配的文本。 Thanks everyone for helping. 谢谢大家的帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM