简体   繁体   English

检查文本是否不是Questio类型,并且还包含一些特定的单词

[英]check if the text is not questio type and also contains some specific words

I want to make sure that the text is not a question type and also contains at least one of these:watch live watch speech live #breaking #breaking news 我要确保该文本不是问题类型,并且至少包含以下一种:观看直播观看演讲直播现场直播#breaking #breaking news

so I wrote the code as follow: 所以我写的代码如下:

private static void containsQuestion(String commentstr){
     String urlPattern = "^(?!.*?\\?)(watch live|watch speech live|#breaking|#breaking news)";
     Pattern p = Pattern.compile(urlPattern,Pattern.CASE_INSENSITIVE);
        Matcher m = p.matcher(commentstr);
        if (m.find()) {
            System.out.println("yes");
        }
}

but when I try it with for example: 但是当我尝试例如:

They say 2's company; is 3 a crowd watch live on...

I expect to see yes in the console since it is matched but nothing happens Why? 我希望在控制台中看到是的,因为它已匹配但什么也没发生,为什么?

Problem is your use of start anchor ^ , 问题是您使用起始锚^

Either remove it: 删除它:

String urlPattern = 
        "(?!.*?\\?)(watch live|watch speech live|#breaking|#breaking news)";

Or place .*? 还是地方.*? before your keywords to match any # of chars before your phrases: 在您的关键字之前匹配词组之前的任何字符数:

String urlPattern =
       "^(?!.*?\\?).*?(watch live|watch speech live|#breaking|#breaking news)";

Due to use of ^ your regex is trying to match all those phrases only at start. 由于使用^您的正则表达式仅尝试在开始时匹配所有这些短语。

You need to allow more characters before/after you key words: Try this: 您需要在关键字前后添加更多字符:尝试以下操作:

/^(?!.*?\?).*(watch live|watch speech live|\#breaking|\#breaking news).*/gm

https://regex101.com/r/uS1xQ4/2 https://regex101.com/r/uS1xQ4/2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM