简体   繁体   English

使用正则表达式获取包含特定单词的句子

[英]Get sentence containing specific word with regex

I am trying to find the sentence that contains a specific word.我正在尝试查找包含特定单词的句子。 I defined a sentence starting and ending with the following characters : . ! ?我定义了一个以以下字符开头和结尾的句子: . ! ? . ! ?

var str = "Hello, how is it going. This is the bus we have to take!";
var regex = /[^.?!]*(?:[.?,\s!])(bus)(?=[\s.?!,])[^.?!]*[.?!]/igm;

var result = regex.exec(str);


output : `This is the bus we have to take!`

Now, I have trouble when I try to find the sentence that contains the word hello , as it's starting the sentence.现在,当我尝试查找包含单词hello的句子时遇到了麻烦,因为它是句子的开头。 How could I change my regex to include that case?我怎么能改变我的正则表达式以包含这种情况? I am not used to regex and it's quite hard to get into it, even with the docs under my eyes!我不习惯正则表达式,即使我眼中的文档也很难进入它!

Remember that splitting text into linguistic sentences is a very specific, difficult task usually performed with the help of NLP packages.请记住,将文本拆分为语言句子是一项非常具体、困难的任务,通常在 NLP 包的帮助下执行。

If you want to limit to specific strings that follow your definition of a sentence:如果您想限制为遵循您的句子定义的特定字符串:

  • Split with /[.?!]/ regex/[.?!]/正则表达式分割
  • Check if the entry contains a substring with RegExp#test() since you need a case insensitive check使用RegExp#test()检查条目是否包含子字符串,因为您需要不区分大小写的检查

 var str = "Hello, how is it going. This is the bus we have to take!"; var chunks = str.split(/[.?!]/).filter(function(n) { return /hello/i.test(n); }); console.log(chunks);

Note that to check for a whole word you may use /\\bhello\\b/i or /(?:^|\\s)hello(?!\\S)/i regexps depending on further requirements.请注意,要检查整个单词,您可以根据进一步的要求使用/\\bhello\\b/i/(?:^|\\s)hello(?!\\S)/i表达式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM