简体   繁体   English

Javascript RegEx从字符串中删除多个单词

[英]Javascript RegEx Remove Multiple words from string

Using Javascript. 使用Javascript。 ( note there is a similar post , but the OP requested Java, this is for Javascript ) 注意有一个类似的帖子 ,但OP请求Java,这是用于Javascript

I'm trying to remove a list of words from an entire string without looping (preferably using Regular Expressions). 我试图从整个字符串中删除一个单词列表而不循环(最好使用正则表达式)。

This is what I have so far, and it removes some of the words but not all of them. 这是我到目前为止所做的,它删除了一些词,但不是全部。 Can someone help identify what I'm doing wrong with my RegEx function? 有人可以帮我确定我的RegEx功能有什么问题吗?

  //Remove all instances of the words in the array var removeUselessWords = function(txt) { var uselessWordsArray = [ "a", "at", "be", "can", "cant", "could", "couldnt", "do", "does", "how", "i", "in", "is", "many", "much", "of", "on", "or", "should", "shouldnt", "so", "such", "the", "them", "they", "to", "us", "we", "what", "who", "why", "with", "wont", "would", "wouldnt", "you" ]; var expStr = uselessWordsArray.join(" | "); return txt.replace(new RegExp(expStr, 'gi'), ' '); } var str = "The person is going on a walk in the park. The person told us to do what we need to do in the park"; console.log(removeUselessWords(str)); //The result should be: "person going walk park. person told need park." 

Three moments: 三个时刻:

  • join array items with | 使用|连接数组项 without side spaces 没有侧面空间
  • enclose regex alternation group into parentheses (...|...) 将正则表达式交替组括在括号中(...|...)
  • specify word boundary \\b to match a separate words 指定单词边界\\b以匹配单独的单词

 var removeUselessWords = function(txt) { var uselessWordsArray = [ "a", "at", "be", "can", "cant", "could", "couldnt", "do", "does", "how", "i", "in", "is", "many", "much", "of", "on", "or", "should", "shouldnt", "so", "such", "the", "them", "they", "to", "us", "we", "what", "who", "why", "with", "wont", "would", "wouldnt", "you" ]; var expStr = uselessWordsArray.join("|"); return txt.replace(new RegExp('\\\\b(' + expStr + ')\\\\b', 'gi'), ' ') .replace(/\\s{2,}/g, ' '); } var str = "The person is going on a walk in the park. The person told us to do what we need to do in the park"; console.log(removeUselessWords(str)); 

May be this is what you want: 可能这是你想要的:

  //Remove all instances of the words in the array var removeUselessWords = function(txt) { var uselessWordsArray = [ "a", "at", "be", "can", "cant", "could", "couldnt", "do", "does", "how", "i", "in", "is", "many", "much", "of", "on", "or", "should", "shouldnt", "so", "such", "the", "them", "they", "to", "us", "we", "what", "who", "why", "with", "wont", "would", "wouldnt", "you" ]; var expStr = uselessWordsArray.join("\\\\b|\\\\b"); return txt.replace(new RegExp(expStr, 'gi'), '').trim().replace(/ +/g, ' '); } var str = "The person is going on a walk in the park. The person told us to do what we need to do in the park"; console.log(removeUselessWords(str)); //The result should be: "person going walk park. person told need park." 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM