简体   繁体   English

以最有效的方式从给定的字符串中删除一组单词

[英]Remove an array of words from given string in the most efficient way

I want the most efficient (in case of speed) solution to remove array of given words from the given string:我想要最有效的(在速度的情况下)解决方案从给定的字符串中删除给定的单词数组:

So far I have this not working solution:到目前为止,我有这个不起作用的解决方案:

 const excludeWordList = ['the', 'in', 'a', 'an']; run("the wall into") run("paintings covered the wall another words into this") function run(speech) { for(let a = 0; a < excludeWordList.length; a++) { speech = speech.replaceAll(excludeWordList[a], ''); } console.log(speech); }

As you see the code has three major issues:如您所见,代码存在三个主要问题:

  1. It removes characters inside the words not just the single words它删除单词中的字符而不仅仅是单个单词

  2. The result is not a trimmed string we have extra spaces inside words of the result too结果不是修剪过的字符串,我们在结果的单词内也有额外的空格

  3. The code is not the most efficient way I think!!!代码不是我认为最有效的方式!!! , because I need to loop through all the excludeWordList array. ,因为我需要遍历所有excludeWordList数组。

I wrote my function as my and as you see the Gainza function is the most efficient function in this case:我写了我的函数作为my ,正如你看到的,在这种情况下, Gainza函数是最有效的函数:

在此处输入图片说明

I'd use the a filter & set approach to minimize computation time (instead of includes or indexOf that iterate the whole array)我会使用过滤器和设置方法来最小化计算时间(而不是includesindexOf迭代整个数组)

const excluded = new Set(['the', 'in', 'a', 'an']);


function run(speech) {
    return speech.split(' ')
           .filter(word => !excluded.has(word))
           .join(' ');
}


run("the wall into")
run("paintings covered the wall another words into this")

Here's a way that doesn't loop, replaces all excluded words with a single regex, and finished up with a trim and extra space cleanup.这是一种不循环的方法,用单个正则表达式替换所有排除的单词,并完成修剪和额外空间清理。 This should be the fasted method.这应该是禁食的方法。 You can map the array into a Regex and preserve word boundaries \\b , which have to be escaped when building it dynamically您可以将数组map到 Regex 并保留字边界\\b ,在动态构建它时必须对其进行转义

 const excludeWordList = ['the', 'in', 'a', 'an']; const reg = new RegExp(excludeWordList.map(w => `\\\\b${w}\\\\b`).join('|'), 'g') run("the wall into") run("paintings covered the wall another words into this") function run(speech) { speech = speech.replaceAll(reg, '').trim().replace(/\\s\\s+/g, ' '); console.log(speech); }

 const excludeWordList = ['the', 'in', 'a', 'an']; run("the wall into") run("paintings covered the wall another words into this") function run(speech) { const result = speech.split(' ').filter(word=>!excludeWordList.includes(word)).join(' ') console.log(result); }

You can try using split() and filter().您可以尝试使用 split() 和 filter()。

Edit: indexOf() complexity is O(N) because of linear search.编辑:由于线性搜索, indexOf()复杂度为 O(N)。 Since we have a fixed set of words we want to exclude, converting to a set is ideal.由于我们有一组固定的要排除的单词,因此转换为一组是理想的。

new Set() is also O(N), but since it is being done only once and your run() will be called more often it makes sense here. new Set()也是 O(N),但由于它只run()一次并且您的run()将被更频繁地调用,因此在这里是有意义的。 With set, .has() has O(1) complexity.使用 set, .has()具有 O(1) 复杂度。

 const excludeWordList = ['the', 'in', 'a', 'an']; const excludeWordSet = new Set(excludeWordList); run("the wall into") run("paintings covered the wall another words into this") function run(speech) { speech = speech.split(' ').filter((a) => { return (!excludeWordSet.has(a)) }).join(' '); console.log(speech); }

filter() still has O(N) complexity. filter()仍然具有 O(N) 复杂度。 join() has O(N). join()有 O(N)。 so this is still O(N^2), same as your initial attempt.所以这仍然是 O(N^2),与您最初的尝试相同。

The problem with a solution using whitespaces as word separator is that it will most likely fail when you use punctuation for example eg使用空格作为单词分隔符的解决方案的问题是,当您使用标点符号时,它很可能会失败,例如

'the, Beattles'.split(' ').includes('the')
//=> false

Instead you should use \\b (word boundary):相反,您应该使用\\b (字边界):

 const excludes = ['the', 'in', 'a', 'an']; const re = new RegExp('\\\\b(?:'+excludes.join('|')+')\\\\b', 'g'); console.log("the.wall.into".replace(re, '')); console.log("paintings covered the, wall another words into this".replace(re, ''));

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 从数组中删除项目的最有效方法? - Most efficient way to remove an item from array? 以有效的方式从字符串中删除特定单词 - Remove specific words from a string in an efficient way 在字符串中查找常用词的最有效方法是什么[暂停] - What is the most efficient way to find the common words in a String [on hold] 从数组中删除最有效的方法? - Most efficient way to delete from array? 根据另一个数组中的键值对象从数组中删除对象的最有效方法 - Most efficient way to remove objects from array based on key value objects in another array 通过键名从javascript对象中删除值列表的最有效方法 - Most efficient way to remove a list of values from a javascript object by keyname 返回包含字符串的嵌套数组的最有效方法 (JavaScript) - Most efficient way to return a nested array that includes a string (JavaScript) 从数组中获取具有最高优先级的字符串的最有效方法 - Most efficient way to get the string with highest priority out of an array 拆分字符串并确保结果数组中没有重复项的最有效方法是什么? - What is the most efficient way of splitting a string and ensuring there are no duplicates in the resulting array? 在JavaScript中从字符串中删除定界符的有效方法 - Efficient way to remove delimiters from string in JavaScript
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM