[英]Remove an array of words from given string in the most efficient way
I want the most efficient (in case of speed) solution to remove array of given words from the given string:我想要最有效的(在速度的情况下)解决方案从给定的字符串中删除给定的单词数组:
So far I have this not working solution:到目前为止,我有这个不起作用的解决方案:
const excludeWordList = ['the', 'in', 'a', 'an']; run("the wall into") run("paintings covered the wall another words into this") function run(speech) { for(let a = 0; a < excludeWordList.length; a++) { speech = speech.replaceAll(excludeWordList[a], ''); } console.log(speech); }
As you see the code has three major issues:如您所见,代码存在三个主要问题:
It removes characters inside the words not just the single words它删除单词中的字符而不仅仅是单个单词
The result is not a trimmed string we have extra spaces inside words of the result too结果不是修剪过的字符串,我们在结果的单词内也有额外的空格
The code is not the most efficient way I think!!!代码不是我认为最有效的方式!!! , because I need to loop through all the excludeWordList
array. ,因为我需要遍历所有excludeWordList
数组。
I wrote my function as my
and as you see the Gainza
function is the most efficient function in this case:我写了我的函数作为my
,正如你看到的,在这种情况下, Gainza
函数是最有效的函数:
I'd use the a filter & set approach to minimize computation time (instead of includes
or indexOf
that iterate the whole array)我会使用过滤器和设置方法来最小化计算时间(而不是includes
或indexOf
迭代整个数组)
const excluded = new Set(['the', 'in', 'a', 'an']);
function run(speech) {
return speech.split(' ')
.filter(word => !excluded.has(word))
.join(' ');
}
run("the wall into")
run("paintings covered the wall another words into this")
Here's a way that doesn't loop, replaces all excluded words with a single regex, and finished up with a trim and extra space cleanup.这是一种不循环的方法,用单个正则表达式替换所有排除的单词,并完成修剪和额外空间清理。 This should be the fasted method.这应该是禁食的方法。 You can map
the array into a Regex and preserve word boundaries \\b
, which have to be escaped when building it dynamically您可以将数组map
到 Regex 并保留字边界\\b
,在动态构建它时必须对其进行转义
const excludeWordList = ['the', 'in', 'a', 'an']; const reg = new RegExp(excludeWordList.map(w => `\\\\b${w}\\\\b`).join('|'), 'g') run("the wall into") run("paintings covered the wall another words into this") function run(speech) { speech = speech.replaceAll(reg, '').trim().replace(/\\s\\s+/g, ' '); console.log(speech); }
const excludeWordList = ['the', 'in', 'a', 'an']; run("the wall into") run("paintings covered the wall another words into this") function run(speech) { const result = speech.split(' ').filter(word=>!excludeWordList.includes(word)).join(' ') console.log(result); }
You can try using split() and filter().您可以尝试使用 split() 和 filter()。
Edit: indexOf()
complexity is O(N) because of linear search.编辑:由于线性搜索, indexOf()
复杂度为 O(N)。 Since we have a fixed set of words we want to exclude, converting to a set is ideal.由于我们有一组固定的要排除的单词,因此转换为一组是理想的。
new Set()
is also O(N), but since it is being done only once and your run()
will be called more often it makes sense here. new Set()
也是 O(N),但由于它只run()
一次并且您的run()
将被更频繁地调用,因此在这里是有意义的。 With set, .has()
has O(1) complexity.使用 set, .has()
具有 O(1) 复杂度。
const excludeWordList = ['the', 'in', 'a', 'an']; const excludeWordSet = new Set(excludeWordList); run("the wall into") run("paintings covered the wall another words into this") function run(speech) { speech = speech.split(' ').filter((a) => { return (!excludeWordSet.has(a)) }).join(' '); console.log(speech); }
filter()
still has O(N) complexity. filter()
仍然具有 O(N) 复杂度。 join()
has O(N). join()
有 O(N)。 so this is still O(N^2), same as your initial attempt.所以这仍然是 O(N^2),与您最初的尝试相同。
The problem with a solution using whitespaces as word separator is that it will most likely fail when you use punctuation for example eg使用空格作为单词分隔符的解决方案的问题是,当您使用标点符号时,它很可能会失败,例如
'the, Beattles'.split(' ').includes('the')
//=> false
Instead you should use \\b
(word boundary):相反,您应该使用\\b
(字边界):
const excludes = ['the', 'in', 'a', 'an']; const re = new RegExp('\\\\b(?:'+excludes.join('|')+')\\\\b', 'g'); console.log("the.wall.into".replace(re, '')); console.log("paintings covered the, wall another words into this".replace(re, ''));
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.