简体   繁体   English

我们如何使用javascript在数组中使用正则表达式过滤数组中的元素?

[英]how can we filter elements in array with regex in array with javascript?

Let's say I have two arrays: one is the regex and the other one is the input. 假设我有两个数组:一个是正则表达式,另一个是输入。 What, then, is the best way - in terms of performance and readability - to do something like the output? 那么,在性能和可读性方面,做出类似输出的最佳方法是什么?

var regex = [
    '/rat/',
    '/cat/'
    '/dog/',
    '/[1-9]/'
]

var texts = [
    'the dog is hiding',
    'cat',
    'human',
    '1'
]

the end result is 最终的结果是

result = [
    'human'
]

Well, what I was thinking was to do something like reduce : 好吧,我在想的是做一些像reduce这样的事情:

// loop by text
for (var i = texts.length - 1; i >= 0; i--) {
    // loop by regex
    texts[i] = regex.reduce(function (previousValue, currentValue) {
        var filterbyRegex = new RegExp("\\b" + currentValue + "\\b", "g");  
        if (previousValue.toLowerCase().match(filterbyRegex)) {
            delete texts[i];
        };
        return previousValue;
    }, texts[i]);
}

But, is that not readable? 但是,这不可读吗? Maybe there is another way that I haven't thought of. 也许还有另一种我没有想到的方式。

I would probably go something like this 我可能会这样做

var regexs = [
    /rat/i,
    /cat/i,
    /dog/i,
    /[1-9]/i
]

var texts = [
    'the dog is hiding',
    'cat',
    'human',
    '1'
]

var goodStuff = texts.filter(function (text) {
    return !regexs.some(function (regex) {
         return regex.test(text);
    });
});

But realistically, performance differences are so negligible here unless you are doing it 10,000 times. 但实际上,除非你做了10,000次,否则这里的性能差异可以忽略不计。

Please note that this uses ES5 methods, which are easily shimmable (I made up a word I know) 请注意,这使用ES5方法,这些方法很容易调整(我知道这个单词)

Here's my solution: 这是我的解决方案:

var words = [ 'rat', 'cat', 'dog', '[1-9]' ];

var texts = [ ... ];

// normalise (and compile) the regexps just once
var regex = words.map(function(w) {
    return new RegExp('\\b' + w + '\\b', 'i');
});

// nested .filter calls, removes any word that is
// found in the regex list
texts = texts.filter(function(t) {
    return regex.filter(function(re) {
        return re.test(t);
    }).length === 0;
});

http://jsfiddle.net/SPAKK/ http://jsfiddle.net/SPAKK/

You clearly have to process the texts array elemnt by element. 你必须按元素处理文本数组elemnt。 However you could combine your regexps into a single one by joining with '|' 但是,您可以通过加入“|”将正则表达式合并为一个正则表达式

The regexps array you show are actually simple strings. 您显示的regexps数组实际上是简单的字符串。 I would remove the leading and trailing / characters and then construct a single regexp. 我会删除前导/尾随/字符,然后构建一个正则表达式。 Something like : 就像是 :

function reduce (texts, re) {
  re = new RegExp (re.join ('|'));
  for (var r = [], t = texts.length; t--;)
    !re.test (texts[t]) && r.unshift (texts[t]);
  return r;
}

alert (reduce (['the dog is hiding', 'cat', 'human', '1'], ['rat', 'cat', 'dog', '[1-9]']))

Be aware that if your re strings contain RegExp special characters like .{[^$ etc you will need to escape them either in the strings or process them in the function. 请注意,如果您的re字符串包含RegExp特殊字符,例如。{[^ $ etc,您将需要在字符串中转义它们或在函数中处理它们。

See jsfiddle : http://jsfiddle.net/jstoolsmith/D3uzW/ 见jsfiddle: http//jsfiddle.net/jstoolsmith/D3uzW/

只是一个想法,将正则表达式数组合并到一个新的正则表达式,并将第二个数组合并为一个新字符串,每个值都用信号分割,例如@,#,然后使用正则表达式替换匹配部分。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM