简体   繁体   English

Javascript 动态正则表达式

[英]Javascript dynamic regex

After looking here I came up with a patter to test an array of words against a string.看了这里之后,我想出了一个模式来针对字符串测试一组单词。

$.each(data, function(index, val) {
    var pattern = new RegExp('?:^|\s'+ val + '?=\s|$', 'g');
    console.log(pattern.test(comment));
    if (!pattern.test(comment)) {                           
           yay = true;
         }
});

the problem here is that it returns true all the time.这里的问题是它一直返回 true。 Any suggestions?有什么建议? Thanks!谢谢!

Fix this line:修复这一行:

var pattern = new RegExp('(?:^|\s)'+ val + '(?=\s|$)', 'g');

You may also find it useful to debug/validate your regex using this online utility:您可能还会发现使用此在线实用程序调试/验证您的正则表达式很有用:

http://gskinner.com/RegExr/ http://gskinner.com/RegExr/

You will need to replace your val variable with a sample value instead in order to debug.您需要用示例值替换val变量才能进行调试。

From your JsFiddle, I forked and created one of my own, and your solution (with my regular expression from the comments) works quite well , once all the minor typos are cleared up.从你的 JsFiddle 中,我分叉并创建了我自己的一个,一旦所有的小错别字都被清除,你的解决方案(使用我的评论中的正则表达式)效果很好 However, it could be much, much cleaner and faster .但是,它可以更干净、更快 Here's what I did differently:这是我的不同之处:

$('#send-btn').on('click',function(){
    $('#error').hide();
    var pattern = new RegExp('\\b(' + list.join('|') + ')\\b', 'i');
    var comment = $('#comment').val();
    if(pattern.test(comment)){
        $('#error').show();
    };
});

Specifically, the pattern I generated takes advantage of Javascript's Array.join (javascript built-in) which pastes an Array of strings together with a prescribed interstitial string.具体来说,我生成的模式利用了 Javascript 的Array.join (javascript 内置),它将字符串数组与规定的间隙字符串粘贴在一起。 This builds a string with all of your search words appended by the regular expressions alternator ( | ).这将构建一个字符串,其中所有搜索词都附加了正则表达式替换符 ( | )。 Then by surrounding that group with parentheses to contain the alternation, I can apply the word boundary regular expression( \\b ) to either end to make sure we're matching only entire words.然后通过用括号包围该组以包含交替,我可以将单词边界正则表达式( \\b )应用于任一端,以确保我们只匹配整个单词。 In other news: You really don't need the g (global) modifier if you're just doing a simple test.在其他新闻中:如果您只是做一个简单的测试,您真的不需要g (全局)修饰符。 You may need it in other applications - such as if you wanted to highlight the offending word - but for this I dropped it.您可能在其他应用程序中需要它 - 例如,如果您想突出显示有问题的单词 - 但为此我放弃了它。 You SHOULD be using the i modifier for case-insensitive behaviour.您应该将i修饰符用于不区分大小写的行为。

The biggest upside to this is that you could, if you wanted to, choose to define your regular expression outside this function, and you'll see pretty significant speed gains.这样做的最大好处是,如果您愿意,可以选择在此函数之外定义正则表达式,并且您会看到非常显着的速度提升。

Downside: There are diminishing returns as your list of foul words gets longer.缺点:随着你的脏话列表越来越长,回报会越来越少。 But given this benchmark , it'll be a while before your way is better (a long while).但是考虑到这个基准,你的方式要好一些(很长一段时间)还需要一段时间。

NOTE笔记

You should be made aware that you ought to escape your words before you use them in a regular expression - In your list, for instance 'ass' will match 'alsls'.您应该意识到在正则表达式中使用单词之前应该将它们转义 - 在您的列表中,例如“ass”将匹配“alsls”。 While that is gibberish, it's not really a swear word, and you can easily see how such a problem could extrapolate into finding profanity where there is none.虽然这胡言乱语,但它并不是真正的脏话,您可以很容易地看到这样的问题如何推断为在没有脏话的地方找到脏话。 However, you may choose to do this outside the function, perhaps even leveraging the power of regular expressions in your word definitions (define '[a@][$s]{2}' instead of 'ass', '@ss', 'a$s', 'as$', '@$s', '@s$', 'a$$', and '@$$'), so I'm not going to address that here.但是,您可以选择在函数之外执行此操作,甚至可能在您的单词定义中利用正则表达式的强大功能(定义 '[a@][$s]{2}' 而不是 'ass'、'@ss'、 'a$s'、'as$'、'@$s'、'@s$'、'a$$' 和 '@$$'),所以我不打算在这里解决这个问题。

Good luck, and happy regexing.祝你好运,快乐的正则表达式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM