简体   繁体   English

RegEx - 排除匹配模式

[英]RegEx - Exclude Matched Patterns

I have the below patterns to be excluded. 我有以下模式被排除在外。

make it cheaper
make it cheapere
makeitcheaper.com.au
makeitcheaper
making it cheaper
www.make it cheaper
ww.make it cheaper.com

I've created a regex to match any of these. 我创建了一个正则表达式以匹配其中任何一个。 However, I want to get everything else other than these. 但是,除了这些之外,我还希望得到其他所有东西。 I am not sure how to inverse this regex I've created. 我不知道如何逆转我创建的这个正则表达式。

mak(e|ing) ?it ?cheaper

Above pattern matches all the strings listed. 上面的模式匹配列出的所有字符串。 Now I want it to match everything else. 现在我想让它与其他一切相匹配。 How do I do it? 我该怎么做?

From the search, it seems I need something like negative lookahead / look back. 从搜索来看,似乎我需要像负向前瞻/回顾这样的东西。 But, I don't really get it. 但是,我真的不明白。 Can some one point me in the right direction? 有人能指出我正确的方向吗?

You can just put it in a negative look-ahead like so: 你可以把它放在负面的预测中,如下所示:

(?!mak(e|ing) ?it ?cheaper)

Just like that isn't going to work though since, if you do a matches 1 , it won't match since you're just looking ahead, you aren't actually matching anything, and, if you do a find 1 , it will match many times, since you can start from lots of places in the string where the next characters doesn't match the above. 就像那样不会起作用,因为,如果你做matches 1 ,它将不匹配,因为你只是向前看,你实际上没有匹配任何东西,并且,如果你做一个find 1 ,它将匹配很多次,因为你可以从字符串中的许多地方开始,下一个字符与上面的字符不匹配。

To fix this, depending on what you wish to do, we have 2 choices: 要解决这个问题,根据您的目的,我们有两个选择:

  1. If you want to exclude all strings that are exactly one of those (ie "make it cheaperblahblah" is not excluded), check for start ( ^ ) and end ( $ ) of string: 如果你想排除所有那些恰好是其中一个的字符串(即“make it cheaperblahblah”不被排除),检查字符串的开始( ^ )和结束( $ ):

     ^(?!mak(e|ing) ?it ?cheaper$).* 

    The .* (zero or more wild-cards) is the actual matching taking place. .* (零个或多个通配符)是实际匹配发生的。 The negative look-ahead checks from the first character. 来自第一个角色的负向前瞻检查。

  2. If you want to exclude all strings containing one of those, you can make sure the look-ahead isn't matched before every character we match: 如果要排除包含其中一个的所有字符串,可以确保在我们匹配的每个字符之前不匹配前瞻:

     ^((?!mak(e|ing) ?it ?cheaper).)*$ 

    An alternative is to add wild-cards to the beginning of your look-ahead (ie exclude all strings that, from the start of the string, contain anything, then your pattern), but I don't currently see any advantage to this (arbitrary length look-ahead is also less likely to be supported by any given tool): 另一种方法是在你的预测开始时添加通配符(即排除从字符串开头包含任何内容,然后是你的模式的所有字符串),但我目前看不到任何优势(任意给定工具也不太可能支持任意长度前瞻:

     ^(?!.*mak(e|ing) ?it ?cheaper).* 

Because of the ^ and $ , either doing a find or a matches will work for either of the above (though, in the case of matches , the ^ is optional and, in the case of find , the .* outside the look-ahead is optional). 由于^$ ,执行findmatches将适用于上述任一项(但是,在matches的情况下, ^是可选的,在find的情况下, .*在前瞻之外是可选的)。


1: Although they may not be called that, many languages have functions equivalent to matches and find with regex. 1:虽然它们可能不被称为,但许多语言具有matches相同的功能并且使用正则表达式find


The above is the strictly-regex answer to this question. 以上是这个问题的严格正则表达式的答案。

A better approach might be to stick to the original regex ( mak(e|ing) ?it ?cheaper ) and see if you can negate the matches directly with the tool or language you're using. 一个更好的方法可能是坚持原始的正则表达式( mak(e|ing) ?it ?cheaper ),看看你是否可以直接使用你正在使用的工具或语言来否定匹配。

In Java, for example, this would involve doing if (!string.matches(originalRegex)) (note the ! , which negates the returned boolean) instead of if (string.matches(negLookRegex)) . 例如,在Java中,这将涉及if (!string.matches(originalRegex)) (注意! ,否定返回的布尔值)而不是if (string.matches(negLookRegex))

The negative lookahead, I believe is what you're looking for. 负面的前瞻,我相信是你正在寻找的。 Maybe try: 也许试试:

(?!.*mak(e|ing) ?it ?cheaper)

And maybe a bit more flexible: 也许更灵活一点:

(?!.*mak(e|ing) *it *cheaper)

Just in case there are more than one space. 以防有多个空间。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM