简体   繁体   English

用于搜索和排除组合的 REGEX

[英]REGEX for search and exclude combined

Overview:概述:

I am trying to combine two REGEX queries into one:我正在尝试将两个 REGEX 查询合并为一个:

  • \\d+\\.\\d+\\.\\d+\\.\\d+
  • ^(?!(10\\.|169\\.)).*$

I wrote this as a two part query.我把它写成一个两部分的查询。 The first part would isolate IPs in a block of text and after I copy and paste this I select everything and that does not being with a 10 or 169.第一部分将隔离文本块中的 IP,在我复制和粘贴后,我选择所有内容,而不是 10 或 169。

Questions:问题:

It seems like I am over complicating this:好像我把这个复杂化了:

  • Can anybody see a better way to do this?谁能看到更好的方法来做到这一点?
  • Is there a way to combine these two queries?有没有办法结合这两个查询?

Sure.当然。 Just put the anchored negative look ahead at the start:一开始就把锚定的负面看法放在前面:

^(?!10\.|169\.)\d+\.\d+\.\d+\.\d+$

Note: Unnecessary brackets have been removed.注意:不必要的括号已被删除。


To match within a line, ie remove the anchors and use a "word boundary" \\b as the anchor:要在一行内匹配,即删除锚点并使用“单词边界” \\b作为锚点:

\b(?!10\.|169\.)\d+\.\d+\.\d+\.\d+

A quick-and-gimme-regex style answer一个快速和给我正则表达式风格的答案

Basic one (whole string looks like an IP): ^\\d+\\.\\d+\\.\\d+\\.\\d+$基本的一个(整个字符串看起来像一个IP): ^\\d+\\.\\d+\\.\\d+\\.\\d+$

Lite (period-separated 4-digit chunks, a whole word): \\b\\d+\\.\\d+\\.\\d+\\.\\d+\\b精简版(以句号分隔的 4 位数块,一个完整的单词): \\b\\d+\\.\\d+\\.\\d+\\.\\d+\\b

Medium (excluding junk like 1.2.4.6.7.9.0 ): (?<!\\d\\.)\\b\\d+\\.\\d+\\.\\d+\\.\\d+\\b(?!\\.\\d+)中等(不包括像1.2.4.6.7.9.0这样的垃圾): (?<!\\d\\.)\\b\\d+\\.\\d+\\.\\d+\\.\\d+\\b(?!\\.\\d+)

Advanced 1 (not starting with 10 or 169 ): (?<!\\d\\.)\\b(?!(?:1(?:0|69))\\.)\\d+\\.\\d+\\.\\d+\\.\\d+\\b(?!\\.\\d+)高级 1 (不以10169开头): (?<!\\d\\.)\\b(?!(?:1(?:0|69))\\.)\\d+\\.\\d+\\.\\d+\\.\\d+\\b(?!\\.\\d+)

Advanced 2 (not ending with 8 or 10 ): (?<!\\d\\.)\\b\\d+\\.\\d+\\.\\d+\\.(?!(?:8|10)\\b)\\d+\\b(?!\\.\\d+)高级 2 (不以810结尾): (?<!\\d\\.)\\b\\d+\\.\\d+\\.\\d+\\.(?!(?:8|10)\\b)\\d+\\b(?!\\.\\d+)

Details for the curious好奇者详情

The \\b is aword boundary that makes it possible to match exact "words" (entities consisting of [a-zA-Z0-9_] characteters) inside a longer text. \\b是一个单词边界,可以在较长的文本中匹配精确的“单词”(由[a-zA-Z0-9_]字符组成的实体)。 So, if we do not want to match 12.12.23.56 inside g12.12.23.56g , we use the Lite version.所以,如果我们不想匹配12.12.23.56g12.12.23.56g ,我们使用了精简版的版本。

The lookarounds together with the word boundary, make it possible to further restrict the matches. 环视与单词边界一起,可以进一步限制匹配。 (?<!\\d\\.) - a negative lookbehind - and a (?!\\.\\d+) - a negative lookahead - will fail a match if the IP-resembling substring is preceded with a digit + . (?<!\\d\\.) - 负向后视 - 和(?!\\.\\d+) - 负向前瞻 - 如果与 IP 相似的子字符串前面有digit + ,则匹配失败. or followed with a .或后跟一个. + digit . + digit So, we do not match 12.12.34.56.78.90899 -like entities with this regex.所以,我们不匹配12.12.34.56.78.90899类的实体与这个正则表达式。 Choose Medium regex for that case.为这种情况选择中等正则表达式。

Now, you need to restrict the matches to those that do not start with some numeric value.现在,您需要将匹配限制为不以某个数值开头的匹配。 You need to make use of either a lookbehind , or a lookahead .您需要使用lookbehindlookahead When choosing between a lookbehind or a lookahead solution, prefer the lookahead, because 1) it is less resource consuming, and 2) more flavors support it.在后视或前瞻解决方案之间进行选择时,更喜欢前瞻,因为 1)它消耗的资源更少,2)更多的风格支持它。 Thus, to fail all matches where IP first number is equal to 10 or 169 , we can use a negative lookahead anchored after the leading word boundary : (?!(?:1(?:0|69))\\.) .因此,要使 IP 第一个数字等于10169所有匹配失败,我们可以使用锚定前导词边界之后的负前瞻(?!(?:1(?:0|69))\\.) The syntax is (?!...) and inside, we match either 1 followed with 0 and then a .语法是(?!...)并且在里面,我们匹配1后跟0和 a . , or 1 followed with 69 and then . , 或者169然后是. . . Note that we could write (?!10\\.|169\\.) but there is some redundant backtracking overhead then, as 1 part is repeating.请注意,我们可以写(?!10\\.|169\\.)但是有一些多余的回溯开销,因为1部分是重复的。 Best practice is to "contract" alternations so that the beginning of each branch did not repeat, make the alternation group more linear.最佳实践是“收缩”交替,使每个分支的开头不重复,使交替组更线性。 So, use Advanced 1 regex version to get those IPs.因此,使用Advanced 1 regex 版本来获取这些 IP。

A similar case is the Advanced 2 regex for getting some IPs that do not end with some value.类似的情况是Advanced 2 regex 用于获取一些不以某些值结尾的 IP。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM