Python - 正則表達式排除文件中的某些行

Question

IP 文件如下所示：

# 111.111.111.111     <= exclude starting with # 
112.112.112.112 1     <= exclude one which has 1 next to it after space(s)
113.113.113.113 2     <= exclude one which has 2 next to it after space(s)
114.114.114.114 3     <= print this 
115.115.115.115 4     <= print this and so on

我對此的看法：

ip = re.findall(r".*\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}[\s](?!1|2)", x)

這沒有向我展示正確的 IP，我是一名 JS 開發人員，希望得到幫助。

Answer 1

您可以使用字符類來匹配 0 或數字 3-9。

^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\s+[03-9]$

正則表達式演示

如果可以有比 0 和 3-9 更多的數字，您可以使用替代也匹配 2 個或更多數字。

^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\s+(?:[03-9]|\d{2,})$

正則表達式演示

Answer 2

如何返回不以#開頭且結尾沒有1或2的行：

import re

with open("ip.txt") as infile:  # change it to your real file name
    for line in infile:
        ip = line.strip()
        match = re.match(r"#(.+)|(.+?)\s+[12]$", ip)
        if not match:
            print(ip)

輸出

114.114.114.114 3
115.115.115.115 4

Answer 3

這個正則表達式可能有幫助：

^(?:\d{3}\.){3}\d{3}\s+([^012]|\d{2,})\b

注意：由於使用了標志 '^'，所以使用了 re.MULTILINE。

注釋：此選項也將處理大於 9 的數字。一個棘手的數字是例如上下文 '116.116.116.116 22' 中的 22。

Answer 4

我會做：

import re
data = '''# 111.111.111.111      
112.112.112.112 1
113.113.113.113 2
114.114.114.114 3 
115.115.115.115 4'''
ip = re.findall(r'^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b(?![\s]+[12])', data, re.MULTILINE)
print(ip)

輸出：

['114.114.114.114', '115.115.115.115']

說明： ^與re.MULTILINE結合 - 查找從行首開始的子字符串， \\b確保捕獲整個地址（例如，防止獲取112.112.112.11 ）， (?![\\s]+[12])負前瞻 - 一個或多個空格后跟1或2 。 請注意，python 前瞻可能具有可變長度，而后視需要固定長度。

Python - 正則表達式排除文件中的某些行

問題描述

4 個解決方案

解決方案1
1 已采納 2020-11-06 11:22:09

解決方案2
1 2020-11-06 11:29:11

解決方案3
1 2020-11-06 11:30:08

解決方案4
1 2020-11-06 11:31:06

Python - 正則表達式排除文件中的某些行

問題描述

4 個解決方案

解決方案1 1 已采納 2020-11-06 11:22:09

解決方案2 1 2020-11-06 11:29:11

解決方案3 1 2020-11-06 11:30:08

解決方案4 1 2020-11-06 11:31:06

解決方案1
1 已采納 2020-11-06 11:22:09

解決方案2
1 2020-11-06 11:29:11

解決方案3
1 2020-11-06 11:30:08

解決方案4
1 2020-11-06 11:31:06