简体   繁体   English

查找文本文件中所有出现的模式

[英]Find all occurences of a pattern in a text file

I have a text file which looks like this 我有一个看起来像这样的文本文件

Nmap scan report for 192.168.2.1
Host is up (0.023s latency).
PORT     STATE  SERVICE
5001/tcp closed commplex-link
MAC Address: EC:1A:59:A2:84:80 (Belkin International)

Nmap scan report for 192.168.2.2
Host is up (0.053s latency).
PORT     STATE  SERVICE
5001/tcp closed commplex-link
MAC Address: 94:35:0A:F0:47:C2 (Samsung Electronics Co.)

Nmap scan report for 192.168.2.3  
Host is up (0.18s latency).  
PORT     STATE    SERVICE  
5001/tcp filtered commplex-link  
MAC Address: 00:13:CE:C0:E5:F3 (Intel Corporate)  

Nmap scan report for 192.168.2.6
Host is up (0.062s latency).
PORT     STATE  SERVICE
5001/tcp closed commplex-link
MAC Address: 90:21:55:7D:53:4F (HTC)

I want to find all the IPs with port 5001 closed (not filtered). 我想查找端口5001关闭(未过滤)的所有IP。 I tried to use the following logic to find all such IPs 我尝试使用以下逻辑来查找所有此类IP

fp = open('nmap_op.txt').read()
ip = re.compile('([0-9]+\.[0-9]+\.[0-9]+\.[0-9]+)(.*)closed',re.S)
matched = ip.findall(fp)
for item in matched:
  print item

I was expecting the output to be 我期望输出是

192.168.2.1 192.168.2.1

192.168.2.2 192.168.2.2

192.168.2.6 192.168.2.6

But I'm not getting the desired output. 但是我没有得到想要的输出。 The output is just one item which looks like this: 输出只是一个看起来像这样的项目:

('192.168.2.1', '\\nHost is up (0.023s latency).\\nPORT STATE SERVICE\\n5001/tcp closed commplex-link\\nMAC Address: EC:1A:59:A2:84:80 (Belkin International)\\n\\nNmap scan report for 192.168.2.2\\nHost is up (0.053s latency).\\nPORT STATE SERVICE\\n5001/tcp closed commplex-link\\nMAC Address: 94:35:0A:F0:47:C2 (Samsung Electronics Co.)\\n\\nNmap scan report for 192.168.2.3\\nHost is up (0.18s latency).\\nPORT STATE SERVICE\\n5001/tcp filtered commplex-link\\nMAC Address: 00:13:CE:C0:E5:F3 (Intel Corporate)\\n\\nNmap scan report for 192.168.2.6\\nHost is up (0.062s latency).\\nPORT STATE SERVICE\\n5001/tcp ) (“ 192.168.2.1”,“ \\ n主机已启动(延迟0.023秒)。\\ n端口状态服务\\ n5001 / tcp关闭了复杂链接\\ nMAC地址:EC:1A:59:A2:84:80(Belkin International)\\ n \\ n针对192.168.2.2的Nmap扫描报告\\ n主机已启动(延迟0.053秒)。\\ n端口状态服务\\ n5001 / tcp关闭了复杂链接\\ nMAC地址:94:35:0A:F0:47:C2(三星电子有限公司。)\\ n \\ n针对192.168.2.3 \\ n的Nmap扫描报告\\ n主机已启动(延迟为0.18s)。\\ n端口状态服务\\ n5001 / tcp过滤的复杂链接\\ nMAC地址:00:13:CE:C0:E5:F3(英特尔公司)\\ n \\ n针对192.168.2.6的Nmap扫描报告\\ n主机已启动(延迟0.062秒)。\\ n端口状态服务\\ n5001 / tcp)

Where am I going wrong? 我要去哪里错了?

Solution: Below logic worked for me. 解决方案:以下逻辑对我有用。 If anyone has a better answer, please let me know. 如果有人有更好的答案,请告诉我。

fp = open('nmap_op.txt').read()
entries = re.split('\n\n',fp)  
ip = re.compile('([0-9]+\.[0-9]+\.[0-9]+\.[0-9]+).*?closed',re.S)
matched = []
for item in entries:
  if ip.search(item):
    matched.append(ip.search(item).group(1))

You don't need re.S here. 你不需要re.S这里。 The s modifier changes the meaning of the dot meta-character ( . ) from " match everything except newline characters " to " match everything including newline characters ". s修饰符将点元字符( . )的含义从“ 匹配换行符以外的所有内容 ”更改为“ 匹配包括换行符在内的所有内容 ”。 You don't need that here. 您在这里不需要。

The second capturing group isn't required either. 第二个捕获组也不是必需的。 You can just remove it to have only the IPs returned: 您可以将其删除以仅返回IP:

>>> matched = re.findall('([0-9]+\.[0-9]+\.[0-9]+\.[0-9]+).*closed', fp)
>>> matched
['192.168.2.1', '192.168.2.2', '192.168.2.6']

Since the lines format seems to be always the same (the ip starts at offset 21 and ends at the next space), you can try this another way without regex: 由于行格式似乎总是相同的(ip从偏移量21开始并在下一个空格处结束),因此您可以在不使用正则表达式的情况下尝试另一种方式:

for block in data.split("\n\n"):
    if block.find('5001/tcp closed')>0:
        print block[21:block.find('\n', 27)]

You can do: 你可以做:

>>> re.findall(r'^Nmap.*?(\d+\.\d+\.\d+\.\d+).*?5001\/tcp closed', fp, re.M)
# ['192.168.2.1', '192.168.2.2', '192.168.2.6']

Solution: Below logic worked for me. 解决方案:以下逻辑对我有用。

fp = open('nmap_op.txt').read()
entries = re.split('\n\n',fp)  
ip = re.compile('([0-9]+\.[0-9]+\.[0-9]+\.[0-9]+).*?closed',re.S)
matched = []
for item in entries:
  if ip.search(item):
    matched.append(ip.search(item).group(1))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM