简体   繁体   中英

Python log parsing for IP's

I'm new to python and have been going through some tutorials on log parsing with regular expressions. In the code below I am able to parse a log and create a file with remote IP's making a connection to the server. I'm missing the piece that will eliminate duplicate IP's in the out.txt file created. Thanks

import re
import sys

infile = open("/var/log/user.log","r")
outfile = open("/var/log/intruders.txt","w")

pattern = r"\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}"
regexp = re.compile(pattern, re.VERBOSE)

for line in infile:
  result = regexp.search(line)
  if result:
    outfile.write("%s\n" % (result.group()))

infile.close()
outfile.close()

You can save the results seen so far in a set() and then only write-out results that have not yet been seen. This logic is easy to add to your existing code:

import re
import sys

seen = set() 

infile = open("/var/log/user.log","r")
outfile = open("/var/log/intruders.txt","w")

pattern = r"\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}"
regexp = re.compile(pattern, re.VERBOSE)

for line in infile:
  mo = regexp.search(line)
  if mo is not None:
     ip_addr = mo.group()
     if ip_addr not in seen:
         seen.add(ip_addr)
         outfile.write("%s\n" % ip_addr)

infile.close()
outfile.close()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM