简体   繁体   中英

Trying to create a Python Script to extract data from .log files

I'm trying to create a Python Script but I'm a bit stuck and can't find what I'm looking for on a Google search as it's quite specific.

I need to run a script on two .log files (auth.log and access.log) to view the following information:

Find how many attempts were made with the bin account

So how many attempts the bin account made to try and get into the server .

The logs are based off being hacked and needing to identify how and who is responsible.

Would anyone be able to give me some help in how I go about doing this? I can provide more information if needed.

Thanks in advance.

Edit:

I've managed to print all the times 'bin' appears in the log which is one way of doing it. Does anyone know if I can count how many times 'bin' appears as well?

with open("auth.log") as f:
for line in f:
    if "bin" in line:
        print line

If you want ot use tool then you can use ELK(Elastic,Logstash and kibana). if no then you have to read first log file then apply regex according to your requirment.

Given that you work with system logs and their format is known and stable, my approach would be something like:

  • identify a set of keywords (either common, or one per log)
  • for each log, iterate line by line
  • once keywords match, add the relevant information from each line in eg a dictionary

You could use shell tools (like grep , cut and/or awk ) to pre-process the log and extract relevant lines from the log (I assume you only need eg error entries).

You can use something like this as a starting point.

In case you might be interested in extracting some data and save it to a .txt file, the following sample code might be helpful:

import re
import sys
import os.path


expDate = '2018-11-27'
expTime = '11-21-09'


infile = r"/home/xenial/Datasets/CIVIT/Nov_27/rover/NMND17420010S_"+expDate+"_"+expTime+".LOG"

keep_phrases = ["FINESTEERING"]

with open(infile) as f:
    f = f.readlines()

with open('/home/xenial/Datasets/CIVIT/Nov_27/rover/GPS_'+expDate+'_'+expTime+'.txt', 'w') as file:
    file.write("gpsWeek,gpsSOW\n")
    for line in f:
        for phrase in keep_phrases:
            if phrase in line:
                resFind = re.findall('\.*?FINESTEERING,(\d+).*?,(\d+\.\d*)',line)[0]
                gpsWeek = re.findall('\.*?FINESTEERING,(\d+)',line)[0]
                gpsWeekStr = str(gpsWeek)

                gpsSOW = re.findall('\.*?FINESTEERING,'+ gpsWeekStr + ',(\d+\.\d*)',line)[0]
                gpsSOWStr = str(gpsSOW)

                file.write(gpsWeekStr+','+gpsSOWStr+'\n')
                break

print ("------------------------------------")

In my case, FINESTEERING was an interesting keyword in my .log file to extract numbers, including GPS_Week and GPS_Seconds_of_Weeks. You may modify this code to suit your own application.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM