Python: Identify Times In File

Question

I'm very new to Python, so apologies, but what's the best way to allow Python to identify times, and use them as integers? I have a file, and need to count the number of lines between two times, eg, per hour. The file looks like this:

Feb  3 08:17:01 j4-be02 CRON[32735]: pam_unix(cron:session): session opened for user root by (uid=0)
Feb  3 08:17:01 j4-be02 CRON[32735]: pam_unix(cron:session): session closed for user root
Feb  3 08:35:21 j4-be02 sshd[32741]: reverse mapping checking getaddrinfo for reserve.cableplus.com.cn [211.167.103.172] failed - POSSIBLE BREAK-IN ATTEMPT!
Feb  3 08:35:21 j4-be02 sshd[32741]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=211.167.103.172  user=root
Feb  3 08:35:23 j4-be02 sshd[32741]: Failed password for root from 211.167.103.172 port 34583 ssh2
Feb  3 08:35:27 j4-be02 sshd[32744]: reverse mapping checking getaddrinfo for reserve.cableplus.com.cn [211.167.103.172] failed - POSSIBLE BREAK-IN ATTEMPT!

So far I've managed to split the times by ':' (see code underneath), but I don't know how to save the HH or MM or SS as a variable, so that I can get Python to know when the next hour is up? For instance, if the file starts at 08:17:01, I would need it to count the number of lines in the file between 08:17:01, and 09:17:01.

  failedPass = 'Failed password for'
  for line in authStrings:
    if ":" in line and failedPass in line:
      time = line.split(':')
      print(time)

Many thanks!

Answer 1

First, format your string into a usable list:

string = """Feb  3 08:17:01 j4-be02 CRON[32735]: pam_unix(cron:session): session opened for user root by (uid=0)
Feb  3 08:17:01 j4-be02 CRON[32735]: pam_unix(cron:session): session closed for user root
Feb  3 08:35:21 j4-be02 sshd[32741]: reverse mapping checking getaddrinfo for reserve.cableplus.com.cn [211.167.103.172] failed - POSSIBLE BREAK-IN ATTEMPT!
Feb  3 08:35:21 j4-be02 sshd[32741]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=211.167.103.172  user=root
Feb  3 08:35:23 j4-be02 sshd[32741]: Failed password for root from 211.167.103.172 port 34583 ssh2
Feb  3 08:35:27 j4-be02 sshd[32744]: reverse mapping checking getaddrinfo for reserve.cableplus.com.cn [211.167.103.172] failed - POSSIBLE BREAK-IN ATTEMPT!"""
times = [line.split()[2] for line in string.split('\n')]

Next, we convert them to datetime objects:

from datetime import datetime, timedelta
datetimes = [datetime.strptime(time, '%H:%M:%S') for time in times]

Then, provided a given time, we can determine the number of lines that occur before a set period after that time. In this case, we'll use the first time as the starting time, and an offset of 10 minutes:

start = datetimes[0]
offset = {"minutes":10}
print(len([time for time in datetimes if time < start + timedelta(**offset)]))

Python: Identify Times In File

Question

1 answers

solution1
0 ACCPTED 2018-01-19 20:08:49

Python: Identify Times In File

Question

1 answers

solution1 0 ACCPTED 2018-01-19 20:08:49

solution1
0 ACCPTED 2018-01-19 20:08:49