简体   繁体   中英

How to compare two numbers in a line which are not same in length and print them

I have a file from which I have to extract lines containing "TCP 0.0.0.0" and ongoing text, then compare the two numbers next to it and print line only if their lengths are not equal.

I have below code which extracts only lines containing "TCP 0.0.0.0" and ongoing text, but I need to filter again by comparing the two numbers next to it and print if length are not equal:

import re

f = open("log.txt", "r")
counter = 0
print("="*20)
for line in f:
  match = re.search("(TCP 0\.0\.0\.0) (.*) (ongoing)", line)
  if match:
    counter += 1
    print("-"*10)

    # If you want to print the whole line
    print("Count {}:[F] {}".format(counter, line.rstrip()))

    # if you want to print just the matched section
    # print("Count {}:[M] {}".format(counter, match.groups()   [1].rstrip()))

print("="*20)
print("Total Found: {}".format(counter))
f.close()

log.txt:

Dash#07-06-2019 18:04:32 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 0 15 Regular "policy1" tetet 534049 533799 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-0029-00005CFADC78

Do#07-06-2019 18:04:32 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 80 15 Regular "policy2" ongoing 77010 76760 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-002A-00005CFADC78

07-06-2019 18:04:37 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 0 15 Regular "policy1" ongoing 53408 533837 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-0029-00005CFADC78

 07-06-2019 18:04:37 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 80 15 Regular "policy2" ongoing 770124 76762 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-002A-00005CFADC78

 D#07-06-2019 18:04:42 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 0 15 Regular "policy1" ongoing 535 533822 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-0029-00005CFADC78

Need to print below three lines from file. As it contains "TCP 0.0.0.0" and ongoing text also "53408,533837" numbers length is not same (in front of ongoing text):

  07-06-2019 18:04:37 WARNING 240 Anomalies "TCP handshake violation, first  packet not syn" TCP 0.0.0.0 0 0.0.0.0 0 15 Regular "policy1" ongoing 53408 533837 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-0029-00005CFADC78

 07-06-2019 18:04:37 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 80 15 Regular "policy2" ongoing 770124 76762 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-002A-00005CFADC78

 D#07-06-2019 18:04:42 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 0 15 Regular "policy1" ongoing 535 533822 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-0029-00005CFADC78

You can use split('ongoing ')[1] to get all text after "ongoing" and then you can split(' ')[0:2] to get both numbers after "ongoing"

import re

data = '''Dash#07-06-2019 18:04:32 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 0 15 Regular "policy1" tetet 534049 533799 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-0029-00005CFADC78
Do#07-06-2019 18:04:32 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 80 15 Regular "policy2" ongoing 77010 76760 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-002A-00005CFADC78
07-06-2019 18:04:37 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 0 15 Regular "policy1" ongoing 53408 533837 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-0029-00005CFADC78
07-06-2019 18:04:37 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 80 15 Regular "policy2" ongoing 770124 76762 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-002A-00005CFADC78
D#07-06-2019 18:04:42 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 0 15 Regular "policy1" ongoing 535 533822 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-0029-00005CFADC78
'''

f = data.split('\n')

for line in f:
    match = re.search("(TCP 0\.0\.0\.0) (.*) (ongoing)", line)
    if match:
        second_part = line.split(' ongoing ')[1]
        numbers = second_part.split(' ')[:2]

        number1 = numbers[0]
        number2 = numbers[1]

        print(number1, 'len:', len(number1))
        print(number2, 'len:', len(number2))

        if len(number1) != len(number2):
            print('different lengths')

        print('---')

Result:

77010 len: 5
76760 len: 5
---
53408 len: 5
533837 len: 6
different lengths
---
770124 len: 6
76762 len: 5
different lengths
---
535 len: 3
533822 len: 6
different lengths

EDIT: Or you can create more complex regex which will get numbers

re.search("TCP 0\.0\.0\.0 (.*) ongoing (\d+) (\d+)", line)

Code:

import re

data = '''Dash#07-06-2019 18:04:32 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 0 15 Regular "policy1" tetet 534049 533799 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-0029-00005CFADC78
Do#07-06-2019 18:04:32 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 80 15 Regular "policy2" ongoing 77010 76760 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-002A-00005CFADC78
07-06-2019 18:04:37 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 0 15 Regular "policy1" ongoing 53408 533837 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-0029-00005CFADC78
07-06-2019 18:04:37 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 80 15 Regular "policy2" ongoing 770124 76762 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-002A-00005CFADC78
D#07-06-2019 18:04:42 WARNING 240 Anomalies "TCP handshake violation, first packet not syn" TCP 0.0.0.0 0 0.0.0.0 0 15 Regular "policy1" ongoing 535 533822 0 0 N/A low drop FFFFFFFF-FFFF-FFFF-0029-00005CFADC78
'''

f = data.split('\n')

for line in f:
    match = re.search("TCP 0\.0\.0\.0 (.*) ongoing (\d+) (\d+)", line)
    if match:
        number1 = match.group(2)
        number2 = match.group(3)

        print(number1, 'len:', len(number1))
        print(number2, 'len:', len(number2))

        if len(number1) != len(number2):
            print('different lengths')

        print('---')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM