简体   繁体   中英

How to count all matches from a regex string python 3

I am trying to run a regex against a text file containing MAC addresses. It matches all four lines but only counts 2. I am new to python and for some reason i cannot figure this out. Code below:

regmac = re.compile("^(([a-fA-F0-9]{2}-){5}[a-fA-F0-9]{2}|([a-fA-F0-9]{2}:){5}[a-fA-F0-9]{2}|([0-9A-Fa-f]{4}\.){2}[0-9A-Fa-f]{4})?$")

regmac1 = "^(([a-fA-F0-9]{2}-){5}[a-fA-F0-9]{2}|([a-fA-F0-9]{2}:){5}[a-fA-F0-9]{2}|([0-9A-Fa-f]{4}\.){2}[0-9A-Fa-f]{4})?$"

with open(file, 'r') as i:
    for line in i.read().split('\n'):
        #matches = re.findall(regmac1, line)
        matches = regmac.findall(line)
        print(matches.count(regmac1)))
    
        macmatch = len(matches)
        macmatch += 1
    print(macmatch)

This returns the following:

C:/Users/jonat/desktop/mactest1.txt
['AE:30:5B:AA:65:7B', '9C:30:5B:BB:66:7B', 'AE:30:5B:CC:67:7B', '9C:30:5B:DD:68:7B']
0
0
0
0
2

I have checked forums etc and this is what i ended up with, any nudge in the right direction would be appreciated, thanks!

You are resetting your macmatch every time you go to a new line. Initialize macmatch outside of the for loop, and then it will work. You also have a lot of capturing groups in your regex that may be throwing off your match count. You can use ?: inside of parentheses to prevent a capturing group from being created, like this:

^((?:[a-fA-F0-9]{2}-){5}[a-fA-F0-9]{2}|(?:[a-fA-F0-9]{2}:){5}[a-fA-F0-9]{2}|(?:[0-9A-Fa-f]{4}\.){2}[0-9A-Fa-f]{4})?$

And if you are not trying to verify the accuracy of MAC addresses but instead are only looking for strings that look like MAC addresses (so that 9C:30:5B:BB-66-7B is also acceptable), you can shorten your regex significantly:

^((?:[a-fA-F0-9]{2}[:-]){5}[a-fA-F0-9]{2}|(?:[0-9A-Fa-f]{4}\.){2}[0-9A-Fa-f]{4})?$

Then you can run:

with open(file, 'r') as i:
    macmatch = 0
    for line in i.readlines():
        matches = regmac.findall(line)
        macmatch += len(matches)
        # OR: macmatch += (1 if matches else 0)

    print(macmatch)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM