简体   繁体   中英

Extract IP addresses from text file without using REGEX

I am trying to extract IPv4 addresses from a text file and save them as a list to a new file, however, I can not use regex to parse the file, Instead, I have check the characters individually. Not really sure where to start with that, everything I find seems to have import re as the first line.

So far this is what I have,

#Opens and prints wireShark txt file
fileObject = open("wireShark.txt", "r")
data = fileObject.read()
print(data)

#Save IP adresses to new file
with open('wireShark.txt') as fin, open('IPAdressess.txt', 'wt') as fout:
    list(fout.write(line) for line in fin if line.rstrip())


#Opens and prints IPAdressess txt file    
fileObject = open("IPAdressess.txt", "r")
data = fileObject.read()
print(data)

#Close Files
fin.close()
fout.close()

So I open the file, and I have created the file that I will put the extracted IP's in, I just don't know ow to pull them without using REGEX.

Thanks for the help.

Here is a possible solution.

The function find_first_digit , position the index at the next digit in the text if any and return True . Else return False

The functions get_dot and get_num read a number/dot and, lets the index at the position just after the number/dot and return the number/dot as str . If one of those functions fails to get the number/dot raise an MissMatch exception.

In the main loop, find a digit, save the index and then try to get an ip.

If sucess, write it to output file.

If any of the called functions raises a MissMatch exception, set the current index to the saved index plus one and start over.

class MissMatch(Exception):pass

INPUT_FILE_NAME = 'text'
OUTPUT_FILE_NAME = 'ip_list'
                

def find_first_digit():
    
    while True:
        c = input_file.read(1)
        if not c: # EOF found!
            return False
        elif c.isdigit():
            input_file.seek(input_file.tell() - 1)
            return True


def get_num():

    num = input_file.read(1)  # 1st digit
    if not num.isdigit():
        raise MissMatch
    if num != '0':
        for i in range(2):    # 2nd 3th digits
            c = input_file.read(1)
            if c.isdigit():
                num += c
            else:
                input_file.seek(input_file.tell() - 1)
                break
    return num


def get_dot():
    
    if input_file.read(1) == '.':
        return '.'
    else:
        raise MissMatch


with open(INPUT_FILE_NAME) as input_file, open(OUTPUT_FILE_NAME, 'w') as output_file:
    while True:
        ip = ''
        if not find_first_digit():
            break
        saved_position = input_file.tell()
        
        try:
            ip = get_num() + get_dot() \
               + get_num() + get_dot() \
               + get_num() + get_dot() \
               + get_num()
        except MissMatch:
            input_file.seek(saved_position + 1)
        else:
            output_file.write(ip + '\n')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM