简体   繁体   中英

Using Python regex to check for string in file

I'm using Python regex to check a log file that contains the output of the Windows command tasklist for anything ending with .exe. This log file contains output from multiple callings of tasklist . After I get a list of strings with .exe in them, I want to write them out to a text file after checking to see if that string already exists in the output file. Instead of the desired output, it writes out duplicates of strings already present in the text file. (svchost.exe shows up several times for example.) The goal is to have a text file with a list of each unique process enumerated by tasklist with no duplicates of processes already written in the file.

import re

file1 = open('taskinfo.txt', 'r')
strings = re.findall(r'.*.exe', file1.read())
file1.close()
file2 = open('exes.txt', 'w+')
for item in strings:
    line_to_write = re.match(item, file2.read())
    if line_to_write == None:
        file2.write(item)
        file2.write('\n')
    else:
        pass

I used print statements to debug and made sure than item is the desired output.

There are some problems with your regex. Try this:

strings = re.findall(r'\b\S*\.exe\b', file1.read())

This will only take the text connected to the .exe by starting at a word boundary ( \\b ) and grabbing all non-space characters ( \\S ). Additionally, when you had .exe instead of \\.exe , the . was matching as a wildcard, rather than a literal period.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM