简体   繁体   中英

Parse text file and only capture lines between two lines with specific characters

I have to write a python script that parses a log text file, but the only data of interest is that of the "Test" being examined. The text file is in the following general format:

Test 1
[lines of data]

Test 2
[lines of data]

...

The [lines of data] represents what could be either many or few lines of data from the said test, and the log file can have any number of tests. So if I only wanted to look at "Test 1", what I want my script to do is extract all the information between "Test 1" and "Test 2" but have it stop reading before "Test 2".

The catch is that I want my script to do the same thing even if I'm looking to parse the data from, say Test 12, and have it stop before Test 13, because there can be any number of tests in said file. How would I go about this?

May i suggest using the following code:

import re

with open("1new.txt","r") as file:
    eaw=file.read()

num_of_tests=2
for i in range(1,num_of_tests):
    extract=re.search(r"(?<=Test %s)(.*)(?=Test %s)"%(i,i+1),eaw,re.DOTALL).group()
    print(extract)

The OUTput will be:

[lines of data]
[lines of data]

Can add additional lines to append the extracted lines into a different file:

with open("extracted.txt","a") as file2:

    file2.write(extract)

The regex will simply look for matches in between Test 1 and Test 2 and so on. it uses positive lookbehind "?<=" and positive lookahead "?=" to look for matches and with ".*" you will be able to get everything in between the matches.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM