简体   繁体   中英

Parse specific data from a text file

I am parsing a text file that has format like this

    ...some lines before this...
    MY TEST MATRIX (ROWS)
     2X+00  2X+00  1X+00  
     2X+00  2X+00  1K+00  
     2X+00  2X+00  1X+00
    MY TEST END
     2Y+00  2Y+00  1E+00  
     2Y+00  2Z+00  1E+00  
     2Y+00  2F+00  1E+00
    STOP
    ---some lines after this

I am trying to read values between MY TEST MATRIX and MY TEST END in one array and MY TEST END and STOP in another.

This is what I wrote till now:

       file_open = open("%s" %filename,"r")
       all_lines = file_open.readlines()
           for line in all_lines:
             line = line.strip()                           
             if line[0] !="MY TEST MATRIX (ROWS)":

This unfortunately reads all the lines. I was wondering if anyone can share some ideas on how can I read the number data in an array between those blocks. Any suggestions would be helpful.

Using re.findall

Ex:

import re
s = """...some lines before this...
    MY TEST MATRIX (ROWS)
     2X+00  2X+00  1X+00  
     2X+00  2X+00  1K+00  
     2X+00  2X+00  1X+00
    MY TEST END
     2Y+00  2Y+00  1E+00  
     2Y+00  2Z+00  1E+00  
     2Y+00  2F+00  1E+00
    STOP
    ---some lines after this"""

firstValue = re.findall(r"(?<=MY TEST MATRIX).*?(?=MY TEST END)", s, flags=re.DOTALL)
print([i.strip() for i in firstValue])

secondValue = re.findall(r"(?<=MY TEST END).*?(?=STOP)", s, flags=re.DOTALL)
print([i.strip() for i in secondValue])

Output:

['(ROWS)\n     2X+00  2X+00  1X+00  \n     2X+00  2X+00  1K+00  \n     2X+00  2X+00  1X+00']
['2Y+00  2Y+00  1E+00  \n     2Y+00  2Z+00  1E+00  \n     2Y+00  2F+00  1E+00']

Without re

Demo:

firstValue = [[]]
secondValue = [[]]
checkFirst = False
checkSecond = False
with open(filename, "r") as infile:
    for line in infile:
        if line.strip().startswith("MY TEST MATRIX"):
            checkFirst = True
        if line.strip().startswith("MY TEST END"):
            checkFirst = False
            checkSecond = True
        if line.strip().startswith("STOP"):
            checkSecond = False  

        if checkFirst:
            firstValue[-1].append(line) 

        if checkSecond:
            secondValue[-1].append(line)          

print(firstValue)
print(secondValue)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM