简体   繁体   中英

Read lines between two keywords Python

I am trying to read and write the contents between 2 keywords in a file to cut out the rest of the file that I dont need.

** ASSEMBLY
**
*Assembly, name=Assembly
**  
*Instance, name=Part-1-1, part=Part-1
*Node
**Node
1,          12.,           0.
2,          12.,          -6.
3,           9.,         -15.
4,   7.99721575,         -53.
** Section: t3
*Shell Section, elset=Set-3, material=PET
2., 5
** Section: t4
*Shell Section, elset=Set-4, material=PET
2., 5
*End Instance
**  
*End Assembly
** 
** MATERIALS
** 
*Material, name=PET

The file I am trying to read is above with the middle section cut out. The code I am using is;

inFile = open("Exp_1.inp")
outFile = open("Exp_12.inp", "w")
keepCurrentSet = False
for line in inFile:
    if line.startswith("*Node"):
        keepCurrentSet = False

    if keepCurrentSet:
        outFile.write(line)

    if line.startswith("*End instance"):
        keepCurrentSet = True
inFile.close()
outFile.close()

Although I cannot figure out why it is writing a blank file.

EDIT

inFile = open("Exp_1.inp")
outFile = open("Exp_12.inp", "w")
keepCurrentSet = True
for line in inFile:
    if line.startswith("*Node"):
        keepCurrentSet = True

    if keepCurrentSet:
        outFile.write(line)

    if line.startswith("*End Instance"):
        keepCurrentSet = False
inFile.close()
outFile.close()

The above code is the solution to what I need. Could someone suggest how I could edit this code to not include the final key word '*End Instance'

Thanks in advance.

A bit of elementary debugging would have shown you two things:

  • keepCurrentSet is set to False initially (It should probably be True )

  • You have a typo: if line.startswith("*End instance"): should be if line.startswith("*End Instance"):

You may not pick this solution but I would use regex as it should do all work for you. Using regex you could reduce your code down to this:

import re
with open("Exp_1.inp") as f:
    with open("Exp_12.inp", "w") as x:
        x.write(re.sub("\*\*Node[\s\S]*?(?=\*End Instance)", "", f.read()))

Here it replaces everything starting with **Node until it hits to End Instance . You can examine the regex expression here to get better understanding of this solution.

Output:

** ASSEMBLY
**
*Assembly, name=Assembly
**  
*Instance, name=Part-1-1, part=Part-1
*Node
*End Instance
**  
*End Assembly
** 
** MATERIALS
** 
*Material, name=PET

You are almost correct but you made two mistakes :

First one when you first encounter *Node , you should set your keepCurrentSet to True so you do get the following lines in your output file :

if line.startswith("*Node"):
        keepCurrentSet = True

Second you have a typo : "*End instance" should be "*End Instance"

Which would give you :

inFile = open("Exp_1.inp")
outFile = open("Exp_12.inp", "w")
keepCurrentSet = False
for line in inFile:
    if line.startswith("*Node"):
        keepCurrentSet = True # CHANGE HERE False => True 

    if keepCurrentSet:
        outFile.write(line)

    if line.startswith("*End Instance"): # CHANGE HERE instance => Instance
        keepCurrentSet = True
inFile.close()
outFile.close()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM