简体   繁体   中英

How to add a word after a particular line in a text?

I am trying to add a word after a particular line in my file. I am using infile and a reference and trying to create outfile. The infile and reference are the same type, but the reference contains the TER word at a particular position. I want to add the TER word into the outfile (basically create a copy of infile and add the TER from the reference). I am trying to search by a number (resnum in the code) but there will be a problem as many consecutive lines have the same number. Can anybody help?

from sys import argv
import argparse

script,infile,outfile, reference = argv
Ter = []
res = []


def get_Ter(reference):
    reference_1 = open(reference,"r")
    for line in reference_1:
        contents = line.split(" ")
        if contents[0] == "TER":
        resnum = line[22:27]
        resname = line[17:20]
        chain = line[21]
        Ter.append(resnum)

        def find_TER(infile,outfile):
            with open(infile, "r") as infile_1:
                content = infile_1.readlines()
            with open(outfile, "w+") as outfile_1:
                outfile_1.write(content)
                if line[0:6] == "ATOM  ":
                    resnum_1 = line[22:27]
                    res.append(resnum_1)
                    if resnum_1 in res == resnum in Ter:
                        outfile_1.write(line + "\nTER")

        find_TER(infile,outfile)
get_Ter(reference)

example of a file (this is the reference, the infile is the same but missing the TER). They are all nicely lined up underneath each other (The formatting here):
ATOM 992 SG CYX D 452 23.296 45.745 28.572 1.00 0.00
ATOM 993 C CYX D 452 20.742 42.431 27.841 1.00 0.00
ATOM 994 O CYX D 452 20.689 41.447 28.565 1.00 0.00
ATOM 995 OXT CYX D 452 19.788 42.822 27.185 1.00 0.00
TER 995 CYS D 452
ATOM 996 N ARG D 492 27.510 26.357 34.041 1.00 0.00
ATOM 997 H1 ARG D 492 26.590 26.591 33.694 1.00 0.00
ATOM 998 H2 ARG D 492 28.138 27.135 34.182 1.00 0.00
ATOM 999 H3 ARG D 492 27.422 26.030 34.993 1.00 0.00
ATOM 1000 CA ARG D 492 28.179 25.410 33.192 1.00 0.00

Now I have this:

from sys import argv
import argparse

   script,infile,outfile, reference = argv
   Ter = []
   res = []

def get_Ter(reference):
    reference_1 = open(reference,"r")
    for line in reference_1:
        contents = line.split(" ")
    if contents[0] == "TER":
        ternum = line[22:27]

        def find_TER(infile,outfile):
            with open(infile, "r") as infile_1:
                content = infile_1.readlines()
            with open(outfile, "w+") as outfile_1:
                for line in content:
                    outfile_1.write(line)
                    line = line.split(" ")
                    if line[0] == "ATOM":
                        resnum = line[22:27]
                        if ternum == resnum:




                            find_TER(infile,outfile)
get_Ter(reference)

The basic logic is twofold:

  1. Determine when you need the TER line and generate it. (You've done this.)
  2. Detect when it's time to write that line to the output.

All you really need to do for the second part is to recognize that you have a pending TER output for resnum 452 (or whatever number it is). You can do this with a simple variable: keep it at -1 until you have a valid resnum.

As you read, you check that resnum consistently. If it's positive and different from the most recent input line, then you have to print the TER line before doing anything else. Something like this:

contents = line.split():
resnum = line[22:27]
if ternum > 0 and ternum != int(resnum):
    # write out the TER line
    ternum = -1

# continue with rest of the program.
if contents[0] == "TER":
    ...

You might also need to check at end of file, in case the last resnum has a hanging TER line to print out.

Is that enough to move you along?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM