简体   繁体   中英

How to input from input direcory folder and save output file in same name as input file in output folder in python

I want to create input directory for my code will take input files from input directory and save same name as input file in output (different folder) directory.


import sys
import glob
import errno
import os

d = {}
chainIDs = ('A', 'B')
atomIDs = ('C4B', 'O4B', 'C1B', 'C2B', 'C3B', 'C4B', 'O4B', 'C1B')
count = 0
for doc in os.listdir('/C:/Users/Vishnu/Desktop/Test_folder/Input'):
doc1 = "doc_path" + doc
doc2 = "/C:/Users/Vishnu/Desktop/Test_folder/Output" + doc1
if doc1.endswith(".pdb"):
with open(doc) as pdbfile:
       single_line = ''.join([line for line in f])
       single_space = ' '.join(single_line.split())
       for line in map(str.rstrip, pdbfile):
            if line[:6] != "HETATM":
            chainID = line[21:22]
            atomID = line[13:16].strip()
            if chainID not in chainIDs:
            if atomID not in atomIDs:
                d[chainID][atomID] = line
            except KeyError:
                d[chainID] = {atomID: line}

    n = 4
    for chainID in chainIDs:
        for i in range(len(atomIDs)-n+1):
            for j in range(n):
                   with open(doc2.format(count) , "w") as doc2:
                         count += 1   


Below error while running above code, I am new in python, just learning, can anyone please help? error:

with open(doc) as pdbfile:
IndentationError: expected an indented block

Input file:

HETATM15207  C4B NAD A 501      47.266 101.038   7.214  1.00 11.48           C  
HETATM15208  O4B NAD A 501      46.466 100.713   8.371  1.00 11.48           O  
HETATM15209  C3B NAD A 501      47.659  99.689   6.567  1.00 11.48           C  
HETATM15211  C2B NAD A 501      46.447  98.835   6.988  1.00 11.48           C  
HETATM15213  C1B NAD A 501      46.221  99.300   8.426  1.00 11.48           C  
HETATM15252  C4B NAD B 501      36.455 115.053  36.671  1.00 11.25           C  
HETATM15253  O4B NAD B 501      35.930 114.469  35.492  1.00 11.25           O  
HETATM15254  C3B NAD B 501      35.307 115.837  37.367  1.00 11.25           C  
HETATM15256  C2B NAD B 501      34.172 114.876  37.039  1.00 11.25           C  
HETATM15258  C1B NAD B 501      34.524 114.613  35.551  1.00 11.25           C  
HETATM15297  C4B NAD C 501      98.229 130.106  18.332  1.00 12.28           C  
HETATM15298  O4B NAD C 501      98.083 131.545  18.199  1.00 12.28           O  
HETATM15299  C3B NAD C 501      99.346 129.675  17.343  1.00 12.28           C  
HETATM15301  C2B NAD C 501     100.220 130.922  17.375  1.00 12.28           C  
HETATM15303  C1B NAD C 501      99.125 132.008  17.317  1.00 12.28           C  
HETATM15342  C4B NAD D 501      77.335 156.939  25.788  1.00 11.99           C  
HETATM15343  O4B NAD D 501      78.705 156.544  25.901  1.00 11.99           O  
HETATM15344  C3B NAD D 501      77.106 158.059  26.824  1.00 11.99           C  
HETATM15346  C2B NAD D 501      78.536 158.632  26.878  1.00 11.99           C  
HETATM15348  C1B NAD D 501      79.351 157.345  26.900  1.00 11.99           C  

Col 2 is Residue name, Col 4 is A, B, C, D is the chain ID:

Expected Output for each chain ID (A, B ..... Z) Chain ID may be A to Z but mostly A to H:

for A chain:

HETATM15207  C4B NAD A 501      47.266 101.038   7.214  1.00 11.48           C 
HETATM15208  O4B NAD A 501      46.466 100.713   8.371  1.00 11.48           O  
HETATM15213  C1B NAD A 501      46.221  99.300   8.426  1.00 11.48           C  
HETATM15211  C2B NAD A 501      46.447  98.835   6.988  1.00 11.48           C   

HETATM15208  O4B NAD A 501      46.466 100.713   8.371  1.00 11.48           O  
HETATM15213  C1B NAD A 501      46.221  99.300   8.426  1.00 11.48           C  
HETATM15211  C2B NAD A 501      46.447  98.835   6.988  1.00 11.48           C  
HETATM15209  C3B NAD A 501      47.659  99.689   6.567  1.00 11.48           C  

HETATM15213  C1B NAD A 501      46.221  99.300   8.426  1.00 11.48           C  
HETATM15211  C2B NAD A 501      46.447  98.835   6.988  1.00 11.48           C  
HETATM15209  C3B NAD A 501      47.659  99.689   6.567  1.00 11.48           C  
HETATM15207  C4B NAD A 501      47.266 101.038   7.214  1.00 11.48           C  

HETATM15211  C2B NAD A 501      46.447  98.835   6.988  1.00 11.48           C  
HETATM15209  C3B NAD A 501      47.659  99.689   6.567  1.00 11.48           C  
HETATM15207  C4B NAD A 501      47.266 101.038   7.214  1.00 11.48           C  
HETATM15208  O4B NAD A 501      46.466 100.713   8.371  1.00 11.48           O  

HETATM15209  C3B NAD A 501      47.659  99.689   6.567  1.00 11.48           C  
HETATM15207  C4B NAD A 501      47.266 101.038   7.214  1.00 11.48           C  
HETATM15208  O4B NAD A 501      46.466 100.713   8.371  1.00 11.48           O  
HETATM15213  C1B NAD A 501      46.221  99.300   8.426  1.00 11.48           C  

The IndentationError is showing up because you seem to have indented two tabs underneath your with open(doc) as pdbfile: line.

Hope this helps!

import sys
import glob
import errno
import os

d = {}
chainIDs = ('A', 'B')
atomIDs = ('C4B', 'O4B', 'C1B', 'C2B', 'C3B', 'C4B', 'O4B', 'C1B')
count = 0
for doc in os.listdir(doc_path):
    doc1 = doc_path+'\\'+ doc
    doc2 = tar_path+'\\'+ doc

    if doc1.endswith(".pdb"):
        with open(doc1) as pdbfile:
           # single_line = ''.join([line for line in f])
           # single_space = ' '.join(single_line.split())
           for line in map(str.rstrip, pdbfile):
                if line[:6] != "HETATM":
                chainID = line[21:22]
                atomID = line[13:16].strip()
                if chainID not in chainIDs:
                if atomID not in atomIDs:
                    d[chainID][atomID] = line
                except KeyError:
                    d[chainID] = {atomID: line}
           n = 4
           for chainID in chainIDs:
               for i in range(len(atomIDs)-n+1):
                   for j in range(n):
                          with open(doc2 , "w+") as s:
                                count += 1   


The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM