简体   繁体   中英

Using Biopython to parse a PDB file

I need to parse through a PDB file using biopython in order to extract each line that pertains to an alpha carbon (CA). Here is the code that I use

from Bio.PDB import *

parser=PDBParser()
io = PDBIO()

structure_2 = parser.get_structure('Y', 'A.pdb')

for l in structure_2:
   if atom.get_id() == 'CA':
       io.set_structure(atom)
       io.save("alpha.pdb")

My idea is that the for loop will go through each line of the PDB file write each line that pertains to an alpha carbon ('CA') to a new PDB file called alpha.pdb . Here is a short preview of what structure_2 looks like:

ATOM      1  N   LYS A  35      -5.054  29.359  -1.504  1.00 61.86           N  
ATOM      2  CA  LYS A  35      -5.430  28.077  -0.842  1.00 61.30           C  
ATOM      3  C   LYS A  35      -4.188  27.450  -0.230  1.00 59.47           C  
ATOM      4  O   LYS A  35      -3.142  27.339  -0.875  1.00 59.94           O  
ATOM      5  CB  LYS A  35      -6.055  27.113  -1.860  1.00 63.54           C  
ATOM      6  CG  LYS A  35      -7.354  26.443  -1.409  1.00 65.88           C  
ATOM      7  CD  LYS A  35      -7.126  25.382  -0.339  1.00 66.83           C  
ATOM      8  CE  LYS A  35      -8.363  24.507  -0.172  1.00 67.47           C  
ATOM      9  NZ  LYS A  35      -8.010  23.158   0.355  1.00 68.07           N  
ATOM     10  N   TYR A  36      -4.293  27.093   1.042  1.00 56.18           N  
ATOM     11  CA  TYR A  36      -3.183  26.472   1.741  1.00 52.61           C  
ATOM     12  C   TYR A  36      -3.455  24.992   1.893  1.00 51.51           C  
ATOM     13  O   TYR A  36      -4.561  24.580   2.250  1.00 51.93           O  
ATOM     14  CB  TYR A  36      -2.986  27.111   3.117  1.00 49.10           C  
ATOM     15  CG  TYR A  36      -2.305  28.456   3.074  1.00 45.23           C 

As you can see, the relevant information (CA) is in the third column of the PDB file. Whenever I run my code, it does not write any new files, but it doesn't give me any errors. What could I be doing wrong here?

Below you can find a script that loads a protein structure 1p49.pdb (from script directory), then parses it and saves only alfa carbon atom coordinates to the 1p48_out.pdb file

#!/usr/bin/env python3
import Bio
print("Biopython v" + Bio.__version__)

from Bio.PDB import PDBParser
from Bio.PDB import PDBIO

# Parse and get basic information
parser=PDBParser()
protein_1p49 = parser.get_structure('STS', '1p49.pdb')
protein_1p49_resolution = protein_1p49.header["resolution"]
protein_1p49_keywords = protein_1p49.header["keywords"]

print("Sample name: " + str(protein_1p49))
print("Resolution: " + str(protein_1p49_resolution))
print("Keywords: " + str(protein_1p49_keywords))
print("Model: " + str(protein_1p49[0]))

#initialize IO 
io=PDBIO()

#custom select
class Select():
    def accept_model(self, model):
        return True
    def accept_chain(self, chain):
        return True
    def accept_residue(self, residue):
        return True       
    def accept_atom(self, atom):
        print("atom id:" + atom.get_id())
        print("atom name:" + atom.get_name())
        if atom.get_name() == 'CA':  
            print("True") 
            return True
        else:
            return False

#write to output file
io.set_structure(protein_1p49)
io.save("1p49_out.pdb", Select())

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM