简体   繁体   中英

Biojava : HET aminoacids

Reading pdb structure 2a65 I am facing the case of an aminoacid residue that should be considered a " ligand of the protein " rather than a " part of the protein ".

In the PDB file as well as cif files, this LEU.601 residue is tagged as HET, unfortunately, being of name LEU, it seems Biojava tags it automatically as ATOM. Does anybody know a way to discriminate between "protein chain A" and the ligand "LEU.601" ?

A sample of 2a65.pdb :

HETATM 4149  N   LEU A 601      24.537  32.416  18.866  1.00 15.26           N
HETATM 4150  CA  LEU A 601      25.812  31.696  18.815  1.00 16.66           C
HETATM 4151  C   LEU A 601      25.693  30.381  18.046  1.00 16.48           C
...

A snippet of my biojava code :

Group g=s.findGroup("A", "601");
System.out.println(g);
System.out.println(g.getType());

g=s.findGroup("A", "701");
System.out.println(g);
System.out.println(g.getType());

And what it generates :

AminoAcid ATOM:LEU L 601 true ATOM atoms: 9
amino
Hetatom 701 BOG true atoms: 20
hetatm

In biojava 4, this is handled through seqres groups and atom groups. Groups that are part of the ligand will not be in seqres at all. This is a snippet that demonstrates how to loop through them:

import org.biojava.nbio.structure.Chain;
import org.biojava.nbio.structure.Group;
import org.biojava.nbio.structure.Structure;
import org.biojava.nbio.structure.StructureIO;

public class StackOverflowIssue {

    public static void main(String[] args) throws Exception {

        Structure s = StructureIO.getStructure("2a65");         

        Chain c = s.getChainByPDB("A");

        for (Group gr : c.getSeqResGroups()) {
            System.out.println(gr.getResidueNumber()+" "+gr.getPDBName());
        }

        for (Group gr : c.getAtomGroups()) {
            if (!gr.isWater())
                System.out.println(gr.getResidueNumber()+" "+gr.getPDBName());
        }

    }

}

The output of that will show how the seqres groups contain the LEU 601 you refer to, whilst the atom groups don't contain it.

In biojava 5 (not released yet, but you can use the SNAPSHOT builds or directly grab the master branch from github: https://github.com/biojava/biojava ), the polymer and non-polymer entities are dealt with in a much better way. Basically every ligand molecule is assigned to its own chain so that it is easy to separate what's polymer (protein or nucleic acid) from what's ligand.

If you keep using 4, do use the latest 4.2.1 (or wait a few days until 4.2.2 is released).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM