[英]I want to sort data from pdb file in perl or python
I want to print sequence of Ribose Puckering.我想打印核糖起皱的序列。
Script in perl: perl 中的脚本:
open (filehandler, "List_NAD_ID.txt") or die $!; #Input file
my @file1=<filehandler>;
my $OutputDir = 'C:\Users\result'; #output directory path
foreach my $line (@file1)
{
chomp $line;
open (fh,"$line") or die $!;
open (out, ">$OutputDir/$line.pdb") or die $!;
print out "\n" , "$line ";
print out "\n";
while($file = <fh>)
{
if($file =~/^HETATM.{7}(?:C4B|O4B|C1B|C2B|O4B|C1B|C2B|C3B|C1B|C2B|C3B|C4B|C2B|C3B|C4B|O4B|C3B|C4B|O4B|C1B)/)
{
print out "$file";
}
}
print "Completed", "\n";
}
I have pdb input file:我有 pdb 输入文件:
HETATM 3934 C4B NAD A 255 10.495 -11.444 1.016 1.00 50.46 C
HETATM 3935 O4B NAD A 255 10.768 -11.615 2.448 1.00 48.17 O
HETATM 3936 C3B NAD A 255 10.445 -12.867 0.431 1.00 49.69 C
HETATM 3938 C2B NAD A 255 10.431 -13.759 1.675 1.00 48.46 C
HETATM 3940 C1B NAD A 255 11.323 -12.898 2.593 1.00 46.97 C
HETATM 3978 C4B NAD B 256 14.596 1.733 33.219 1.00 50.48 C
HETATM 3979 O4B NAD B 256 14.370 0.578 32.357 1.00 48.22 O
HETATM 3980 C3B NAD B 256 14.940 1.177 34.603 1.00 49.64 C
HETATM 3982 C2B NAD B 256 14.987 -0.347 34.401 1.00 48.48 C
HETATM 3984 C1B NAD B 256 14.066 -0.517 33.189 1.00 46.98 C
Expected Result:预期结果:
I want to copy following atom and then paste as per following sequence.我想复制以下原子,然后按照以下顺序粘贴。 All should be chain wise.一切都应该是连锁明智的。 (Chain "A, B, C,..........") (链“A、B、C…………”)
HETATM 3934 **C4B** NAD **A** 255 10.495 -11.444 1.016 1.00 50.46 C
HETATM 3935 **O4B** NAD **A** 255 10.768 -11.615 2.448 1.00 48.17 O
HETATM 3938 **C2B** NAD **A** 255 10.431 -13.759 1.675 1.00 48.46 C
HETATM 3940 **C1B** NAD **A** 255 11.323 -12.898 2.593 1.00 46.97 C
HETATM 3935 **O4B** NAD **A** 255 10.768 -11.615 2.448 1.00 48.17 O
HETATM 3940 **C1B** NAD **A** 255 11.323 -12.898 2.593 1.00 46.97 C
HETATM 3938 **C2B** NAD **A** 255 10.431 -13.759 1.675 1.00 48.46 C
HETATM 3936 **C3B** NAD **A** 255 10.445 -12.867 0.431 1.00 49.69 C
.
.
.
I have five level of paste sequence, v0,v1,v2,v3,v4.我有五个级别的粘贴序列,v0、v1、v2、v3、v4。
Sequence is:顺序是:
C4B-O4B-C1B-C2B
O4B-C1B-C2B-C3B
C1B-C2B-C3B-C4B
C2B-C3B-C4B-O4B
C3B-C4B-O4B-C1B
This all sequence, I want to print data as per above sequence.这所有序列,我想按照上述序列打印数据。 I have also edited expected result.我还编辑了预期结果。
I want to sort data as per above sequence, chain wise.我想按照上述顺序对数据进行链式排序。 I am not getting expected result.我没有得到预期的结果。 I have tried in perl.我在 perl 中尝试过。 I am new in perl and python... so please try to solve my problem我是 perl 和 python 新手...所以请尝试解决我的问题
Its Like matrix problem:它的Like矩阵问题:
for example we have five values: 1,2,3,4,5例如我们有五个值:1,2,3,4,5
Row 1 - 1 2 3 4
Row 2 - 2 3 4 5
Row 3 - 3 4 5 1
Row 4 - 4 5 1 2
I want to print data like that for each chain.我想为每个链打印这样的数据。 Chain A to Z.链 A 到 Z。
If you want to use Biopython, you have to create all the Chains and insert the Atoms in it.如果你想使用Biopython,你必须创建所有的链并将原子插入其中。 But the atoms must be hold in a Residue for this to work out:但是原子必须保持在一个 残基中才能解决这个问题:
from Bio.PDB import PDBParser, PDBIO, Chain, Residue
# This is your source structure
pdb = PDBParser().get_structure("UGLY", "ugly.pdb")
# Now you cycle all your chains
for chain in pdb.get_chains():
# Load all the atoms and residues in each Chain
atoms = list(chain.get_atoms())
residues = list(chain.get_residues())
# Start a new structure to save the output
io = PDBIO()
this_chain = Chain.Chain("A")
this_residue = Residue.Residue(residues[0].id,
residues[0].resname,
residues[0].segid)
# Now get the atoms in your source structure that matches your sort keys
# You should refactor this out to a function that accepts a sort key
# and returns a list of atoms or a residue with the atoms added.
for atom_name in "O4B-C1B-C2B-C3B".split("-"):
for atom in atoms:
if atom.get_name() == atom_name:
this_residue.add(atom)
# Add the residue to a structure and save it
this_chain.add(this_residue)
io.set_structure(this_chain)
# And now write your output file. Remember to change the name!
io.save("temp.pdb")
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.