简体   繁体   English

在 python3 中向 SDF 文件添加属性

[英]Adding properties to SDF file in python3

I am new to rdkit.我是 rdkit 的新手。 So excuse me if the question sounds very basic.I have a sdf file with several molecules.如果这个问题听起来很基本,请原谅我。我有一个包含几个分子的 sdf 文件。 I would like to add certain properties to each entry.我想为每个条目添加某些属性。 How can I achieve this?我怎样才能做到这一点? My sample data looks like this.我的示例数据如下所示。

D00AAN
  -OEChem-10101305022D

100108  0     1  0  0  0  0  0999 V2000
    2.0000    5.1929    0.0000 Cl  0  0  0  0  0  0  0  0  0  0  0  0
    5.2896    2.9173    0.0000 S   0  0  0  0  0  0  0  0  0  0  0  0
    6.3905   -0.2731    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
    3.8629   -5.1929    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
  1 53  1  0  0  0  0
  2  5  1  0  0  0  0
  2  6  2  0  0  0  0
M  END

$$$$

D00AAU
  -OEChem-10101305022D

 42 43  0     1  0  0  0  0  0999 V2000
    6.3301    3.2500    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
    2.0000   -3.2500    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
    4.5981    0.2500    0.0000 C   0  0  3  0  0  0  0  0  0  0  0  0
  1 15  1  0  0  0  0
  1 41  1  0  0  0  0
  2 16  1  0  0  0  0
  2 42  1  0  0  0  0
  3  4  1  0  0  0  0
  3  5  1  0  0  0  0
  3  8  1  0  0  0  0
M  END

$$$$

I would like to add a line after each molecule entry.我想在每个分子条目后添加一行。

>  <ID>  id

The expected output is:预期的 output 为:

D00AAN
  -OEChem-10101305022D

100108  0     1  0  0  0  0  0999 V2000
    2.0000    5.1929    0.0000 Cl  0  0  0  0  0  0  0  0  0  0  0  0
    5.2896    2.9173    0.0000 S   0  0  0  0  0  0  0  0  0  0  0  0
    6.3905   -0.2731    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
    3.8629   -5.1929    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
  1 53  1  0  0  0  0
  2  5  1  0  0  0  0
  2  6  2  0  0  0  0
M  END
>  <ID>  D00AAN
$$$$

D00AAU
  -OEChem-10101305022D

 42 43  0     1  0  0  0  0  0999 V2000
    6.3301    3.2500    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
    2.0000   -3.2500    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
    4.5981    0.2500    0.0000 C   0  0  3  0  0  0  0  0  0  0  0  0
  1 15  1  0  0  0  0
  1 41  1  0  0  0  0
  2 16  1  0  0  0  0
  2 42  1  0  0  0  0
  3  4  1  0  0  0  0
  3  5  1  0  0  0  0
  3  8  1  0  0  0  0
M  END
>  <ID>  D00AAU
$$$$

To get the title and turn it to a property,要获得标题并将其转换为属性,

  • read the.sdf with Chem.SDMolSupplier()使用Chem.SDMolSupplier()读取.sdf

  • write or overwrite the.sdf with Chem.SDWriter('old.sdf | new.sdf')Chem.SDWriter('old.sdf | new.sdf')写入或覆盖.sdf

  • get the title with GetProp('_Name')使用GetProp('_Name')获取标题

  • set the title as a property SetProp('ID', 'title')将标题设置为属性SetProp('ID', 'title')

This code should work:此代码应该可以工作:

from rdkit import Chem

suppl = Chem.SDMolSupplier('old.sdf')

w = Chem.SDWriter('new.sdf')  # or old.sdf to overwrite

for m in suppl:
    n = m.GetProp('_Name')    # title
    m.SetProp('ID', n)        # associated data
    w.write(m)
        
w.close()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM