如何在python中导入SDF或sd文件?

问题描述 投票:0回答:1

这是我第一次在这里提出问题,我对编程有点陌生,所以请耐心等待。现在,我试图从 ChemBL 导入化合物列表,以便将这些化合物显示在网格中并对它们进行分类。我面临的问题是,每次我尝试导入 SDF 时,都会出现错误,表示输入文件错误。 现在我尝试提及我下载的 SDF 的路径,并且还尝试输入文件的实际名称,但仍然存在相同的错误。我觉得我犯了一些愚蠢的错误,但我无法真正确定它。 现在,我是否必须将每个化合物的数据作为 .sdf 文件输入,然后打印?

INPUT
from rdkit import Chem
from rdkit.Chem import AllChem
sd_supplier = Chem.SDMolSupplier(r'C:\Users\abc\Downloads\DOWNLOAD-kFj-7sO59mDkb3HHrtKd0_C9Qh8SCpVsEFJstjEfSUw=\DOWNLOAD-kFj-7sO59mDkb3HHrtKd0_C9Qh8SCpVsEFJstjEfSUw=.sdf')*
for mol in sd_supplier:
    name = mol.GetProp('Name')
    smiles = Chem.MolToSmiles(mol)
    print(f'Molecule: {name}')
    print(f'SMILES: {smiles}')
OUTPUT
OSError                                   Traceback (most recent call last)
<ipython-input-33-a625f28bcf9e> in <cell line: 2>()
---->3 sd_supplier = Chem.SDMolSupplier(r'C:\Users\abc\Downloads\DOWNLOAD-kFj-7sO59mDkb3HHrtKd0_C9Qh8SCpVsEFJstjEfSUw=\DOWNLOAD-kFj-7sO59mDkb3HHrtKd0_C9Qh8SCpVsEFJstjEfSUw=.sdf')
OSError: File error: Bad input file C:\Users\abc\Downloads\DOWNLOAD-kFj-7sO59mDkb3HHrtKd0_C9Qh8SCpVsEFJstjEfSUw=\DOWNLOAD-kFj-7sO59mDkb3HHrtKd0_C9Qh8SCpVsEFJstjEfSUw=.sdf

(*这行每次都会抛出错误,我尝试输入SDF的实际名称,但仍然不起作用,我尝试使用文件路径,但仍然抛出错误。)

python chemistry rdkit cheminformatics
1个回答
0
投票

尝试使用输入代码

PNN_ideal.sdf

PNN
  -OEChem-10252417133D

 41 43  0     1  0  0  0  0  0999 V2000
   -2.6110   -0.3980    0.9770 O   0  0  0  0  0  0  0  0  0  0  0  0
   -1.4630   -0.7530    1.1450 C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.6430   -0.8790    2.2690 N   0  0  0  0  0  0  0  0  0  0  0  0
   -0.2640    0.3720    2.9340 C   0  0  1  0  0  0  0  0  0  0  0  0
   -1.0640    0.5150    4.2040 C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.3510    1.7360    4.6820 O   0  0  0  0  0  0  0  0  0  0  0  0
   -1.4470   -0.4680    4.7910 O   0  0  0  0  0  0  0  0  0  0  0  0
    1.2550    0.4140    3.2550 C   0  0  0  0  0  0  0  0  0  0  0  0
    1.5830   -0.3670    4.5290 C   0  0  0  0  0  0  0  0  0  0  0  0
    1.7730    1.8510    3.3350 C   0  0  0  0  0  0  0  0  0  0  0  0
    1.8810   -0.4510    1.7460 S   0  0  0  0  0  0  0  0  0  0  0  0
    0.4280   -1.5480    1.4930 C   0  0  2  0  0  0  0  0  0  0  0  0
   -0.3690   -1.2470    0.2230 C   0  0  2  0  0  0  0  0  0  0  0  0
    0.2220   -0.1980   -0.6100 N   0  0  0  0  0  0  0  0  0  0  0  0
    0.0470   -0.2190   -1.9460 C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.5990   -1.1080   -2.4580 O   0  0  0  0  0  0  0  0  0  0  0  0
    0.6560    0.8590   -2.8030 C   0  0  0  0  0  0  0  0  0  0  0  0
    0.3140    0.6050   -4.2490 C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.8360    1.1440   -4.7920 C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.1490    0.9110   -6.1180 C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.3110    0.1390   -6.9010 C   0  0  0  0  0  0  0  0  0  0  0  0
    0.8400   -0.3980   -6.3580 C   0  0  0  0  0  0  0  0  0  0  0  0
    1.1550   -0.1610   -5.0330 C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.4920    1.2090    2.2740 H   0  0  0  0  0  0  0  0  0  0  0  0
   -1.8640    1.8290    5.4960 H   0  0  0  0  0  0  0  0  0  0  0  0
    2.6630   -0.3860    4.6760 H   0  0  0  0  0  0  0  0  0  0  0  0
    1.1080    0.1130    5.3830 H   0  0  0  0  0  0  0  0  0  0  0  0
    1.2120   -1.3880    4.4340 H   0  0  0  0  0  0  0  0  0  0  0  0
    2.8540    1.8400    3.4780 H   0  0  0  0  0  0  0  0  0  0  0  0
    1.5340    2.3750    2.4100 H   0  0  0  0  0  0  0  0  0  0  0  0
    1.3000    2.3620    4.1740 H   0  0  0  0  0  0  0  0  0  0  0  0
    0.6030   -2.6020    1.7090 H   0  0  0  0  0  0  0  0  0  0  0  0
   -0.6420   -2.1370   -0.3440 H   0  0  0  0  0  0  0  0  0  0  0  0
    0.7400    0.5120   -0.2000 H   0  0  0  0  0  0  0  0  0  0  0  0
    0.2600    1.8290   -2.5010 H   0  0  0  0  0  0  0  0  0  0  0  0
    1.7390    0.8550   -2.6790 H   0  0  0  0  0  0  0  0  0  0  0  0
   -1.4910    1.7470   -4.1810 H   0  0  0  0  0  0  0  0  0  0  0  0
   -2.0490    1.3310   -6.5430 H   0  0  0  0  0  0  0  0  0  0  0  0
   -0.5560   -0.0430   -7.9370 H   0  0  0  0  0  0  0  0  0  0  0  0
    1.4950   -1.0010   -6.9690 H   0  0  0  0  0  0  0  0  0  0  0  0
    2.0550   -0.5820   -4.6080 H   0  0  0  0  0  0  0  0  0  0  0  0
  1  2  2  0  0  0  0
  2  3  1  0  0  0  0
  2 13  1  0  0  0  0
  3  4  1  0  0  0  0
  3 12  1  0  0  0  0
  4  5  1  0  0  0  0
  4  8  1  0  0  0  0
  4 24  1  0  0  0  0
  5  6  1  0  0  0  0
  5  7  2  0  0  0  0
  6 25  1  0  0  0  0
  8  9  1  0  0  0  0
  8 10  1  0  0  0  0
  8 11  1  0  0  0  0
  9 26  1  0  0  0  0
  9 27  1  0  0  0  0
  9 28  1  0  0  0  0
 10 29  1  0  0  0  0
 10 30  1  0  0  0  0
 10 31  1  0  0  0  0
 11 12  1  0  0  0  0
 12 13  1  0  0  0  0
 12 32  1  0  0  0  0
 13 14  1  0  0  0  0
 13 33  1  0  0  0  0
 14 15  1  0  0  0  0
 14 34  1  0  0  0  0
 15 16  2  0  0  0  0
 15 17  1  0  0  0  0
 17 18  1  0  0  0  0
 17 35  1  0  0  0  0
 17 36  1  0  0  0  0
 18 19  2  0  0  0  0
 18 23  1  0  0  0  0
 19 20  1  0  0  0  0
 19 37  1  0  0  0  0
 20 21  2  0  0  0  0
 20 38  1  0  0  0  0
 21 22  1  0  0  0  0
 21 39  1  0  0  0  0
 22 23  2  0  0  0  0
 22 40  1  0  0  0  0
 23 41  1  0  0  0  0
M  END
> <OPENEYE_ISO_SMILES>
CC1([C@@H](N2[C@H](S1)[C@@H](C2=O)NC(=O)Cc3ccccc3)C(=O)O)C

> <OPENEYE_INCHI>
InChI=1S/C16H18N2O4S/c1-16(2)12(15(21)22)18-13(20)11(14(18)23-16)17-10(19)8-9-6-4-3-5-7-9/h3-7,11-12,14H,8H2,1-2H3,(H,17,19)(H,21,22)/t11-,12+,14-/m1/s1

> <OPENEYE_INCHIKEY>
JGSARLDLIJGVTE-MBNYWOFBSA-N

> <FORMULA>
C16H18N2O4S

$$$$

代码:

#INPUT
from rdkit import Chem
from rdkit.Chem import AllChem

sd_supplier = Chem.SDMolSupplier('PNN_ideal.sdf')


for mol in sd_supplier:
    
    smiles = Chem.MolToSmiles(mol)
    
    try:
        name = mol.GetProp('Name')
        print(f'Molecule: {name}')
    
    except Exception as excpt:
        
        print('error --> ', excpt , '____', repr(excpt))
    
    print(f'SMILES: {smiles}')

输出:

error -->  'Name' ____ KeyError('Name')
SMILES: CC1(C)S[C@@H]2[C@H](NC(=O)Cc3ccccc3)C(=O)N2[C@H]1C(=O)O

所以我猜是关于您的输入或输入名称/路径

© www.soinside.com 2019 - 2024. All rights reserved.