这是我第一次在这里提出问题,我对编程有点陌生,所以请耐心等待。现在,我试图从 ChemBL 导入化合物列表,以便将这些化合物显示在网格中并对它们进行分类。我面临的问题是,每次我尝试导入 SDF 时,都会出现错误,表示输入文件错误。 现在我尝试提及我下载的 SDF 的路径,并且还尝试输入文件的实际名称,但仍然存在相同的错误。我觉得我犯了一些愚蠢的错误,但我无法真正确定它。 现在,我是否必须将每个化合物的数据作为 .sdf 文件输入,然后打印?
INPUT
from rdkit import Chem
from rdkit.Chem import AllChem
sd_supplier = Chem.SDMolSupplier(r'C:\Users\abc\Downloads\DOWNLOAD-kFj-7sO59mDkb3HHrtKd0_C9Qh8SCpVsEFJstjEfSUw=\DOWNLOAD-kFj-7sO59mDkb3HHrtKd0_C9Qh8SCpVsEFJstjEfSUw=.sdf')*
for mol in sd_supplier:
name = mol.GetProp('Name')
smiles = Chem.MolToSmiles(mol)
print(f'Molecule: {name}')
print(f'SMILES: {smiles}')
OUTPUT
OSError Traceback (most recent call last)
<ipython-input-33-a625f28bcf9e> in <cell line: 2>()
---->3 sd_supplier = Chem.SDMolSupplier(r'C:\Users\abc\Downloads\DOWNLOAD-kFj-7sO59mDkb3HHrtKd0_C9Qh8SCpVsEFJstjEfSUw=\DOWNLOAD-kFj-7sO59mDkb3HHrtKd0_C9Qh8SCpVsEFJstjEfSUw=.sdf')
OSError: File error: Bad input file C:\Users\abc\Downloads\DOWNLOAD-kFj-7sO59mDkb3HHrtKd0_C9Qh8SCpVsEFJstjEfSUw=\DOWNLOAD-kFj-7sO59mDkb3HHrtKd0_C9Qh8SCpVsEFJstjEfSUw=.sdf
(*这行每次都会抛出错误,我尝试输入SDF的实际名称,但仍然不起作用,我尝试使用文件路径,但仍然抛出错误。)
尝试使用输入代码
PNN_ideal.sdf
:
PNN
-OEChem-10252417133D
41 43 0 1 0 0 0 0 0999 V2000
-2.6110 -0.3980 0.9770 O 0 0 0 0 0 0 0 0 0 0 0 0
-1.4630 -0.7530 1.1450 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.6430 -0.8790 2.2690 N 0 0 0 0 0 0 0 0 0 0 0 0
-0.2640 0.3720 2.9340 C 0 0 1 0 0 0 0 0 0 0 0 0
-1.0640 0.5150 4.2040 C 0 0 0 0 0 0 0 0 0 0 0 0
-1.3510 1.7360 4.6820 O 0 0 0 0 0 0 0 0 0 0 0 0
-1.4470 -0.4680 4.7910 O 0 0 0 0 0 0 0 0 0 0 0 0
1.2550 0.4140 3.2550 C 0 0 0 0 0 0 0 0 0 0 0 0
1.5830 -0.3670 4.5290 C 0 0 0 0 0 0 0 0 0 0 0 0
1.7730 1.8510 3.3350 C 0 0 0 0 0 0 0 0 0 0 0 0
1.8810 -0.4510 1.7460 S 0 0 0 0 0 0 0 0 0 0 0 0
0.4280 -1.5480 1.4930 C 0 0 2 0 0 0 0 0 0 0 0 0
-0.3690 -1.2470 0.2230 C 0 0 2 0 0 0 0 0 0 0 0 0
0.2220 -0.1980 -0.6100 N 0 0 0 0 0 0 0 0 0 0 0 0
0.0470 -0.2190 -1.9460 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.5990 -1.1080 -2.4580 O 0 0 0 0 0 0 0 0 0 0 0 0
0.6560 0.8590 -2.8030 C 0 0 0 0 0 0 0 0 0 0 0 0
0.3140 0.6050 -4.2490 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.8360 1.1440 -4.7920 C 0 0 0 0 0 0 0 0 0 0 0 0
-1.1490 0.9110 -6.1180 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.3110 0.1390 -6.9010 C 0 0 0 0 0 0 0 0 0 0 0 0
0.8400 -0.3980 -6.3580 C 0 0 0 0 0 0 0 0 0 0 0 0
1.1550 -0.1610 -5.0330 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.4920 1.2090 2.2740 H 0 0 0 0 0 0 0 0 0 0 0 0
-1.8640 1.8290 5.4960 H 0 0 0 0 0 0 0 0 0 0 0 0
2.6630 -0.3860 4.6760 H 0 0 0 0 0 0 0 0 0 0 0 0
1.1080 0.1130 5.3830 H 0 0 0 0 0 0 0 0 0 0 0 0
1.2120 -1.3880 4.4340 H 0 0 0 0 0 0 0 0 0 0 0 0
2.8540 1.8400 3.4780 H 0 0 0 0 0 0 0 0 0 0 0 0
1.5340 2.3750 2.4100 H 0 0 0 0 0 0 0 0 0 0 0 0
1.3000 2.3620 4.1740 H 0 0 0 0 0 0 0 0 0 0 0 0
0.6030 -2.6020 1.7090 H 0 0 0 0 0 0 0 0 0 0 0 0
-0.6420 -2.1370 -0.3440 H 0 0 0 0 0 0 0 0 0 0 0 0
0.7400 0.5120 -0.2000 H 0 0 0 0 0 0 0 0 0 0 0 0
0.2600 1.8290 -2.5010 H 0 0 0 0 0 0 0 0 0 0 0 0
1.7390 0.8550 -2.6790 H 0 0 0 0 0 0 0 0 0 0 0 0
-1.4910 1.7470 -4.1810 H 0 0 0 0 0 0 0 0 0 0 0 0
-2.0490 1.3310 -6.5430 H 0 0 0 0 0 0 0 0 0 0 0 0
-0.5560 -0.0430 -7.9370 H 0 0 0 0 0 0 0 0 0 0 0 0
1.4950 -1.0010 -6.9690 H 0 0 0 0 0 0 0 0 0 0 0 0
2.0550 -0.5820 -4.6080 H 0 0 0 0 0 0 0 0 0 0 0 0
1 2 2 0 0 0 0
2 3 1 0 0 0 0
2 13 1 0 0 0 0
3 4 1 0 0 0 0
3 12 1 0 0 0 0
4 5 1 0 0 0 0
4 8 1 0 0 0 0
4 24 1 0 0 0 0
5 6 1 0 0 0 0
5 7 2 0 0 0 0
6 25 1 0 0 0 0
8 9 1 0 0 0 0
8 10 1 0 0 0 0
8 11 1 0 0 0 0
9 26 1 0 0 0 0
9 27 1 0 0 0 0
9 28 1 0 0 0 0
10 29 1 0 0 0 0
10 30 1 0 0 0 0
10 31 1 0 0 0 0
11 12 1 0 0 0 0
12 13 1 0 0 0 0
12 32 1 0 0 0 0
13 14 1 0 0 0 0
13 33 1 0 0 0 0
14 15 1 0 0 0 0
14 34 1 0 0 0 0
15 16 2 0 0 0 0
15 17 1 0 0 0 0
17 18 1 0 0 0 0
17 35 1 0 0 0 0
17 36 1 0 0 0 0
18 19 2 0 0 0 0
18 23 1 0 0 0 0
19 20 1 0 0 0 0
19 37 1 0 0 0 0
20 21 2 0 0 0 0
20 38 1 0 0 0 0
21 22 1 0 0 0 0
21 39 1 0 0 0 0
22 23 2 0 0 0 0
22 40 1 0 0 0 0
23 41 1 0 0 0 0
M END
> <OPENEYE_ISO_SMILES>
CC1([C@@H](N2[C@H](S1)[C@@H](C2=O)NC(=O)Cc3ccccc3)C(=O)O)C
> <OPENEYE_INCHI>
InChI=1S/C16H18N2O4S/c1-16(2)12(15(21)22)18-13(20)11(14(18)23-16)17-10(19)8-9-6-4-3-5-7-9/h3-7,11-12,14H,8H2,1-2H3,(H,17,19)(H,21,22)/t11-,12+,14-/m1/s1
> <OPENEYE_INCHIKEY>
JGSARLDLIJGVTE-MBNYWOFBSA-N
> <FORMULA>
C16H18N2O4S
$$$$
代码:
#INPUT
from rdkit import Chem
from rdkit.Chem import AllChem
sd_supplier = Chem.SDMolSupplier('PNN_ideal.sdf')
for mol in sd_supplier:
smiles = Chem.MolToSmiles(mol)
try:
name = mol.GetProp('Name')
print(f'Molecule: {name}')
except Exception as excpt:
print('error --> ', excpt , '____', repr(excpt))
print(f'SMILES: {smiles}')
输出:
error --> 'Name' ____ KeyError('Name')
SMILES: CC1(C)S[C@@H]2[C@H](NC(=O)Cc3ccccc3)C(=O)N2[C@H]1C(=O)O
所以我猜是关于您的输入或输入名称/路径