Professional Documents
Culture Documents
Chemical Space Travel: Ruud Van Deursen and Jean-Louis Reymond
Chemical Space Travel: Ruud Van Deursen and Jean-Louis Reymond
DOI: 10.1002/cmdc.200700021
636
Compound
Formula
mass
n[a]
With aromatic[c]
N[d]
Steps to
MeOH[b]
N[d]
Cubane
Fluorouracil
Metheneamine
3-Tetrazene-2-carboximidamide
Aspirine
9-Ethyl-carbazole
Vitamin H
VX (van)
Adenosine
b-estradiol
Retinal
Morphine
Aspartame
Cocaine
Tetrodotoxin
Sucrose
Penicillin G
Strychnine
Papaverin
Colchicine
Calcitriol
Dipicrylamine
Tetracycline
Vitamin K
Epothilone
Vitamin E
Reserpine
Taxotere
C8H8
C4H3FN2O2
C6H12N4
C2H6N10
C9H8O4
C14H13N
C10H16N2O3S
C11H26NO2PS
C10H13N5O4
C18H24O2
C20H28O
C17H19NO3
C14H18N2O5
C17H21NO4
C11H17N3O8
C12H22O11
C16H18N2O4S
C21H22N2O2
C20H21NO4
C22H25NO6
C27H44O3
C12H5N7O12
C22H24N2O8
C31H46O2
C27H41NO6S
C29H50O2
C33H40N2O9
C45H55NO15
104
130
140
170
180
197
244
267
267
272
284
285
294
303
319
342
334
332
339
399
417
439
428
451
508
531
609
808
8
9
10
12
13
15
16
16
19
20
21
21
21
22
22
23
23
25
25
29
30
31
31
33
35
38
44
58
12
16
12
12
15
n.f.[e]
18*
21
n.f.[e]
23
23
26
303
n.f.[e]
28
25*
n.f.[e]
n.f.[e]
n.f.[e]
37
37*
n.f.[e]
36
55
n.f.[e]
71
n.f.[e]
n.f.[e]
9*
8(1)
20(2)
25(2)
15(2)
18(2)
16(2)
20(2)*
20(2)
26(2)
25(3)
32(3)
21(2)
30(1)
42(3)
62(4)
40(2)
68(5)
74(4)
6638
2456
6157
4685
2567
20 501
27 161
29 460
23680
43 089
45 176
69 113
34 172
70 807
106 158
67 052
70 497
176721
53 099
136 519
298 327
21 015
173 734
411107
709 250
443 477
286 342
1128 960
7
7*
9
11*
12
16
14*
14*
19*
20
19*
20*
20*
22
20*
21
23*
25
25
28
28*
26
30
32*
34*
37*
62
57*
994
560
1768
2007
2582
5357
6304
3954
13 639
19067
15 100
16 247
11 430
17 993
16 757
19 552
15 748
32479
28 449
33 592
65 595
13 950
34 883
77 337
75219
140 017
230 646
304 172
[a] n is the number of nonhydrogen atoms in the molecule. [b] The step number is the number of structural mutations (see Table 1) done to reach the
target. Numbers with * indicate that the target is reached through a mutation sequence using only the 10 fittest mutants in each round, which implies
that the trajectory is reproducible. MeOH was used as target B for the back-trajectory because CH4 has no pharmacophore fingerprint. [c] Numbers in parentheses are the number of aromatic ring addition mutations used. [d] N is the number of molecules stored in the trajectory after reduction by applying
filters for chemical stability and synthetic feasibility rules.[15] These filters eliminated on average 89 % of the compounds generated by mutations. Population size is based on the optimal run, that is, with aromatic ring addition if it was applied. [e] n.f. = not found after 500 iterations.
www.chemmedchem.org
637
MED
was taxotere (n = 58 atoms). Buckminster-fullerene (n = 60
atoms) and brevetoxin (n = 64 atoms) were not found from
methane.
The spaceship program also efficiently travelled between different molecules (Table 3). The number of steps and the
number of molecules generated for these trajectories increased
with molecular size and complexity of the targets. 55 of the
156 trajectories shown in Tables 2 and 3 always used one of
the 10 fittest mutants in each step and are therefore reproducible (marked *). The other 101 trajectories passed through at
least one nonoptimal mutant selected by roulette wheel selection, implying that a different trajectory may be selected for
each journey.
We next investigated in more detail chemical space travel
between AMPA ((S)-2-amino-3-(3-hydroxy-5-methyl-isoxazol4-yl)-propionic acid, 1), an agonist of the AMPA-receptor, and
CNQX (7-nitro-2,3,-dioxo-1,4-dihydroquinoxaline-6-carbonitrile,
2), an antagonist of the same receptor (Figure 3). The AMPA receptor is an ionotropic glutamate receptor in the central nervous system for which both antagonists and agonists may have
therapeutic application.[5] The spaceship used 14 3 steps to
travel from AMPA to CNQX and 16 2 steps for the reverse trajectory. In both cases only nearest neighbour mutations were
used. A total of 559656 unique compounds were obtained
after 500 runs.[16] Approximately 90 % of the trajectory compounds occupied a hyperbolic cloud in a two-dimensional
property space representing the similarity measure of AMPA
and CNQX, indicating that these compounds combined features of both start and target (Figure 2). The remaining 10 % of
the trajectory compounds occupied the low-similarity island at
bottom left, and were probably produced by mutations causing large structural changes.
As a control, we also let the propulsion device run away
from either AMPA or CNQX for 20 mutational rounds without
compass, selecting only for intermediate molecular size (n =
1317 atoms) and chemical stability and synthetic feasibility
criteria, resulting in 152916 unique compounds representing a
broad selection of chemical space in the size range of the ligands. More than 98 % of the run-away compounds occupied
the low similarity island in the AMPA/CNQX plot.
A small fraction of the compounds produced were converted to all stereoisomers using CORINA,[17] and docked into the
glutamate binding site of the GluR2-subunit (1FTK.pdb)[18]
using AutoDock.[19] Such docking programs locate the optimal
and usually crystallographically correct binding conformation
of ligands within a protein binding site and estimate the binding energy by quantifying proteinligand interactions, which
provides a useful tool for virtual screening.[20] Binding energies
from 5.7 to 13.9 kcal mol1 were obtained. Compounds
from the run-away trajectories docked with an average energy
To:
Cubane
Aspirine
VX
Adenosine
Sucrose
Penicillin G
Strychnine
Colchicine
Tetracycline
Vitamin K
10*
13
17*
18*
19*
21*
27
28*
30*
10
17 (1)
27
22 (1)
13*
17*
22*
20
24*
18
14
18*
22*
14*
20
21
25*
30*
23 (1)
21
31 (1)
29 (1)
23
26
26
49
34*
19
15
18
14
19*
22
18
19
28*
18 (1)
16
15 (1)
15
25
16*
22
19*
27*
18 (1)
24
21 (1)
24
26 (1)
20
23
16
19*
22 (1)
22
20 (2)
23
31 (1)
19*
30*
28
30*
24 (1)
22
24* (1)
27*
25 (1)
21*
17*
22*
22*
26 (1)
33
25* (1)
29
25 (1)
29
22*
21*
17
[a] The number of structural mutations done for the cross trajectory from the line entry to the column entry is shown. Numbers with * indicate that the
target is reached through a mutation sequence using only the 10 fittest mutants in each round, which implies that the trajectory is reproducible. Numbers
in parentheses are the number of aromatic ring addition mutations used.
638
www.chemmedchem.org
Figure 3. Structures of AMPA (1), CNQX (2), and the strongest docking trajectory compound (3).
Figure 4. Predicted binding mode for the trajectory compound 3 (green) in the active site of the AMPA receptor (1FTK.pdb), overlaid with AMPA (A, yellow)
and CNQX (B, yellow). The position of the reference ligands were obtained from a docking experiment. AMPA occupies the same position as found in the experimental crystal structure of its complex. A crystal structure of the AMPA-receptor with CNQX is not available.18 The images are generated using Pymol.
www.chemmedchem.org
639
MED
Acknowledgement
This work was financially supported by the Swiss National Science Foundation. The authors gratefully acknowledge Chemaxon
Ltd. for donation of the academic license to the JChem package.
We thank Dr. Tobias Fink and Sacha Javor for critical comments
on the project and manuscript.
[12]
[13]
640
www.chemmedchem.org
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]