27.08.2013 Views

Virtual Exploration of Chemical Space by Database Generation

Virtual Exploration of Chemical Space by Database Generation

Virtual Exploration of Chemical Space by Database Generation

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Virtual</strong> <strong>Exploration</strong> <strong>of</strong> <strong>Chemical</strong> <strong>Space</strong> <strong>by</strong> <strong>Database</strong><br />

<strong>Generation</strong><br />

Jean-Louis Reymond<br />

University <strong>of</strong> Berne<br />

Switzerland<br />

1


The <strong>Chemical</strong> Universe <strong>Database</strong> (GDB)<br />

> <strong>Chemical</strong> <strong>Space</strong> Travel<br />

Swiss National Science Foundation<br />

Tobias Fink<br />

2


The <strong>Chemical</strong> Universe Project<br />

HN<br />

O<br />

NH<br />

S<br />

J. Lederberg, C. Djerassi et al. J. Am. Chem. Soc. 1969, 91, 2973<br />

T. Fink et al. Angew. Chem. Int. Ed. 2005, 44, 1504-1508, , J. Chem. Inf. Model. 2007, 47, 342-353<br />

3


Structure Generator<br />

Graphs from GENG<br />

I. Graph Selection<br />

II: Structure <strong>Generation</strong><br />

III: Filters<br />

IV: Stereoisomers<br />

843,335<br />

↓<br />

15,726<br />

↓<br />

276,220<br />

↓<br />

1.7 billion<br />

↓<br />

26.4 million<br />

GDB<br />

4


I. Graph Selection<br />

Table 1. Graph selection table. Each graph corresponds to a saturated hydrocarbon.<br />

Nodes Graphs a Passed Topo I b Passed Topo II c Planar Graphs d Unstrained Graphs e with unsaturations<br />

1 1 1 1 1 1 1<br />

2 1 1 1 1 1 3<br />

3 2 2 2 2 2 4<br />

4 6 4 4 4 4 13<br />

5 21 8 8 8 8 33<br />

6 78 20 20 20 20 123<br />

7 353 57 57 57 57 445<br />

8 1’929 199 194 194 194 1'956<br />

9 12’207 780 712 708 705 8'863<br />

10 89’402 3’600 2’893 2’845 2’822 43'443<br />

11 739’335 19’215 12’575 12’169 11’912 221'336<br />

Total 843’335 23’887 16’467 16’009 15’726 276'220<br />

a generated <strong>by</strong> the program GENG using a maximum connectivity <strong>of</strong> 4. b The Topo I filters eliminates any graph with a<br />

node present in two different 3 or 4-membered ring. c The Topo II filter eliminates graphs with a tetravalent bridgehead<br />

in a small ring. d Non-planar graphs cannot be drawn in a plane without crossing edges, e.g. 3.<br />

5


Tetrahedrane<br />

I. Graph Selection<br />

Claus Benzol<br />

K 3,3 Graph<br />

Tricyclo[2.2.2.2]decane<br />

6


Count<br />

350<br />

300<br />

250<br />

200<br />

150<br />

100<br />

50<br />

0<br />

1<br />

6<br />

Steric Energy Distribution over Ringsystems<br />

11<br />

16<br />

21<br />

26<br />

I. Graph Selection<br />

31<br />

36<br />

41<br />

46<br />

51<br />

56<br />

61<br />

66<br />

71<br />

Highest Atomic Contribution [kcal/mol]<br />

76<br />

81<br />

86<br />

91<br />

96<br />

7


Graph automorphism<br />

> Multiple bonds<br />

– no allenes<br />

– no DB at tetravalent nodes<br />

– no DB at bridgeheads<br />

– no TB at tri- or tetravalent nodes<br />

– no TB inside rings smaller than 9<br />

– no duplicates (symmetry)<br />

> Elements<br />

II. Structure <strong>Generation</strong><br />

– C, N, O, F at chemically meaningful nodes (valency rules)<br />

1.7 billion "molecules"<br />

276,220<br />

8


III. Filters<br />

> "Problematic" heteroatom constellations (X-X)<br />

> Hydrolytically labile FG's<br />

> Tautomers + aromaticity<br />

OH<br />

OH<br />

OH<br />

N<br />

H<br />

O<br />

N<br />

N<br />

OH<br />

NH 2<br />

O<br />

F<br />

O O O<br />

F (O, N)<br />

O<br />

OH<br />

-CHO + -NH 2<br />

9


Structure Generator<br />

Graphs from GENG<br />

I. Graph Selection<br />

II: Structure <strong>Generation</strong><br />

III: Filters<br />

IV: Stereoisomers<br />

843,335<br />

↓<br />

15,726<br />

↓<br />

276,220<br />

↓<br />

1.7 billion<br />

↓<br />

26.4 million<br />

GDB<br />

10


IV. Stereoisomers<br />

– Nourse, J. G.; Carhart, R. E.; Smith, D. H.; Djerassi, C. Exhaustive<br />

<strong>Generation</strong> <strong>of</strong> Stereoisomers for Structure Elucidation. J. Am. Chem. Soc.<br />

1979, 101, 1216-1223.<br />

11


Table 2. Overview <strong>of</strong> the structure generation process.<br />

Nodes Graphs a Generated b<br />

GDB Overview<br />

Accepted c<br />

Unique<br />

Tautomers<br />

(GDB) d<br />

All Tautomers Stereoisomers e<br />

1 1 4 4 4 4 4<br />

2 1 10 9 9 9 9<br />

3 2 52 20 20 21 20<br />

4 4 332 80 80 88 87<br />

5 8 2’294 357 352 397 469<br />

6 20 18’066 1’906 1’850 2’135 2’911<br />

7 57 154’542 10’953 10’568 12’438 19’904<br />

8 194 1’445’073 69’563 66’706 79’899 153’601<br />

9 705 14’213’741 464’402 444’313 540’002 1’258’963<br />

10 2’822 146’004’340 3’259’036 3’114’041 3’827’907 10’898’065<br />

11 11’912 1’558’491’448 23’875’101 22’796’628 28’240’425 98’645’474<br />

Total 15’726 1’720’329’902 27’681’431 26’434’571 32’703’325 110’979’507<br />

log N = a x n 2<br />

12


13<br />

GDB <strong>by</strong> Graph<br />

Monocycli<br />

c<br />

43%<br />

Polycyclic<br />

1%<br />

Tricyclic<br />

9%<br />

Bicyclic<br />

32%<br />

Acyclic<br />

15%<br />

1<br />

10<br />

100<br />

1'000<br />

10'000<br />

100'000<br />

1<br />

250<br />

500<br />

750<br />

100<br />

125<br />

150<br />

175<br />

200<br />

225<br />

250<br />

275<br />

300<br />

325<br />

350<br />

375<br />

400<br />

425<br />

450<br />

475<br />

500<br />

525<br />

550<br />

575<br />

Graphs<br />

Number <strong>of</strong> compounds<br />

Acyclic<br />

Monocyclic<br />

Bicyclic<br />

Tricyclic<br />

Polycyclic<br />

CH 4


Most Prolific Graphs<br />

Bicyclic<br />

32%<br />

Tricyclic<br />

9%<br />

Polycyclic<br />

1%<br />

94'903 cpds 64'641cpds 48'836 cpds 12'983 cpds<br />

Acyclic<br />

15%<br />

Monocycli<br />

c<br />

43%<br />

14


GDB <strong>by</strong> Elemental Composition<br />

Table 3. Number <strong>of</strong> compounds in GDB as a function <strong>of</strong> elemental composition (rows) and number <strong>of</strong> heavy atoms<br />

(columns).<br />

Number <strong>of</strong> heavy atoms<br />

Elemental<br />

composition<br />

1 2 3 4 5 6 7 8 9 10 11 Total<br />

C 1 3 4 12 29 102 347 1'468 6'413 30'582 152'117 191'078<br />

N 1 0 0 0 0 0 0 0 0 0 0 1<br />

O 1 1 0 0 0 0 0 0 0 0 0 2<br />

F 1 1 0 0 0 0 0 0 0 0 0 2<br />

CN 0 1 6 22 91 443 2'255 12'832 76'063 472'756 3'049'435 3'613'904<br />

CO 0 2 5 19 74 338 1'671 9'302 54'733 337'024 2'164'860 2'568'028<br />

CF 0 1 3 12 39 151 622 2'954 14'598 77'225 429'664 525'269<br />

CNO 0 0 2 11 82 526 3'578 24'858 176'888 1'299'010 9'819'032 11'323'987<br />

CNF 0 0 0 2 17 122 818 5'594 39'052 279'803 2'059'976 2'385'384<br />

COF 0 0 0 2 18 126 809 5'437 37'148 260'489 1'883'487 2'187'516<br />

CNOF 0 0 0 0 2 42 468 4'401 39'910 358'528 3'236'049 3'639'400<br />

Total 4 9 20 80 352 1'850 10'568 66'846 444'805 3'115'417 22'794'620 26'434'571<br />

15


Novelty vs. RDB<br />

> Ring systems<br />

> Stereochemistry<br />

> Physico-chemical properties<br />

> Compound classes<br />

> Drugs and leads<br />

IV: Stereoisomers<br />

GDB Analysis<br />

GDB<br />

16


RDB = 63'857 cpds<br />

Novelty vs. RDB<br />

– organic molecules up to 11 atoms from PubChem, ChemACX, ChemSCX,<br />

NCI open database, Merck Index<br />

26'464<br />

– Sulfur (14'048)<br />

– Acyl halides etc. (7'782)<br />

– S, P, Si (3'787)<br />

– Allenes etc. (629)<br />

– Topology (218)<br />

RDB<br />

37'393<br />

GDB<br />

26.4 million<br />

17


Ring Systems<br />

Number <strong>of</strong> 4-membered rings<br />

Number <strong>of</strong> 3-membered rings 0 1 2 Total<br />

0 124 [3] 189 [60] 103 [67] 416 [130]<br />

1 225 [50] 238 [177] 20 [19] 483 [246]<br />

2 201 [88] 55 [48] - 256 [136]<br />

3 53 [26] - - 53 [26]<br />

Total 603 [167] 482 [285] 123 [86] 1'208 [538]<br />

18


Ring Sytems<br />

– Chiral: 677 (367 unknown)<br />

– Achiral: 455 (150 unknown)<br />

– Both 76 (21 unknown)<br />

> Acyclic graphs<br />

– Chiral: 166 (all known)<br />

– Achiral: 143 (all known)<br />

> All graphs<br />

Stereochemistry<br />

– Chiral: 12'771<br />

– Achiral: 2'955 19.2 million<br />

Chiral<br />

7.1 million<br />

Achiral<br />

134’390 (including tartaric acid!)<br />

Fraction <strong>of</strong> SCU Compounds<br />

100%<br />

75%<br />

50%<br />

25%<br />

0%<br />

Chiral<br />

Achiral<br />

1 3 5 7 9 11 13 15 17 19<br />

Number <strong>of</strong> heavy atoms<br />

19


Stereochemistry<br />

Table 5. Number <strong>of</strong> stereocenters vs. number <strong>of</strong> stereoisomers for GDB compounds.<br />

Number <strong>of</strong> stereocenters a<br />

Number <strong>of</strong><br />

stereoisomers b 0 1 2 3 4 5 6 7 8 9 10 Total<br />

1 3’805’751 c<br />

0 127’959 e<br />

0 25’340 186 10’245 68 1’125 58 66 3’970’798<br />

2 2’193’503 d 4’023’245 1’160’206 189’290 317’283 33’036 88’602 7’945 7’488 228 158 8’020’984<br />

3 2’481 d<br />

0 13’182 0 2’493 0 115 0 31 0 0 18’302<br />

4 572’362 d 1’664’209 3’524’711 2’137’679 380’906 320’836 53’880 43’573 3’240 889 5 8’702’290<br />

5 0 0 0 0 61 0 14 0 0 0 0 75<br />

6 229 d<br />

38 1’860 1’253 422 87 256 14 3 0 0 4’162<br />

7 0 0 0 0 6 0 0 0 0 0 0 6<br />

8 49’980 d<br />

188’701 733’641 1’869’820 1’265’196 362’856 93’685 19’829 3’794 130 0 4’587’632<br />

10 62 d<br />

0 300 0 1’537 52 521 12 26 0 0 2’510<br />

12 0 0 142 164 510 44 4 0 0 0 0 864<br />

16 991 d<br />

3’822 29’787 165’346 443’339 286’119 108’252 8’965 1’096 89 0 1’047’806<br />

20 4 d<br />

0 0 0 6 0 18 0 1 0 0 29<br />

24 0 0 0 0 14 107 0 0 0 0 0 121<br />

32 19 d<br />

0 118 2’208 11’041 41’237 17’192 5’605 134 1 0 77’555<br />

36 0 0 0 0 0 0 42 0 0 0 0 42<br />

64 0 0 0 0 0 186 1’209 0 0 0 0 1’395<br />

Total 6’625’382 5’880’015 5’591’906 4’365’760 2’448’154 1’044’746 374’035 86’011 16’938 1’395 229 26’434’571<br />

a) A stereocenter is defined as an atom at which the interchange <strong>of</strong> two groups produces a stereoisomer, with the<br />

exception <strong>of</strong> E/Z isomers. b) Total number <strong>of</strong> stereoisomers resulting from a compound. c) Molecules containing no<br />

stereocenters and no E/Z double bonds. d) Molecules resulting in E/Z isomers only.<br />

20


Stereochemistry<br />

21


10 stereocenters<br />

1 stereoisomer<br />

Stereochemistry<br />

6 stereocenters<br />

64 stereoisomers<br />

22


Novelty vs. RDB<br />

> Ring systems<br />

> Stereochemistry<br />

GDB Analysis<br />

> Physico-chemical properties<br />

> Compound classes<br />

> Drugs and leads<br />

IV: Stereoisomers<br />

GDB<br />

23


Frequency x10 3<br />

7000000<br />

6000000<br />

5000000<br />

4000000<br />

3000000<br />

2000000<br />

1000000<br />

> MW ~ 153 Dal<br />

> H-bond Acceptors ~ 3<br />

> H-bond Donors ~ 1.5<br />

> log P, NRB, TPSA<br />

0<br />

0<br />

1<br />

2<br />

3<br />

4<br />

Physico-chemical properties<br />

10000<br />

5<br />

8000<br />

6000<br />

4000<br />

2000<br />

6<br />

Number <strong>of</strong> Rotatable Bonds<br />

0<br />

0<br />

2<br />

4<br />

7<br />

6<br />

8<br />

8<br />

Frequency x10 3<br />

Frequency x10 3<br />

4500000<br />

4000000<br />

3500000<br />

3000000<br />

2500000<br />

2000000<br />

1500000<br />

1000000<br />

500000<br />

0<br />

2500000<br />

2000000<br />

1500000<br />

1000000<br />

500000<br />

0<br />

-12.0<br />

0<br />

-10.5<br />

12<br />

-9.0<br />

24<br />

-7.5<br />

36<br />

-6.0<br />

48<br />

-4.5<br />

60<br />

-3.0<br />

72<br />

-1.5<br />

0.0<br />

logP<br />

84<br />

96<br />

1.5<br />

4000<br />

3000<br />

2000<br />

1000<br />

108<br />

6000<br />

5000<br />

4000<br />

3000<br />

2000<br />

1000<br />

0<br />

-12<br />

-9<br />

-6<br />

-3<br />

0<br />

3<br />

6<br />

9<br />

3.0<br />

120<br />

4.5<br />

132<br />

6.0<br />

144<br />

Topological Polar Surface Area [A 3 ]<br />

0<br />

0<br />

28<br />

56<br />

84<br />

112<br />

7.5<br />

140<br />

156<br />

9.0<br />

168<br />

168<br />

196<br />

24


RO5 (Lipinski )<br />

RO3 (Congreve)<br />

Physico-chemical properties<br />

MW<br />

≤ 500<br />

≤ 300<br />

logP<br />

≤ 5<br />

≤ 3<br />

Lipinski et al., Adv. Drug. Deliv. Rev. 1997, 23, 3<br />

Congreve et al., Drug Discov. Today 2003, 8, 876<br />

HBD<br />

≤ 5<br />

≤ 3<br />

HBA<br />

≤ 10<br />

≤ 3<br />

All structures OK<br />

Half <strong>of</strong> all structures OK<br />

NRB<br />

≤ 3<br />

TPSA<br />

≤ 60 Å 2<br />

25


Rdb<br />

GDB<br />

Property <strong>Space</strong><br />

DMU<br />

26


Novelty vs. RDB<br />

> Ring systems<br />

> Stereochemistry<br />

> Physico-chemical properties<br />

> Compound classes<br />

> Drugs and leads<br />

IV: Stereoisomers<br />

GDB Analysis<br />

GDB<br />

27


AC<br />

d<br />

N<br />

p<br />

d<br />

where<br />

=<br />

=<br />

=<br />

⎧1<br />

δ ij = ⎨<br />

⎩0<br />

N<br />

= ∑∑ δ ⋅<br />

i=<br />

1 j=<br />

1<br />

Number <strong>of</strong><br />

if<br />

if<br />

N<br />

Autocorrelation Descriptors<br />

( p ⋅ p )<br />

Considered topological<br />

Atomic property<br />

d<br />

d<br />

ij<br />

ij<br />

i<br />

=<br />

≠<br />

atoms<br />

d<br />

d<br />

j<br />

d<br />

distance<br />

(Kronecker Delta)<br />

Partial σ- and π-charges<br />

Atomic polarizability<br />

Topological steric effect index<br />

Atomic number<br />

Identity function<br />

0-7 bonds<br />

Moreau, G.; Broto, P. Autocorrelation <strong>of</strong> Molecular Structures. Application to SAR Studies.<br />

Nouv. J. Chim. 1980, 4, 757-764.<br />

28


48-dimensional AC vector<br />

> 200x200 neurons, toroidal<br />

> 1’000’000 GDB molecules<br />

Kohonen Map<br />

> 100 molecules/epoch, 250’000 epochs<br />

> map 26.4 million<br />

> Color code for properties<br />

29


1 10 100 1000 10000<br />

Occupancy<br />

1 10 100 1000 1000030


Heavy atoms:<br />

< 7 7 8 9 10 11<br />

Molecular size<br />

Mixed Empty<br />

1 10 100 1000 1000031


Compound Classes<br />

Heteroaromatic Aromatic Fused heterocyclic<br />

Fused Carboc. Heterocyclic Carbocyclic<br />

Heteroacyclic Carboacyclic Mixed Empty<br />

1 10 100 1000 1000032


Lead-likeness and Chirality<br />

Heteroaromatic Aromatic Fused heterocyclic<br />

Carbocyclic Heterocyclic Carbocyclic<br />

Heteroacyclic Carboacyclic Mixed Empty<br />

Chiral<br />

Achiral Empty<br />

Lead-like<br />

33<br />

Mixed<br />

Mixed<br />

Not lead-like Empty


OH<br />

O<br />

O<br />

O<br />

Fragrances<br />

Heteroaromatic Aromatic Fused heterocyclic<br />

Fused Carboc Heterocyclic Carbocyclic<br />

Heteroacyclic Carboacyclic Mixed Empty<br />

HO<br />

HO<br />

O<br />

O<br />

O<br />

O<br />

O<br />

34


O<br />

O<br />

O<br />

O<br />

O<br />

O<br />

O<br />

O<br />

O<br />

O<br />

Camphor Analogs<br />

O<br />

O<br />

O<br />

O<br />

O<br />

O<br />

O<br />

O<br />

O O<br />

O<br />

O<br />

O<br />

O<br />

O<br />

O<br />

O<br />

O<br />

O<br />

O<br />

O<br />

35


O<br />

O<br />

O<br />

O<br />

O<br />

O<br />

Rose Oxide Analogs<br />

O O<br />

O<br />

O<br />

O O<br />

O<br />

O<br />

O<br />

O O<br />

O<br />

O<br />

O O<br />

O<br />

O<br />

O<br />

O<br />

O<br />

O<br />

O<br />

O<br />

O<br />

O<br />

O<br />

36


Novelty vs. RDB<br />

> Ring systems<br />

> Stereochemistry<br />

> Physico-chemical properties<br />

> Compound classes<br />

> Drugs and leads<br />

IV: Stereoisomers<br />

GDB Analysis<br />

GDB<br />

37


O<br />

H 2N<br />

N<br />

H<br />

39<br />

(Paracetamol)<br />

44<br />

(GABA)<br />

O<br />

O<br />

O<br />

49<br />

(Brevicomin)<br />

OH O<br />

OH<br />

HO<br />

H 2N<br />

Known Drugs in GDB<br />

OH<br />

40<br />

(Pregabalin)<br />

O<br />

NH 2<br />

O<br />

45<br />

(Glutamate)<br />

O<br />

50<br />

(Carvon)<br />

OH<br />

H 2N<br />

NH 2<br />

41<br />

(Amantadine)<br />

46<br />

(Dopamine)<br />

OH<br />

51<br />

(Menthol)<br />

OH<br />

OH<br />

O<br />

H<br />

F<br />

O<br />

N<br />

H<br />

NH<br />

O<br />

42<br />

(Fluorouracil)<br />

HO<br />

O<br />

47<br />

(Niacin)<br />

52<br />

(Benzaldehyde)<br />

N<br />

O<br />

H<br />

F<br />

F<br />

F<br />

F<br />

O F<br />

F<br />

43<br />

(Desflurane)<br />

O<br />

O<br />

48<br />

(Frontalin)<br />

53<br />

(Cinnamaldehyde )<br />

38


O<br />

N<br />

N<br />

NH<br />

N<br />

N<br />

N<br />

H<br />

F<br />

F<br />

N<br />

O<br />

N N<br />

HN<br />

N HN<br />

H<br />

N<br />

OH<br />

NH 2<br />

Bayesian <strong>Virtual</strong> Screening<br />

Heteroaromatic Aromatic Fused heterocyclic<br />

Carbocyclic Heterocyclic Carbocyclic<br />

Heteroacyclic Carboacyclic Mixed Empty<br />

OH<br />

N<br />

HN<br />

OH<br />

37 38 39<br />

O<br />

OH<br />

N<br />

N<br />

31 32 33<br />

34 35 36<br />

F<br />

F<br />

N<br />

OH<br />

N<br />

NH<br />

37<br />

32<br />

31<br />

35<br />

34<br />

39<br />

Kinase Inhibitors<br />

GPCR ligands<br />

Ion-channel modulators<br />

36<br />

33<br />

38<br />

39


The <strong>Chemical</strong> Universe <strong>Database</strong> (GDB)<br />

> <strong>Chemical</strong> <strong>Space</strong> Travel<br />

Swiss National Science Foundation<br />

R. Van Deursen, J.-L. Reymond ChemMedChem 2007, 2, 636-640<br />

Ruud van Deursen<br />

40


<strong>Chemical</strong> <strong>Space</strong> Travel<br />

> Move continuously via nearest neighbours<br />

> Unknown space<br />

> Structural vs. property space<br />

A<br />

B<br />

n<br />

41


A<br />

<strong>Chemical</strong> <strong>Space</strong> as a Graph<br />

B<br />

Nearest neighbour mutations<br />

Atom type exchange<br />

Atom inversion<br />

Atom removal<br />

Atom addition<br />

Bond saturation<br />

Bond unsaturation<br />

Bond rearrangement<br />

42


Selection<br />

Selector<br />

<strong>Space</strong>ship<br />

Start A<br />

Generator<br />

Mutants<br />

Ranking Target B<br />

Ranked Mut.<br />

No Target B found<br />

Yes<br />

Stop<br />

Store in DB<br />

43


A<br />

A<br />

<strong>Space</strong>ship<br />

B<br />

mutation selection<br />

mutation<br />

B<br />

F(c) = (T SF(target, c) x T PF(target, c)) 5<br />

A<br />

A<br />

selection<br />

B<br />

B<br />

44


A<br />

Trajectory Libraries<br />

B<br />

Filters<br />

"crude" possible<br />

A<br />

B<br />

45


Compound Formula mass n [a]<br />

Trajectory Examples (1)<br />

Nearest<br />

neighbours<br />

Steps from CH 4 [b]<br />

With<br />

aromatic [c]<br />

Steps to<br />

N [d] MeOH [b]<br />

Cubane C 8H 8 104 8 12 - 6'638 7 994<br />

Fluorouracil C 4H 3FN 2O 2 130 9 16 9* 2'456 7* 560<br />

Metheneamine C6H12N4 140 10 12 - 6'157 9 1'768<br />

3-Tetrazene-2carboximidamide<br />

C2H6N10 170 12 12<br />

-<br />

4'685 11* 2'007<br />

Aspirine C9H8O4 180 13 15 8(1) 2'567 12 2'582<br />

9-Ethyl-carbazole C 14H 13N 197 15 n.f. [e] 20(2) 20'501 16 5'357<br />

Vitamin H C 10H 16N 2O 3S 244 16 18* - 27'161 14* 6'304<br />

VX (van) C 11H 26NO 2PS 267 16 21 - 29'460 14* 3'954<br />

Adenosine C 10H 13N 5O 4 267 19 n.f. [e] 25(2) 23'680 19* 13'639<br />

β-estradiol C 18H 24O 2 272 20 23 15(2) 43'089 20 19'067<br />

Retinal C 20H 28O 284 21 23 - 45'176 19* 15'100<br />

Morphine C 17H 19NO 3 285 21 26 18(2) 69'113 20* 16'247<br />

Aspartame C 14H 18N 2O 5 294 21 303 16(2) 34'172 20* 11'430<br />

Cocaine C 17H 21NO 4 303 22 n.f. [e] 20(2)* 70'807 22 17'993<br />

N [d]<br />

46


Compound Formula mass n [a]<br />

Trajectory Examples (2)<br />

Nearest<br />

neighbours<br />

Steps from CH4 [b]<br />

With<br />

aromatic [c]<br />

Steps to<br />

N [d] MeOH [b]<br />

Tetrodotoxine C 11H 17N 3O 8 319 22 28 - 106'158 20* 16'757<br />

Sucrose C 12H 22O 11 342 23 25* - 67'052 21 19'552<br />

Penicillin G C 16H 18N 2O 4S 334 23 n.f. [e] 20(2) 70'497 23* 15'748<br />

Strychnine C 21H 22N 2O 2 332 25 n.f. [e] 26(2) 176'721 25 32'479<br />

Papaverin C 20H 21NO 4 339 25 n.f. [e] 25(3) 53'099 25 28'449<br />

Colchicine C 22H 25NO 6 399 29 37 32(3) 136'519 28 33'592<br />

Calcitriol C 27H 44O 3 417 30 37* - 298'327 28* 65'595<br />

Dipicrylamine C 12H 5N 7O 12 439 31 n.f. [e] 21(2) 21'015 26 13'950<br />

Tetracycline C 22H 24N 2O 8 428 31 36 30(1) 173'734 30 34'883<br />

Vitamin K C 31H 46O 2 451 33 55 42(3) 411'107 32* 77'337<br />

Epothilone C 27H 41NO 6S 508 35 n.f. [e] 62(4) 709'250 34* 75'219<br />

Vitamin E C 29H 50O 2 531 38 71 40(2) 443'477 37* 140'017<br />

Reserpine C 33H 40N 2O 9 609 44 n.f. [e] 68(5) 286'342 62 230'646<br />

Taxotere C 45H 55NO 15 808 58 n.f. [e] 74(4) 1'128'960 57* 304'172<br />

N [d]<br />

47


From: To:<br />

Cubane<br />

Aspirine<br />

Cross-Trajectories<br />

VX<br />

Cubane - 10 18<br />

Adenosine<br />

23<br />

(1)<br />

Aspirine 10* - 14 21 15 16 24 22 22 33<br />

VX 13<br />

17<br />

(1)<br />

-<br />

31<br />

(1)<br />

Adenosine 17* 27 18* - 14 15 24 23 27* 29<br />

Sucrose 18*<br />

22<br />

(1)<br />

22*<br />

29<br />

(1)<br />

Sucrose<br />

19<br />

18<br />

Penicillin G<br />

18<br />

(1)<br />

15<br />

(1)<br />

- 25<br />

Penicillin G 19* 13* 14* 23 19* - 20 19* 21* 29<br />

Strychnine 21* 17* 20 26 22 16* - 30* 17* 22*<br />

Colchicine 27 22* 21 26 18 22 23 - 22* 21*<br />

Tetracycline 28* 20 25* 49 19 19* 16 28 - 17<br />

Vitamin K 30* 24* 30* 34* 28* 27* 19* 30* 22* -<br />

Strychnine<br />

18<br />

(1)<br />

21<br />

(1)<br />

26<br />

(1)<br />

Colchicine<br />

22<br />

(1)<br />

20<br />

(2)<br />

31<br />

(1)<br />

Tetracycline<br />

24<br />

(1)<br />

24*<br />

(1)<br />

25<br />

(1)<br />

Vitamin K<br />

26<br />

(1)<br />

25*<br />

(1)<br />

25<br />

(1)<br />

48


HO<br />

N O<br />

O<br />

NH 2<br />

OH<br />

AMPA ↔ CNQX<br />

14 ± 3<br />

16 ± 2<br />

O<br />

H<br />

O N N+<br />

O-<br />

O<br />

N<br />

H<br />

N<br />

49


N<br />

O<br />

N<br />

OH<br />

H 2N<br />

AMPA<br />

HO<br />

HN<br />

O<br />

O<br />

OH<br />

NH<br />

OH<br />

O<br />

O<br />

NH 2<br />

O O<br />

HO N<br />

O<br />

NH<br />

N<br />

NH 2<br />

O<br />

O<br />

N +<br />

-<br />

O<br />

O<br />

H<br />

N<br />

Trajectory<br />

O<br />

HO<br />

OH<br />

O<br />

NH 2<br />

O<br />

1 2 3 4<br />

OH<br />

N<br />

5 6 7 8<br />

O<br />

N<br />

H<br />

N O<br />

OH<br />

9 10 11<br />

N<br />

H<br />

O<br />

O<br />

HN<br />

O<br />

O<br />

O<br />

O<br />

OH<br />

NH<br />

NH<br />

O<br />

NH<br />

N<br />

NH 2<br />

O -<br />

N +<br />

O<br />

N<br />

OH<br />

HO<br />

O<br />

O<br />

O<br />

N<br />

H<br />

N<br />

N<br />

H<br />

O<br />

O<br />

NH<br />

N<br />

H<br />

CNQX<br />

OH<br />

N<br />

H<br />

N O<br />

OH<br />

O -<br />

N +<br />

O<br />

N<br />

50


AMPA-CNQX Libraries<br />

> AMPA → CNQX<br />

> CNQX → AMPA<br />

> Runaway from AMPA<br />

> Runaway from CNQX<br />

(500 runs)<br />

559,656 cpds<br />

(20 steps, 100 runs)<br />

152,916 cpds<br />

51


Rearrangement<br />

40%<br />

Bond type operations<br />

Desaturation: 0.735 ± 0.054<br />

Saturation: 0.437 ± 0.274<br />

Rearrangement: 0.512 ± 0.260<br />

Atom type operations<br />

Insertion: 0.788 ± 0.210<br />

Removal: 0.550 ± 0.268<br />

Identity: 1.000 ± 0.000<br />

Inversion: 0.768 ± 0.062<br />

Substitution: 0.726 ± 0.149<br />

Mutation Statistics<br />

Identity 1% 1% Inversion<br />

Desaturation 1% 7% Saturation<br />

24%<br />

13%<br />

13%<br />

Insertion<br />

Removal<br />

Substitution<br />

52


Docking<br />

Table 4. Average estimated binding energies <strong>by</strong> docking <strong>of</strong> trajectory compounds using AutoDock.<br />

Trajectory Ngenerated<br />

[a]<br />

NSelected SMILES<br />

[b]<br />

NStereoisomers BE (kcal/mol) [c]<br />

AMPA to CNQX 353'036 3’571 8’570 -8.8 ± 0.9<br />

CNQX to AMPA 206'622 2’051 5’754 -10.1 ± 1.0<br />

run-away from AMPA 91'417 992 7’586 -9.1 ± 0.9<br />

run-away from CNQX 61'499 974 5’751 -8.5 ± 0.9<br />

[a] Randomly selected structures (as SMILES codes) from the trajectory library. [b] All possible stereoisomers were<br />

generated using CORINA. [c] Average Binding Energy <strong>of</strong> the most stable bound conformation located <strong>by</strong> AutoDock.<br />

53


HO<br />

N O<br />

O<br />

NH 2<br />

OH<br />

Docking<br />

H<br />

N<br />

N<br />

H<br />

1 (BE = -10.4) 2 (BE = -9.3)<br />

HO<br />

H 2N<br />

O<br />

O<br />

OH<br />

3 (BE = -13.9)<br />

O<br />

O<br />

N<br />

NH 2<br />

O<br />

N+<br />

O-<br />

N<br />

54


1FTK.pdb<br />

Docking<br />

55


Docking, synthesis and testing<br />

> Expand GDB<br />

> Graphical user interface<br />

Outlook<br />

Swiss National Science Foundation<br />

56

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!