27.08.2013 Views

pdf 3.1 M - OpenEye Scientific Software

pdf 3.1 M - OpenEye Scientific Software

pdf 3.1 M - OpenEye Scientific Software

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Considerations of small molecule strain energy<br />

Johannes Hermann<br />

Ken Brameld<br />

Deborah Reuter


Why calculate strain energies ?<br />

• Probably - high strained conformations are not binding to target protein<br />

• Strained conformations might bind but lose activity<br />

• Conformational strain information could be used to guide/prioritize designs<br />

Problems:<br />

• Data sets to define thresholds are not clean<br />

• Forcefields have holes in the parameter set<br />

Q. Wang, Y.-P. Pang PLoS ONE 2007, 2(9) e820<br />

J. Tirado-Rives, W.L. Jorgensen J. Med. Chem. 2006, 49, 5880-5884<br />

E. Perola, P.S. Charifson J. Med. Chem. 2004, 47, 2499-2510<br />

• Forcefields are not ideal for calculating non minimum energies<br />

<strong>Scientific</strong>ally accurate investigation/calculation<br />

of strain energies is difficult,<br />

but pragmatic conclusions may be feasible


Strain Energy Calculation Strategy<br />

CSD<br />

• Source of low energy<br />

conformations<br />

• Relatively clean<br />

PDB<br />

• Probably more strained<br />

conformations<br />

• less clean<br />

Test methods<br />

Use methods


Strain Energy Calculation Scheme CSD/PDB<br />

xray<br />

ΔE local<br />

ΔE global<br />

E – global minimum<br />

Minimization with<br />

0.2Å wall box<br />

restraints<br />

=<br />

=<br />

E – xray relaxed<br />

E – xray relaxed<br />

Free minimizations<br />

Energy ranking<br />

E – - xray relaxed<br />

-<br />

-<br />

Conf-ensemble<br />

Free minimization<br />

E – local minimum<br />

E – global minimum<br />

E – local minimum<br />

Conformer<br />

generation


Strain Energy Calculation Scheme PDB<br />

xray<br />

ΔE local<br />

ΔE global<br />

E – global minimum<br />

Minimization with<br />

0.2Å wall box<br />

restraints<br />

=<br />

=<br />

E – xray relaxed<br />

E – xray relaxed<br />

Free minimizations<br />

Energy ranking<br />

E – - xray relaxed<br />

-<br />

-<br />

Conf-ensemble<br />

Free minimization<br />

E – local minimum<br />

E – global minimum<br />

E – local minimum<br />

Conformer<br />

generation


Databases / methods<br />

filtered to contain drug-likeish molecules<br />

PDB<br />

inhouse + public<br />

CSD<br />

Package<br />

Macromodel<br />

MOE<br />

Jaguar<br />

200 < MW < 700<br />

No P,B,As…..<br />

Rotatable bonds < 10<br />

Ring size < 10<br />

Internal other filters<br />

Forcefield /<br />

basis set<br />

MMFF94s<br />

OPLS2005<br />

MMFF94s<br />

MMFF94x<br />

B3LYP/6-31G*<br />

PDB<br />

inhouse + public<br />

vacuum<br />

CSD<br />

Electrostatics/solvent<br />

constant dielectric<br />

with/out GBSA water model<br />

constant dielectric<br />

distance dependent<br />

dielectric<br />

3119 entries<br />

15528 entries<br />

4592 for DFT


Local strain in the CSD<br />

different methods suggest low overall strain energy<br />

Package<br />

Jaguar<br />

MOE<br />

Macromodel<br />

Forcefield/basis<br />

B3LYP/6-31G*<br />

MMFF94s<br />

MMFF94x<br />

MMFF94s<br />

MMFF94s_GBSA<br />

OPLS2005<br />

OPLS2005_GBSA<br />

Median<br />

0.1<br />

1.8<br />

1.7<br />

0.5<br />

0.4<br />

0.4<br />

0.4<br />

sd<br />

1.8<br />

4.9<br />

4.8<br />

2.1<br />

1.9<br />

1.8<br />

1.7


Local strain energy distribution<br />

> 99% of DFT local strain less than 4 kcal/mol<br />

100%<br />

90%<br />

80%<br />

70%<br />

60%<br />

50%<br />

40%<br />

30%<br />

20%<br />

10%<br />

0%<br />

Local Strain > 4 kcal/mol<br />

1 – 4 kcal/mol<br />

< 1 kcal/mol<br />

DFT MMFF94s MMFF94x MMFF94s MMFF94sw OPLS2005 OPLS2005w<br />

Jaguar MOE Macromodel


B3LYP/6-31* can be used as a control<br />

high local strain with DFT identifies problematic compounds<br />

Gross Structural Errors<br />

JIYGED<br />

69 kcal/mol<br />

Hydrogen Placement Errors<br />

EKIWAW<br />

53 kcal/mol<br />

BEWWIJ<br />

19 kcal/mol<br />

BATYIE<br />

24 kcal/mol<br />

QUADWUO<br />

13 kcal/mol<br />

TAKNIC<br />

22 kcal/mol


ΔE MMFF94s MOE<br />

Identification of poor forcefield parameterization<br />

large discrepancy of FF vs. DFT suggest problematic substructures in compound<br />

70 n = 4500<br />

60<br />

50<br />

40<br />

30<br />

20<br />

10<br />

0<br />

0 0.5 1 1.5 2 2.5<br />

ΔE B3LYP/G6-31*<br />

> Median + sd<br />

< Median + sd


ΔE MMFF94s MOE<br />

Identification of poor forcefield parameterization<br />

large discrepancy of FF vs. DFT suggest problematic substructures in compound<br />

70<br />

60<br />

50<br />

40<br />

30<br />

20<br />

10<br />

0<br />

n = 4500<br />

8<br />

7<br />

6<br />

5<br />

4<br />

3<br />

2<br />

1<br />

0<br />

0 0.5 1 1.5 2<br />

dE l l B3LYP/6 31G*<br />

0 0.5 1 1.5 2 2.5<br />

ΔE B3LYP/G6-31*<br />

~3000 (of 4500)<br />

> Median + sd<br />

< Median + sd


Problematic substructures MMFF94s<br />

fingerprint based learner identifies frequent substructures in problematic sets<br />

N +<br />

N +<br />

N<br />

O O<br />

O<br />

S R<br />

S R<br />

O<br />

N<br />

H O O<br />

O<br />

CSD 40-75°<br />

MMFF94s 0°<br />

S R<br />

S R<br />

O<br />

N<br />

O


Comparison Macromodel & MOE MMFF94s<br />

MOE minimizer finds lower local min - same molecules as outliers<br />

ΔE MMFF94s MM<br />

10 n = 15500<br />

8<br />

6<br />

4<br />

2<br />

0<br />

0 2 4 6 8 10<br />

ΔE MMFF94s MOE<br />

94 O


Learnings from CSD strain energy calculations<br />

• Different forcefields / methods different strain energies<br />

• Strain energies are not transferable,<br />

need to have strain energy analysis for each software setup<br />

• The drug-like subset of the CSD is very clean – only few artifacts (gross errors)<br />

• Some relevant substructures are poorly parameterized in major forcefields<br />

• Although less accurate, forcefields can be used for strain energy assessment<br />

• Solvation models, treatment of electrostatics is less important for<br />

strain energy calculations in the absence of protein


Local strain energy in the PDB<br />

only slightly higher local strain observed than in the CSD<br />

Method<br />

MOE / MMFF94x<br />

Relaxation<br />

0.2 box wall<br />

0.2 box wall in protein<br />

free in protein<br />

Median Median (CSD)<br />

2.5 2.5 (1.7)<br />

7.8<br />

7.3<br />

sd (CSD) sd<br />

6.8 6.8 (4.8)<br />

7.2<br />

5.5


Much more problems in the PDB<br />

many protein- and protein unrelated errors public & inhouse<br />

2g1n<br />

8/18 kcal/mol<br />

1ikv<br />

961 kcal/mol<br />

1bqm<br />

17 kcal/mol<br />

2pxk<br />

4/474 kcal/mol<br />

1ya4<br />

93 kcal/mol<br />

3da2<br />

14/58 kcal/mol<br />

2pmo<br />

31 kcal/mol


Local strain energies (no protein)<br />

no correlation of strain energy with resolution, dpi, MW or number of rotors<br />

ΔE local MMFF94x MOE<br />

ΔE local MMFF94x MOE<br />

14<br />

12<br />

10<br />

8<br />

6<br />

4<br />

2<br />

0<br />

14<br />

12<br />

10<br />

8<br />

6<br />

4<br />

2<br />

0<br />

Resolution<br />

1 1.5 2 2.5 3 3.5<br />

dpi<br />

0 0.2 0.4 0.6 0.8 1<br />

ΔE local MMFF94x MOE<br />

ΔE local MMFF94x MOE<br />

14<br />

12<br />

10<br />

8<br />

6<br />

4<br />

2<br />

0<br />

14<br />

12<br />

10<br />

8<br />

6<br />

4<br />

2<br />

0<br />

Rotors<br />

0 1 2 3 4 5 6 7 8 9<br />

MW<br />

200 300 400 500 600


Local strain energies (no protein)<br />

no correlation of strain energy with resolution, dpi, MW or number of rotors<br />

ΔE local MMFF94x MOE<br />

ΔE local MMFF94x MOE<br />

14<br />

12<br />

10<br />

8<br />

6<br />

4<br />

2<br />

0<br />

14<br />

12<br />

10<br />

8<br />

6<br />

4<br />

2<br />

0<br />

Resolution<br />

1 1.5 2 2.5 3 3.5<br />

dpi<br />

0 0.2 0.4 0.6 0.8 1<br />

ΔE local MMFF94x MOE<br />

ΔE<br />

14<br />

12<br />

10<br />

8<br />

ΔE local MMFF94x MOE<br />

6<br />

4<br />

2<br />

0<br />

-2<br />

-4<br />

14<br />

12<br />

14<br />

12<br />

10<br />

8<br />

6<br />

4<br />

2<br />

0<br />

-2<br />

-4<br />

-6<br />

-8<br />

ΔE -6<br />

10<br />

8<br />

6<br />

4<br />

2<br />

0<br />

14<br />

12<br />

10<br />

8<br />

6<br />

4<br />

2<br />

0<br />

RotorsRotatable<br />

bonds<br />

0 1 2 3 4 5 6 7 8 9<br />

0 1 2 3 4 5 6 7 8 9<br />

MW<br />

Binned MW<br />

0 1 2 3 4 5 6 7 8<br />

200 300 400 500 600


Global strain similar distributed as in CSD<br />

OMEGA finds >95% of global minima in the PDB (1/3 = local)<br />

20% 20<br />

15% 15<br />

10% 10<br />

5% 5<br />

0<br />

CSD<br />

PDB<br />

x = -0.5 -0.5 < x = 0.5 0.5 < x = 3 3 < x = 6 6 < x = 10 10 < x = 15 15 < x = 20 20 < x<br />

1 2 3 4 5 6 7 8 9<br />

Bi d dE l b l<br />

Binned ΔE global MMFF94s MOE/OMEGA


CSD global strain energy (MOE/MMFF94s, OMEGA)<br />

ΔE global MMFF94s MOE/OMEGA<br />

spiro- and aliphatic bridgehead atoms explain 2/3 of wrong global minima<br />

0<br />

-100<br />

-200<br />

-300<br />

-400<br />

-500<br />

n = 15500<br />

0 2 4 6 8 10<br />

ΔE local MMFF94s MOE<br />

spiro-center<br />

> 1 aliphatic bridgeheads<br />

other<br />

OMEGA 2.3.2 settings:<br />

-searchff & buildff MMFF94s<br />

-rms 0.8<br />

-noestat false


Global strain similar distributed as in CSD<br />

OMEGA finds >95% of global minima in the PDB (1/3 = local)<br />

20% 20<br />

15% 15<br />

10% 10<br />

5% 5<br />

0<br />

CSD<br />

PDB<br />

x = -0.5 -0.5 < x = 0.5 0.5 < x = 3 3 < x = 6 6 < x = 10 10 < x = 15 15 < x = 20 20 < x<br />

1 2 3 4 5 6 7 8 9<br />

Bi d dE l b l<br />

Binned ΔE global MMFF94s MOE/OMEGA


Local strain for isolated proteins/series<br />

within a series strain energies become meaningful<br />

# compounds<br />

Mean<br />

Spread E<br />

Spread MW<br />

Kinase<br />

target 1<br />

Series 1<br />

11<br />

11.5<br />

8 - 16<br />

250<br />

Kinase<br />

target 1<br />

Series 2<br />

7<br />

7.5<br />

2 - 13<br />

220<br />

Kinase<br />

target 2<br />

7<br />

1.4<br />

0.3 – 3.8<br />

100<br />

all compound’s IC 50 < 500 nM<br />

Viral<br />

target<br />

6<br />

2.9<br />

1.4 - 3.4<br />

90<br />

Serine<br />

protease<br />

6<br />

2.1<br />

0.6 - 4.3<br />

115


Conclusions<br />

• Calculating strain energies is not as straightforward as one might think<br />

• Individual inspection of PDB structures necessary<br />

• Strain energy does not correlate strongly with resolution, dpi, MW or number of<br />

rotatable bonds<br />

• PDB and CSD strain energy distributions surprisingly similar<br />

• PDB structures are very heterogeneous no universal threshold for<br />

tolerated strain energy<br />

• Isolated series may provide guidance on what strain energies are acceptable

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!