pdf 3.1 M - OpenEye Scientific Software
pdf 3.1 M - OpenEye Scientific Software
pdf 3.1 M - OpenEye Scientific Software
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Considerations of small molecule strain energy<br />
Johannes Hermann<br />
Ken Brameld<br />
Deborah Reuter
Why calculate strain energies ?<br />
• Probably - high strained conformations are not binding to target protein<br />
• Strained conformations might bind but lose activity<br />
• Conformational strain information could be used to guide/prioritize designs<br />
Problems:<br />
• Data sets to define thresholds are not clean<br />
• Forcefields have holes in the parameter set<br />
Q. Wang, Y.-P. Pang PLoS ONE 2007, 2(9) e820<br />
J. Tirado-Rives, W.L. Jorgensen J. Med. Chem. 2006, 49, 5880-5884<br />
E. Perola, P.S. Charifson J. Med. Chem. 2004, 47, 2499-2510<br />
• Forcefields are not ideal for calculating non minimum energies<br />
<strong>Scientific</strong>ally accurate investigation/calculation<br />
of strain energies is difficult,<br />
but pragmatic conclusions may be feasible
Strain Energy Calculation Strategy<br />
CSD<br />
• Source of low energy<br />
conformations<br />
• Relatively clean<br />
PDB<br />
• Probably more strained<br />
conformations<br />
• less clean<br />
Test methods<br />
Use methods
Strain Energy Calculation Scheme CSD/PDB<br />
xray<br />
ΔE local<br />
ΔE global<br />
E – global minimum<br />
Minimization with<br />
0.2Å wall box<br />
restraints<br />
=<br />
=<br />
E – xray relaxed<br />
E – xray relaxed<br />
Free minimizations<br />
Energy ranking<br />
E – - xray relaxed<br />
-<br />
-<br />
Conf-ensemble<br />
Free minimization<br />
E – local minimum<br />
E – global minimum<br />
E – local minimum<br />
Conformer<br />
generation
Strain Energy Calculation Scheme PDB<br />
xray<br />
ΔE local<br />
ΔE global<br />
E – global minimum<br />
Minimization with<br />
0.2Å wall box<br />
restraints<br />
=<br />
=<br />
E – xray relaxed<br />
E – xray relaxed<br />
Free minimizations<br />
Energy ranking<br />
E – - xray relaxed<br />
-<br />
-<br />
Conf-ensemble<br />
Free minimization<br />
E – local minimum<br />
E – global minimum<br />
E – local minimum<br />
Conformer<br />
generation
Databases / methods<br />
filtered to contain drug-likeish molecules<br />
PDB<br />
inhouse + public<br />
CSD<br />
Package<br />
Macromodel<br />
MOE<br />
Jaguar<br />
200 < MW < 700<br />
No P,B,As…..<br />
Rotatable bonds < 10<br />
Ring size < 10<br />
Internal other filters<br />
Forcefield /<br />
basis set<br />
MMFF94s<br />
OPLS2005<br />
MMFF94s<br />
MMFF94x<br />
B3LYP/6-31G*<br />
PDB<br />
inhouse + public<br />
vacuum<br />
CSD<br />
Electrostatics/solvent<br />
constant dielectric<br />
with/out GBSA water model<br />
constant dielectric<br />
distance dependent<br />
dielectric<br />
3119 entries<br />
15528 entries<br />
4592 for DFT
Local strain in the CSD<br />
different methods suggest low overall strain energy<br />
Package<br />
Jaguar<br />
MOE<br />
Macromodel<br />
Forcefield/basis<br />
B3LYP/6-31G*<br />
MMFF94s<br />
MMFF94x<br />
MMFF94s<br />
MMFF94s_GBSA<br />
OPLS2005<br />
OPLS2005_GBSA<br />
Median<br />
0.1<br />
1.8<br />
1.7<br />
0.5<br />
0.4<br />
0.4<br />
0.4<br />
sd<br />
1.8<br />
4.9<br />
4.8<br />
2.1<br />
1.9<br />
1.8<br />
1.7
Local strain energy distribution<br />
> 99% of DFT local strain less than 4 kcal/mol<br />
100%<br />
90%<br />
80%<br />
70%<br />
60%<br />
50%<br />
40%<br />
30%<br />
20%<br />
10%<br />
0%<br />
Local Strain > 4 kcal/mol<br />
1 – 4 kcal/mol<br />
< 1 kcal/mol<br />
DFT MMFF94s MMFF94x MMFF94s MMFF94sw OPLS2005 OPLS2005w<br />
Jaguar MOE Macromodel
B3LYP/6-31* can be used as a control<br />
high local strain with DFT identifies problematic compounds<br />
Gross Structural Errors<br />
JIYGED<br />
69 kcal/mol<br />
Hydrogen Placement Errors<br />
EKIWAW<br />
53 kcal/mol<br />
BEWWIJ<br />
19 kcal/mol<br />
BATYIE<br />
24 kcal/mol<br />
QUADWUO<br />
13 kcal/mol<br />
TAKNIC<br />
22 kcal/mol
ΔE MMFF94s MOE<br />
Identification of poor forcefield parameterization<br />
large discrepancy of FF vs. DFT suggest problematic substructures in compound<br />
70 n = 4500<br />
60<br />
50<br />
40<br />
30<br />
20<br />
10<br />
0<br />
0 0.5 1 1.5 2 2.5<br />
ΔE B3LYP/G6-31*<br />
> Median + sd<br />
< Median + sd
ΔE MMFF94s MOE<br />
Identification of poor forcefield parameterization<br />
large discrepancy of FF vs. DFT suggest problematic substructures in compound<br />
70<br />
60<br />
50<br />
40<br />
30<br />
20<br />
10<br />
0<br />
n = 4500<br />
8<br />
7<br />
6<br />
5<br />
4<br />
3<br />
2<br />
1<br />
0<br />
0 0.5 1 1.5 2<br />
dE l l B3LYP/6 31G*<br />
0 0.5 1 1.5 2 2.5<br />
ΔE B3LYP/G6-31*<br />
~3000 (of 4500)<br />
> Median + sd<br />
< Median + sd
Problematic substructures MMFF94s<br />
fingerprint based learner identifies frequent substructures in problematic sets<br />
N +<br />
N +<br />
N<br />
O O<br />
O<br />
S R<br />
S R<br />
O<br />
N<br />
H O O<br />
O<br />
CSD 40-75°<br />
MMFF94s 0°<br />
S R<br />
S R<br />
O<br />
N<br />
O
Comparison Macromodel & MOE MMFF94s<br />
MOE minimizer finds lower local min - same molecules as outliers<br />
ΔE MMFF94s MM<br />
10 n = 15500<br />
8<br />
6<br />
4<br />
2<br />
0<br />
0 2 4 6 8 10<br />
ΔE MMFF94s MOE<br />
94 O
Learnings from CSD strain energy calculations<br />
• Different forcefields / methods different strain energies<br />
• Strain energies are not transferable,<br />
need to have strain energy analysis for each software setup<br />
• The drug-like subset of the CSD is very clean – only few artifacts (gross errors)<br />
• Some relevant substructures are poorly parameterized in major forcefields<br />
• Although less accurate, forcefields can be used for strain energy assessment<br />
• Solvation models, treatment of electrostatics is less important for<br />
strain energy calculations in the absence of protein
Local strain energy in the PDB<br />
only slightly higher local strain observed than in the CSD<br />
Method<br />
MOE / MMFF94x<br />
Relaxation<br />
0.2 box wall<br />
0.2 box wall in protein<br />
free in protein<br />
Median Median (CSD)<br />
2.5 2.5 (1.7)<br />
7.8<br />
7.3<br />
sd (CSD) sd<br />
6.8 6.8 (4.8)<br />
7.2<br />
5.5
Much more problems in the PDB<br />
many protein- and protein unrelated errors public & inhouse<br />
2g1n<br />
8/18 kcal/mol<br />
1ikv<br />
961 kcal/mol<br />
1bqm<br />
17 kcal/mol<br />
2pxk<br />
4/474 kcal/mol<br />
1ya4<br />
93 kcal/mol<br />
3da2<br />
14/58 kcal/mol<br />
2pmo<br />
31 kcal/mol
Local strain energies (no protein)<br />
no correlation of strain energy with resolution, dpi, MW or number of rotors<br />
ΔE local MMFF94x MOE<br />
ΔE local MMFF94x MOE<br />
14<br />
12<br />
10<br />
8<br />
6<br />
4<br />
2<br />
0<br />
14<br />
12<br />
10<br />
8<br />
6<br />
4<br />
2<br />
0<br />
Resolution<br />
1 1.5 2 2.5 3 3.5<br />
dpi<br />
0 0.2 0.4 0.6 0.8 1<br />
ΔE local MMFF94x MOE<br />
ΔE local MMFF94x MOE<br />
14<br />
12<br />
10<br />
8<br />
6<br />
4<br />
2<br />
0<br />
14<br />
12<br />
10<br />
8<br />
6<br />
4<br />
2<br />
0<br />
Rotors<br />
0 1 2 3 4 5 6 7 8 9<br />
MW<br />
200 300 400 500 600
Local strain energies (no protein)<br />
no correlation of strain energy with resolution, dpi, MW or number of rotors<br />
ΔE local MMFF94x MOE<br />
ΔE local MMFF94x MOE<br />
14<br />
12<br />
10<br />
8<br />
6<br />
4<br />
2<br />
0<br />
14<br />
12<br />
10<br />
8<br />
6<br />
4<br />
2<br />
0<br />
Resolution<br />
1 1.5 2 2.5 3 3.5<br />
dpi<br />
0 0.2 0.4 0.6 0.8 1<br />
ΔE local MMFF94x MOE<br />
ΔE<br />
14<br />
12<br />
10<br />
8<br />
ΔE local MMFF94x MOE<br />
6<br />
4<br />
2<br />
0<br />
-2<br />
-4<br />
14<br />
12<br />
14<br />
12<br />
10<br />
8<br />
6<br />
4<br />
2<br />
0<br />
-2<br />
-4<br />
-6<br />
-8<br />
ΔE -6<br />
10<br />
8<br />
6<br />
4<br />
2<br />
0<br />
14<br />
12<br />
10<br />
8<br />
6<br />
4<br />
2<br />
0<br />
RotorsRotatable<br />
bonds<br />
0 1 2 3 4 5 6 7 8 9<br />
0 1 2 3 4 5 6 7 8 9<br />
MW<br />
Binned MW<br />
0 1 2 3 4 5 6 7 8<br />
200 300 400 500 600
Global strain similar distributed as in CSD<br />
OMEGA finds >95% of global minima in the PDB (1/3 = local)<br />
20% 20<br />
15% 15<br />
10% 10<br />
5% 5<br />
0<br />
CSD<br />
PDB<br />
x = -0.5 -0.5 < x = 0.5 0.5 < x = 3 3 < x = 6 6 < x = 10 10 < x = 15 15 < x = 20 20 < x<br />
1 2 3 4 5 6 7 8 9<br />
Bi d dE l b l<br />
Binned ΔE global MMFF94s MOE/OMEGA
CSD global strain energy (MOE/MMFF94s, OMEGA)<br />
ΔE global MMFF94s MOE/OMEGA<br />
spiro- and aliphatic bridgehead atoms explain 2/3 of wrong global minima<br />
0<br />
-100<br />
-200<br />
-300<br />
-400<br />
-500<br />
n = 15500<br />
0 2 4 6 8 10<br />
ΔE local MMFF94s MOE<br />
spiro-center<br />
> 1 aliphatic bridgeheads<br />
other<br />
OMEGA 2.3.2 settings:<br />
-searchff & buildff MMFF94s<br />
-rms 0.8<br />
-noestat false
Global strain similar distributed as in CSD<br />
OMEGA finds >95% of global minima in the PDB (1/3 = local)<br />
20% 20<br />
15% 15<br />
10% 10<br />
5% 5<br />
0<br />
CSD<br />
PDB<br />
x = -0.5 -0.5 < x = 0.5 0.5 < x = 3 3 < x = 6 6 < x = 10 10 < x = 15 15 < x = 20 20 < x<br />
1 2 3 4 5 6 7 8 9<br />
Bi d dE l b l<br />
Binned ΔE global MMFF94s MOE/OMEGA
Local strain for isolated proteins/series<br />
within a series strain energies become meaningful<br />
# compounds<br />
Mean<br />
Spread E<br />
Spread MW<br />
Kinase<br />
target 1<br />
Series 1<br />
11<br />
11.5<br />
8 - 16<br />
250<br />
Kinase<br />
target 1<br />
Series 2<br />
7<br />
7.5<br />
2 - 13<br />
220<br />
Kinase<br />
target 2<br />
7<br />
1.4<br />
0.3 – 3.8<br />
100<br />
all compound’s IC 50 < 500 nM<br />
Viral<br />
target<br />
6<br />
2.9<br />
1.4 - 3.4<br />
90<br />
Serine<br />
protease<br />
6<br />
2.1<br />
0.6 - 4.3<br />
115
Conclusions<br />
• Calculating strain energies is not as straightforward as one might think<br />
• Individual inspection of PDB structures necessary<br />
• Strain energy does not correlate strongly with resolution, dpi, MW or number of<br />
rotatable bonds<br />
• PDB and CSD strain energy distributions surprisingly similar<br />
• PDB structures are very heterogeneous no universal threshold for<br />
tolerated strain energy<br />
• Isolated series may provide guidance on what strain energies are acceptable