pdf 1.95M - OpenEye Scientific Software
pdf 1.95M - OpenEye Scientific Software
pdf 1.95M - OpenEye Scientific Software
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Shape Descriptors<br />
Sometimes Better than Nothing<br />
Brian Kelley<br />
Senior Developer/Scientist
Information Reduction – the<br />
myth of fingerprints<br />
First there were experts...
Information Reduction – the<br />
myth of fingerprints<br />
…then there were med-chems<br />
Hydrophobe<br />
Donor<br />
Acceptor<br />
Ring<br />
Charge<br />
Hydrogen<br />
Bonds
Shape: I want to believe...<br />
UFO Symposium 1968: Shepard UFO Shape Array
Shape is a Fundamental Physical<br />
Property of a Molecule.<br />
Thermal Motion (BFactor)
T=<br />
Our Definition of Shape:<br />
Shape Tanimoto<br />
Overlap(q,t) = Intersection of Volumes q and t<br />
Overlap(q,t)<br />
Overlap(q,q) + Overlap(t,t) – Overlap(q,t)
Shape Tanimoto: Gold Standard<br />
• Is a metric ( follows triangle inequality )<br />
• Is an objective function (has gradients,<br />
first and second derivatives).<br />
• Fast to compute for single overlays.<br />
• Easy extension to color.<br />
• Optimization generates alignments.<br />
• Is its own descriptor.
Shape Tanimoto: Gold Standard<br />
95 % confidence<br />
for 1000 random trials
The “Shape” of Ligand-based<br />
Design<br />
Co-Crystal Ligand-Based<br />
Docking<br />
Hypothesis<br />
Generation<br />
FILTER remove undesirables<br />
OMEGA conformations<br />
ROCS<br />
shape similarity
Generating Conformations,<br />
where do shapes come from?<br />
FILTER remove undesirables<br />
set protonation<br />
OMEGA conformations<br />
Conformations are required for all molecular<br />
shape descriptors!
ROCS Virtual Screening Workflow<br />
Query molecule<br />
Shape Query<br />
FILTER remove undesirables<br />
set protonation<br />
OMEGA conformations<br />
ROCS shape similarity<br />
(color similarity)<br />
ROCS 10x faster then OMEGA
What Are Shape Descriptors?<br />
Start with atom positions Start with a Volume or surface
What Are Shape Descriptors?<br />
Start with atom positions<br />
Reduce atom positions to<br />
an array of numbers that<br />
can be compared using<br />
various distances<br />
L1 norm: Manhattan distance<br />
L2 norm: Euclidean distance<br />
Distance Measure: Distance=0 identity, unbounded range?<br />
Similarity Measure: Similarity of 1=identity, 0=max difference.
Distance Metrics in real life<br />
Pirates expand their activity farther off the coast<br />
in the indian Ocean in 2008<br />
Indian<br />
Ocean<br />
Gulf of<br />
Aden<br />
Hijacked Ship<br />
Attempted Hijacking<br />
Feb ’08 Aug June Aug Oct Jan’09 Apr<br />
New York Times, April 17 2009<br />
Distance from Somali<br />
coast in nautical miles<br />
700<br />
600<br />
500<br />
400<br />
300<br />
200<br />
100<br />
0
• Distance Based<br />
Types of Reduced<br />
– Binned Histograms<br />
– Atom Triplets<br />
– Pharmacophore<br />
– BCUTS<br />
– EShape3D<br />
Representations
Types of Reduced<br />
Representations (cont)<br />
• Moment Based<br />
– Shape Multipoles<br />
– Moments of Inertia<br />
– Central/Zernike Moments<br />
– USR (Ultrafast Shape Recognition)
Required Properties of Shape<br />
Descriptors<br />
• Needs to be a measure: greater<br />
distances imply greater differences in<br />
shape.<br />
• Useful to be a metric and follow the<br />
triangle inequality (otherwise harder to<br />
cluster):<br />
if a
Start Simple - Distance<br />
Histograms<br />
Bin interatomic distances into a histogram
Distance Histograms from<br />
“Special Points”<br />
Used in Flexophores for atom matching.<br />
Center Of Mass<br />
Intertial Arms
Shape Multipoles<br />
• Start with Simple expansions of the<br />
positions of atomic Gaussians.<br />
‣0th Pole is the Volume<br />
‣Three Dipoles (x,y,z)<br />
‣Six Quadropoles (xx,xy,xz...)<br />
‣10 Octopoles (xxx, xxy, xxz...)
Shape Multipoles(cont)<br />
• Can generate an irreducible descriptor<br />
(invariant to translation and rotation)<br />
• Poles rotated into inertial frame (also<br />
provides alignment)
Ultrafast Shape Descriptors<br />
For each conformation find four special<br />
center points<br />
1.The geometric center<br />
2. The atom closest to the center<br />
3.The atom farthest away center<br />
4.The atom farthest away from (3)
Ultrafast Shape Descriptors(cont)<br />
For each special point Get the distances<br />
to all atoms and:<br />
1. Compute the mean distance from all atoms<br />
2. Compute the variation of all distances<br />
3. Compute the skew of all distances
Ultrafast Shape Descriptors(cont)
Ultrafast Shape Descriptors(cont)
Ultrafast Shape Descriptors(cont)<br />
Ballester, PJ, Richards WG, Ultrafast shape recognition for similarity,<br />
Proc. R. Soc. A, 2007, Vol. 463, 1307-1321
Information<br />
Compression<br />
Moments<br />
of Inertial<br />
Multipole<br />
Central/Zernike<br />
Moments<br />
USR<br />
Spherical<br />
Harmonics<br />
Fourier<br />
Coefficients<br />
BCUTS/ESShape3D<br />
Distance<br />
Histograms<br />
ROCS<br />
Pharmacophore<br />
Atom<br />
Triplets<br />
1x 10x 100x 1000x<br />
Speed<br />
DOCKING<br />
FRED
USR Versus Shape Tanimoto<br />
2<br />
r = 0.42<br />
USR
Multipole versus Shape Tanimoto
Rank Correlations To Shape<br />
Kendall Tau: pairwise disagreements (number<br />
of discordant pairs)<br />
Number of operations a bubble sort would take<br />
to order list B into list A
Rank Correlations To Shape<br />
Molecule A B C D E<br />
Shape 1 2 3 4 5<br />
USR 3 4 1 2 5<br />
Normalized<br />
Kendall Tau =<br />
4<br />
5(5-1)/2<br />
Pair Shape USR<br />
AB 1
Rank Correlation To Shape<br />
Kendall Tau: pairwise disagreements<br />
(discordant pairs)<br />
Shape Rank vs. USR Rank: 0.4<br />
Shape Rank vs. Multipole Rank: 0.38
Fragility of descriptors<br />
2<br />
r = 0.42<br />
USR
UFR Versus Shape Tanimoto
UFR Versus Shape Tanimoto
UFR Versus Shape Tanimoto
UFR Versus Shape Tanimoto
Effect of adding Torsional<br />
Noise
Effect of adding Torsional<br />
Noise
Effect of adding Torsional<br />
Noise
Effect of adding Torsional<br />
Noise
Effect of adding Torsional<br />
Noise
Effect of adding Torsional<br />
Noise
Effect of adding Torsional<br />
Noise
Effect of adding Torsional<br />
Noise
Effect of adding Torsional<br />
Noise
Effect of adding Torsional<br />
Noise
Effect of adding Torsional<br />
Noise
Effect of adding Torsional<br />
Noise
Effect of adding Torsional<br />
Noise
Effect of adding Torsional<br />
Noise
Effect of adding Torsional<br />
Noise
Effect of adding Torsional<br />
Noise
Effect of adding Torsional<br />
Noise
Effect of adding Torsion Noise
Effects of randomizing Torsions
Effects of randomizing Torsions
Virtual Screening Performance<br />
against DUD
...using Lowest Energy<br />
Conformation
...adding color
...using alignments + color
How do overlays help?
Using Moment Alignments<br />
60%<br />
30%<br />
10%<br />
% Filtered
Information<br />
Compression<br />
Moments<br />
of Inertial<br />
Multipole<br />
Central/Zernike<br />
Moments<br />
USR<br />
Spherical<br />
Harmonics<br />
Fourier<br />
Coefficients<br />
BCUTS/ESShape3D<br />
Distance<br />
Histograms<br />
ROCS<br />
Pharmacophore<br />
Atom<br />
Triplets<br />
1x 10x 100x 1000x<br />
Speed<br />
DOCKING<br />
FRED
Information<br />
Compression<br />
Moments<br />
of Inertial<br />
Multipole<br />
Central/Zernike<br />
Moments<br />
USR<br />
Spherical<br />
Harmonics<br />
Fourier<br />
Coefficients<br />
BCUTS/ESShape3D<br />
Can be used with Electron Density?<br />
Distance<br />
Histograms<br />
ROCS<br />
Pharmacophore<br />
Atom<br />
Triplets<br />
1x 10x 100x 1000x<br />
Speed<br />
DOCKING<br />
FRED
Information<br />
Compression<br />
Moments<br />
of Inertial<br />
Multipole<br />
Central/Zernike<br />
Moments<br />
USR<br />
Spherical<br />
Harmonics<br />
Fourier<br />
Coefficients<br />
BCUTS/ESShape3D<br />
Generates Overlays?<br />
Distance<br />
Histograms<br />
ROCS<br />
Pharmacophore<br />
Atom<br />
Triplets<br />
1x 10x 100x 1000x<br />
Speed<br />
DOCKING<br />
FRED
Information<br />
Compression<br />
Moments<br />
of Inertial<br />
Multipole<br />
Central/Zernike<br />
Moments<br />
USR<br />
Spherical<br />
Harmonics<br />
Fourier<br />
Coefficients<br />
BCUTS/ESShape3D<br />
Can use chemistry (color)?<br />
Distance<br />
Histograms<br />
ROCS<br />
Pharmacophore<br />
Atom<br />
Triplets<br />
1x 10x 100x 1000x<br />
Speed<br />
DOCKING<br />
FRED
Information<br />
Compression<br />
Moments<br />
of Inertial<br />
Multipole<br />
Central/Zernike<br />
Moments<br />
USR<br />
Spherical<br />
Harmonics<br />
Fourier<br />
Coefficients<br />
BCUTS/ESShape3D<br />
Is really an objective function?<br />
Distance<br />
Histograms<br />
Atom<br />
Triplets<br />
ROCS<br />
Pharmacophore<br />
1x 10x 100x 1000x<br />
Speed<br />
DOCKING<br />
FRED
• Anthony Nicholls<br />
• Robert Tolbert<br />
• Mark McGann<br />
Acknowledgements