27.08.2013 Views

pdf 1.95M - OpenEye Scientific Software

pdf 1.95M - OpenEye Scientific Software

pdf 1.95M - OpenEye Scientific Software

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Shape Descriptors<br />

Sometimes Better than Nothing<br />

Brian Kelley<br />

Senior Developer/Scientist


Information Reduction – the<br />

myth of fingerprints<br />

First there were experts...


Information Reduction – the<br />

myth of fingerprints<br />

…then there were med-chems<br />

Hydrophobe<br />

Donor<br />

Acceptor<br />

Ring<br />

Charge<br />

Hydrogen<br />

Bonds


Shape: I want to believe...<br />

UFO Symposium 1968: Shepard UFO Shape Array


Shape is a Fundamental Physical<br />

Property of a Molecule.<br />

Thermal Motion (BFactor)


T=<br />

Our Definition of Shape:<br />

Shape Tanimoto<br />

Overlap(q,t) = Intersection of Volumes q and t<br />

Overlap(q,t)<br />

Overlap(q,q) + Overlap(t,t) – Overlap(q,t)


Shape Tanimoto: Gold Standard<br />

• Is a metric ( follows triangle inequality )<br />

• Is an objective function (has gradients,<br />

first and second derivatives).<br />

• Fast to compute for single overlays.<br />

• Easy extension to color.<br />

• Optimization generates alignments.<br />

• Is its own descriptor.


Shape Tanimoto: Gold Standard<br />

95 % confidence<br />

for 1000 random trials


The “Shape” of Ligand-based<br />

Design<br />

Co-Crystal Ligand-Based<br />

Docking<br />

Hypothesis<br />

Generation<br />

FILTER remove undesirables<br />

OMEGA conformations<br />

ROCS<br />

shape similarity


Generating Conformations,<br />

where do shapes come from?<br />

FILTER remove undesirables<br />

set protonation<br />

OMEGA conformations<br />

Conformations are required for all molecular<br />

shape descriptors!


ROCS Virtual Screening Workflow<br />

Query molecule<br />

Shape Query<br />

FILTER remove undesirables<br />

set protonation<br />

OMEGA conformations<br />

ROCS shape similarity<br />

(color similarity)<br />

ROCS 10x faster then OMEGA


What Are Shape Descriptors?<br />

Start with atom positions Start with a Volume or surface


What Are Shape Descriptors?<br />

Start with atom positions<br />

Reduce atom positions to<br />

an array of numbers that<br />

can be compared using<br />

various distances<br />

L1 norm: Manhattan distance<br />

L2 norm: Euclidean distance<br />

Distance Measure: Distance=0 identity, unbounded range?<br />

Similarity Measure: Similarity of 1=identity, 0=max difference.


Distance Metrics in real life<br />

Pirates expand their activity farther off the coast<br />

in the indian Ocean in 2008<br />

Indian<br />

Ocean<br />

Gulf of<br />

Aden<br />

Hijacked Ship<br />

Attempted Hijacking<br />

Feb ’08 Aug June Aug Oct Jan’09 Apr<br />

New York Times, April 17 2009<br />

Distance from Somali<br />

coast in nautical miles<br />

700<br />

600<br />

500<br />

400<br />

300<br />

200<br />

100<br />

0


• Distance Based<br />

Types of Reduced<br />

– Binned Histograms<br />

– Atom Triplets<br />

– Pharmacophore<br />

– BCUTS<br />

– EShape3D<br />

Representations


Types of Reduced<br />

Representations (cont)<br />

• Moment Based<br />

– Shape Multipoles<br />

– Moments of Inertia<br />

– Central/Zernike Moments<br />

– USR (Ultrafast Shape Recognition)


Required Properties of Shape<br />

Descriptors<br />

• Needs to be a measure: greater<br />

distances imply greater differences in<br />

shape.<br />

• Useful to be a metric and follow the<br />

triangle inequality (otherwise harder to<br />

cluster):<br />

if a


Start Simple - Distance<br />

Histograms<br />

Bin interatomic distances into a histogram


Distance Histograms from<br />

“Special Points”<br />

Used in Flexophores for atom matching.<br />

Center Of Mass<br />

Intertial Arms


Shape Multipoles<br />

• Start with Simple expansions of the<br />

positions of atomic Gaussians.<br />

‣0th Pole is the Volume<br />

‣Three Dipoles (x,y,z)<br />

‣Six Quadropoles (xx,xy,xz...)<br />

‣10 Octopoles (xxx, xxy, xxz...)


Shape Multipoles(cont)<br />

• Can generate an irreducible descriptor<br />

(invariant to translation and rotation)<br />

• Poles rotated into inertial frame (also<br />

provides alignment)


Ultrafast Shape Descriptors<br />

For each conformation find four special<br />

center points<br />

1.The geometric center<br />

2. The atom closest to the center<br />

3.The atom farthest away center<br />

4.The atom farthest away from (3)


Ultrafast Shape Descriptors(cont)<br />

For each special point Get the distances<br />

to all atoms and:<br />

1. Compute the mean distance from all atoms<br />

2. Compute the variation of all distances<br />

3. Compute the skew of all distances


Ultrafast Shape Descriptors(cont)


Ultrafast Shape Descriptors(cont)


Ultrafast Shape Descriptors(cont)<br />

Ballester, PJ, Richards WG, Ultrafast shape recognition for similarity,<br />

Proc. R. Soc. A, 2007, Vol. 463, 1307-1321


Information<br />

Compression<br />

Moments<br />

of Inertial<br />

Multipole<br />

Central/Zernike<br />

Moments<br />

USR<br />

Spherical<br />

Harmonics<br />

Fourier<br />

Coefficients<br />

BCUTS/ESShape3D<br />

Distance<br />

Histograms<br />

ROCS<br />

Pharmacophore<br />

Atom<br />

Triplets<br />

1x 10x 100x 1000x<br />

Speed<br />

DOCKING<br />

FRED


USR Versus Shape Tanimoto<br />

2<br />

r = 0.42<br />

USR


Multipole versus Shape Tanimoto


Rank Correlations To Shape<br />

Kendall Tau: pairwise disagreements (number<br />

of discordant pairs)<br />

Number of operations a bubble sort would take<br />

to order list B into list A


Rank Correlations To Shape<br />

Molecule A B C D E<br />

Shape 1 2 3 4 5<br />

USR 3 4 1 2 5<br />

Normalized<br />

Kendall Tau =<br />

4<br />

5(5-1)/2<br />

Pair Shape USR<br />

AB 1


Rank Correlation To Shape<br />

Kendall Tau: pairwise disagreements<br />

(discordant pairs)<br />

Shape Rank vs. USR Rank: 0.4<br />

Shape Rank vs. Multipole Rank: 0.38


Fragility of descriptors<br />

2<br />

r = 0.42<br />

USR


UFR Versus Shape Tanimoto


UFR Versus Shape Tanimoto


UFR Versus Shape Tanimoto


UFR Versus Shape Tanimoto


Effect of adding Torsional<br />

Noise


Effect of adding Torsional<br />

Noise


Effect of adding Torsional<br />

Noise


Effect of adding Torsional<br />

Noise


Effect of adding Torsional<br />

Noise


Effect of adding Torsional<br />

Noise


Effect of adding Torsional<br />

Noise


Effect of adding Torsional<br />

Noise


Effect of adding Torsional<br />

Noise


Effect of adding Torsional<br />

Noise


Effect of adding Torsional<br />

Noise


Effect of adding Torsional<br />

Noise


Effect of adding Torsional<br />

Noise


Effect of adding Torsional<br />

Noise


Effect of adding Torsional<br />

Noise


Effect of adding Torsional<br />

Noise


Effect of adding Torsional<br />

Noise


Effect of adding Torsion Noise


Effects of randomizing Torsions


Effects of randomizing Torsions


Virtual Screening Performance<br />

against DUD


...using Lowest Energy<br />

Conformation


...adding color


...using alignments + color


How do overlays help?


Using Moment Alignments<br />

60%<br />

30%<br />

10%<br />

% Filtered


Information<br />

Compression<br />

Moments<br />

of Inertial<br />

Multipole<br />

Central/Zernike<br />

Moments<br />

USR<br />

Spherical<br />

Harmonics<br />

Fourier<br />

Coefficients<br />

BCUTS/ESShape3D<br />

Distance<br />

Histograms<br />

ROCS<br />

Pharmacophore<br />

Atom<br />

Triplets<br />

1x 10x 100x 1000x<br />

Speed<br />

DOCKING<br />

FRED


Information<br />

Compression<br />

Moments<br />

of Inertial<br />

Multipole<br />

Central/Zernike<br />

Moments<br />

USR<br />

Spherical<br />

Harmonics<br />

Fourier<br />

Coefficients<br />

BCUTS/ESShape3D<br />

Can be used with Electron Density?<br />

Distance<br />

Histograms<br />

ROCS<br />

Pharmacophore<br />

Atom<br />

Triplets<br />

1x 10x 100x 1000x<br />

Speed<br />

DOCKING<br />

FRED


Information<br />

Compression<br />

Moments<br />

of Inertial<br />

Multipole<br />

Central/Zernike<br />

Moments<br />

USR<br />

Spherical<br />

Harmonics<br />

Fourier<br />

Coefficients<br />

BCUTS/ESShape3D<br />

Generates Overlays?<br />

Distance<br />

Histograms<br />

ROCS<br />

Pharmacophore<br />

Atom<br />

Triplets<br />

1x 10x 100x 1000x<br />

Speed<br />

DOCKING<br />

FRED


Information<br />

Compression<br />

Moments<br />

of Inertial<br />

Multipole<br />

Central/Zernike<br />

Moments<br />

USR<br />

Spherical<br />

Harmonics<br />

Fourier<br />

Coefficients<br />

BCUTS/ESShape3D<br />

Can use chemistry (color)?<br />

Distance<br />

Histograms<br />

ROCS<br />

Pharmacophore<br />

Atom<br />

Triplets<br />

1x 10x 100x 1000x<br />

Speed<br />

DOCKING<br />

FRED


Information<br />

Compression<br />

Moments<br />

of Inertial<br />

Multipole<br />

Central/Zernike<br />

Moments<br />

USR<br />

Spherical<br />

Harmonics<br />

Fourier<br />

Coefficients<br />

BCUTS/ESShape3D<br />

Is really an objective function?<br />

Distance<br />

Histograms<br />

Atom<br />

Triplets<br />

ROCS<br />

Pharmacophore<br />

1x 10x 100x 1000x<br />

Speed<br />

DOCKING<br />

FRED


• Anthony Nicholls<br />

• Robert Tolbert<br />

• Mark McGann<br />

Acknowledgements

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!