13.07.2015 Views

FRED , Fast Rigid Exhaustive Docking

FRED , Fast Rigid Exhaustive Docking

FRED , Fast Rigid Exhaustive Docking

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>FRED</strong><strong>Fast</strong> <strong>Rigid</strong> <strong>Exhaustive</strong> <strong>Docking</strong>version 2.2.5OpenEye Scientific Software, Inc.April 7, 20099 Bisbee Ct, Suite DSanta Fe, NM 87508www.eyesopen.comsupport@eyesopen.com


Copyright c○ 1997-2006 OpenEye Scientific Software, Santa Fe, New Mexico. All rights reserved.All rights reserved. This material contains proprietary information of OpenEye Scientific Software. Useof copyright notice is precautionary only and does not imply publication or disclosure.The information supplied in this document is believed to be true but no liability is assumed for its use orthe infringement of the rights of others resulting from its use. Information in this document is subject tochange without notice and does not represent a commitment on the part of OpenEye Scientific Software.This package is sold/licensed/distributed subject to the condition that it shall not, by way of trade orotherwise, be lent, re-sold, hired out or otherwise circulated without OpenEye Scientific Software’s priorconsent, in any form of packaging or cover other than that in which it was produced. No part of thismanual or accompanying documentation, may be reproduced, stored in a retrieval system on optical ormagnetic disk, tape, CD, DVD or other medium, or transmitted in any form or by any means, electronic,mechanical, photocopying recording or otherwise for any purpose other than for the purchaser’s personaluse without a legal agreement or other written permission granted by OpenEye.This product should not be used in the planning, construction, maintenance, operation or use of anynuclear facility nor the flight, navigation or communication of aircraft or ground support equipment.OpenEye Scientific software, shall not be liable, in whole or in part, for any claims arising from suchuse, including death, bankruptcy or outbreak of war.Windows is a registered trademark of Microsoft Corporation. Apple and Macintosh are registered trademarksof Apple Computer, Inc. AIX and IBM are registered trademarks of International Business MachinesCorporation. UNIX is a registered trademark of the Open Group. RedHat is a registered trademarkof RedHat, Inc. Linux is a registered trademark of Linus Torvalds. Alpha is a trademark of DigitalEquipment Corporation. SPARC is a registered trademark of SPARC International Inc.SYBYL is a registered trademark of TRIPOS, Inc. MDL is a registered trademark and ISIS is a trademarkof MDL Information Systems, Inc. SMILES, SMARTS, and SMIRKS may be trademarks of DaylightChemical Information Systems. Macromodel is a trademark of Schrödinger, Inc. Schrödinger, Inc maybe a wholly owned subsidiary of the Columbia University, New York.Python is a trademark of the Python Software Foundation.Java is a trademark or registered trademark of Sun Microsystems, Inc. in the U.S. or other countries.“The forefront of chemoinformatics” is a trademark of Daylight Chemical Information Systems, Inc.Other products and software packages referenced in this document are trademarks and registered trademarksof their respective vendors or manufacturers.


CONTENTS1 Version 12 Introduction 22.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22.2 Key technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Installation and Platform Notes 43.1 Licenses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43.2 Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43.3 PVM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 Theory 74.1 Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74.2 <strong>Docking</strong> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84.3 Scoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124.4 Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 Running <strong>FRED</strong> 285.1 Specifying Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285.2 Command Line Help . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295.3 Receptor file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305.4 Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335.5 Preparing ligand database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355.6 Screen output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365.7 Complete list of output files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386 Example Command Lines 436.1 Simple command lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 436.2 Creating a receptor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 486.3 Finding the right pose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 526.4 Scoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 576.5 Using MASC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 606.6 Ligand+Structure based design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67ii


7 Parameters 717.1 Execute Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 717.2 Input Ligands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 727.3 Receptor Site . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 747.4 <strong>Docking</strong> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 767.5 Scoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 797.6 Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82A Release Notes 83A.1 <strong>FRED</strong> 2.2.5 Change Log . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83A.2 <strong>FRED</strong> 2.2.4 Change Log . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83A.3 <strong>FRED</strong> 2.2.3 Change Log . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83A.4 <strong>FRED</strong> 2.2.2 Change Log . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84A.5 <strong>FRED</strong> 2.2.1 Change Log . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84A.6 Initial <strong>FRED</strong> 2.2 release . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85B File formats 87B.1 Valid file extensions for Both Reading and Writing . . . . . . . . . . . . . . . . . . . . 87B.2 Valid File Extensions for Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88B.3 Valid File Extensions for Writing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88C Glossary 89D Standard MASC reference sites 90E <strong>FRED</strong> 1.2.10 to <strong>FRED</strong> 2.1 parameter dictionary 91E.1 Parameter Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91E.2 PVM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91E.3 Molecule Loader . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92E.4 Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92E.5 Omega . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92E.6 Receptor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93F <strong>FRED</strong> 2.0 to <strong>FRED</strong> 2.2 parameter dictionary 96G <strong>FRED</strong> 2.1 to <strong>FRED</strong> 2.2 parameter dictionary 97H Accessing scores in oeb files from OEChem 101H.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101H.2 Tags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101iii


CHAPTERONEVersionThis document is for version 2.2.5 of <strong>FRED</strong>.1


CHAPTERTWOIntroduction2.1 OverviewF.R.E.D. (<strong>Fast</strong> <strong>Rigid</strong> <strong>Exhaustive</strong> <strong>Docking</strong>) is a protein-ligand docking program, which takes a multiconformerlibrary/database and receptor file as input and outputs molecules of the input database most likelyto bind to the receptor. <strong>FRED</strong> is a command line program, although a GUI is available to setup and createreceptor files prior to docking. Typical docking time for <strong>FRED</strong> is a few seconds per ligand. <strong>FRED</strong> jobscan also be easily distributed over multiple computers/processors using PVM to further reduce dockingtime.2.2 Key technologiesCombined ligand and structure based design While primarily a structure based designprogram <strong>FRED</strong> can also use a known bound ligand within the active site to improve results.Receptor Site Shape <strong>FRED</strong> has a very effective method for determining the shape of an activesite (that works well even on very shallow/open binding sites), which allows it to very efficientlyreject poses in incorrect positions. This technology can also be used to detect active sites on aprotein when the site is not known a priori.Consensus Structure This method of selecting a correctly docked pose from a set of likely candidatesuses the consensus of multiple scoring functions. The consensus structure method improvesthe chances that the pose selected is correct (relative to using any single scoring function). Pleasenote that ”consensus structure” uses multiple scoring functions to compare different poses of thesame ligand, unlike ”consensus scoring” which is used for ligand-ligand comparison.M.A.S.C. scoring MASC stands for Multiple Active Site Correction and is a method of correctingfor systematic biases in any scoring function. The MASC method corrects for systematic bias bycomparing a ligands score to the same ligands score in a set of standard reference sites. Ligandswhich score well in the reference sites are assumed to be promiscuous, and are penalized.2


2.2. Key technologies 3Chemgauss scoring Chemgauss is a scoring function developed by OpenEye which uses Gaussianfunctions to describe the shape and chemistry of molecules. Chemgauss is currently on version 3and is described in more detail in the Scoring function chapter.<strong>Exhaustive</strong> <strong>Docking</strong> <strong>FRED</strong>’s docking strategy is to exhaustively score all possible positions ofeach ligand in the active site. The exhaustive search is based on rigid rotations and translationsof each conformer. This novel strategy completely avoids the sampling issues associated withstochastic methods used by many other docking programs.Most of these technologies are described in more detail in the Theory section.


CHAPTERTHREEInstallation and Platform Notes3.1 LicensesTo run <strong>FRED</strong> you will need to obtain a license file for <strong>FRED</strong> from OpenEye Scientific Software (support@eyesopen.com).The license file will be one of the 2 options:1. In a file pointed to by the environment variable $OE LICENSE.2. Named oe license.txt and must be placed in the directory specified by the environmentvariable $OE DIR.If you intend on running multi-processor <strong>FRED</strong> via PVM, only the master machine needs access to thelicense file.3.2 Installation3.2.1 Linux/UnixBy default, all OpenEye applications are installed in a single tree, with the latest script of each appinstalled in openeye/bin. This script determines that actual platform at run-time and calls the actualexecutable under the openeye/arch directory.Assuming the installation is in /usr/local/openeye, all that is necessary to run <strong>FRED</strong> (and theother included apps) is to add /usr/local/openeye/bin to your PATH.4


3.3. PVM 53.2.2 WindowsOn Windows, we now provide a standard EXE setup installer. By default, all OpenEye apps will installinto C:\OpenEye. In order to run <strong>FRED</strong> (and the other included apps) in a command shell or Cygwinterminal, you must add C:\OpenEye\bin to your PATH environment variable in the System ControlPanel.3.2.3 OS X3.3 PVMPVM or parallel virtual machine is a freely available library for running processes on more than oneprocessor on one or more machines. <strong>FRED</strong> can take advantage of PVM to distribute jobs over multipleprocessors. To do this PVM must be installed on all the machines <strong>FRED</strong> will be distributed over. ThePVM source is freely available fromhttp : //www.csm.ornl.gov/pvm/pvm_home.htmlhowever many Linux distributions, and some Unix versions, include PVM by default. <strong>FRED</strong> is built withthe current PVM version 3.4.5, but should also work with PVM version 3.4.4. <strong>FRED</strong> does not supportPVM under Windows.To use <strong>FRED</strong> with PVM you must do one of the following.1. Place a link to the “fred” executable in $PVM ROOT/bin/$PVM ARCH2. Define the enviroment variable PVM PATH, which names the directory where the <strong>FRED</strong> executableresides.The enviroment variables PVM ROOT and PVM ARCH should be defined globablly as part of the PVMinstallation. PVM PATH is generally a user-defined enviroment variable, and must defined for all shells(i.e., you can’t just define it in the shell you’re launching <strong>FRED</strong> in).NOTE : There is no specific slave executable. The executable distributed for this program serves as botha master and slave PVM program as well as a single processor version.3.3.1 PVM hosts fileAfter PVM and <strong>FRED</strong> are installed, each <strong>FRED</strong> job run via PVM must be supplied a file (via-pvmconf) that describes the hosts and number of CPU’s to use. Each line in this file includes theword “host” followed by the hostname of the slave followed by the number of CPU’s to use on that slave.To use 10 slaves on a linux cluster (where nodes are named “linux1”, “linux2”, etc.)


6 Chapter 3. Installation and Platform Noteshost linux1 2host linux2 2host linux3 2host linux4 2host linux5 2To use 32 CPU’s on a single-image system like an SGI Altix, a single line can be used:host altix 3 23.3.2 PVM ScalingPVM jobs involve a certain of network traffic, sending multi-conformer molecules from the master to theslaves and sending results from the slaves back to the master. The single <strong>FRED</strong> slave jobs are relativelyfast compared to the network I/O time, so for scaling beyond 128 CPUs, additional effort is required insetting up a job.As of version 2.0, Omega can write rotor-offset-compressed OEB files (via the -roc command-lineflag) that greatly reduce the I/O on the master and allow <strong>FRED</strong> to scale to larger number of CPUs.


CHAPTERFOURTheory4.1 Input<strong>FRED</strong> takes the following inputMulticonformer ligand database This is a file or files with multiconformer expansions ofthe ligands <strong>FRED</strong> will dock. <strong>FRED</strong> has been validated using conformer databases produced withthe OpenEye conformer generator OMEGA. Other conformer generators that produce output in aformat readbale by <strong>FRED</strong> are also permissible.Receptor file This is a special type of .oeb file (OEB is OpenEye’s standard molecule file format)which describes the receptor site to <strong>FRED</strong>.A receptor file always contains the following:1. The structure of the target protein.2. The location of the receptor site on the protein.3. A shape potential grid that describes the shape complementarity of the active site.Additionally a receptor file may contain:1. One or two isocontour levels of the shape potential grid which create shapes complementaryto the active site.2. The structure of a known bound ligand.3. <strong>Docking</strong> constraints.The receptor file can be created by using the <strong>FRED</strong> command line executable or a stand-alone GUI(fred receptor) which is designed to created and edit receptor files.Command line arguments Which can also be entered in a parameter file.7


8 Chapter 4. Theory4.2 <strong>Docking</strong>By default <strong>FRED</strong> returns a single docked structure for each molecule in the input database. <strong>FRED</strong> usesthe following process to generate this pose within the active site defined by the user.1. <strong>Exhaustive</strong> <strong>Docking</strong>(a) Enumerate all possible poses of the ligand around the active site by rigidly rotating andtranslating each conformer within the site.(b) Filter the resulting pose ensemble by rejecting poses that do not fit within the larger of thetwo volumes specified by the receptor file’s shape potential grid and a contour level (alsoreferred to as the outer contour).(c) Filter the resulting pose ensemble by rejecting poses that do not have a least one heavy atomwithin the smaller of the two volumes specified by the receptor file’s shape potential grid anda contour level (also referred to as the inner contour).(d) Filter the pose ensemble by rejecting poses that do not match any user-defined docking constraints.(e) Rank all remaining poses using either the Shapegauss, PLP, Chemgauss2, Chemgauss3 orCGO scoring functions (described in section 4.3). Retain N top scoring candidate poses anddiscard the rest (by default N is 100).2. Perform a systematic solid body optimization of the top ranked candidate poses using either Shapegauss,PLP, Chemgauss2, Chemgauss3, CGO, CGT, Chemscore, OEChemscore or Screenscore.3. Rank poses via the Consensus Structure method and discard all but the top ranked poses, unless theuser requests that <strong>FRED</strong> retain alternate poses (in which case as many alternate poses as requestedare retained up to the number of candidate poses).4. Force Field refinement(a) Do a full coordinate optimization of the pose vs. the MMFF force field. If alternate poseswere retained in the previous step those poses are refined as well.(b) Check that the refined pose passed the user-defined constraints (if they are specified), anddiscard the pose if it does not.Note The force field refinement step is skipped by default.These steps are described in more detail in this section.4.2.1 <strong>Exhaustive</strong> <strong>Docking</strong>The purpose of <strong>Exhaustive</strong> <strong>Docking</strong> is to take a multiconformer ligand and generate N candidate posesof the ligand within the receptor site (by default N is 100).


4.2. <strong>Docking</strong> 9Enumerate all possible posessThe first step of the docking processes is the generation of the pose ensemble. The ensemble is constructedby enumerating rigid rotations and translations of each conformer within the active site. Thetranslations are generated by systematically translating the conformer within the active site using a specifiedstep size. Rotations are generated such that a single rotational step does not produce a displacementof any atom greater than a specified rotational step size.The default translational step size is 1 Angstrom and the default rotational step size is 1.5 Angstroms,which roughly gives a 1 RMSD change for any rotational step. The ensemble size can range from tens ofmillions of poses, in the case of large active sites and highly flexible molecules, to a few thousand posesfor small enclosed sites and rigid molecules.Inner and Outer Contour FilterThe pose ensemble is filtered by an inner and an outer contour filter. These filters reject poses that do nothave sufficient shape complementarity to the protein’s active site. All heavy atoms of the pose must fitwithin the outer contour and at least one heavy atom of the pose must fit within the inner contour.Both complementary volumes are created by creating two isocontours of the same shape potential gridcontained in the receptor file at different contour levels. The outer contour level is generally low, resultingin a large volume (typically around 1500 cubic Angstroms), while the inner contour level is high,resulting in a small volume (typically around 50 cubic Angstroms). Obviously more pose atoms will fitwithin the large outer contour than the inner contour, however to satisfy the outer contour filter everyheavy atom of the pose must fit within the outer volume, while satisfying the inner contour filter requiresonly that one heavy atom fit within the volume.Note that both the inner and outer contour filters can be disabled by the user. In this case <strong>FRED</strong> will filterposes based on clashes with the protein structure. You can also enable the clash checking in combinationwith these filter by using the -clash scale parameter, although in general this is not required as theouter contour filter has a shape that rejects clashes.User defined constraint filter<strong>FRED</strong> now has two types of user defined constraints it will filter the pose ensemble with, protein constraintsand custom constraints. Constraint information is stored within the receptor file and there can beany number of each constraint type. Any pose that does not meet all of the user specified constraints isremoved from the pose ensemble.Custom constraints are user defined constraints that use spheres and associated SMARTS patternsto specify regions within the active site where certain chemical functionality is required.<strong>FRED</strong> will reject any poses that do not match this functionality. Each custom constraint is referredto as a constraint feature.Each constraint feature consists of one or more spheres, and optionally a list of SMARTS patterns.A feature without a SMARTS patterns will be satisfied if any heavy atom of the pose falls within


10 Chapter 4. Theoryone of the feature’s spheres. If the feature has SMARTS pattern(s) only atoms which match theSMARTS pattern(s) can satisfy the constraint.Protein constraints are new to <strong>FRED</strong> 2.2 and requires use of the receptor setup GUI to use.Protein constraints are placed on individual protein atoms and come in three basic types:Hydrogen Bond constraints tell <strong>FRED</strong> that a pose must make a hydrogen bond interaction withthe specified protein atom to pass the constraint filter. These constraints have more geometricspecificity than custom constraints (they are inherently directional), and can be specified aseither acceptor or donor constraints (or both). They function by recognizing ”lone pair”and ”polar hydrogen” positions around acceptor and donor atoms respectively and requirethat one of the acceptor’s ”lone pair” positions is within 1.0 Angstrom of the donor’s ”polarhydrogen” position. Note that the actual position of a donor’s polar hydrogen is not used,rather <strong>FRED</strong> generates its own set of likely polar hydrogen positions.Metal constraints tell <strong>FRED</strong> that a pose must make a coordinating interaction with the metal theconstraint is placed on to pass the constraint filter (metals are treated as being part of the proteinby <strong>FRED</strong>). These constraints have more geometric specificity than custom constraints.Similar to hydrogen bond constraints, metal constraints work by defining ”coordinating positions”around a coordinating atom, and requiring that a ”coordinating position” be within1.0 Angstrom of the metal.Contact constraints tell <strong>FRED</strong> that a pose must make a contact interaction with the atom theconstraint is placed upon to pass the constraint. Contact is defined as having a heavy atomwithin 4 Angstroms of the protein atom the constraint is placed on. These constraints havethe same geometric specificity as custom constraints (i.e. making a custom constraint with a4 Angstrom radius centered on the protein atom will perform the same function).Mini Constraint F.A.Q.Which constraint type should I use? In general protein constraints are designed for simplicity and constrainthe geometry of hydrogen bond and metal constraints more realistically than custom constraintscan. Custom constraints on the other hand are more flexible in the sense that they allow users to specifytheir own chemistry required to satisfy the constraint, via SMARTS patterns.Is there a maximum number of constraints I can use? There is no direct limit on the number of constraintsthat the user can specify, however there is a modest increase in memory requirements for each additionalconstraint specified. Fred is very efficient at restricting the search space of possible poses based on userdefined constraints. Accordingly adding constraints will generally decrease run time.Why do some of the poses <strong>FRED</strong> is generating violate my constraints? The constraint spheres are mappedonto a grid during the docking process, and the resulting interpolation error can allow atoms slightlyoutside a sphere (approximately 1/2 the translational stepsize or 0.5A by default), to satisfy a constraint.RankingAll poses of the ensemble that pass the previous filtering steps are scored by the exhaustive scoring function(see flag -exhaustive scoring), which can be either Chemgauss3 (the default, see section4.3.5), Chemgauss2 (see section 4.3.4), Shapegauss (see section 4.3.2), PLP (see section 4.3.3) or CGO


4.2. <strong>Docking</strong> 11(see section 4.3.10). The poses are then ordered by score and the top N scoring poses are retained (Nis 100 by default, see also the flag -num poses). Accordingly this list of N poses may contain severaldifferent poses for the same conformer and does not necessarily contain a pose for any given inputconformer.4.2.2 OptimizationThe top ranked poses from <strong>Exhaustive</strong> docking (by default 100) may be optimized using a systematicsolid body optimization against either Shapegauss, PLP, CGO, CGT Chemgauss2, Chemgauss3,Chemscore, OEChemscore or Screenscore. Alternatively the optimization may be skipped (by defaultChemgauss3 optimization is used). The systematic solid body optimization is done by rigidly rotatingand translating the molecule at half the stepsize used in the <strong>Exhaustive</strong> <strong>Docking</strong>, 4.2.1. One positive andnegative step is taken in each translational and rotational direction, so 729 (i.e. 27x27) poses are testedfrom which the optimal one is selected.4.2.3 Consensus StructureThe poses returned from exhaustive docking (and optional optimization) are scored by one or morescoring functions (Shapegauss, PLP, Chemgauss, Chemgauss2, Chemscore, Screenscore, CGO or CGT).For each scoring function a list of the poses is created ordered by rank. Each pose is then assigned aconsensus structure score equal to the sum that pose’s rank in each list. The pose with the top consensusstructure score is then retained and all other poses are discarded, unless the user has requested thatalternate poses be saved (see flag -num alt poses) in which case poses are ordered by consensusstructure score and a number of poses up to one plus the number of alternate poses requested are passedon.Note that the user can specify the weights that different scoring functions are given in this calculation.See the flags -pose select weight xxx, where xxx is the name of the scoring function.4.2.4 Force Field RefinementOptionally after consensus scoring, poses can be refined using the Merck Molecular Mechanics ForceField. The refinement is full coordinate optimization of all ligand atoms (the protein is held rigid). Thisstep is optional and very CPU intensive, and it is only recommended when using the Zapbind scoringfunction which is very sensitive to small atom-atom clashes.User defined constraints, see section 4.2.1, are ignored during the refinement process. However, bydefault after the refinement process any poses that violate the constraints are discarded (if this is the onlypose the entire molecule will be discarded and not appear in the output hitlists at all).Note by default Force Field Refinement and the constraint re-checking are skipped, see the flag-refine if you wish to turn this step on.


12 Chapter 4. Theory4.3 ScoringThe final stage of docking a ligand is to score the pose or poses generated by the docking steps describedabove.4.3.1 OverviewThe following structure-based scoring functions are available in <strong>FRED</strong>. These scoring functions alsohave MASC variant, see section 4.3.12.Shapegauss A shape-based scoring function that uses smooth Gaussian functions to represent theshapes of molecules. Details of this scoring function can be found in reference [4].PLP or Piecewise Linear Potential which is described in detail in reference [7].Chemgauss2 Version 2 of the Chemgauss scoring function, which uses smooth Gaussian functions torepresent the shape and chemistry of molecules. Chemgauss2 has been superseded by the newerChemgauss3 in the new version of <strong>FRED</strong>.Chemgauss3 Version 3 of the Chemgauss scoring function, which uses smooth Gaussian functions torepresent the shape and chemistry of molecules.Chemscore which is described in reference [8]. This implementation of Chemscore adheres as faithfullyas possible to the referenced paper.OEChemscore An OpenEye variant of Chemscore which is similar to the Chemscore implementationfound in the original 1.2.x version of <strong>FRED</strong>.Screenscore which is described in reference [9].Zapbind A scoring function which uses PB electrostatic calculations in combination with an areacontact term.The following ligand-based scoring function are also available in <strong>FRED</strong>. Use of these scoring functionsrequires that the receptor file contain a bound ligand.CGO or Chemical Gaussian Overlay. This scoring function represents molecular shape and chemistrywith smooth Gaussian function, as the Chemgauss functions do. However this function scoresa pose by measuring how well it overlays onto the bound ligand, rather than scoring against theprotein structure.CGT of Chemical Gaussian Tanimoto. Similar to CGO, but calculates a Tanimoto overlay rather than avolume overlay.


4.3. Scoring 13The following table gives a quick overview of the interactions the various scoring functions in <strong>FRED</strong> areaware of. It is easily seen that none of the functions have intramolecular terms. This is because <strong>FRED</strong>relies on the conformer generator that provides its input conformer database to ensure that all the inputconformers have reasonable geometries and do not have significant intramolecular contacts and/or highstrain energies.Shape Hydrogen Bonds Metal Aromatic DesolvationShapegauss Yes No No No NoPLP Yes Yes Yes No NoChemgauss2 Yes Yes Yes Yes NoChemgauss3 Yes Yes Yes No YesChemscore Yes Yes Yes No NoOEChemscore Yes Yes Yes No NoScreenscore Yes Yes Yes No NoZapbind Yes No No No YesCGO Yes Yes Yes No NoCGT Yes Yes Yes No NoTable 4.1: Summary of the interactions scoring functions are aware ofThe scoring functions also vary in their speed, and only the faster functions can be used in the <strong>Exhaustive</strong><strong>Docking</strong> and Optimization stages, as detailed in the following table. All the scoring functions can be usedto give the final score for a pose.Speed <strong>Exhaustive</strong> <strong>Docking</strong> OptimizationShapegauss Very <strong>Fast</strong> Yes YesPLP Very <strong>Fast</strong> Yes YesChemgauss2 <strong>Fast</strong> Yes YesChemgauss3 <strong>Fast</strong> Yes YesChemscore Medium No YesOEChemscore Medium No YesScreenscore Medium No YesZapbind Slow No NoCGO <strong>Fast</strong> Yes YesCGT Medium No YesTable 4.2: Summary of speed and places scoring function can be used4.3.2 ShapegaussReferenceThis scoring functions is described in detail in Ref [4].


14 Chapter 4. TheoryTypingAll heavy atoms are typed as steric atoms. Hydrogen atoms are ignored.ComponentsThere is only once component of Shapegauss, the shape score.DescriptionThe Shapegauss scoring function represents all atoms as smooth Gaussian functions. A pairwise potentialbetween ligand and protein atoms is applied that attempts to maximize their surface contact andminimize their volume overlap (i.e., The potential is most favorable when the atoms are touching but notoverlapping. A correction term is then applied to further penalize atoms which significantly overlap theprotein.4.3.3 PLPReferenceThe Piecewise Linear Potential is an implementation of the scoring function described in Ref [7].TypingThe following atom types are recognized by PLP.Donor Hydrogen bond donors - primary and secondary amines.Acceptor Hydrogen bond acceptors - oxygen and nitrogen atoms with no bound hydrogens.Hydroxyl Hydroxyl groups are treated as both acceptors and donors.NonPolar Carbon, Chlorine, Fluorine, Bromine, Iodine and Nitrogen or sulfur with more than twoattached hydrogens.sulfur Sulfurs with less than two attached hydrogen atoms.Metal Iron, Magnesium or Zinc.ComponentsThe total PLP score is a sum of the following components.


4.3. Scoring 15NonPolar Interactions of all ligand non-polar atoms.Hydrogen Bond Interactions of all ligand acceptors and donors.sulfur Interactions of all ligand sulpurs.Metal Interactions of all ligand metals.DescriptionPLP is a heavy atom scoring function, meaning all potentials are based on distances from heavy atomcenters (i.e. hydrogen position is irrelevant, although the presence or absence of hydrogen is not, as itcan affect the atom typing). The PLP implementation in <strong>FRED</strong> adheres to the reference as faithfullyas possible, with the caveat that the implementation in <strong>FRED</strong> has been extending to include favorableinteractions between acceptor and metal atoms.4.3.4 Chemgauss2OverviewThe Chemgauss2 has been deprecated and will likely be removed in the next minor (not bugfix) releaseof <strong>FRED</strong>. Users are encouraged to use the new Chemgauss3 scoring function in place of Chemgauss2.TypingThe following atom types are recognized by Chemgauss2.Strong Hydrogen Bond Acceptors are defined as any of the following1. Oxygen with two single bonded heavy atoms.2. Oxygens double bonded to a carbon.3. Oxygens single bonded to a carbon that are part of a carboxylic acid group.4. Triple bonded nitrogens without an attached hydrogen.5. Non-aromatic nitrogens with no attached hydrogens and two single bonds one of which is toa carbon and another to carbon, sulfur or nitrogen.6. Nitrogens with one single bond to another heavy atom and with 1 or 2 attached hydrogens.7. Oxygens double bonded to phosphorus.8. Oxygens single bonded to a metal.Weak Hydrogen Bond Acceptors are defined as any of the following:1. Aromatic nitrogens with no attached hydrogens.


16 Chapter 4. Theory2. Oxygens single bonded to a carbon that are not part of a carboxylic acid group.3. Oxygens double bonded to sulfur.4. Sulfurs single bonded to a carbon.5. A sulfur with two single bonded carbons and no attached hydrogens.Strong Hydrogen Bond Donors are defined as any of the following:1. Non-aromatic nitrogens with two single bonds to heavy atoms and one attached hydrogen.2. Nitrogens with two attached hydrogens and one single bond to a carbon.Weak Hydrogen Bond Donors are defined as any of the following:1. Aromatic nitrogens with 1 attached hydrogen.2. Non-aromatic nitrogens with 3 single bonds to carbons and one attached hydrogen.3. Oxygens with an attached hydrogen and a single bond to a carbon.Aromatic Heavy atoms in an aromatic ringMetal Any heavy atom except He, B, C, N, O, F, Ne, Si, P, S, Cl, Ar, As, Se, Br, Kr, Te, I, Xe, At, andRn.Small Non Polar fluorines, oxygens and nitrogens that are not also hydrogen bonding atoms.Large Non Polar Iodines, sulfurs and Phosphorous that are not also hydrogen bonding atoms.Medium Non Polar Any heavy atom that does not fit one of the above types.In addition to these atom centered types, Chemgauss also determines the following positions around amolecule:Polar Hydrogens One or more possible positions for a hydrogen involved in a hydrogen bond.Note that this can be different from the explicit position of the hydrogen atom.Lone Pairs One or more possible positions around a hydrogen bond acceptor for a polar hydrogenfrom a donor.PI electon positions Pi electron positions of an aromatic atom above and below the plane ofthe aromatic ring.ComponentsThe Chemgauss2 function is the sum of the following potentials, all of which are based on smoothGaussian functions:Shape Shape based interactions between all heavy atoms.


4.3. Scoring 17Hydrogen Bond Hydrogen bonding interactions based on favorable interactions between polar hydrogensand lone pairs and a mild repulsion between donor heavy atoms and acceptor heavy atoms(which tends to make the hydrogen bonds linear).Aromatic Aromatic ring interactions based on favorable interactions between aromatic atoms and thepi-electron positions plus repulsive aromatic atom to aromatic atom and pi-electron to pi-electroninteractions.DescriptionThe Chemgauss scoring function combines the Shapegauss scoring function with additional potentialsbetween chemically matched positions around the ligand pose. These chemically complementary positionsare generally not also atom positions, but rather are placed near specific functional groups. Forinstance acceptors have ”lone pair” positions around them which denote positions where a polar hydrogencould be placed to create a hydrogen bonding interaction. Similarly donors have ”polar hydrogen”positions, which denote positions its hydrogen could be in. For simple donors without rotatable bondsthese positions correspond to the actual polar hydrogen position, but for rotatable bonds such as hydroxylsthere are several positions representing the ring of possible positions for the polar hydrogen. Afavorable hydrogen bond score is obtained when a ”polar hydrogen” position on one molecule overlapsa ”lone pair” position on another molecule.4.3.5 Chemgauss3TypingChemgauss3 recognizes the following heavy atom types (a single atom may have multiple types):Steric All heavy atoms are typed as steric.Acceptor Acceptors are classified as strong, moderate or weak in strength as follows:Strong Phosphate, NOxide, Carboxylate, Het6N, Phosphinyl and Oxyanion.Moderate Water, Sulfoxide, Primary Amine, Het5N, Thiocarbonyl, Sulfate, Tertiary Amine,Amide, Carbamate and Urea.Weak Nitrile, Ketone, Ester, Nitro, Het5O, Imine, Phenol, Hydroxyl, Sulfone, Primary Aniline,Secondary Amine and Ether.Donor Donors are classified as strong, moderate or weak as follows:Strong Primary Amine NpH, Secondary Amine NpH, Tertiary Amine NpH, Amidine NpH,Guanidine NpH, Het5NH and AcidOH.Moderate Water, Primary Amide, AnilineNH, AmidineNH, Secondary Amide, Aniline NH2and Hydroxyl.


18 Chapter 4. TheoryWeak Hydrazine NH, Imine NH, Phenyl OH, Primary Amine and Secondary Amine.Coordinating Groups Carboxylate, Oxanion, PyridineN, SulfonamideNAnion, AromNAnion,Thianion and Hydroxamate are considered coordinating groups.Metals Calcium, Magnesium and Zinc.In addition to these heavy atom types, Chemgauss3 also recognizes positions around some of the heavyatoms. The positions are not required to be located at atom centers (they generally are not) and are usedto represent the directionality of certain interactions. These positions are typed as follows.Lone Pairs are placed around acceptor heavy atoms and represent places where a polar hydrogenfrom a donor could be placed to form a hydrogen bond interaction with the acceptor.Polar Hydrogen positions are placed around donor heavy atoms and represent possible positionsof the donors polar hydrogens. These position will often correspond to the position of the actualpolar hydrogen attached to the donor heavy atom, however this is not required. For example inthe case of a hydroxyl there are six polar hydrogen positions used to represent the ring of possiblepositions the attached hydrogen can be in.Water Positions are placed around both donor and acceptor atoms and represent positions wherea solvent water can make a hydrogen bonding interaction with the donor or acceptor.Chelator coord positions are located around metal-binding atoms and represent the positions ametal could be placed to form a coordinating interaction.Note that the presence or absence of hydrogen can affect how a heavy atom is typed. e.g., an oxygenwith an attached hydrogen may be classified as a donor for instance, but it would not be classified asa donor if the hydrogen were removed. However, if the hydrogen is present its position is ignored (i.e.polar hydrogen positions will be created for donors, but the actual hydrogen’s position is not used in thatcalculation.ComponentsThe final Chemgauss3 score is a sum of the following components:Steric This component is based on the number of protein heavy atoms that contact heavy atomsof the ligand, with a correction term. The base potential accounts for two effects, the first isthe increase in VdW type interactions when the ligand docks to the active site. The second isprotein desolvation energy from displacing water from the active site into solvent, ignoring anyfavorable hydrogen bonding interactions water could make with the site. The correction term ofthis component accounts for the favorable interaction waters can make with the site, by applyinga penalty to lipophilic ligand heavy atoms placed near acceptors or donors of the active site. Notethat in principal this desolvation penalty should be applied to any heavy atom of the ligand not justlipophilic ones, however in practice we have found this correction to be ineffective when appliedto polar atoms.


4.3. Scoring 19Acceptor component measures the interactions acceptors on the ligand are making with donors onthe protein.Donor component measures the interactions donors on the ligand are making with acceptors on theprotein.Metal component measures the interactions coordinating atoms on the ligand are making with metalsin the active site.Desolvation component is a penalty assessed when donors and acceptors on the ligand are blockedfrom interacting with solvent waters by the active site.DescriptionAll Chemgauss3 scoring function interactions are created from a base function that is smoothed byconvolution with a Gaussian function. The base function for the various interactions are described below.Steric is a combination of a clash step function and two hard sphere potentials (representing shortand long range Van der Waals interactions). This setup is designed to roughly approximate a VdWpotential.Hydrogen bond is a hard sphere function based on the distance between a donor ”polar hydrogenposition” and an acceptor ”lone pair” position. The function has a constant favorable value if thedistance is less than 1.0 Angstrom and zero otherwise.Metal is a hard sphere function based on the distance between a ”chelator coordinate” of the ligandand a metal on the protein. The function has a constant favorable value if the distance is less than1.0 Angstrom and zero otherwise.Ligand Desolvation is a step function based on the distance between a ”water position” of theligand and the active site surface. Any water position within the active site surface (i.e. thatclashed with the protein) is assessed a constant penalty. This effective penalizes the ligand forbreaking hydrogen bonds with solvent upon binding.Protein Desolvation is an estimation of the chemical potential of water within the active site.Areas where water can make multiple hydrogen bonds are more favorable than those where it canform fewer hydrogen bonds.Aromatic is a hard sphere function based on the distance between ”ring negative” and ”ring positive”positions. The function has a constant favorable value if the distance is less than 1.0 Angstrom andzero otherwise.4.3.6 ChemscoreReferenceThe Chemscore scoring function is an implementation of the scoring function described in Ref [8].


20 Chapter 4. TheoryTypingSee reference [8].ComponentsThe Chemscore score is a sum of the following components:Lipophilic Interaction between lipophilic atoms.Hydrogen Bond Interactions between donors and acceptors.Metal Interactions between metals and acceptors (which are considered chelators for the purposes ofthis term).Clash Penalty for clashes between ligand and protein.Frozen Rotatable Bond Penalty for loss of entropy due rotatable bonds that can no longer rotateupon binding to the active site.DescriptionWhen this scoring function is used the position of hydrogens involved in hydrogen bonding is optimizedwith respect to the hydrogen bond energy (this is done for both protein and ligand donors). However, nooptimization of heavy atoms on either the protein or ligand is done.4.3.7 OEChemscoreThis scoring function is identical to the Chemscore scoring function (see section 4.3.6), except that itlacks the rotatable bond term and has slightly different atom typing rules.4.3.8 ScreenscoreReferenceThe screenscore scoring function is an implementation of the scoring functions described in Ref [9].TypingSee reference [9].


4.3. Scoring 21ComponentsThe Screenscore score is a sum of the following components:Lipophilic Steric interactions of Lipophilic atoms on the ligand.Ambiguous Steric interaction of Ambiguous atoms on the ligand.Clash Penalty for clashes with the protein.PLP A steric contribution identical to the steric component of PLP.Hydrogen Bond Interactions between acceptors and donors.Metal Interactions between metals and acceptors (which are treated as coordinating atoms for the purposesof this term).Aromatic Interactions between phenyl groups and methyls, amides or hydrogens on aromatic rings.Rotatable Bond A penalty term proportional to the number of rotatable bonds the ligand has.4.3.9 ZapbindReferenceSee reference [3] for information about the Zap PB method.TypingZapbind use the charge, radius and position of all atoms in the system. It does not type them further asthe other scoring functions do.ComponentsThe Zapbind function is a sum of the following components.ZAP Electrostatic binding energy from a ZAP (PB) calculation.AREA Burried area contribution term.


22 Chapter 4. TheoryDescriptionZapbind is a combination of a surface area contact term and an electrostatic interaction calculated usingthe Poisson-Boltzman solvent approximation. The surface area is calculated using a Gaussian-basedmethod, while the PB energy is calculated using ZAP (see Ref [3]).While not required, it is HIGHLY, recomended that refinement vs. the Merck Molecular Mechanics ForceField be done when using this scoring function (this is done by setting ”-refine lig mmff”). Electrostaticcalculations are extremely sensitive to the exact position of the ligand, and thus require highly refinedstructures.4.3.10 CGOCGO (short for Chemical Gaussian Overlay) is a ligand based scoring functions. It measures a pose’sfitness (i.e. scores it), by testing how well the pose overlays a known bound ligand, rather than how wellit complements the active site.TypingCGO recognizes the following heavy atom types (note that a single atom is allowed to have multipletypes):Steric All heavy atoms are typed as steric.Acceptor Acceptors are classified as strong, moderate or weak in strength as follows:Strong Phosphate, NOxide, Carboxylate, Het6N, Phosphinyl and Oxyanion.Moderate Water, Sulfoxide, Primary Amine, Het5N, Thio Carbonyl, Sulfate, Tertiary Amine,Amide, Carbamate and Urea.Weak Nitrile, Ketone, Ester, Nitro, Net5O, Imine, Phenol, Hydroxyl, Sulfone, Primary Aniline,Secondary Amine and Ether.Donor Donors are classified as strong, moderate or weak as follows:Strong Primary Amine NpH, Secondary Amine NpH, Tertiary Amine NpH, Amidine NpH,Guanidine NpH, Het5NH and AcidOH.Moderate Water, Primary Amide, AnilineNH, AmidineNH, Secondary Amide, Aniline NH2and Hydroxyl.Weak Hydrazine NH, Imine NH, Phenyl OH, Primary Amine and Secondary Amine.Coordinating Atoms Carboxylate, Oxanion, PyridineN, SulfonamideNAnion, AromNAnion, Thianionand Hydroxamate are considered coordinating atoms.Metals Calcium, Magnesium and Zinc.


4.3. Scoring 23Aromatic Heavy atoms in 5 and 6 member aromatic rings.In addition to the heavy atom types, CGO also recognizes positions around some of the heavy atoms. Thepositions are not required to be located at atom centers (they generally are not) and are used to representthe directionality of certain interactions. These positions are typed as follows:Lone Pairs are placed around acceptor heavy atoms and represent places where a polar hydrogenfrom a donor could be placed to form a hydrogen bond interaction with the acceptor.Polar Hydrogen positions are placed around donor heavy atoms and represent possible positions ofthe donor’s polar hydrogens. These positions will often correspond to the position of the actualpolar hydrogen attached to the donor heavy atom, however this is not required. For example inthe case of a hydroxyl there are six polar hydrogen positions used to represent the ring of possiblepositions the attached hydrogen can be in.Coordinating positions are located around metal-binding atoms and represent the positions in whicha metal could be placed to form a coordinating interaction.Ring positive are placed roughly on the position of the hydrogen attached to the aromatic heavyatom, and represent areas of slight positive charge around an aromatic ring system.Ring negative two ring negative positions are placed 3 Angstroms above and below each aromaticring center, and represent areas of slight negative charge around aromatic ring systems. Unlikeother extended positions ring negative positions are associated with several heavy atoms (the aromaticring) rather than just one.Note that the presence or absence of hydrogen can affect how a heavy atom is typed. e.g. an oxygenwith an attached hydrogen may be classified as a donor for instance, but it would not be classified asa donor if the hydrogen were removed. However, if the hydrogen is present its position is ignored (i.e.polar hydrogen positions will be created for donors, and the hydrogen’s actual position is not used in thatcalculation).ComponentsThe CGO score is a sum of the following components:Shape measures the overlay between the shape of the pose and the shape of the bound ligand.Acceptor measures the overlay between ”lone pair” positions on the pose and those on the boundligand. Only ”lone pair” on acceptors that are making a hydrogen bond interaction with the proteinare considered in this calculation.Donor measures the overlay between ”polar hydrogen” positions on the pose and those on the boundligand. Only ”polar hydrogen” positions on donors that are making a hydrogen bond interactionwith the protein are considered in theis calculation.


24 Chapter 4. TheoryChelator measures the overlay between the ”coordinating” positions of the pose and those of thebound ligand. Only ”coordinating” positions of chelators that are making a metal interactions withthe protein are considered in this calculation.Aromatic measures the overlay between the ”ring positive” and ”ring negative” positions on the poseand those on the bound ligand. By default this term is disabled.DescriptionCGO is essentially the ligand based design version of the Chemgauss3 scoring function. The overlay ofany two positions is calculated with the following formula.∫Overlay =g1 ∗ g2 (4.1)Where g1 and g2 are Gaussian functions centered on positions one and two respectively.4.3.11 CGTCGT is identical to CGO (see section 4.3.10), except that it converts the overlay information into aTanimoto similarity.Note that the total CGT score is not a sum of the components, but rather the components are a measureof the similarity of the respective positions alone.4.3.12 Multiple Active Site CorrectionMultiple Active Site Correction, or MASC, is not a scoring function, but rather a method for correctingfor systematic errors in any scoring function by comparing the score of each ligand in the target proteinto the same ligand’s score in several standard protein targets (see ref [11]). The MASC score correctionisCorrected =Uncorrected − AverageStandardDeviationwhere uncorrected is the uncorrected score of the ligand in the current protein target, while the averageand standard deviation are calculated for each ligand based upon its score in a set of reference proteintargets. Qualitatively the MASC corrected score measures how much better the ligand is scoring againstthe target protein than it does against a typical protein.This is a computationally expensive docking strategy to set up, due to the fact that the ligand mustbe docked into each of a set of standard protein targets (the original MASC publication used 12, seeAppendix D for a listing). However dockings to the standard proteins can be done independent of any(4.2)


4.4. Output 25target protein for a given ligand dataset and the precalculated values stored in the ligand file. Once this isdone the MASC method can be used on any protein target with a negligible decrease in docking speed.There are MASC variants of each of the scoring functions in <strong>FRED</strong>, see section 4.3.4.4 Output<strong>FRED</strong>’s primary outputs are hitlist(s) of top scoring molecules (<strong>FRED</strong> also outputs several auxiliary filesduring the course of the run, see section 5.7 for a complete list of output files).4.4.1 Docked moleculesA hitlist will be maintained for each of the scoring functions the user selected. If multiple scoringfunctions are selected <strong>FRED</strong> will also maintain a consensus scoring hitlist using all the scoring functionsselected. If no scoring functions are selected by the user <strong>FRED</strong> will default to using Chemgauss3 scoring.Each hitlist can be output to up to 4 files, each with a different representation of the hitlist. Those are:Structure This file contains the conformer and location of the top scoring pose of each ligand sortedby score. The format is one of the standard molecular file formats supported by <strong>FRED</strong> (see sectionB), which can be selected using the -oformat flag. The name of the file will be docked.Where is the name of the scoring function and correspondsto the file extension corresponding to the file format.For file formats that support storing scoring information (.oeb and .sdf at the time of this writing)the total score and score components are written into the structure file.Alternate Structure This file contains the structure of the top scoring pose and alternate posesfor each ligand. The top scoring ligand’s pose and its alternates are listed first, followed by thesecond best scoring ligand’s top pose and its alternates and so on. The format is one of the standardmolecular formats supported by <strong>FRED</strong> (see section B), which can be selected using the -oformatflag. The name of the file will be alt docked.Where is the name of the scoring function and correspondsto the file extension corresponding to the file format.For file formats that support storing scoring information (.oeb and .sdf at the time of this writing)the total score and score components are written into the structure file.Score This text file lists the name, total score, score components and SMILES representation of eachligands top ranks pose. The name of the file will be scores.txt


26 Chapter 4. TheoryWhere is the name of the scoring function.Alternate Score This text lists the name, total score, score components and SMILES representationof each ligand top rank pose and alternates. The name of the file will be alt scores.txtWhere is the name of the scoring function.Note that the names of these files can be modified by adding a prefix specified with the -prefix flag.4.4.2 Undocked moleculesNot every molecule that <strong>FRED</strong> reads can be docked and scored. Molecules that <strong>FRED</strong> cannot dock haveno score or docked structure and thus are not inserted into the output hitlists described in the precedingsection. Instead they are outputted to a smiles to a file namedundocked_code.smiWhere is an integer return code describing why the molecules in that file failed to dock. Themeaning of each return code is as follows.1 : Not Initialized <strong>FRED</strong> failed to initialize the exhaustive search properly.2 : Empty Ligand The molecule is empty (i.e., has no atoms).3 : Empty Protein The protein molecule is empty (i.e., has no atoms).4 : Typing Error <strong>FRED</strong> could not properly type the atoms of the molecule.5 : Grid Setup Error <strong>FRED</strong> could not create the grids it needs for the exhaustive search, generallythis is due to a lack of memory.6 : Coordinate Error There was an error reading one or more of the atom coordinates, orgenerating extended coordinates for the molecule (i.e., ”lone pair” and ”polar hydrogen” positionsfor the Chemgauss scoring functions).7 : No Constraint Match The molecule does not have the required functionality to match oneor more constraints. (e.g., a constraint requires that a donor on the ligand interact with the protein,but the ligand does not have any donors).8 : No Valid Poses During the exhaustive search after filtering the initial pose ensemble usingthe inner/outer contours and any specified constraints there were no poses.9 : Aborted The user aborted the docking process. This can happen in the fred receptor GUI’strial docking mode, but not with the <strong>FRED</strong> command line program.10 : Fredcore Error This return code is not used.11 : Outside Grid This return code is not used.


4.4. Output 2712 : Optimization Error Solid body refinement of the molecule failed.13 : Consensus Error The consensus structure stage of docking failed.14 : Refinement Error <strong>FRED</strong> was unable to refine the molecule, this error generally becausethe molecule has atoms not recognized by the Merck Molecular Mechanics Force Field.15 : Invalid Refinement Refinement of the molecule caused it to move outside the site boxdefining the active site, or caused it to violate user specified constraints.16 : Missing MASC Data A masc variant scoring function was used, but the molecule did nothave the required MASC data.17 : Invalid Score An error occurred during scoring that resulted in an invalid floating pointvalue for one of the scores.18 : Lost This error indicates that the molecules was never returned from the slave during a multiprocessorrun. This error generally occurs when one of the nodes used in a multiprocessor rungoes down or loses contact with the network.Having undocked molecules does necessarily indicate an error has occurred. For example, the undockedmolecules with return code 8, No Valid Poses, will commonly occur when docking a typical liganddatabase to a small internal active site because <strong>FRED</strong> recognizes larger molecules simply cannot fitwithin the active site.


CHAPTERFIVERunning <strong>FRED</strong>5.1 Specifying Parameters<strong>FRED</strong> is a command line driven program. Parameters are entered in key-value pairs after the executablename on the command line (parameters can also be enter using a parameter file, described later in thissection, using the -param flag).For example:fred The order in which the parameter key-value pairs appear on the command line is unimportant, exceptwhen the same key is specified twice in which case the second value specified is used.For example:fred is the same as the preceding example.Keys are always preceeded by a ”-” character, e.g., -dbase. There are several general types of parametersthat restrict the allowable values that can be given to them.Boolean Parameters Allowed values are on/true/t/yes/y and off/false/f/no/n indicating true orfalse respectively (case insensitive). As a special case boolean parameters can optionally not befollowed by a value, in which case the parameter is set to true.For example:<strong>FRED</strong> -x is the same as <strong>FRED</strong> -x true,provided -x is a boolean parameter.Integer Parameters Value is any integer.Float Parameters Value is any real number, must be in decimal format (e.g., 1 or 5.23 is allowed,but 1e3 is not).String Parameters Value is a single text word.28


5.2. Command Line Help 29File Parameters Value is the name of a file.Molecule Parameters Value is the name of a file containing one molecule record. Many standardfile formats are supported (e.g., MOL2, SDF, PDB) and format is determined by reading the fileextension."Parameters File" Parameter This is a specialized kind of file parameter, its value is the nameof a parameter file. A parameter file is a text file containing one or more parameter key-valuecombinations that <strong>FRED</strong> will use in addition to the parameters specified on the command line.The following rules apply to parameter files:1. One key-value pair per line.2. Blank lines and lines begining with a # character are ignored.3. A parameter file parameter cannot appear in a parameter file (i.e., do not specify -param inyour parameter file).4. If a parameter is specified both on the command line and in the parameter file, the valuespecified on the command line is used.5. Boolean parameters must be specified with a key-value pair, the shortcut that interprets aboolean key without a corresponding value as true on the command line will not be consideredvalid in a parameter file. Accordingly ”-x” may not be used as a shortcut for ”-x true”for boolean parameters in the parameter file.Some parameters are restricted to specific ranges or values as listed in the individual parameter’s documention(see chapter 7).5.2 Command Line HelpAll of <strong>FRED</strong> parameters are documented in chapter 7, the same documentation is available from the<strong>FRED</strong> executable on the command line. Typingfred −−helpwill return the following list of help options:Help functions :fred −−help simple : Get a list of simple parametersfred −−help all: Get a complete list of parametersfred −−help defaults : List the defaults for all parametersfred −−help : Get detailed help on a parameterfred −−help html: Create an html help file for this programThe different help options do the following


30 Chapter 5. Running <strong>FRED</strong>simple provides a minimal list and a brief description of parameters required to run <strong>FRED</strong>. By defaultlong lines will be wrapped at 80 columns, but a wider terminal can be accommodated by enteringthe number of columns in your terminal after ”simple”.For example:fred −−help simple 1 2 0will display the simple list of parameters formatted for a 120 column terminal.all provides a complete list and brief description of all parameters used by <strong>FRED</strong>. By default longlines will be wrapped at 80 columns, if you as using a wider terminal you can enter the number ofcolumns in your terminal after simple. So for examplefred −−help all 1 2 0will display the list of parameters formatted for a 120 column terminal.defaults Lists the default value of all parameters, if any. Provides a detailed description of the specified parameter. (e.g. ”fred –help -dbase”will provided a detailed description of the -dbase parameter.)html Write all information available from the command line help to the html file <strong>FRED</strong> help.html.5.3 Receptor fileTo perform its primary purpose of docking molecules, <strong>FRED</strong> needs a description of the active site containedin a receptor file (see section 4.1). A stand alone GUI application, fred receptor, for creatingreceptor files is available from the OpenEye website’s download section, which uses the same license as<strong>FRED</strong>. It is recommended that you use this application to setup your receptor file, however a receptorfile can also be setup using the <strong>FRED</strong> executable.5.3.1 File FormatA receptor file always uses a special version of the OEB format (i.e. it will always be an .oeb or .oeb.gzfile). <strong>FRED</strong> and fred receptor fully support this file format. Other OpenEye programs, including Vida2.1, only read the target protein structure from this file and cannot currently see the extra receptor specificdata in the file.5.3.2 QueryingThe <strong>FRED</strong> command line executable can list basic properties of a receptor file simply by passing thereceptor file to the executable using the -rec flag. The properites listed will be:


5.3. Receptor file 311. The size of the site box.2. The size of the output contour (if there is one).3. The size of the inner contour (if there is one).4. If a bound ligand is present.5. The names of any protein constraints and whether they are enabled.6. The names of any custom constraints and whether they are enabled.Reasonable contour valuesThere is no direct limit to the size of an active site, however you should generally expect that the innercontour volume is around 50-100 cubic Angstroms and the outer contour is in the range of 500-2000cubic Angstroms.Exampleninja : ˜ / TESTING / <strong>FRED</strong>−docs> fred −rec rec1ppx . oeb . gz: jGf : . o88o . . o8: jGDDDDf : 8 8 8 ‘ ’ ’888,fDDDGjLDDDf , o888oo oooo d8b . ooooo . . oooo888,fDDLt : : iLDDL ; 8 8 8 ‘888""8P d88’ ‘88b d88’ ‘ 8 8 8; fDLt : : tfDG ; 8 8 8 8 8 8 888ooo888 8 8 8 888,jft : ,ijfffji , : iff 8 8 8 8 8 8 8 8 8 . o 8 8 8 888. jGDDDDDDDDDGt . o888o d888b ‘Y8bod8P’ ‘Y8bod88P’; GDDGt : ’’’:tDDDG,.DDDG: :GDDG. Copyright (c) 2003,2004,2005,2006;DDDj tDDDi OpenEye Scientific Software, Inc.,DDDf fDDD, Licensed to OpenEye Scientific SoftwareLDDDt. .fDDDj Version: 2.2 (Build date 20060810).tDDDDfjtjfDDDGt OEChem version: 1.4.2 debug 20060810:ifGDDDDDGfi.Platform: linux-2.6-g++4.1-i586.:::. Supported Run Modes:...................... Single processorDDDDDDDDDDDDDDDDDDDDDDPVM Multiprocessor (PVM Slavename : <strong>FRED</strong>)DDDDDDDDDDDDDDDDDDDDDD----------------------------------------#Interface settings#Receptor Site :-rec rec1ppx.oeb.gzWriting settings to : setup.txtRun status will periodically be written to : status.txt


32 Chapter 5. Running <strong>FRED</strong>-----Receptor Information-----Site box volume : 2373Outer contour volume : 697Inner contour volume : 103Standard Constraints"SER496 HB" is ENABLEDWriting a copy of the receptor to receptor.oeb.gz----------------------------------------5.3.3 GUI CreationSee the documentation that comes with the fred receptor GUI distribution available from the Downloadssection of the OpenEye website.5.3.4 Command Line CreationThe <strong>FRED</strong> executable can create a receptor file by passing it a target protein (using the -pro parameter)along with the following additional parameters:-bound ligand <strong>FRED</strong> will create a receptor file using the supplied target protein, setting up theactive site around the bound ligand. The extent of the site will be automatically determined usingthe bound ligand in combination with a shape based site detection routine (the -addbox flag ifsupplied is ignored).-box <strong>FRED</strong> will create a receptor file using the supplied target protein and assume that the active siteis centered around a box file passed to the -box flag. The format of the box file is any standardmolecular format. <strong>FRED</strong> will create a site box aligned along the x,y and z coordinate axis withmaximum and minimum extents equal to the maximum and minimum x, y and z value of anyheavy atom in the molecule. If the -addbox flag is specified, each side of the box is extended bythe value of the -addbox flag.-box and -bound ligand This works identically to supplying just the -box flag, except that thesupplied bound ligand will also be attached to the receptor file.neither -box nor -bound ligand If nothing is known about the location of the protein’s activesite you can opt to supply only the -pro flag to <strong>FRED</strong>. In this case <strong>FRED</strong> will create a receptorand determine the active site using an automatic shape based site detection routine. While thisautomatic method of determining the active site is fairly effective, it is not foolproof and an independentinspection and verification of the site chosen should be performed before a docking run isundertaken. A good rule of thumb is that the automatic site detection has about an 80% successrate.Once <strong>FRED</strong> creates the receptor file it will write out basic information about the receptor to stderr andwrite the receptor to the file receptor.oeb.gz or receptor.oeb.gz if you havespecified the -prefix flag.


5.4. Constraints 33If you supplied <strong>FRED</strong> with a database of ligands to dock, <strong>FRED</strong> will begin docking those moleculesafter the receptor has been created and written out, otherwise <strong>FRED</strong> will terminate.Preparing the proteinPrior to creating a receptor the protein which will be passed to the -pro flag should be prepared as follows:Bound molecules Many protein files contain crystallographic waters and other solvent molecules.The user may choose to keep or strip these molecules from the protein file. Any bound moleculespresent in the protein file will be treated by <strong>FRED</strong> like any other part of the protein and allowedto interact with docked ligands. Any active molecules bound to the protein active site should beremoved from the protein file as they will block ligands from docking into the active site.Protonation state The target protein should be properly protonated before being given to <strong>FRED</strong>.<strong>FRED</strong> accepts the input protonation state of the protein, except in the cases when the protonationstate is not fully specified by the input file (such as with PDB and MOL2 files with implicit hydrogens).In these cases <strong>FRED</strong> will take a best guess as to the correct protonation state.Charging protein Charges are only needed by <strong>FRED</strong> when using the Zapbind scoring function.By default the <strong>FRED</strong> will accept the input charges on the protein. However, if the -assign protein charges flag is set to true <strong>FRED</strong> will assign MMFF charges to the protein.The custom constraints can also be added to the receptor file, see the next section for details.5.4 Constraints<strong>FRED</strong> can optionally restrict the poses it will examine during the exhaustive docking step to poses thatmatch user defined constraints. Constraint information (if present) is contained in the receptor file.5.4.1 Protein constraintsProtein constraints can only be setup with the fred receptor GUI program. The fred receptor GUI canalso add/remove/modify constraints on existing receptor files.5.4.2 Custom constraintsCustom constraints can be added/created/removed with the fred receptor GUI, but can also be addedto a receptor on the command line using the -pharm flag to pass a constraint file. The format of thisfile is described in the -pharm flag documentation. If a receptor file with existing custom constraints is


34 Chapter 5. Running <strong>FRED</strong>passed to <strong>FRED</strong> and the -pharm flag is used the constraints in the receptor file will be ignored and theconstraints in the -pharm file will be used.This subsection has several examples of constraint files that demonstrate some subtleties of the constraintfiles.Basic constraint fileThe following constraint file which would be passed to the -pharm parameter, specifies a single constraintwith no SMARTS pattern.SPHERE 1 4 . 0 1 0 . 0 1 0 . 0 1 0 . 0This tells <strong>FRED</strong> that custom constraint feature ’1’ has a sphere associated with it. The sphere is radius4.0 and is centered at the coordinate (10.0,10.0,10.0). With this file every potential pose that <strong>FRED</strong> examineswill be rejected unless the pose has at least 1 heavy atom within 4.0 Angstroms of the coordinate(10.0,10.0,10.0). A feature can have more than one sphere as in the following example.Constraints with multiple spheresThe following constraint file specifies two constraints, only one of which need be satisfied.SPHERE 1 4 . 0 1 0 . 0 1 0 . 0 1 0 . 0SPHERE 1 3 . 0 9 . 0 1 2 . 0 8 . 0This tells <strong>FRED</strong> that custom constrain feature ’1’ has two spheres associated with it. The effect is thatevery pose must have at least one heavy atom that is either within 4.0 Angstroms of (10.0,10.0,10.0) ORwithin 3.0 Angstroms of (9.0,12.0,8.0). This is different than the following.Multiple constraintsThe following constraint file specifies two constraints, both of which must be satisfied.SPHERE 1 4 . 0 1 0 . 0 1 0 . 0 1 0 . 0SPHERE 2 3 . 0 9 . 0 1 2 . 0 8 . 0This would tell <strong>FRED</strong> to create two custom constraint features, ’1’ and ’2’, both of which must besatisfied. The effect here is that every pose must have at least one heavy atom within 4.0 Angstroms of(10.0,10.0,10.0) AND at least one heavy atom within 3.0 Angstroms of (9.0,12.0,8.0).By default any heavy atom on the ligand can satisfy a custom constraint feature. Supplying a SMARTSpattern along with the sphere definition will ensure that only poses that can place at least one atom ofthat pattern within the sphere will satisfy the constraint. This is illustrated in the following example.Constraints with SMARTSThe following constraint file specifies a location and a SMARTS pattern.


5.5. Preparing ligand database 35SPHERE 1 4 . 0 1 0 . 0 1 0 . 0 1 0 . 0SMARTS 1 FThis tells <strong>FRED</strong> that every pose must have at least one fluorine atom that is within 4.0 Angstroms of(10.0,10.0,10.0).Complex exampleSPHERE 1 4 . 0 1 0 . 0 1 0 . 0 1 0 . 0SMARTS 1 FSPHERE 2 5 . 0 0 . 0 0 . 0 0 . 0SMARTS 2 C=OSPHERE 3 3 . 0 − 1 0 . 0 − 1 0 . 0 − 1 0 . 0SPHERE 3 3 . 0 − 1 2 . 0 − 1 0 . 0 − 1 0 . 0SMARTS 3 [ # 7 ]SMARTS 3 [ # 6 ]This would tell <strong>FRED</strong> to create three custom constraint features. The first feature tells <strong>FRED</strong> a fluorineatom must be within 4.0 Angstroms of (10.0,10.0,10.0). The second feature tells <strong>FRED</strong> that a carbondouble bonded to an oxygen must be withing 5.0 Angstroms of (0.0,0.0,0.0). The third feature tells<strong>FRED</strong> that a carbon or nitrogen must either be within 3.0 Angstroms of (-10.0,-10.0,-10.0) or within 3.0Angstroms of (-12.0,-10.0,-10.0).Centering spheres on atomsEvery custom constraint feature must have at least one sphere. The distance constraints are alwaysmeasured from atom center to sphere center. To define a sphere on a residue atom use the ATOMcommand. It works as follows:ATOM 1 4 . 0 1 5 0This command is equivalent toSPHERE 1 4 . 0 < x> Where x,y and z are the coordinates of atom 150 on the target protein.5.5 Preparing ligand database5.5.1 Protonation stateLigands should be properly protonated. <strong>FRED</strong> accepts the input protonation state of the ligands, except inthe cases when the protonation state is not fully specified by the input file (such as with PDB and MOL2files with implicit hydrogens). In these cases <strong>FRED</strong> will use the OEChem function OEReadMolecule toassign protonation states according to the OpenEye valence model. Within the OpenEye toolset, Quacpacwill enumerate protonation states or FILTER can set a single protonation state.


36 Chapter 5. Running <strong>FRED</strong>5.5.2 Charging moleculesCharges are only needed by <strong>FRED</strong> when using the Zapbind scoring function. By default <strong>FRED</strong> willaccept input charges on the ligands. However, if the -assign ligand charges flag is set to true <strong>FRED</strong>will assign AM1BCC charges to the ligands [5] [6]. The AM1BCC calculation has a minimal, but nonzero,cost (approximately 1-5% of the total docking time), so users should set charges on their liganddatabase once rather than recalculating the charges each time <strong>FRED</strong> is run.5.5.3 Generating conformersConformers of the input ligand database must be generated prior to running <strong>FRED</strong>. Conformers of agiven molecule should appear in sequence in the input file. It is not necessary to name conformers of thesame molecule in any special fashion, <strong>FRED</strong>’s conformer perception matches the chemistry, not the title.To ensure that <strong>FRED</strong> can reproduce the binding pose of a ligand accurately each molecule should have atleast one conformer in its ensemble that is within 1.0 Angstrom RMSD of the bound structure. Moleculesthat have a conformer between 1.0 and 1.5 Angstrom RMSD can be docked with moderate success.5.6 Screen outputAll screen output in <strong>FRED</strong> is sent to stderr.The following is example screen output from a simple <strong>FRED</strong> run.ninja : ˜ / TESTING / <strong>FRED</strong>−docs> fred −rec rec1ppx . oeb . gz: jGf : . o88o . . o8: jGDDDDf : 8 8 8 ‘ ’ ’888,fDDDGjLDDDf , o888oo oooo d8b . ooooo . . oooo888,fDDLt : : iLDDL ; 8 8 8 ‘888""8P d88’ ‘88b d88’ ‘ 8 8 8; fDLt : : tfDG ; 8 8 8 8 8 8 888ooo888 8 8 8 888,jft : ,ijfffji , : iff 8 8 8 8 8 8 8 8 8 . o 8 8 8 888. jGDDDDDDDDDGt . o888o d888b ‘Y8bod8P’ ‘Y8bod88P’; GDDGt : ’’’:tDDDG,.DDDG: :GDDG. Copyright (c) 2003,2004,2005,2006;DDDj tDDDi OpenEye Scientific Software, Inc.,DDDf fDDD, Licensed to OpenEye Scientific SoftwareLDDDt. .fDDDj Version: 2.2 (Build date 20060810).tDDDDfjtjfDDDGt OEChem version: 1.4.2 debug 20060810:ifGDDDDDGfi.Platform: linux-2.6-g++4.1-i586.:::. Supported Run Modes:...................... Single processorDDDDDDDDDDDDDDDDDDDDDDPVM Multiprocessor (PVM Slavename : <strong>FRED</strong>)DDDDDDDDDDDDDDDDDDDDDD----------------------------------------


5.6. Screen output 37#Interface settings#Inputting Ligands :-dbase ph_conf.oeb.gz#Receptor Site :-rec rec1azm.oeb.gzWriting settings to : setup.txtRun status will periodically be written to : status.txt-----Receptor Information-----Site box volume : 5579Outer contour volume : 1454Inner contour volume : 96Has bound ligand "_"Writing a copy of the receptor to receptor.oeb.gz-----Ligand Database Information-----Parsing ligands Done.1 molecules152 conformers-----<strong>Docking</strong> Summary-----1) <strong>Exhaustive</strong> search will generate 100 using Chemgauss32) Poses will be solid body optimized with Chemgauss3-----Docked hitlists summary-----Sorted hitlists : TRUEHitlist size : 1000--Structure files--Chemgauss3 : chemgauss3_docked.oeb.gz--Tab delimited score files--Chemgauss3 : chemgauss3_scores.txtNote : Chemgauss3 scoring selected by defaultInitializing docking<strong>Exhaustive</strong> Search ..........Done.Chemgauss3 ..........Done.............x.x.............................x....x...................................x.....................................x.....x......x.......................................x.............................................................................................x..........x.....................x...............x....x.x.....................xx....................................x........................................x.....................x.............................................x..............x....................................x........x....x........x...x..........................................................


38 Chapter 5. Running <strong>FRED</strong>....x................................x...xx........x.x..................x..............x.........................................................................xx...............x......................x...x......x..........xx..x...........x..x...x..........x.............................x.........................x.........x...................x.......x..............----------------------------------------Outputing hitlistsFinishedThe large block of .’s and x’s are written to the screen as <strong>FRED</strong> docks molecules. Each . represents andsucessfully docked molecule, while each x represents a molecule that could not be docked and was sentto one of the undocked lists (see section 4.4.2). Any W characters that appear during docking indicatea warning was issued and sent to the warning log (see section 5.7.4). The remaining screen output isgenerally self explanatory.5.7 Complete list of output filesThis is a complete list of all output files <strong>FRED</strong> can write during a run. Some of these files may not bewritten depending upon user input. setup.txt : Setup File status.txt : Status file info log.txt : Information message log info log.txt : Warning log receptor.oeb.gz : Protein file scores.txt : Score File docked. : Docked Structure File alt scores.txt : Alternate Score File alt docked. : Alternate Docked StructureFile consensus scores.txt : Non-masc consensus hitlist scores. consensus docked. : Non-masc consensus hitlist docked structures. masc consensus scores.txt : MASC consensus hitlist scores. masc consensus docked. : MASC consensus hitlist docked structures.


5.7. Complete list of output files 39 undocked .smi Smiles file listing the molecules <strong>FRED</strong> could notdock.where is the setting of the -prefix flag and is the setting of the -oformat flagwhich also determines the molecular format used. is the name of one of the selectedscoring functions (see section 7.5). In the case where more than one scoring function is used a separatefile is written for each scoring function.5.7.1 Setup File_setup . txtThis file is automatically written at the beginning of each <strong>FRED</strong> run. This file lists the settings ofevery parameter used during the <strong>FRED</strong> run (this includes the default setting of parameters that were notexplicitly set by the user). This file can be used as a parameter file to duplicate the original <strong>FRED</strong> runthat generated the file (provided all of the required files are still present).5.7.2 Status File_status . txtThis file is written during at the beginning of a run and updated every few seconds during the course ofthe run. It lists information about the status of the current <strong>FRED</strong> run.5.7.3 Information message log_info_log . txtThis file contains informational messages that are sent during the run. It is primarily used by <strong>FRED</strong>during a multiprocessor run to report the progress of the slaves.5.7.4 Warning log_warning_log . txtThis file contains a log of any warnings issued during the run. The first time a warning is issued <strong>FRED</strong>will splash a message to stderr indicating that a warning log is being created. All subsequent warning willbe redirected to the warning log, and a ’W’ character will be inserted into the screen output to indicatethat a new warning was issued.5.7.5 Receptor file_receptor . oeb . gz


40 Chapter 5. Running <strong>FRED</strong>This file contains a copy of the receptor file <strong>FRED</strong> used (or created).5.7.6 Score Files__scores . txtA separate score file will be written for each of the selected scoring functions. The format of the file is atab separated text file, which contains the name, score, score components and a smiles representation ofeach ligand in the hitlist (one per line). The number and ordering of ligands in this file depends upon thesetting of the -serial and -hitlist size flags as follows.If the -serial is false (the default setting) the ligands will be sorted by score and the maximum numberof ligand appearing in the list will be specified by the -hitlist size flag, which defaults to 1000.If the -serial flag is true every ligand that <strong>FRED</strong> successfully docks will appear in the score file. If<strong>FRED</strong> is not being run in PVM mode the order of molecules in the output file will be the order theyappear in the input file. Information for each ligand will be written as the ligand is docked, rather than atthe end of the docking run, as occurs when -serial is false.5.7.7 Docked Structure Files__docked.Docked structure files will only be written if the -output structs flag is true (true is the defaultsetting). A separate docked structure file will be written for each of the selected scoring functions. Thetop scoring pose of each ligand will appear in the file. The format of the file and is determinedby the -oformat flag. Which ligands are listed in the file depends upon the setting of the -serial and-hitlist size flags.If the -serial is false (the default setting) the ligands will be sorted by score and the maximum numberof ligand appearing in the list will be specified by the -hitlist size flag, which defaults to 1000.If the -serial flag is true every ligand that <strong>FRED</strong> successfully docks will appear in the docked structurefile. If <strong>FRED</strong> is not being run in PVM mode the order of molecules in the output file will be theorder they appear in the input file. The structure of each ligand will be written to the file as the ligand isdocked, rather than at the end of the docking run.With certain file formats (see the -oformat flag), scores are stored within the molecule record. This isindependent of, and in addition to, the the output of the scores in the score and alternate score files. Seeappendix H.5.7.8 Alternate Score Files__alt_scores . txtAlternate score file will only be written if the flag -num alt poses is set to a non-zero value. The formatof this file is a tab separated text file, that contains the ligand name, pose number, score, score components


5.7. Complete list of output files 41and a smiles representation of poses. Poses of the same ligand will be grouped together. Ordering ofligands is the same as the standard score text file (see section 5.7.6).5.7.9 Alternate Docked Structure Files__alt_docked.Alternate docked structure files will only be written if -num alt poses is set to a non-zero value. Thefile contains the structure of different possible poses of each ligand in the active site. Poses of the sameligand are grouped together. Ligand ordering is the same as the standard docked structure file (see section5.7.7).5.7.10 Consensus Hitlist_consensus_scores . txt_consensus_docked.These files will be written if the following conditions are met:1. -serial is false (the default value).2. -consensus is true (the default value).3. More than one non-masc scoring function has been selected.The scores.txt file contains the names of the top ranked molecules by consensus score of the non-mascvariant scoring functions, as well as its rank in each of the individual scoring functions and overallconsensus score.The docked file contains the top consensus structure pose of each of the top ranked ligands rank byconsensus score. The order will be the same as the scores.txt file.5.7.11 MASC Consensus Hitlists_MASC_consensus_scores . txt_MASC_consensus_docked.These files will be written if the following conditions are met:1. -serial is false (the default value)2. -consensus is true (the default value)3. More than one MASC scoring function has been selected.The scores file contains the names of the top ranked molecules by consensus score of the Non-MASCvariant scoring functions, as well as its rank in each of the individual scoring functions and overallconsensus score.


42 Chapter 5. Running <strong>FRED</strong>The docked file contains the top consensus structure pose of each of the top ranked ligands rank byconsensus score. The order will be the same as the scores file.5.7.12 Undocked Ligand Filesundocked_code.smiThese smiles files contain all the molecules that <strong>FRED</strong> read in but could not dock. is aninteger value that specifies why <strong>FRED</strong> could not dock the ligands listed in the particular file. See section4.4.2 for the meaning of individual return codes.


CHAPTERSIXExample Command LinesThis chapter has example command lines for <strong>FRED</strong>. The purpose is to connect some concepts of dockingto specific parameters in <strong>FRED</strong>. To that end almost all these examples focus on one idea or concept andthe associated parameters (although all examples are complete valid command lines for <strong>FRED</strong>). It isexpected that users will often take multiple examples and combine them on their own command line.6.1 Simple command lines6.1.1 <strong>Docking</strong> with an existing receptorCommand Linefred −dbase my_ligands . oeb . gz −rec my_rec . oeb . gzDescription<strong>FRED</strong> will dock the ligands in the file my ligands.oeb.gz to the receptor my rec.oeb.gz. No scoringfunction has been specified so <strong>FRED</strong> will use its default scoring function Chemgauss3 (see section4.3.5).Parameters-dbase my ligands.oeb.gzThis tells <strong>FRED</strong> where the ligands it is going to dock are, in this case the file my ligands.oeb.gz.The ligand file should have multiple conformers of all flexible ligands (see section 5.5 for moredetail on preparing the ligand database).-rec my rec.oeb.gzThis specifies a receptor file to dock to the ligand into. The receptor file must be created prior torunning <strong>FRED</strong> with this command line. This file can be created by <strong>FRED</strong> or by the fred receptor43


44 Chapter 6. Example Command LinesGUI.Output Filessetup.txt A file with the settings of all of <strong>FRED</strong>’s parameters. Note that there will be many moreparameters in this file than those specified on the command line because many parameters havedefault values. This file will be written at the beginning of the run.receptor.oeb.gz This file contains a copy of the receptor site used for the run. This file will bewritten at the beginning of the run.status.txt This file will be written every few seconds by <strong>FRED</strong> and indicates the status of the run.chemgauss3 docked.oeb.gz contains the structure of the top 1000 docked ligands as ranked byChemgauss3.chemgauss3 scores.txt list the names, scores and SMILES representation of the top 1000docked ligands as ranked by Chemgauss3 in a text format.6.1.2 Creating a receptor and dockingCommand Linefred −dbase my_ligands . oeb . gz \−pro my_protein . pdb \−box bound_ligand . mol2 \−addbox 4 . 0Description<strong>FRED</strong> will create a receptor and then dock the ligands in the file my ligands.oeb.gz to it. No scoringfunction has been specified so <strong>FRED</strong> will use its default scoring function Chemgauss3 (see section 4.3.5).Parameters-dbase my ligands.oeb.gzThis tells <strong>FRED</strong> where the ligands it is going to dock are, in this case the file my ligands.oeb.gz.The ligand file should have multiple conformers of all flexible ligands (see section 5.5 for moredetail on preparing the ligand database).-pro my protein.pdbThis specifies the protein structure used to create the receptor file. This file should not contain abound ligand.


6.1. Simple command lines 45-box bound ligand.mol2This specifies a box defining the extents of the active site. The format of this file is any supported3d molecule format. The box will always be aligned along the x,y and z axis. The max and minx,y and z values denoting the sides of the box will the the max and min x,y and z values of anyheavy atom on the molecule passed.This example uses a bound ligand. Without modification the box created around it would beunreasonably small, however we are also using the -addbox flag which increases the size of thebox created.-addbox 4.0This flag tells <strong>FRED</strong> to extend each side of the box it created with the -box flag by 4 Angstroms.(if the dimensions of the box were initially 12x12x12, the modified box would have dimensions20x20x20).Output Filessetup.txt A file with the settings of all of <strong>FRED</strong>’s parameters. Note that there will be many moreparameters in this file than those specified on the command line because many parameters havedefault values. This file will be written at the beginning of the run.receptor.oeb.gz This file contains a copy of the receptor site <strong>FRED</strong> created. This file will bewritten at the beginning of the run.status.txt This file will be written every few seconds by <strong>FRED</strong> and indicates the status of the run.chemgauss3 docked.oeb.gz contains the structure of the top 1000 docked ligands as ranked byChemgauss3.chemgauss3 scores.txt list the names, scores and SMILES representation of the top 1000docked ligands as ranked by Chemgauss3 in a text format.6.1.3 Changing the output file formatCommand Linefred −dbase my_ligands . oeb . gz −rec my_rec . oeb . gz −oformat sdf . gzDescription<strong>FRED</strong> will dock the ligands in the file my ligands.oeb.gz to the receptor my rec.oeb.gz. No scoringfunction has been specified so <strong>FRED</strong> will use its default scoring function Chemgauss3 (see section4.3.5).The output format of docked structures will be gzipped sdf.


46 Chapter 6. Example Command LinesParameters-dbase my ligands.oeb.gzThis tells <strong>FRED</strong> where the ligands it is going to dock are, in this case the file my ligands.oeb.gz.The ligand file should have multiple conformers of all flexible ligands (see section 5.5 for moredetail on preparing the ligand database).-rec my rec.oeb.gzThis specifies a receptor file to dock to the ligand too. The receptor file must be created prior torunning <strong>FRED</strong> with this command line. This file can be created by <strong>FRED</strong> or by the fred receptorGUI.-oformat sdf.gzThis flag tells <strong>FRED</strong> that the docked structures it outputs should be in sdf gzipped format, ratherthan the default OEB gzipped format. If sdf were specified instead of sdf.gz regular non-gzippedSDF format would be used.Output Filessetup.txt A file with the settings of all of <strong>FRED</strong>’s parameters. Note that there will be many moreparameters in this file than those specified on the command line because many parameters havedefault values. This file will be written at the beginning of the run.receptor.oeb.gz This file contains a copy of the receptor site used for the run. This file will bewritten at the beginning of the run.status.txt This file will be written every few seconds by <strong>FRED</strong> and indicates the status of the run.chemgauss3 docked.sdf.gz contains the structure of the top 1000 docked ligands as ranked byChemgauss3 in gzipped sdf format. This file can be unzipped like any gzipped file after the run,although almost all OpenEye programs can read the zipped version of the file.chemgauss3 scores.txt list the names, scores and SMILES representation of the top 1000docked ligands as ranked by Chemgauss3 in a text format.6.1.4 Setting a prefix and/or an output directoryCommand Linefred −dbase my_ligands . oeb . gz \−rec my_rec . oeb . gz \−prefix myrun \−output_dir / results / site / <strong>FRED</strong>


6.1. Simple command lines 47Description<strong>FRED</strong> will dock the ligands in the file my ligands.oeb.gz to the receptor my rec.oeb.gz. No scoringfunction has been specified so <strong>FRED</strong> will use it’s default scoring function Chemgauss3 (see section4.3.5).All output file will be prefixed with the text ”myrun ”, and placed in the directory /results/site/<strong>FRED</strong>(which must exist). Note that it is not necessary to specify both -prefix and -output dir, you can specifyone (or none) if you only want to set the a prefix to the output filenames or an output directoryrespectively.Parameters-dbase my ligands.oeb.gzThis tells <strong>FRED</strong> where the ligands it is going to dock are, in this case the file my ligands.oeb.gz.The ligand file should have multiple conformers of all flexible ligands (see section 5.5 for moredetail on preparing the ligand database).-rec my rec.oeb.gzThis specifies a receptor file to dock to the ligand too. The receptor file must be created prior torunning <strong>FRED</strong> with this command line. This file can be created by <strong>FRED</strong> or by the fred receptorGUI.-prefix myrunThis flag tells <strong>FRED</strong> to pre-append all output files it writes with the text ”myrun ” (theautomatically).is added-output dir /results/site/<strong>FRED</strong>This flag tells <strong>FRED</strong> that all output files should be written in the directory /results/site/<strong>FRED</strong>.Output FilesThese files will all be written to /results/site/<strong>FRED</strong>, rather than the working directory as they normallywould.myrun setup.txt A file with the settings of all of <strong>FRED</strong>’s parameters. Note that there will be manymore parameters in this file than those specified on the command line because many parametershave default values. This file will be written at the beginning of the run.myrun receptor.oeb.gz This file contains a copy of the receptor site used for the run. This filewill be written at the beginning of the run.myrun status.txt This file will be written every few seconds by <strong>FRED</strong> and indicates the status ofthe run.


48 Chapter 6. Example Command Linesmyrun chemgauss3 docked.sdf.gz contains the structure of the top 1000 docked ligands asranked by Chemgauss3.myrun chemgauss3 scores.txt list the names, scores and SMILES representation of the top1000 docked ligands as ranked by Chemgauss3 in a text format.6.2 Creating a receptorThis section has several examples of how to use the <strong>FRED</strong> executable to create a receptor site. Note thata receptor file can also be created using the fred receptor GUI available from the download section ofthe OpenEye website (www.eyesopen.com).These examples all have <strong>FRED</strong> create and write a receptor site file and then terminate. If you add the-dbase flag, and pass the name of a multiconformer ligand file to it <strong>FRED</strong> will begin docking thosemolecules after creating the receptor site and writing it to a file.6.2.1 With a boxCommand Linefred −pro protein . pdb −box box . pdb −addbox 4 . 0Description<strong>FRED</strong> will create a receptor site with the given protein, using a user supplied box to locate the site onthe protein.Parameters-pro protein.pdbThis tells <strong>FRED</strong> to create a receptor using the protein located in the file protein.pdb. If the proteinstructure contains a bound ligand it should be stripped from the file before running <strong>FRED</strong>.-box box.pdbThis flag defines the extents of the active site using a box that is in a molecular file format. Themaximum and minimum x,y and z values of any heavy atom in box.pdb will define the maximumand minimum x,y and z values of the box, which is always aligned along the coordinate axises.-addbox 4.0This flag tells <strong>FRED</strong> to modify the box specified by the -box flag, by adding 4 Angstroms to everyside. This flag is optional, but it very commonly used with this setting when box.pdb is a bound


6.2. Creating a receptor 49ligand. (passing a bound ligand to -box without using -addbox flag generally results in the receptorsite being too small).Outputsetup.txt A file with the settings of all of <strong>FRED</strong>’s parameters. Note that there will be many moreparameters in this file than those specified on the command line because many parameters havedefault values. This file will be written at the beginning of the run.status.txt This file will be written every few seconds by <strong>FRED</strong> and indicates the status of the run.receptor.oeb.gz A receptor file that contains the binding site defined by the -box and -addboxparameters.6.2.2 With a bound ligandCommand Linefred −pro protein . pdb −bound_ligand bound_ligand . pdbDescription<strong>FRED</strong> will create a receptor site, by using the supplied protein structure in combination with a shapebased site detection algorithm and the position of a known bound ligand. The extent of the site createdis determined by the site detection algorithm with some guidance from the position of the known boundligand.Parameters-pro protein.pdbtells <strong>FRED</strong> to create a receptor using the protein located in the file protein.pdb. If the proteinstructure contains a bound ligand it should be stripped from this file before running <strong>FRED</strong>.-bound ligand bound ligand.pdbspecifies a file containing a ligand in a pose bound to the active site (generally from the same x-raycrystallography data used to determine the protein structure). <strong>FRED</strong> uses this bound ligand inconjunction with a shape based site detection algorithm to determine the extent of the active site.Outputsetup.txt A file with the settings of all of <strong>FRED</strong>’s parameters. Note that there will be many moreparameters in this file than those specified on the command line because many parameters havedefault values. This file will be written at the beginning of the run.


50 Chapter 6. Example Command Linesstatus.txt This file will be written every few seconds by <strong>FRED</strong> and indicates the status of the run.receptor.oeb.gz A receptor file. In addition to the required receptor data, this receptor file willcontain a copy of the bound ligand.6.2.3 With a box and a bound ligandCommand Linefred −pro protein . pdb −box box . pdb −addbox 4 . 0 − bound_ligand ligand . pdbDescription<strong>FRED</strong> will create a receptor site with the given protein, using a user supplied box to locate the site onthe protein. <strong>FRED</strong> will also attach a bound ligand (ligand.pdb) to the receptor site, although because wehave specified the -box flag the ligand will not be used to locate the receptor site as was the case in theprevious example.Parameters-pro protein.pdbThis tells <strong>FRED</strong> to create a receptor using the protein located in the file protein.pdb. If the proteinstructure contains a bound ligand it should be stripped from the file before running <strong>FRED</strong>.-box box.pdbThis flag defines the extents of the active site using a box that is in a molecular file format. Themaximum and minimum x,y and z values of any heavy atom in box.pdb will define the maximumand minimum x,y and z values of the box, which is always aligned along the coordinate axises.-addbox 4.0This flag tells <strong>FRED</strong> to modify the box specified by the -box flag, by adding 4 Angstroms to everyside. This flag is optional, but it very commonly used with this setting when box.pdb is a boundligand. (passing a bound ligand to -box without using -addbox flag generally results in the receptorsite being too small).-bound ligand ligand.pdbThis flag tells <strong>FRED</strong> to attach ligand.pdb to the receptor file as a known bound ligand. Becausewe have specified the -box flag the structure of the bound ligand is not used to locate the receptorsite.Outputsetup.txt A file with the settings of all of <strong>FRED</strong>’s parameters. Note that there will be many moreparameters in this file than those specified on the command line because many parameters havedefault values. This file will be written at the beginning of the run.


6.2. Creating a receptor 51status.txt This file will be written every few seconds by <strong>FRED</strong> and indicates the status of the run.receptor.oeb.gz A receptor file that contains the binding site defined by the -box and -addboxparameters.6.2.4 Without a box or bound ligandCommand Linefred −pro protein . pdbDescription<strong>FRED</strong> will create a receptor site using the supplied protein and will guess at the location of the active siteusing a shape based site detection algorithm. This method of setting up a receptor site is not foolproofand it is possible that <strong>FRED</strong> will not locate the active site correctly. Therefore it is recommended thatyou use either a bound ligand or specify the location of the active site with a box if that information isavailable.Success at determining the active site with this method is fairly binary, i.e. site detection either workswell or fails completely. A good rule of thumb is that the active site will be correctly located 4 out of 5times.Parameters-pro protein.pdbtells <strong>FRED</strong> to create a receptor using the protein located in the file protein.pdb. If the proteinstructure contains a bound ligand it should be stripped from this file before running <strong>FRED</strong>.-bound ligand bound ligand.pdbspecifies a file containing a ligand in a pose bound to the active site (generally from the same x-ray crystallography data used to determine the protein structure). <strong>FRED</strong> use this bound ligand inconjunction with a shape based site detection algorithm to determine the extent of the active site.Outputsetup.txt A file with the settings of all of <strong>FRED</strong>’s parameters. Note that there will be many moreparameters in this file than those specified on the command line because many parameters havedefault values. This file will be written at the beginning of the run.status.txt This file will be written every few seconds by <strong>FRED</strong> and indicates the status of the run.receptor.oeb.gz A receptor file.


52 Chapter 6. Example Command Lines6.3 Finding the right poseExamples in this section deal with adjusting how <strong>FRED</strong> docks ligands (i.e. how it determines the pose,or poses if alternate poses are requested, of a ligand within the active site). None of the examples in thissection address how the final ligand pose(s) are scored and outputted to the hitlist(s). Hence the sameset of output files are returned in each example, the docked structures thus the scores of the outputtedligands will be different however (since these examples adjust how <strong>FRED</strong> generates the poses to score).The next section has examples of adjusting the final scoring and hitlists.See section 4.2 for a general explanation of how <strong>FRED</strong> docks ligands.6.3.1 Rescoring - scoring without dockingCommand Linefred −dbase my_poses . mol2 −rec my_receptor . oeb . gz −no_dock −opt noneDescriptionUsing this command line <strong>FRED</strong> will assume that the ligand molecules passed to -dbase are alreadypositioned within the active site, and rescore and re-rank the ligands using the default scoring functionChemgauss3 (see section 4.3.5) since we have not specified a scoring function to use.Parameters-dbase my poses.mol2This specifies the poses to read in for rescoring and reranking.-rec my receptor.oeb.gzSpecifies the receptor site to score the ligands in.-no dock This flag tells <strong>FRED</strong> to skip the exhaustive search (which is how poses are initially generated).-opt nonetells <strong>FRED</strong> not to perform solid body optimization on the poses that are passed in.Outputsetup.txt A file with the settings of all of <strong>FRED</strong>’s parameters. Note that there will be many moreparameters in this file than those specified on the command line because many parameters havedefault values. This file will be written at the beginning of the run.receptor.oeb.gz This file contains a copy of the receptor site used for the run. This file will bewritten at the beginning of the run.


6.3. Finding the right pose 53status.txt This file will be written every few seconds by <strong>FRED</strong> and indicates the status of the run.chemgauss3 docked.oeb.gz contains the structure of the top 1000 ligands as ranked by Chemgauss3in gzipped sdf format.chemgauss3 scores.txt list the names, scores and SMILES representation of the top 1000 ligandsas ranked by Chemgauss3 in a text format.6.3.2 Setting the exhaustive scoring and optimization functionsCommand Linefred −dbase my_ligands . oeb . gz \−rec my_receptor . oeb . gz \−exhaustive_scoring plp \−opt plpDescriptionThe initial exhaustive search to generate candidate poses and the optimization step both have settings forthe scoring functions they use to rank or optimize poses respectively. The default scoring function forthese operations is Chemgauss3 (see section 4.3.5), however both operations can be set to use differentscoring functions as shown in this example which sets both the exhaustive scoring and optimizationscoring functions to PLP (see section 4.3.3).Parameters-dbase my ligands.oeb.gzThis tells <strong>FRED</strong> where the ligands it is going to dock are, in this case the file my ligands.oeb.gz.The ligand file should have multiple conformers of all flexible ligands (see section 5.5 for moredetail on preparing the ligand database).-rec my receptor.oeb.gzThis specifies a receptor file to dock the ligands to. The receptor file must be created prior torunning <strong>FRED</strong> with this command line. This file can be created by <strong>FRED</strong> or by the fred receptorGUI.-exhaustive scoring plp-opt plpSets the exhaustive scoring function to PLP.Sets the optimization scoring function to PLP.Note that -exhaustive scoring and -opt are independent (i.e. you can specify one without the other).


54 Chapter 6. Example Command LinesOutputsetup.txt A file with the settings of all of <strong>FRED</strong>’s parameters. Note that there will be many moreparameters in this file than those specified on the command line because many parameters havedefault values. This file will be written at the beginning of the run.receptor.oeb.gz This file contains a copy of the receptor site used for the run. This file will bewritten at the beginning of the run.status.txt This file will be written every few seconds by <strong>FRED</strong> and indicates the status of the run.chemgauss3 docked.oeb.gz contains the structure of the top 1000 docked ligands as ranked byChemgauss3.chemgauss3 scores.txt list the names, scores and SMILES representation of the top 1000docked ligands as ranked by Chemgauss3 in a text format.6.3.3 Adjusting consensus structure - changing weightsCommand Linefred −dbase my_ligands . oeb . gz \−rec my_ligands . oeb . gz \−pose_select_weight_shapegauss 1 \−pose_select_weight_plp 0 \−pose_select_weight_Chemgauss2 1 \−pose_select_weight_Chemgauss3 4 \−pose_select_weight_Chemscore 0 \−pose_select_weight_oeChemscore 2 \−pose_select_weight_screenscore 0 \−pose_select_weight_cgo 0DescriptionOnce candidate poses have been generated by the exhaustive search and optimization step <strong>FRED</strong> selectsa single best pose from the set of candidates. This pose is then scored and the score is used to rankligands in the output hitlist. The consensus structure step allows multiple scoring functions to vote forthe best docked structure in a rank-by-vote approach. This example shows how to adjust the weight ofeach scoring function’s vote.Parameters-dbase my ligands.oeb.gzThis tells <strong>FRED</strong> where the ligands it is going to dock are, in this case the file my ligands.oeb.gz.The ligand file should have multiple conformers of all flexible ligands (see section 5.5 for moredetail on preparing the ligand database).


6.3. Finding the right pose 55-rec my receptor.oeb.gzThis specifies a receptor file to dock the ligand to. The receptor file must be created prior to running<strong>FRED</strong> with this command line. This file can be created by <strong>FRED</strong> or by the fred receptor GUI.-pose select weight shapegauss 1Gives shapegauss a weight of 1 for consensus structure voting.-pose select weight plp 0Gives PLP a weight of 0 for consensus structure voting (i.e. PLP is ignored for consensus structure).-pose select weight Chemgauss2 1Gives Chemgauss2 a weight of 1 for consensus structure voting.-pose select weight Chemgauss3 4Gives Chemgauss3 a weight of 4 for consensus structure voting.-pose select weight Chemscore 0Gives Chemscore a weight of 0 for consensus structure voting (i.e. Chemscore is ignored forconsensus structure).-pose select weight oeChemscore 2Gives OEChemscore a weight of 2 for consensus structure voting.-pose select weight screenscore 0Gives Screenscore a weight of 0 for consensus structure voting (i.e. Screenscore is ignored forconsensus structure).-pose select weight cgo 0 Gives CGO a weight of 0 for consensus structure voting (i.e. CGOis ignored for consensus structure). Note that CGO is a ligand based scoring function, whichmeans that if CGO is to be used with a non-zero weight for consensus structure the receptor filemust contain a bound ligand.Outputsetup.txt A file with the settings of all of <strong>FRED</strong>’s parameters. Note that there will be many moreparameters in this file than those specified on the command line because many parameters havedefault values. This file will be written at the beginning of the run.receptor.oeb.gz This file contains a copy of the receptor site used for the run. This file will bewritten at the beginning of the run.status.txt This file will be written every few seconds by <strong>FRED</strong> and indicates the status of the run.chemgauss3 docked.oeb.gz contains the structure of the top 1000 docked ligands as ranked byChemgauss3.chemgauss3 scores.txt list the names, scores and SMILES representation of the top 1000docked ligands as ranked by Chemgauss3 in a text format.


56 Chapter 6. Example Command Lines6.3.4 Turning consensus structure offCommand Linefred −dbase my_ligands . oeb . gz \−rec my_ligands . oeb . gz \−pose_select_weight_shapegauss 0 \−pose_select_weight_plp 0 \−pose_select_weight_Chemgauss2 0 \−pose_select_weight_Chemgauss3 0 \−pose_select_weight_Chemscore 0 \−pose_select_weight_oeChemscore 0 \−pose_select_weight_screenscore 0 \−pose_select_weight_cgo 0DescriptionThis example disables the consensus structure step by setting all the scoring function weights for consensusstructure to zero. In this case the top ranked pose from the optimization step (or from the exhaustivesearch if optimization is also disabled) will scored with Chemgauss3 and outputted to the hitlist(s).Parameters-dbase my ligands.oeb.gzThis tells <strong>FRED</strong> where the ligands it is going to dock are, in this case the file my ligands.oeb.gz.The ligand file should have multiple conformers of all flexible ligands (see section 5.5 for moredetail on preparing the ligand database).-rec my receptor.oeb.gzThis specifies a receptor file to dock to the ligand too. The receptor file must be created prior torunning <strong>FRED</strong> with this command line. This file can be created by <strong>FRED</strong> or by the fred receptorGUI.-pose select weight shapegauss 0Gives shapegauss a weight of 0 for consensus structure voting.-pose select weight plp 0Gives PLP a weight of 0 for consensus structure voting.-pose select weight Chemgauss2 0Gives Chemgauss2 a weight of 0 for consensus structure voting.-pose select weight Chemgauss3 0Gives Chemgauss3 a weight of 0 for consensus structure voting.-pose select weight Chemscore 0Gives Chemscore a weight of 0 for consensus structure voting.


6.4. Scoring 57-pose select weight oeChemscore 0Gives OEChemscore a weight of 0 for consensus structure voting.-pose select weight screenscore 0Gives Screenscore a weight of 0 for consensus structure voting,-pose select weight cgo 0Gives CGO a weight of 0 for consensus structure voting.Outputsetup.txt A file with the settings of all of <strong>FRED</strong>’s parameters. Note that there will be many moreparameters in this file than those specified on the command line because many parameters havedefault values. This file will be written at the beginning of the run.receptor.oeb.gz This file contains a copy of the receptor site used for the run. This file will bewritten at the beginning of the run.status.txt This file will be written every few seconds by <strong>FRED</strong> and indicates the status of the run.chemgauss3 docked.oeb.gz contains the structure of the top 1000 docked ligands as ranked byChemgauss3.chemgauss3 scores.txt list the names, scores and SMILES representation of the top 1000docked ligands as ranked by Chemgauss3 in a text format.6.4 ScoringBy default <strong>FRED</strong> outputs a Chemgauss3 hitlist which has the top 1000 ligands as scored by Chemgauss3.This section has examples of how to have <strong>FRED</strong> use other scoring functions to ranks ligands.6.4.1 One scoring functionCommand Linefred −dbase my_ligands . oeb . gz −rec my_receptor . oeb . gz −plp −hitlist_size 5000DescriptionParameters<strong>FRED</strong> will maintain and output a hitlist which uses the PLP scoring function, in place of the defaultChemgauss3 scoring function. This example also increases the size of the hitlist to 5000, from the


58 Chapter 6. Example Command Linesdefault value of 1000.-dbase my ligands.oeb.gzThis tells <strong>FRED</strong> where the ligands it is going to dock are, in this case the file my ligands.oeb.gz.The ligand file should have multiple conformers of all flexible ligands (see section 5.5 for moredetail on preparing the ligand database).-rec my receptor.oeb.gzThis specifies a receptor file to dock to the ligand too. The receptor filemust be created prior to running <strong>FRED</strong> with this command line. This file can be created by <strong>FRED</strong>or by the fred receptor GUI.-plp Enables plp output hitlist.-hitlist size 5000Tells <strong>FRED</strong> to keep and output the top 5000 ligands in the PLP hitlist.Outputsetup.txt A file with the settings of all of <strong>FRED</strong>’s parameters. Note that there will be many moreparameters in this file than those specified on the command line because many parameters havedefault values. This file will be written at the beginning of the run.receptor.oeb.gz This file contains a copy of the receptor site used for the run. This file will bewritten at the beginning of the run.status.txt This file will be written every few seconds by <strong>FRED</strong> and indicates the status of the run.plp docked.oeb.gz contains the structure of the top 5000 docking ligands as ranked by PLP.plp scores.txt list the names, scores and SMILES representation of the top 5000 docked ligandsas ranked by PLP in a text format.6.4.2 Multiple scoring functionsCommand Linefred −dbase my_ligands . oeb . gz \−rec my_receptor . oeb . gz \−plp \−chemgauss3 \−shapegauss \−oechemscore \−screenscore \−hitlist_size 5000


6.4. Scoring 59DescriptionThis example enables multiple scoring functions and output hitlists and increases the size of the outputhitlists to 5000, from the default value of 1000. Also because multiple scoring functions are enabled<strong>FRED</strong> will also automatically maintain a consensus scoring hitlist using all the selected scoring functions.Parameters-dbase my ligands.oeb.gzThis tells <strong>FRED</strong> where the ligands it is going to dock are, in this case the file my ligands.oeb.gz.The ligand file should have multiple conformers of all flexible ligands (see section 5.5 for moredetail on preparing the ligand database).-rec my receptor.oeb.gzThis specifies a receptor file to dock to the ligand too. The receptor file must be created prior torunning <strong>FRED</strong> with this command line. This file can be created by <strong>FRED</strong> or by the fred receptorGUI.-hitlist size 5000Tells <strong>FRED</strong> to keep and output the top 5000 ligands in the PLP hitlist.-plp Enables PLP output hitlist.-chemgauss3 Enables Chemgauss3 output hitlist.-oechemscore Enables OEChemscore output hitlist.-screenscore Enables Screenscore output hitlist.Outputsetup.txt A file with the settings of all of <strong>FRED</strong>’s parameters. Note that there will be many moreparameters in this file than those specified on the command line because many parameters havedefault values. This file will be written at the beginning of the run.receptor.oeb.gz This file contains a copy of the receptor site used for the run. This file will bewritten at the beginning of the run.status.txt This file will be written every few seconds by <strong>FRED</strong> and indicates the status of the run.plp docked.oeb.gz contains the structure of the top 5000 docking ligands as ranked by PLP.plp scores.txt list the names, scores and SMILES representation of the top 5000 docked ligandsas ranked by PLP in a text format.chemgauss3 docked.oeb.gz contains the structure of the top 5000 docking ligands as ranked byChemgauss3.


60 Chapter 6. Example Command Lineschemgauss3 scores.txt list the names, scores and SMILES representation of the top 5000docked ligands as ranked by Chemgauss3 in a text format.oechemscore docked.oeb.gz contains the structure of the top 5000 docking ligands as rankedby OEChemscore.oechemscore scores.txt list the names, scores and SMILES representation of the top 5000docked ligands as ranked by OEChemscore in a text format.screenscore docked.oeb.gz contains the structure of the top 5000 docking ligands as rankedby Screenscore.screenscore scores.txt list the names, scores and SMILES representation of the top 5000docked ligands as ranked by Screenscore in a text format.consensus docked.oeb.gz contains the structure of the top 5000 docking ligands as ranked by aconsensus score using PLP, Chemgauss3, OEChemscore, and Screenscore (i.e. the selected scoringfunctions).consensus scores.txt list the names, scores and SMILES representation of the top 5000 dockedligands (in a text format) as ranked by a consensus score using PLP, Chemgauss3, OEChemscoreand Screenscore (i.e. the selected scoring functions).6.5 Using MASCThis section has examples of using the MASC (see section 4.3.12) variants of the scoring functions availablein <strong>FRED</strong>. MASC requires that ligands be specially prepared by docking them to a set of referencesites and storing the resulting scores with the ligands. This will be done automatically by <strong>FRED</strong> if itdetects that the input ligands do not have the required MASC data using a standard set of reference sitesbuilt into <strong>FRED</strong>. Calculating the MASC data is computationally expensive since each ligand must bedocked to the reference sites (of which there are 12 in <strong>FRED</strong>’s standard set) before docking to the targetreceptor. However this calculation can be done once for a given set of ligands, and the same data re-usedwhen those ligands are docked to any number of different targets (in which case the computational costfor using MASC is negligible after the initial calculation).6.5.1 Scoring with MASCUsing a single MASC scoring functionfred −dbase my_ligands . oeb . gz \−rec my_receptor . oeb . gz \−chemgauss3_masc


6.5. Using MASC 61DescriptionDocks my ligands.oeb.gz to my receptor.oeb.gz and outputs the 1000 molecules as scored by the MASCvariant of the Chemgauss3 scoring function.The contents of emphmy ligands.oeb.gz affect how the run proceeds as follows:my ligands.oeb.gz contains :a standard set of multiconformer molecules <strong>FRED</strong> will automatically generate theneeded MASC data by docking the ligands to 12 internal reference sites before docking the ligandsto my receptor. The run time of the job will be approximately 13-fold longer than a standard<strong>FRED</strong> run because <strong>FRED</strong> must dock each ligand 13 times (12 reference sites + the target protein).<strong>FRED</strong> will also output a copy of the my ligands.oeb.gz file to masc ligands.oeb.gz with all thecalculated MASC data tagged to it. The masc ligands.oeb.gz file can then be used in future runsagainst this or other targets when the MASC variant of Chemgauss3 is used (it may also be usedwhen doing standard non-MASC scoring).multiconformer molecules with the Chemgauss3 MASC data <strong>FRED</strong> will use theexisting MASC data, and will not need to dock the molecules to the reference sites. Run timewill be the same as a normal run using Chemgauss3.Parameters-dbase my ligands.oeb.gzThis tells <strong>FRED</strong> where the ligands it is going to dock are, in this case the file my ligands.oeb.gz.The ligand file should have multiple conformers of all flexible ligands. It also may or may notcontain MASC data. The run will proceed differently depending on whether MASC data is presentor not, see the description above.-rec my receptor.oeb.gzThis specifies a receptor file to dock the ligand to. The receptor file must be created prior to running<strong>FRED</strong> with this command line. This file can be created by <strong>FRED</strong> or by the fred receptor GUI.-chemgauss3 masc Tells <strong>FRED</strong> to score ligands with the MASC variant of Chemgauss3.Outputsetup.txt A file with the settings of all of <strong>FRED</strong>’s parameters. Note that there will be many moreparameters in this file than those specified on the command line because many parameters havedefault values. This file will be written at the beginning of the run.receptor.oeb.gz This file contains a copy of the receptor site used for the run. This file will bewritten at the beginning of the run.status.txt This file will be written every few seconds by <strong>FRED</strong> and indicates the status of the run.


62 Chapter 6. Example Command Linesmasc chemgauss3 docked.oeb.gz contains the structure o the top 1000 docked ligands asranked by the MASC variant of Chemgauss3.masc chemgauss3 scores.txt list the names, scores and SMILES representation of the top1000 docked ligands as ranked by the MASC variant of Chemgauss3 in a text format.Additionally if my ligands.oeb.gz did not contain the required MASC data the following file will beoutputted.masc ligands.oeb.gz A copy of the my ligands.oeb.gz file that also hold the MASC data<strong>FRED</strong> calculated. The masc ligands.oeb.gz contains all the conformers and data of the originalmy ligands.oeb.gz file and can be used in place of the original file in any program.6.5.2 Using multiple MASC and other scoring functionsCommand Linefred −dbase my_ligands . oeb . gz \−rec my_receptor . oeb . gz \−plp_masc \−plp \−chemgauss3_MASC \−oechemscoreDescriptionmy ligands.oeb.gz will be docked to my receptor.oeb.gz. Docked ligands will be scored and outputtedto hitlists using PLP, the MASC variant of PLP, the MASC variant of Chemgauss3 and OEChemscore.Also two consensus hitlists will be maintained, one which uses the MASC variant scoring functions andone which uses the standard non-MASC scoring functions. If there were only one MASC or non-MASCscoring function selected, that type of consensus hitlist would be be maintained.The contents of my ligands.oeb.gz affect how the run proceeds as follows:my ligands.oeb.gz contains :a standard set of multiconformer molecules <strong>FRED</strong> will automatically generate theneeded MASC data by docking the ligands to 12 internal reference sites before docking the ligandsto my receptor. The run time of the job will be approximately 13-fold longer than a standard<strong>FRED</strong> run because <strong>FRED</strong> must dock each ligand 13 times (12 reference sites + the target). <strong>FRED</strong>will also output a copy of the my ligands.oeb.gz file to masc ligands.oeb.gz with all the calculatedMASC data tagged to it. The masc ligands.oeb.gz file can then be used in future runs against thisor other targets when the MASC variant of Chemgauss3 or PLP are used (you can also use it whendoing standard non-masc scoring).


6.5. Using MASC 63multiconformer molecules with either PLP or Chemgauss3 Run will proceed as ifthe input molecules did not have any MASC data (i.e. docking to the reference sites will berequired, and the run time will be long).multiconformer molecules with PLP and Chemgauss3 MASC data <strong>FRED</strong> will usethe existing MASC data, and will not need to dock the molecules to the reference sites. Runtime will be the same as a normal run.Parameters-dbase my ligands.oeb.gzThis tells <strong>FRED</strong> where the ligands it is going to dock are, in this case the file my ligands.oeb.gz.The ligand file should have multiple conformers of all flexible ligands. It also may or may notcontain MASC data. The run will proceed different depending on if MASC data is present, see thedescription above.-rec my receptor.oeb.gzThis specifies a receptor file to dock to the ligand too. The receptor file must be created prior torunning <strong>FRED</strong> with this command line. This file can be created by <strong>FRED</strong> or by the fred receptorGUI.-plp Tells <strong>FRED</strong> to score ligands with PLP.-plp masc Tells <strong>FRED</strong> to score ligands with the MASC variant of PLP.-chemgauss3 masc Tells <strong>FRED</strong> to score ligands with the MASC variant of Chemgauss3.-oechemscore Tells <strong>FRED</strong> to score the ligands with OEChemscore.Outputsetup.txt A file with the settings of all of <strong>FRED</strong>’s parameters. Note that there will be many moreparameters in this file than those specified on the command line because many parameters havedefault values. This file will be written at the beginning of the run.receptor.oeb.gz This file contains a copy of the receptor site used for the run. This file will bewritten at the beginning of the run.status.txt This file will be written every few seconds by <strong>FRED</strong> and indicates the status of the run.plp docked.oeb.gz contains the structure of the top 1000 docked ligands as ranked by PLP.plp scores.txt list the names, scores and SMILES representation of the top 1000 docked ligandsas ranked by PLP in a text format.masc plp docked.oeb.gz contains the structure of the top 1000 docked ligands as ranked by theMASC variant of PLP.


64 Chapter 6. Example Command Linesmasc plp scores.txt list the names, scores and SMILES representation of the top 1000 dockedligands as ranked by the MASC variant of PLP in a text format.masc chemgauss3 docked.oeb.gz contains the structure of the top 1000 docked ligands asranked by the MASC variant of Chemgauss3.masc chemgauss3 scores.txt list the names, scores and SMILES representation of the top1000 docked ligands as ranked by the MASC variant of Chemgauss3 in a text format.oechemscore docked.oeb.gz contains the structure of the top 1000 docked ligands as ranked byOEChemscore.oechemscore scores.txt list the names, scores and SMILES representation of the top 1000docked ligands as ranked by OEChemscore in a text format.masc consensus docked.oeb.gz contains the structure of the top 100 docked ligands andranked by consensus of MASC PLP and MASC Chemgauss3.masc consensus scores.txt lists the names, scores and SMILES representation of the top 1000docked ligands as ranked by MASC PLP and MASC Chemgauss3. The file is in a text format.consensus docked.oeb.gz constains the structure of the top 100 docked ligands and ranked byconsensus of PLP and OEChemscore.consensus scores.txt lists the names, scores and SMILES representation of the top 1000docked ligands as ranked by PLP and OEChemscore. The file is in a text format.Additionally if my ligands.oeb.gz did not contain the required MASC data the following file will beoutputted.masc ligands.oeb.gz A copy of the my ligands.oeb.gz file that holds the MASC data <strong>FRED</strong>calculated. The masc ligands.oeb.gz contains all the conformers and data of the originalmy ligands.oeb.gz file and can be used in place of the original file.6.5.3 Preparing ligandsCommand Linefred −dbase my_ligands . oeb . gz \−shapegauss_MASC \−plp_MASC \−chemgauss2_MASC \−chemgauss3_MASC \−chemscore_MASC \−oechemscore_MASC \−screenscore_MASC


6.5. Using MASC 65DescriptionThis command tells <strong>FRED</strong> to calculate Shapegauss, PLP, Chemgauss2, Chemgauss3, Chemscore,OEChemscore and Screenscore MASC data for all the ligands in my ligands.oeb.gz, by docking themto 12 internal reference sites (hence the total time for this process will be about 12 times that of a normaldocking run).A copy of my ligands.oeb.gz will be outputted to masc ligands.oeb.gz which contains all the originalinformation of my ligands.oeb.gz (i.e. masc ligands.oeb.gz can be used anywhere my ligands.oeb.gzcould be) plus MASC data for the scoring functions. The masc ligands.oeb.gz can then be used in future<strong>FRED</strong> runs against any target and the use of MASC will not slow those runs down (for the scoringfunctions the data is calculated for).Note that the run time for this calculation is fairly invariant if 1 or all of the scoring functions are used,so in general there is no reason not to prepare your ligands with as much MASC data as possible. Oneexception is Zapbind the calculations for which are lengthy therefore calculating Zapbind MASC datawill increase the runtime significantly.Parameters-dbase my ligands.oeb.gz-shapegauss masc tells <strong>FRED</strong> to calculate Shapegauss MASC data.-plp masc tells <strong>FRED</strong> to calculate PLP MASC data.-chemgauss2 masc tells <strong>FRED</strong> to calculate Chemgauss2 MASC data.-chemgauss3 masc tells <strong>FRED</strong> to calculate Chemgauss3 MASC data.-chemscore masc tells <strong>FRED</strong> to calculate Chemscore MASC data.-oechemscore masc tells <strong>FRED</strong> to calculate OEChemscore MASC data.-screenscore masc tells <strong>FRED</strong> to calculate Screenscore MASC data.Outputsetup.txt A file with the settings of all of <strong>FRED</strong>’s parameters. Note that there will be many moreparameters in this file than those specified on the command line because many parameters havedefault values. This file will be written at the beginning of the run.status.txt This file will be written every few seconds by <strong>FRED</strong> and indicates the status of the run.masc ligands.oeb.gz A copy of the my ligands.oeb.gz file that also hold the MASC data<strong>FRED</strong> calculated. The masc ligands.oeb.gz contains all the conformers and data of the originalmy ligands.oeb.gz file and can be used in place of the original file in any program.


66 Chapter 6. Example Command Lines6.5.4 Preparing ligands with your own reference sites6.5.5 Preparing ligandsCommand Linefred −dbase my_ligands . oeb . gz \−reference_receptors my\_reference\_receptors . txt−shapegauss_MASC \−plp_MASC \−chemgauss2_MASC \−chemgauss3_MASC \−chemscore_MASC \−oechemscore_MASC \−screenscore_MASCDescriptionThis command tells <strong>FRED</strong> to calculate Shapegauss, PLP, Chemgauss2, Chemgauss3, Chemscore,OEChemscore and Screenscore MASC data for all the ligands in my ligands.oeb.gz, by docking themto the reference sites listed in the text file my reference receptors.txt.A copy of my ligands.oeb.gz will be outputted to masc ligands.oeb.gz which contains all the originalinformation of my ligands.oeb.gz (i.e. masc ligands.oeb.gz can be used anywhere my ligands.oeb.gzcould be) plus MASC data for the scoring functions. The masc ligands.oeb.gz can then be used in future<strong>FRED</strong> runs against any target and the use of MASC will not slow those runs down (for the scoringfunctions the data is calculated for).Note that the run time for this calculation is fairly invariant if 1 or all of the scoring functions are used,so in general there is no reason not to prepare your ligands with as much MASC data as possible. Oneexception is Zapbind the calculations for which are lengthy therefore calculating Zapbind MASC datawill increase the runtime significantly. Note that if you want to calculated MASC data for the ligandbased scoring functions (i.e. CGO and CGT), you reference receptors must all contain bound ligands.Parameters-dbase my ligands.oeb.gz-reference receptors my reference receptorsTells <strong>FRED</strong> to use the receptors listed in my reference receptors.txt to calculate the MASC data.The format of my reference receptors.txt is a text file with the name of one receptor file per line.The minimum number of receptor files is 3, 6-10 is recommended.Example reference filemy_reference_rec1 . oeb . gzmy_reference_rec2 . oeb . gzmy_reference_rec3 . oeb . gz


6.6. Ligand+Structure based design 67my_reference_rec4 . oeb . gzmy_reference_rec5 . oeb . gzmy_reference_rec6 . oeb . gzmy_reference_rec7 . oeb . gzmy_reference_rec8 . oebmy_reference_rec9 . oeb-shapegauss masc tells <strong>FRED</strong> to calculate Shapegauss MASC data.-plp masc tells <strong>FRED</strong> to calculate PLP MASC data.-chemgauss2 masc tells <strong>FRED</strong> to calculate Chemgauss2 MASC data.-chemgauss3 masc tells <strong>FRED</strong> to calculate Chemgauss3 MASC data.-chemscore masc tells <strong>FRED</strong> to calculate Chemscore MASC data.-oechemscore masc tells <strong>FRED</strong> to calculate OEChemscore MASC data.-screenscore masc tells <strong>FRED</strong> to calculate Screenscore MASC data.Outputsetup.txt A file with the settings of all of <strong>FRED</strong>’s parameters. Note that there will be many moreparameters in this file than those specified on the command line because many parameters havedefault values. This file will be written at the beginning of the run.status.txt This file will be written every few seconds by <strong>FRED</strong> and indicates the status of the run.masc ligands.oeb.gz A copy of the my ligands.oeb.gz file that also hold the MASC data<strong>FRED</strong> calculated. The masc ligands.oeb.gz contains all the conformers and data of the originalmy ligands.oeb.gz file and can be used in place of the original file in any program.6.6 Ligand+Structure based designTwo of <strong>FRED</strong>’s scoring functions, CGO and CGT, are ligand based design scoring functions. That isthey score by measuring how well a pose matches a known bound ligand in the active site, rather thanmeasuring how well the ligand interacts with the site itself. By using these two functions you can then doligand based design with <strong>FRED</strong>, or even more interestingly you can combine both ligand and structurebased design by using a combination of the structure based and ligand based scoring functions. Thissection has several examples of using the ligand based design scoring functions.Note that even when all the docking and scoring functions are ligand based design functions (i.e. CGO orCGT) information from the protein is still used during the exhaustive search to screen the initial ensembleof poses using the inner and outer contour, as well as any constraints. In other words when you use CGOand CGT <strong>FRED</strong> will still insure that the poses do not clash with the ligand and obey any constraints theuser has specified.


68 Chapter 6. Example Command Lines6.6.1 Scoring with ligand based designCommand Linefred −dbase my_ligands . oeb . gz \−rec my_receptor . oeb . gz \−cgo \−cgtDescription<strong>FRED</strong> will dock my ligands.oeb.gz to my receptor.oeb.gz and then score and output the resulting posesto a CGO and CGT hitlist.The docking process is still structure based in this example, only the final scoring in ligand based. Seethe next example for how to do structure based docking and scoring.Note that CGO and CGT are treated the same as any other scoring function by <strong>FRED</strong>, provided thereceptor has a bound ligand.Parameters-dbase my ligands.oeb.gzThis tells <strong>FRED</strong> where the ligands it is going to dock are, in this case the file my ligands.oeb.gz.The ligand file should have multiple conformers of all flexible ligands (see section 5.5 for moredetail on preparing the ligand database).-rec my receptor.oeb.gzThis specifies a receptor file to dock the ligand to. The receptor file must be created prior to running<strong>FRED</strong> with this command line, and it must contain a bound ligand.-cgo Tells <strong>FRED</strong> to score with CGO.-cgt Tells <strong>FRED</strong> to score with CGT.Outputsetup.txt A file with the settings of all of <strong>FRED</strong>’s parameters. Note that there will be many moreparameters in this file than those specified on the command line because many parameters havedefault values. This file will be written at the beginning of the run.receptor.oeb.gz This file contains a copy of the receptor site used for the run. This file will bewritten at the beginning of the run.status.txt This file will be written every few seconds by <strong>FRED</strong> and indicates the status of the run.cgo docked.oeb.gz contains the structure of the top 1000 docked ligands as ranked by CGO.cgo scores.txt list the names, scores and SMILES representation of the top 1000 docked ligandsas ranked by CGO in a text format.


6.6. Ligand+Structure based design 696.6.2 <strong>Docking</strong> and scoring with ligand based designCommand Linefred −dbase my_ligands . oeb . gz \−rec my_receptor . oeb . gz \−exhaustive_scoring cgo \−opt cgo \−pose_select_weight_shapegauss 0 \−pose_select_weight_plp 0 \−pose_select_weight_Chemgauss2 0 \−pose_select_weight_Chemgauss3 0 \−pose_select_weight_Chemscore 0 \−pose_select_weight_oeChemscore 0 \−pose_select_weight_screenscore 0 \−cgo \−cgtDescriptionThis command line tells <strong>FRED</strong> to docking my ligands.oeb.gz to my receptor.oeb.gz using the ligandbased CGO scoring function and then score and output the ligands to CGT and CGO hitlists. Consensusstructure is also disabled since the purpose of this example is to maximize the amount of ligand baseddesign, and there is only one scoring function for consensus structure that is ligand based, CGO.Parameters-dbase my ligands.oeb.gzThis tells <strong>FRED</strong> where the ligands it is going to dock are, in this case the file my ligands.oeb.gz.The ligand file should have multiple conformers of all flexible ligands (see section 5.5 for moredetail on preparing the ligand database).-rec my receptor.oeb.gzThis specifies a receptor file to dock to the ligand too. The receptor file must be created prior torunning <strong>FRED</strong> with this command line, and it must contain a bound ligand.-exhaustive scoring cgo-opt cgoTells <strong>FRED</strong> to use CGO as the scoring function in the exhaustive search.Tells <strong>FRED</strong>] to optimize the candidate poses with CGO-pose select weight shapegauss 0Sets the weight of Shapegauss to 0 in consensus structure.-pose select weight plp 0Sets the weight of PLP to 0 in consensus structure.


70 Chapter 6. Example Command Lines-pose select weight Chemgauss2 0Sets the weight of Chemgauss3 to 0 in consensus structure.-pose select weight Chemgauss3 0Sets the weight of Chemgauss3 to 0 in consensus structure.-pose select weight Chemscore 0Sets the weight of Chemscore to 0 in consensus structure.-pose select weight oeChemscore 0Sets the weight of OEChemscore to 0 in consensus structure.-pose select weight screenscore 0Sets the weight of Screenscore to 0 in consensus structure.-cgo Tells <strong>FRED</strong> to score with CGO.-cgt Tells <strong>FRED</strong> to score with CGT.Outputsetup.txt A file with the settings of all of <strong>FRED</strong>’s parameters. Note that there will be many moreparameters in this file than those specified on the command line because many parameters havedefault values. This file will be written at the beginning of the run.receptor.oeb.gz This file contains a copy of the receptor site used for the run. This file will bewritten at the beginning of the run.status.txt This file will be written every few seconds by <strong>FRED</strong> and indicates the status of the run.cgo docked.oeb.gz contains the structure of the top 1000 docked ligands as ranked by CGO.cgo scores.txt list the names, scores and SMILES representation of the top 1000 docked ligandsas ranked by CGO in a text format.


CHAPTERSEVENParametersThis chapter documents all the command line flags <strong>FRED</strong> can accept.7.1 Execute Options-param A parameter file.Type param file.Default No default.A parameter file is a text file that lists parameter settings to be used during a run. The parametersetting are listed in key value pairs (i.e. ) in the file, withthe following rules:1. One key-value pair per line.2. Blank lines and lines beginning with a # character are ignored.3. A parameter file parameter cannot appear in a parameter file (i.e. don’t specify -param inyour parameter file).4. If a parameter is specified both on the command line and in the parameter file, the valuespecified on the command line is used.5. Boolean parameters must be specified with a key-value pair. The shortcut that interprets aboolean key without a corresponding value as true on the command line will not be consideredvalid in a parameter file. Accordingly ”-x” may not be used as a shortcut for ”-x true”for boolean parameters in the parameter file.Additionally some parameters are restricted to specific ranges or values as listed in the individualparameters documentation.-pvmconf A text file specifying a PVM configuration.Type file.71


72 Chapter 7. ParametersDefault No default.This flag specifies a text file that tells which hosts to lauch PVM slaves on. If this flag is notspecified the program runs in single processor mode.The format of the PVM configuration file is:host [ hostname] host [ hostname] Any number of host lines can be used. Lines not beginning with the keyword host are ignored.’hostname’ should exactly match the names of the hosts listed by the PVM daemon.7.2 Input Ligands-dbase File of multiconformer ligands.Type string.Default No default.Molecule format is determined by file extension. The list of recognized extensions includes but isnot limited to oeb , oeb.gz , sdf , sdf.mol2 , mol2.gz , pdb,pdb.gz , .list , and .lst. Both .list and .lstare text files which contain the name of one or more files to load in sequence.See Appendix B for a complete list of supported file formats.-conftest Set the test for detecting if sequential molecule records in the ligand database are conformers.Type string.Default isomeric.Legal Values none, isomeric, absolute and canonicalThe settings for this flag have the following meaningnone : No conformer test is done, database is treated as single conformerisomeric : Molecules are conformers if they:1. Have the same numbers of atoms and bonds in the same order.2. Each atom and bond has identical properties with its order correspondent in the subsequentconnection table.3. Have the same atom and bond stereochemistry.absolute :1. Have the same numbers of atoms and bonds in the same order.


7.2. Input Ligands 732. Each atom and bond has identical properties with its order correspondent in the subsequentconnection table.canonical :1. Have the same absolute (non-isomeric) graph.-molnames Tells <strong>FRED</strong> to only dock molecules with names specified in a text file.Type string.Default No default.This flag specifies a text file with the names of one or more molecules in the database specifiedwith -dbase. If specified <strong>FRED</strong> will ignore molecules in the database that do not have namesmatching those listed in the text file. If this flag is not specified <strong>FRED</strong> will dock all molecules inthe database(s) normally. Molecules names should be listed one per line in the text file.The general purpose of this flag is to provide an easy mechanism for docking a few specificmolecules that are contained in a large database, without having to extract those molecules byhand from the database.-assign ligand charges Assign AM1BCC charges to all input ligands.Type bool.Default false.If this flag is false (the default) any existing charges will be used. Charges are only required by<strong>FRED</strong> when using the Zapbind scoring function.7.2.1 MASC Preparation-reference receptors Text file listing custom MASC reference receptors files.Alias -ref recType string.Default No default.By default when MASC data calculated ligands are docked to 12 standard reference sites. Thisflag allows the users to substitute their own set of reference sites, by passing a text file to this flagwhich lists the filenames of 3 or more reference sites to use (6-10 sites is a typical number to use).Each file should be on its own line. If CGO or CGT MASC data will be calculated then eachreference receptor must have a bound ligand.-no masc data calc Don’t calculate any masc data for this runType bool


74 Chapter 7. ParametersDefault falseWhen this flag is set to true this flag will not calculate any MASC data for the input ligand database.If MASC scoring is used any ligands without MASC data will fail to dock.The general use of this flag is for cases where the input database has MASC data for %99 ofthe molecules and you’d rather skip the %1 of ligands missing MASC data than do the MASCpreparation on the entire database again.-recalculate masc data Force re-calculation of MASC data on ligands with exisiting data.Type bool.Default false.By default existing MASC data on ligands is used if it is available. This tells <strong>FRED</strong> to re-calculateall MASC data that it needs, even if that data is already present on the molecule.-report masc failures Report failure of ligands to dock to MASC reference structures.Type bool.Default false.By default <strong>FRED</strong> suppresses warning information about ligands not being able to dock to referencesites, because it is expected that not all ligands will be able to fit in all sites (<strong>FRED</strong> requires thateach ligand fit in at least 3). This flag can be used to unsuppress those warnings.7.3 Receptor Site-rec Receptor site file molecules will be docked into.Type string.Default No default.Legal Values *.oeb and *.oeb.gzReceptor site files are always in .oeb or .oeb.gz. A receptor file can be created either with thecommand line flags for creating a receptor (see flags -pro, -bound ligand, -box and -addbox) orwith the stand alone GUI app fred receptor.-pharm File of custom docking constraints.Alias -custom constraintsType file.Default No default.


7.3. Receptor Site 75A constraint file is used to define one or more constraint features that restrict what poses <strong>FRED</strong> willconsider during the exhaustive search. See section 4.2.1 for more detail on how these constraintsaffect the docking process.<strong>FRED</strong> recognizes lines with the following format in the constraint file:NAME ""ENABLED ""SPHERE SMARTS Lines not beginning with SPHERE or SMARTS are ignored. is an integer value uniqueto a given constraint feature.A SPHERE line defines the center and radius of a sphere associates with a given feature. ASMARTS line defines a SMARTS pattern associated with a particular feature. The NAME line setthe name for the particular feature (and is optional). The ENABLE line specifies if the feature isenabled (this line is optional and if not specified it is assumed the feature is enabled).All constraint features must have at least one sphere associated with them to be valid. A constraintfeature need not have a SMARTS pattern associated with it, in which case any heavy atom will beallowed to match the feature. If one or more SMARTS patterns are associated with a constraintfeature only atoms matching at least one of the SMARTS patterns will be allowed to satisfy theconstraint feature.-assign protein charges Assign MMFF charges to receptor (otherwise accept input).Type bool.Default false.Note that if Zapbind scoring is selected and the receptor does not have partial charges assigned<strong>FRED</strong> will automatically attempt to assign MMFF partial charges to the receptor.7.3.1 Creating Receptor Site-pro Protein molecule to convert into a receptor site.Type Molecule.Default No default.This protein should have any bound ligands present in the active site stripped before being passedto this flag.This flag is generally used in combination with the -bound ligand, -box or -addbox flags.-bound ligand Known ligand bound to the protein.Type Molecule.Default No default.


76 Chapter 7. Parameters<strong>FRED</strong> will use the bound ligand in combination with a shape based detection method to determinethe location and extents of the active site when setting up a receptor.-box A box defining the receptor site.Type Molecule.Default No default.The box format is any molecular file format. The box created is always aligned alone the x y andz axis of the coordinate system and its maximum and minimum x,y and z values will be the maxand min x,y and z value of any heavy atom in the molecule file.-addbox Adjusts the box created with the -box flag by extending all sides by this value.Type float.Default No default.This parameter can only be used in combination with the -box flag.-no inner contour Create the receptor without an inner contour.Type boolDefault false7.4 <strong>Docking</strong>7.4.1 <strong>Exhaustive</strong> Search-no dock Flag to skip the docking process.Alias -nodockType bool.Default false.If this flag is turned on <strong>FRED</strong> assumes the molecules in the input database have already been placedin the active site and skips the docking process, passing the molecules directly to refinement andrescoring. All other parameters in the <strong>Exhaustive</strong> Search category are ignored if this parameter isset to true.-exhaustive scoring Scoring function used during the exhaustive searchAlias -init scr -exh scrType string.


7.4. <strong>Docking</strong> 77Default Chemgauss3.Legal Values shapegauss, plp, chemgauss2, chemgauss3 and cgo.The scoring function used during the exhaustive search must be grid based, hence not all thescoring functions available in the final scoring stage are available here.-num poses Maximum number of poses to generate per ligand.Type int.Default 100.Legal Range (1 to 10000).This flag controls the number of candidate poses generated by the exhaustive search. The numberof alternate poses cannot exceed this value, although this is not the flag that tells <strong>FRED</strong> to outputalternate poses (see -num alt poses).-clash scale Specifies a fraction of the sum of Vdw radii that will be considered a clash.Alias -clash checking.Type float.Default No default.Legal Range 0 to 1.0.If not specified no explict VdW clash checking is done during the exhaustive search. However, thisis generally not required as poses that fit within the outer contour of the active site shape potentialwill not clash with the protein prevent.7.4.2 Optimization-opt Scoring function to do solid body optimization against.Alias -optimization.Type string.Default Chemgauss3.Legal Values none, shapegauss, plp, chemgauss2, chemgauss3, chemscore, oechemscore,screenscore and cgo.7.4.3 Consensus Structure-pose select weight shapegauss Weight Shapegauss is given in the consensus pose selection.


78 Chapter 7. ParametersType int.Default 0.Legal Range 0 to inf.-pose select weight plp Weight PLP is given in the consensus pose selection.Type int.Default 0.Legal Range 0 to inf.-pose select weight chemgauss2 Weight Chemgauss2 is given in the consensus pose selection.Type int.Default 0.Legal Range 0 to inf.-pose select weight chemgauss3 Weight Chemgauss3 is given in the consensus pose selection.Type int.Default 0.Legal Range .0 to inf.-pose select weight chemscore Weight Chemscore is given in the consensus pose selection.Type int.Default 0.Legal Range 0 to inf.-pose select weight oechemscore Weight OEChemscore is given in the consensus pose selection.Type int.Default 0.Legal Range 0 to inf.-pose select weight screenscore Weight Screenscore is given in the consensus pose selection.Type int.Default 0.Legal Range 0 to inf.


7.5. Scoring 797.4.4 Force Field Refinement-refine Specifies a refinement method to use.Type string.Default no refinement.Legal Values no refinement and lig mmff.This flag specifies what kind of refinement (optimization) should be performed on all the posesfrom the docking stage. Allowable settings are as follows:no refinement No refinement is done.lig mmff A full coordinate optimization of the ligand vs. the Merck Molecular MechanicsForce Field.7.5 Scoring7.5.1 Standard Scoring Functions-shapegauss Score molecules with Shapegauss.Type bool.Default false.-plp Score molecules with PLP.Type bool.Default false.-chemgauss2 Score molecules with Chemgauss2.Type bool.Default false.-chemgauss3 Score molecules with Chemgauss3.Type bool.Default false.-chemscore Score molecules with Chemscore.Type bool.


80 Chapter 7. ParametersDefault false.-oechemscore Score molecules with OEChemscore.Type bool.Default false.-screenscore Score molecules with Screenscore.Type bool.Default false.-cgo Score molecules with CGO.Type bool.Default false.When scoring with CGO the receptor file must contain a bound ligand.-cgt Score molecules with CGT.Type bool.Default false.When scoring with CGT the receptor file must contain a bound ligand.7.5.2 MASC Scoring Functions-shapegauss masc Score molecule with the MASC variant of Shapegauss.Alias -masc shapegauss.Type bool.Default false.-plp masc Score molecule with the MASC variant of PLP.Alias -masc plp.Type bool.Default false.-chemgauss2 masc Score molecule with the MASC variant of Chemgauss2.Alias -masc chemgauss2.Type bool.


7.5. Scoring 81Default false.-chemgauss3 masc Score molecule with the MASC variant of Chemgauss3.Alias -masc chemgauss3.Type bool.Default false.-chemscore masc Score molecule with the MASC variant of Chemscore.Alias -masc chemscore.Type bool.Default false.-oechemscore masc Score molecule with the MASC variant of OEChemscore.Alias -masc oechemscore.Type bool.Default false.-screenscore masc Score molecule with the MASC variant of Screenscore.Alias -masc screenscore.Type bool.Default false.-cgo masc Score molecule with the MASC variant of CGO.Alias -masc cgo.Type bool.Default false.-cgt masc Score molecule with the MASC variant of CGT.Alias -masc cgt.Type bool.Default false.-zapbind masc Score molecule with the MASC variant of Zapbind.Alias -masc zapbind.Type bool.Default false.


82 Chapter 7. Parameters7.6 Output-output dir Directory to place all output files in.Alias -outputdir.Type string.Default No default.If this flag is not specified output files will be placed in the current working directory.-prefix Prefix applied to all output files.Type string.Default No default.-serial Output all ligands as they are docked, do not keep sorted hitlists.Type bool.Default false.-oformat File extension of the format for docked output structures.Type string.Default oeb.gz.This flag determines the extension and format of the docked structures. Recognized extensionsinclude but are not limited to oeb, oeb.gz, sdf, sdf.gz, mol2 and mol2.gz.See the program documentation for a complete list of file formats.-hitlist size Size of the output hitlists.Alias -hitlistsize.Type int.Default 1000.LegalRange 1 to inf.This flag applies to all hitlists <strong>FRED</strong> outputs.-num alt poses Number of alternate poses to output.Alias -numaltposes.Type int.Default 0.LegalRange .0 to inf.The number of alternate poses can be set to any value, however the actual number of poses outputtedwill never exceed the number of candidate poses returned by the exhaustive search (whichis limited by the -num poses flag).


APPENDIXARelease NotesA.1 <strong>FRED</strong> 2.2.5 Change Log1. <strong>FRED</strong> licenser has been updated to work with licenses that expire past the year 2009.2. The <strong>FRED</strong> licenser supports having the license file located in the users home directory.3. On Microsoft Windows platforms, the installer adds the abliity to open command prompts thatsetup the user environment to run specific versions of <strong>FRED</strong> or the latest version of <strong>FRED</strong>.4. SD files are now written with the 3D flag set.5. Fixed issue where Fred could crash if nodock is specified and the poses given to Fred to score arenot within the receptor site.6. <strong>FRED</strong> and <strong>FRED</strong> RECEPTOR are now versioned and shipped together.A.2 <strong>FRED</strong> 2.2.4 Change Log1. Fixed crash when -refine and -no dock were used together.2. Fixed crash when -clash scale and a receptor without an inner contour is used.A.3 <strong>FRED</strong> 2.2.3 Change Log1. Fixed a bug when using two custom constraints that caused all molecules to fail to dock with aNoConstraintMatch code.83


84 Appendix A. Release NotesA.4 <strong>FRED</strong> 2.2.2 Change Log1. Fixed a bug that causes the -pharm flag to be ignored.2. Cleaned up minor formatting issue when informing the user that a warning log has been opened.A.5 <strong>FRED</strong> 2.2.1 Change Log1. Important: Fixed a bug when using -pro and -box to create a receptor that causes the inner contourto be set to an extremely small value. The resulting receptor would produce extremely poor dockingresults without warning. In this release newly created receptors will now have reasonable innercontours. Additionally Fred now checks if the inner contour volume is extremely small (i.e., onethat was produced by this bug), and if one is detected the inner contour is turned off and a warningis issued before proceeding with the run.2. Fixed a when requesting alternate poses during a PVM run, which caused the run to shutdown andnot dock any molecules.3. Fixed a bug in screenscore when the receptor/protein it was initialized with did not have explicithydrogens. The bug causes the initial setup to fail and the run to stop if screenscore was used. Theerror reported when this bug ocured was ”Error! Screenscore::FindAcceptors (OH)”.4. Fixed a crash bug when calculating MASC data for ligands on a 64bit machine.5. Fixed minor when reporting how many molecules in a database have MASC data. The percentagereported was erroniously divided by 100.6. Fixed a bug when using both the MASC and Non-MASC variants of a scoring function in thesame run. The bug caused a shift in the Non-MASC score related to the precalculated MASC data,while the MASC variant score was correctly calculated. The error was especially damaging toCGT score.7. Fixed spelling error by changing parameter -recaculate masc data to -recalculate masc data.The original misspelling is now an alias to minimize impact on users with existing parameter files.8. Fixed bug checking for charges when -zapbind and -pro are used together. (Did not affect runswhere -zapbind and -rec were being used). The bug caused the run to stop.9. Fixed a bug in -clash scale flag, that causes the value passed to the flag to be ignored and a valueof one to be used. If the -clash scale flag was not used no bug occurred in this regard.10. Silenced the warningOEInterface::Get, requesting value of unset parameter -addboxwhen using the -box parameter without also specifying -addbox. The warning was spurious, noerror occurred when it was issued.


A.6. Initial <strong>FRED</strong> 2.2 release 8511. Corrected a deficiency in Chemgauss3 metal term and metal constraints. The metal chelator interactionfunction were picking up on some but not all of the allowable geometries for metal-chelatorinteractions.12. Improved Chemscore’s, OEChemscore’s and Screenscore’s handling of rotatable hydrogens involvedin hydrogen bonding by replacing the brute force torsion driving search for the optimalhydrogen position with an analytic solution for the best position.13. Chemscore’s, OEChemscores’s and Screenscores’s hydrogen bonding terms now only allow onehydrogen bond per hydrogen. Also prevented two hydroxyls from making both and acceptor-donorinteraction and a donor-acceptor interaction.14. Modified OEChemscore’s hydrogen bonding term to be more forgiving of non-ideal geometries.The range of geometries considered ideal is unchanged.15. Added a new flag ”-no masc data calc” which will prevent Fred from calculating MASC datafor ligands. Any ligands that missing needed masc data will not be docked. The purpose of thisflag is to prevent from from doing a lengthy MASC calculation when all but a handful of ligandshave the required MASC data.16. Extended initial list of atom types know to Chemgauss3. This should improve the startup speedof runs using Chemgauss3, by avoiding costly grid recalculations each time a new atom type isencountered. This change is only for speed, there is no change to the Chemgauss3 score valueFred calculates.17. Added support for OEB rotor offset compression when writing the MASC tagged version of theinput database (provided the initial input database used rotor offset compression).18. Slave of a multiprocessor run now longer require a license, only the master process requires alicense now.19. When using constraints fred now defaults to effectively using a -clash scale value of 0.6 if the-clash scale flag has not been set by the user. This helps eliminate unusually close protein-ligandcontacts that can occasionally occur when constraints are used. The original behavior can beobtained by explicitly setting -clash scale to 0. Runs without constraints are unaffected and behaveas before.20. Chemgauss2 is not longer used by default in the consensus pose stage of docking. This has no realeffect on docking performance and will reduces Fred’s startup time.A.6 Initial <strong>FRED</strong> 2.2 releaseA.6.1New Features / Improvements1. Optional GUI setup and preparation of the active site (the actual docking remains command line).The GUI allows users to


86 Appendix A. Release Notes(a) Separate bound ligands and solvent molecules from the protein structure.(b) Detect active sites, and adjust the box defining the active site.(c) Manually tweak residue protonation states.(d) Visualize and adjust the complimentary shapes <strong>FRED</strong> uses during the exhaustive search.(e) Specify constraints.2. A new version of the Chemgauss scoring function, version 3, which includes new desolvationterms as well as improved typing.3. Ligand based design scoring functions, C.G.O (Chemical Gaussian Overlay) and C.G.T. (ChemicalGaussian Tanimoto). These functions score be measuring how well a molecules shape andchemistry overlay a known bound ligand placed in the active site.4. All new algorithm for generating the negative image of the active site using molecular shapeprobes, as opposed to the atomic probes uses earlier.5. On the fly preparation of MASC data for runs using the MASC variant scoring functions is now anoption. Pre-calculating MASC data is still available and most efficient when doing multiple runs.A.6.2Changes1. <strong>FRED</strong> now uses a special receptor file to describe the active site. This file can be created interactivelyusing the new fred receptor GUI program, or on the fly with the command line using thesame flags as the previous version of <strong>FRED</strong> (2.1.x).2. The functionality of the masc prep and ligand info programs distributed with the previous version(2.1.x) have been merged into the main fred executable.3. Version 1 of Chemgauss has been removed, Version 2 is deprecated but still available.4. Individual hitlist *size and *cut flags have be replaced by a single -hitlist size flag.


APPENDIXBFile formats<strong>FRED</strong> supports a wide range of input file formats for molecules. <strong>FRED</strong> determines the file format byparsing the extension of input files. The following is a list of recognized extensions:B.1 Valid file extensions for Both Reading and Writingdat Macromodelent PDBmdl MDL Molmmd Macromodelmmod Macromodelmol2h MOL2 with Hmol2 Tripos MOL2mol MDL Molmopac MOPACoeb OEBinary v2pac MOPACpdb PDBrxn MDL Molsd MDL SDFsdf MDL SDFsyb Tripos MOL2xyz XYZ87


88 Appendix B. File formatsB.2 Valid File Extensions for Readingbin OEBinary v1rd MDL RDFrdf MDL RDFB.3 Valid File Extensions for Writingsmi SMILEScan Canonical SMILESism Isomeric SMILESmf Molecular Formulasln Tripos SLNfasta FASTAseq FASTAcdx ChemDraw CDX


APPENDIXCGlossaryactive site The area of the target protein ligands will be docked into.conformer A unique arrangement of a ligand’s atoms with respect to each other.constraint feature A user defined set of at least one sphere and optionally one or more SMARTS patterns.Any pose that does not have an atom matching one of the SMARTS patterns (or any heavyatom if no SMARTS patterns are specified) within one of the spheres will be discarded by <strong>FRED</strong>.See section 4.2.1 for more detail.consensus scoring A method of using multiple scoring functions to rank one ligand against another.consensus structure A method of using multiple scoring functions to rank one pose of a chemical entityagainst another pose of the same chemical entity.docking constraints A set of one or more constraint features that restrict what poses <strong>FRED</strong> will examineduring the docking processes. See section 4.2.1 for more detail.inner contour A shape complimentary to the active site, used during the exhaustive search. Any posewhich does not have a heavy atom within this shape is discarded.MASC Multiple Active Site Correction.negative image This is the same as the outer contour.optimization In this document optimization refers to solid body optimization vs. one of the availablescoring functions.outer contour A shape complimentary to the active site, used during the exhaustive search. Any posewhich has a heavy atom that falls outside this shape is discarded.pose A unique arrangement of a ligand’s atoms within the active site.PVM Parallel Virtual Machine. A set of tools for distributing jobs over multiple machines and/or processors.refinement In this document refinement refers to refining the molecule in the active site using a forcefield.89


APPENDIXDStandard MASC reference sitesBy default <strong>FRED</strong> uses 12 proteins described in the original MASC paper [11]. Those structures have thefollowing PDB codes:1ABE arabinose binding protein.1AZM carbonic anhydrase.1CBX carboxypeptidase A.1EPB retinoic acid binding protein.1HYT thermolysin.1MRK ribosome inactivating protein.1PHF cytochrome P450-CAM.1POC phospholipase A2.1SRJ streptavidin.1TPP trypsin.4PHV HIV-1 protease.8DFR dihydrofolate reductase.90


APPENDIXE<strong>FRED</strong> 1.2.10 to <strong>FRED</strong> 2.1 parameterdictionaryThis appendix lists all the parameters in <strong>FRED</strong> 1.2.10 and what their corresponding <strong>FRED</strong> 2.2 parameter(if any) is.The following abbreviations are used in the dictionaryNC : No Change, the parameter exists in both <strong>FRED</strong> 1.2.10 and <strong>FRED</strong> 2.1 and functions identically.DE : Direct Equivalent, a parameter in <strong>FRED</strong> 1.2.10 functions identically to a parameter of a differentname in <strong>FRED</strong> 2.1.NE : No Equivalent, the <strong>FRED</strong> 1.2.10 has no equivalent parameter in <strong>FRED</strong> 2.0.HID : Hidden, this flag has been hidden from the standard command line interface and is now undocumented,but still exists.E.1 Parameter Input-param : NCE.2 PVM-pvmconf : NC-pvmpass : NE-slavelog : NE91


92 Appendix E. <strong>FRED</strong> 1.2.10 to <strong>FRED</strong> 2.1 parameter dictionaryE.3 Molecule Loader-db : DE(-dbase).-dbfiles : NE, however the functionality is now part of -dbase which can be passed a file with a .listor .lst extension which <strong>FRED</strong> will assume to be a text file listing ligand files to dock.-combine conf : This flag has been replaced by -conftest which can now have several options fortesting if two molecule records are conformers.-molnames : NCE.4 Output-Nreturn : DE (-hitlist size), except that the new flag does not accept size 0 (although the -serial flagcan be used to achive the same goal).-pref : DE(-prefix)-docked format : DE(-oformat)-alt poses : NE, although the equilivant output is automatically written if -num alt poses is nonzero.-write mol : NE.-write scores : NE.-write pose scores : NE, although the equivilant output is automatically written if -num alt poses is set to a non-zero value.-write energy : NE-write recfile : NE-write box : NE-write pharm : NEE.5 OmegaNote : On the fly generation of conformers with Omega is not available in <strong>FRED</strong> 2.1.-use omega : NE


E.6. Receptor 93-maxconfs : NE-maxrot : NE-ewindow : NE-rms : NE-torlib : NE-ringlib : NE-from2d : NE-maxpool : NEE.6 Receptor-recfile : NE-pro : NC-box : NC-addbox : NC-ref : NEE.6.1<strong>FRED</strong>-no dock : NC<strong>Exhaustive</strong> Search-tstep : HID-rstep : HIDPoses Returned-NScore : DE(-num poses)-cut : NE


94 Appendix E. <strong>FRED</strong> 1.2.10 to <strong>FRED</strong> 2.1 parameter dictionaryPharmacaphore-pharm : NCShapefit GridNote : <strong>FRED</strong> 2.0 uses a chemically aware Gaussian scoring function during the exhaustive search, ratherthan the Gaussian shape only function that <strong>FRED</strong> 1.2.10 uses.-kappa : NE-gamma : NE-rad : NEInclusion Grid-excvol : NE.-excrad : NE.Advanced Parameters-res : NE. Grid resolution during the exhaustive search is now always one third the translationalstepsize.-gridbuf : NE.-max hev : NE-bb : NE-cube scr : NE-temp : NE-Eo : NEE.6.2Sequential Screen-smethod : NE. <strong>FRED</strong> 1.2.10 only allowed for one final scoring functions, <strong>FRED</strong> 2.1 allows formultiple. They can be selected with the flags listed in section 7.5.-custom score config : NE


E.6. Receptor 95Notes about screens<strong>FRED</strong> 2.2 replaces the concept of three serial screens in <strong>FRED</strong> 1.2.10 with a single refinment step thatis applied to the results of the exhaustive search followed by parallel scoring of the refined poses.ScreenA-use screenA : NE-scoreA : NE-OHRotA : NE-sb optA : NE-tor optA : NE-clusterA : NE-NcutA : NEScreenB-use screenB : NE-scoreB : NE-OHRotB : NE-sb optB : NE-tor optB : NE-clusterB : NE-NcutB : NEScreenC-use screenC : NE-scoreC : NE-OHRotC : NE-sb optC : NE-tor optC : NE-clusterC : NE-NcutC : NE


APPENDIXF<strong>FRED</strong> 2.0 to <strong>FRED</strong> 2.2 parameterdictionaryThe following parameters that existed in <strong>FRED</strong> 2.0 have changed in <strong>FRED</strong> 2.2.-scdbase This parameter has been replaced by the -conftest flag. Setting -conftest none is now equilivantto -scdbase false, and -conftest isomeric is now equilivant to -scdbase true.-refine This flag is now only used to specify force field refinement. In <strong>FRED</strong> 2.0 this flag was alsoused to specify solid body optimization against a scoring function, that functionality has beenassumed by the new -optimization flag.-output alt scores This parameter has been removed. Alternate scores are now written automaticallyif -output alt scores is true and -num alt poses is set to a non-zero values.-output alt structs This parameter has been removed. Alternate docked structures are nowwritten automatically if -output alt structs is true and emph-num alt poses is set to a non-zerovalues.96


APPENDIXG<strong>FRED</strong> 2.1 to <strong>FRED</strong> 2.2 parameterdictionaryThe following parameter from <strong>FRED</strong> 2.1 have been removed from <strong>FRED</strong> 2.2 documented interface, butstill exist as undocumented features for expert users.-consensus struct by product-consensus struct by score-rstep-tstep-set shapegauss gamma-set shapegauss kappa-scale plp clash-scale plp hb-scale plp metal-scale plp steric-scale plp sulphur-include plp desolvation penalty-scale chemgauss2 aromatic-scale chemgauss2 hb-scale chemgauss2 metal-scale chemgauss2 steric-scale chemscore hb97


98 Appendix G. <strong>FRED</strong> 2.1 to <strong>FRED</strong> 2.2 parameter dictionary-scale chemscore lipophilic-scale chemscore metal-scale chemscore rb-set chemscore use oelib rad-set chemscore use oelib typing-scale screenscore ambiguous-scale screenscore aromatic-scale screenscore clash-scale screenscore hb-scale screenscore lipo-scale screenscore metal-scale screenscore plp-scale screenscore rb-consensus size-consensus pool size-sort by top consensus poseThe following parameter from <strong>FRED</strong> 2.1 have been removed altogether from <strong>FRED</strong> 2.2.-sqrt poses-neg img size-output scores-output structs-shapegauss size-shapegauss cut-shapegauss consensus score weight-plp size-plp cut-plp consensus score weight


99-chemgauss size-chemgauss cut-chemgauss consensus score weight-chemgauss2 size-chemgauss2 cut-chemgauss2 consensus score weight-chemscore size-chemscore cut-chemscore consensus score weight-screenscore size-screenscore cut-screenscore consensus score weight-zapbind size-zapbind cut-zapbind consensus score weight-shapegauss masc size-shapegauss masc cut-shapegauss masc consensus score weight-plp masc consensus score weight-plp masc size-plp masc cut-chemgauss masc size-chemgauss masc cut-chemgauss masc consensus score weight-chemgauss2 masc size-chemgauss2 masc cut-chemgauss2 masc consensus score weight-chemscore masc size


100 Appendix G. <strong>FRED</strong> 2.1 to <strong>FRED</strong> 2.2 parameter dictionary-chemscore masc cut-chemscore masc consensus score weight-screenscore masc size-screenscore masc cut-screenscore masc consensus score weight-zapbind masc size-zapbind masc cut-zapbind masc consensus score weight


APPENDIXHAccessing scores in oeb files fromOEChemH.1 OverviewDocked molecule structures that are outputted in oeb format have the scores of each pose tagged to them.This score data can be accessed by users who have OEChem. When using oeb output the moleculeswritten to the file are OEGraphMol objects, while the molecules written to the alternate structure file areOEMol objects. Scores are tagged to the generic data (of type float) of the OEGraphMol, or the OEMol’sconformers.H.2 TagsThe tags are as follows :"Shapegauss" Total Shapegauss score"PLP" Total PLP score.The total PLP score is a sum of contribitions from different ligand atoms. That data is also attachedwith the following generic data tags."PLP HB" Contribution from hydrogen bond atoms."PLP NP" Contribution from non-polar atoms."PLP METAL" Contribution from metal atoms on the ligand. Contribution from metal atoms ofthe protien is part of the PLP HB component."PLP sulfur" Contribution from sulfur atoms."Chemgauss2" Total Chemgauss2 score101


102 Appendix H. Accessing scores in oeb files from OEChemThe total Chemgauss2 score is a sum of contribution from different components. That data isattached with the following generic data tags:"Chemgauss2 Steric" Contribution from shape complementarity between the ligand andprotein."Chemgauss2 Acc/Metal" Contribution from acceptors on the ligand interacting withdonors and metals on the protein."Chemgauss2 Donor" Contribution from donors on the ligand interacting with acceptors onthe protein."Chemgauss2 Aromatic" Aromatic interactions."Chemgauss3" Total Chemgauss3 scoreThe total Chemgauss3 score is a sum of contribution from different components. That data isattached with the following generic data tags:"Chemgauss3 Steric" Steric interactions."Chemgauss3 ProDesolv" Penalty for displacing water from the site."Chemgauss3 LigDesolv" Penalty for breaking ligands hydrogen bond(s) with water."Chemgauss3 Acc" Interactions between ligand acceptors and donors of the protein."Chemgauss3 Donor" Interactions between ligand donors and acceptors of the protein."Chemgauss3 Aromatic" Aromatic interactions between the ligand and protein."Chemgauss3 Metal" Interaction of ligand chelators with a metal in the site."Chemscore" Total Chemscore scoreThe total Chemscore score is a sum of contribution from different components. That data is attachedwith the following generic data tags."Chemscore RB" Rotatable bond penalty."Chemscore LIPO" Contribution from lipophilic interactions."Chemscore METAL" Contribtion from acceptor-metal interactions."Chemscore HB" Contribution from hydrogen bonds."Chemscore Clash" Clash penalty."OEChemscore" Total OEChemscore scoreThe total OEChemscore score is a sum of contribution from different components. That data isattached with the following generic data tags."OEChemscore LIPO" Contribution from lipophilic interactions."OEChemscore METAL" Contribtion from acceptor-metal interactions."OEChemscore HB" Contribution from hydrogen bonds."OEChemscore Clash" Clash penalty.


H.2. Tags 103"Screenscore score" Total Screenscore scoreThe total Screenscore score is a sum of contribution from different components. That data isattached with the following generic data tags."Screenscore RB" Rotatable bond penalty."Screenscore LIPO" Lipophilic interactions."Screenscore AMBIG" Ambigous interactions."Screenscore CLASH" Clash penalty."Screenscore PLP" Contribution from PLP."Screenscore HB" Hydrogen bonds."Screenscore METAL" Metal acceptor interactions."Screenscore AROMATIC" Aromatic interactions."CGO" Total CGO scoreThe total Screenscore score is a sum of contribution from different components. That data isattached with the following generic data tags."CGO Shape" Overlap of molecular shape."CGO Donor" Overlap of ”polar hydrogen” positions."CGO Acceptor" Overlap of ”lone pair” positions."CGO Chelator" Overlap of ”chelator” positions."CGO Aromatic" Overlap of aromatic ”ring positive” and ”ring negative” positions."CGT" Total CGT scoreUnlike all other scoring functions in <strong>FRED</strong> the components of CGT do not sum to the total score.The individual components, however, are calculated and reported as follows."CGO Shape" Tanimoto of molecular shape."CGO Donor" Tanimoto of ”polar hydrogen” positions."CGO Acceptor" Tanimoto of ”lone pair” positions."CGO Chelator" Tanimoto of ”chelator” positions."CGO Aromatic" Tanimoto of aromatic ”ring positive” and ”ring negative” positions.If a particular type of scoring was not performed the generic data associated with that score will not bepresent.


BIBLIOGRAPHY[1] Jonas Boström, “Reproducing the Conformations of Protein-bound Ligands: A Critical Evaluationof Several Popular Conformational Searching Tools”, Journal of Computer-Aided MolecularDesign (JCAMD), Vol. 15, No. 12, pp. 1137–1152, 2001.[2] Al Geist, Adam Beguelin, Jack Dongarra, Weicheng Jiang, Robert Manchek and Vaidy Sunderam,“PVM - Parallel Virtual Machine: A User’s Guide and Tutorial for Networked Parallel Computing”,MIT Press, 1994.[3] J. Andrew Grant, Barry T. Pickup and Anthony Nicholls, “A Smooth Permittivity Function forPoisson-Boltzmann Solvation Methods”, Journal of Computational Chemistry, Vol. 22, No. 6, pp.608–640, 2001.[4] Mark R. McGann, Harold R. Almond, Anthony Nicholls, J. Andrew Grant, and Frank K. Brown,“Gaussian <strong>Docking</strong> Functions”, Biopolymers, Vol. 68, pp. 76–90, 2003.[5] Araz Jakalian, Bruce L. Bush, David B. Jack and Christopher I. Bayly, “<strong>Fast</strong>, Efficient Generationof High-Quality Atomic Charges. AM1-BCC Model: I: Method”, Journal of ComputationalChemistry, Vol. 21, pp. 132–146, 2000.[6] Araz Jakalian, David B. Jack and Christopher I. Bayly, “<strong>Fast</strong>, Efficient Generation of High-QualityAtomic Charges. AM1-BCC Model: II: Parameterization and Validation”, Journal of ComputationalChemistry, Vol. 23, pp. 1623–1641, 2002.[7] Gennady M. Verkivker, Djamal Bouzida, Daniel K. Gehlaar, Paul A. Rejto, Sandra Arthurs, AnthonyB. Colson, Stephan T. Freer, Veda Larson, Brock A. Luty, Tami Marrone and Peter W.Rose, “Deciphering common failures in molecular docking of ligand-protein complexes”, Journalof Computer-Aided Molecular Design (JCAMD), Vol. 14, pp.731–751, 2000.[8] Matthew D. Eldridge, Christopher W. Murray, Timothy R. Auton, Gaia V. Paolini and Roger P.Mee, “Empirical scoring functions: I. The development of a fast empirical scoring function to estimatethe binding affinity of ligands in receptor complexes”, Journal of Computer-Aided MolecularDesign (JCAMD), Vol. 11, pp. 425–445, 1997.[9] Martin Stahl and Matthias Rarey, “Detailed Analysis of Scoring Functions for Virtual Screening”,Journal of Medicinal Chemistry, Vol. 44, pp. 1035–1042, 2001.104


Bibliography 105[10] Tanja Schulz-Gasch and Martin Stahl, “Binding Site Characteristics in Structure-Based VirtualScreening: Evaluation of Current <strong>Docking</strong> Tools”, Journal of Molecular Modeling, Vol. 9, pp.47–57, 2003.[11] G.P. Vigers and J.P. Rizzi, “Multiple Active Site Corrections for <strong>Docking</strong> and Virtual Screening”,Journal of Medicinal Chemistry, Vol. 47, pp. 80–89, 2004.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!