Prime User Manual - ISP
Prime User Manual - ISP
Prime User Manual - ISP
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>Prime</strong> <strong>User</strong> <strong>Manual</strong><br />
<strong>Prime</strong> 2.1<br />
<strong>User</strong> <strong>Manual</strong><br />
Schrödinger Press
<strong>Prime</strong> <strong>User</strong> <strong>Manual</strong> Copyright © 2009 Schrödinger, LLC. All rights reserved.<br />
While care has been taken in the preparation of this publication, Schrödinger<br />
assumes no responsibility for errors or omissions, or for damages resulting from<br />
the use of the information contained herein.<br />
Canvas, CombiGlide, ConfGen, Epik, Glide, Impact, Jaguar, Liaison, LigPrep,<br />
Maestro, Phase, <strong>Prime</strong>, <strong>Prime</strong>X, QikProp, QikFit, QikSim, QSite, SiteMap, Strike, and<br />
WaterMap are trademarks of Schrödinger, LLC. Schrödinger and MacroModel are<br />
registered trademarks of Schrödinger, LLC. MCPRO is a trademark of William L.<br />
Jorgensen. Desmond is a trademark of D. E. Shaw Research. Desmond is used with<br />
the permission of D. E. Shaw Research. All rights reserved. This publication may<br />
contain the trademarks of other companies.<br />
Schrödinger software includes software and libraries provided by third parties. For<br />
details of the copyrights, and terms and conditions associated with such included<br />
third party software, see the Legal Notices for Third-Party Software in your product<br />
installation at $SCHRODINGER/docs/html/third_party_legal.html (Linux OS) or<br />
%SCHRODINGER%\docs\html\third_party_legal.html (Windows OS).<br />
This publication may refer to other third party software not included in or with<br />
Schrödinger software ("such other third party software"), and provide links to third<br />
party Web sites ("linked sites"). References to such other third party software or<br />
linked sites do not constitute an endorsement by Schrödinger, LLC. Use of such<br />
other third party software and linked sites may be subject to third party license<br />
agreements and fees. Schrödinger, LLC and its affiliates have no responsibility or<br />
liability, directly or indirectly, for such other third party software and linked sites,<br />
or for damage resulting from the use thereof. Any warranties that we make<br />
regarding Schrödinger products and services do not apply to such other third party<br />
software or linked sites, or to the interaction between, or interoperability of,<br />
Schrödinger products and services and such other third party software.<br />
June 2009
Contents<br />
Document Conventions ..................................................................................................... ix<br />
Chapter 1: Introduction ....................................................................................................... 1<br />
1.1 The <strong>Prime</strong> Suite in Maestro ..................................................................................... 1<br />
1.2 <strong>Prime</strong> Documentation............................................................................................... 2<br />
1.3 Running Schrödinger Software .............................................................................. 3<br />
1.4 Citing <strong>Prime</strong> in Publications.................................................................................... 3<br />
Chapter 2: Using <strong>Prime</strong>–Structure Prediction .................................................... 5<br />
2.1 <strong>Prime</strong> Runs and Maestro Projects.......................................................................... 5<br />
2.2 General Panel Layout................................................................................................ 7<br />
2.3 The <strong>Prime</strong> Menu Bar ................................................................................................. 9<br />
2.3.1 The File Menu ..................................................................................................... 9<br />
2.3.2 The Edit Menu................................................................................................... 10<br />
2.3.3 The Display Menu ............................................................................................. 10<br />
2.3.4 The Step Menu ................................................................................................. 12<br />
2.4 The <strong>Prime</strong> Toolbar ................................................................................................... 13<br />
2.5 <strong>Prime</strong> Job Options and Control ............................................................................ 14<br />
2.5.1 The Job Options Dialog Box ............................................................................. 14<br />
2.5.2 Setting Up and Launching Jobs ........................................................................ 14<br />
2.5.3 Monitoring Jobs................................................................................................. 15<br />
2.5.4 Adding Structures to the Project Table.............................................................. 16<br />
Chapter 3: <strong>Prime</strong>–Structure Prediction: Initial Steps ................................... 17<br />
3.1 Input Sequence Step............................................................................................... 18<br />
3.2 Find Homologs Step ............................................................................................... 19<br />
3.2.1 Importing Homologs.......................................................................................... 19<br />
3.2.2 Searching for Homologs.................................................................................... 20<br />
3.2.3 Find Homologs Search Options ........................................................................ 21<br />
3.2.4 Search Results.................................................................................................. 22<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> iii
iv<br />
Contents<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />
3.2.5 Multiple Templates and Structure Alignment..................................................... 23<br />
3.2.6 Continuing to the Next Step .............................................................................. 25<br />
3.2.7 Saving Runs With Different Templates.............................................................. 25<br />
3.2.8 Error Messages................................................................................................. 25<br />
Chapter 4: Comparative Modeling: Edit Alignment....................................... 27<br />
4.1 The Comparative Modeling Path .......................................................................... 27<br />
4.2 Overview of the Edit Alignment Step................................................................... 27<br />
4.3 Initial and Imported Alignments ........................................................................... 29<br />
4.3.1 Accepting the BLAST/PSI-BLAST Alignment ................................................... 29<br />
4.3.2 Exporting an Alignment..................................................................................... 29<br />
4.3.3 Importing an Alignment..................................................................................... 29<br />
4.4 Secondary Structure Predictions (SSPs)............................................................ 29<br />
4.4.1 Generating SSPs .............................................................................................. 29<br />
4.4.2 Exporting and Importing SSPs.......................................................................... 30<br />
4.4.3 Deleting or Editing an SSP ............................................................................... 31<br />
4.4.4 Resetting SSPs................................................................................................. 31<br />
4.5 Generated and Modified Alignments ................................................................... 31<br />
4.5.1 Running the Align Program............................................................................... 31<br />
4.5.2 Align Program Technical Details ....................................................................... 33<br />
4.5.3 <strong>Manual</strong>ly Editing the Alignment......................................................................... 33<br />
4.5.4 Updating the Step View Table ........................................................................... 34<br />
4.6 Error Messages........................................................................................................ 34<br />
Chapter 5: Comparative Modeling: Build Structure ...................................... 35<br />
5.1 The Build Process ................................................................................................... 35<br />
5.2 Preparing to Build ................................................................................................... 36<br />
5.2.1 Multiple Templates (Set Template Regions)...................................................... 36<br />
5.2.2 Including Ligands and Cofactors....................................................................... 37<br />
5.2.3 Build Options..................................................................................................... 37<br />
5.2.4 Omitting Structural Discontinuities .................................................................... 37
Contents<br />
5.3 Running the Build Structure Job.......................................................................... 38<br />
5.4 Saving and Exporting Built Structures ............................................................... 40<br />
5.5 Technical Notes for the Build Structure Step..................................................... 40<br />
Chapter 6: Fold Recognition ......................................................................................... 43<br />
6.1 Overview of the Fold Recognition Step .............................................................. 43<br />
6.2 Producing an SSP Profile ...................................................................................... 43<br />
6.2.1 Importing and Exporting SSPs.......................................................................... 43<br />
6.2.2 Generating SSPs .............................................................................................. 44<br />
6.2.3 Editing or Deleting SSPs................................................................................... 45<br />
6.2.4 Resetting SSPs................................................................................................. 46<br />
6.3 Searching for Templates ........................................................................................ 46<br />
6.3.1 Search Program Options .................................................................................. 46<br />
6.3.2 Search Program Technical Details.................................................................... 47<br />
6.3.3 Search Output................................................................................................... 47<br />
6.4 Evaluating Templates ............................................................................................. 49<br />
6.5 Building a Model...................................................................................................... 50<br />
Chapter 7: <strong>Prime</strong>–Refinement..................................................................................... 51<br />
7.1 Preparing Structures for Refinement .................................................................. 51<br />
7.2 Using the Refinement Panel.................................................................................. 55<br />
7.3 Setting Up an Implicit Membrane ......................................................................... 56<br />
7.4 <strong>Prime</strong> Energy Analysis ........................................................................................... 58<br />
7.5 Refining Loops ........................................................................................................ 58<br />
7.6 Predicting Side Chains........................................................................................... 61<br />
7.7 Minimizing Structures ............................................................................................ 62<br />
7.8 Refinement Panel Options..................................................................................... 63<br />
7.8.1 General Options................................................................................................ 63<br />
7.8.2 Dielectric Options.............................................................................................. 64<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> v
vi<br />
Contents<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />
7.8.3 Loop Refinement Options ................................................................................. 64<br />
7.9 Technical Details for Refinement ......................................................................... 66<br />
7.9.1 Loop Refinement............................................................................................... 66<br />
7.9.2 Cooperative Loop Refinement .......................................................................... 67<br />
7.9.3 Side-Chain Prediction ....................................................................................... 68<br />
7.10 Refinement with Nonstandard Residues .......................................................... 68<br />
7.11 Output of Refinement Jobs ................................................................................. 68<br />
Chapter 8: Maestro Protein Structure Alignment ........................................... 71<br />
8.1 The Protein Structure Alignment Panel............................................................... 71<br />
8.2 The Align Binding Sites Panel .............................................................................. 74<br />
Chapter 9: Docking Covalently Bound Ligands............................................... 77<br />
9.1 Running Covalent Docking from Maestro........................................................... 77<br />
9.1.1 Selecting the Ligands........................................................................................ 77<br />
9.1.2 Defining the Ligand Reactive Groups ............................................................... 78<br />
9.1.3 Specifying the Receptor Bond to Break ............................................................ 79<br />
9.1.4 Specifying Receptor Residues to Sample......................................................... 79<br />
9.1.5 Running the Job................................................................................................ 80<br />
9.2 Running Covalent Docking from the Command Line....................................... 80<br />
Chapter 10: <strong>Prime</strong> MM-GBSA ..................................................................................... 81<br />
Chapter 11: Command Syntax.................................................................................... 85<br />
11.1 blast—Run Blast Searches.................................................................................. 85<br />
11.2 bldstruct—Build a Protein Model from an Alignment..................................... 88<br />
11.3 fr—Fold Recognition ............................................................................................ 91<br />
11.4 multirefine—Multi-Step Extended Sampling .................................................... 94<br />
11.5 refinestruct—Refine a Protein Structure .......................................................... 98<br />
11.6 ska—Protein Structure Alignment ................................................................... 102
Contents<br />
11.7 ssp—Secondary Structure Prediction............................................................. 107<br />
11.8 sta—Single Template Alignment ...................................................................... 109<br />
11.9 tfm—Tertiary Folding Module ........................................................................... 112<br />
Chapter 12: Command-Line Utilities ..................................................................... 115<br />
12.1 Multiple Template Alignment: structalign....................................................... 115<br />
12.2 Format Conversion: seqconvert....................................................................... 116<br />
12.3 Structure Preparation: primefix.py .................................................................. 118<br />
12.4 Structure Alignment: align_binding_sites...................................................... 119<br />
12.5 Database Update Utilities .................................................................................. 120<br />
12.5.1 update_BLASTDB......................................................................................... 120<br />
12.5.2 rsync_pdb ..................................................................................................... 120<br />
Appendix A: Third-Party Programs ........................................................................ 121<br />
A.1 Location of Third-Party Programs .................................................................... 121<br />
A.1.1 BLAST/PSI-BLAST, Pfam, and the PDB......................................................... 121<br />
A.1.2 Secondary Structure Prediction...................................................................... 121<br />
A.2 Third-Party Program Use Agreement ............................................................... 122<br />
Appendix B: Environment Variables ...................................................................... 123<br />
Appendix C: File Formats ............................................................................................. 125<br />
C.1 Input File Formats ............................................................................................... 125<br />
C.2 Output File Formats ............................................................................................ 125<br />
Appendix D: Error Messages and Warnings ................................................... 127<br />
D.1 Input Sequence .................................................................................................... 127<br />
D.2 Find Homologs ..................................................................................................... 127<br />
D.2.1 Error Messages .............................................................................................. 127<br />
D.2.2 Warnings in Homologs Table .......................................................................... 128<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> vii
viii<br />
Contents<br />
D.3 Edit Alignment ...................................................................................................... 129<br />
D.4 Build Structure ..................................................................................................... 129<br />
D.5 Refinement ............................................................................................................ 129<br />
Getting Help ........................................................................................................................... 131<br />
Glossary.................................................................................................................................... 135<br />
Index ............................................................................................................................................ 137<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
Document Conventions<br />
In addition to the use of italics for names of documents, the font conventions that are used in<br />
this document are summarized in the table below.<br />
Font Example Use<br />
Sans serif Project Table Names of GUI features, such as panels, menus,<br />
menu items, buttons, and labels<br />
Monospace $SCHRODINGER/maestro File names, directory names, commands, environment<br />
variables, and screen output<br />
Italic filename Text that the user must replace with a value<br />
Sans serif<br />
uppercase<br />
CTRL+H Keyboard keys<br />
Links to other locations in the current document or to other PDF documents are colored like<br />
this: Document Conventions.<br />
In descriptions of command syntax, the following UNIX conventions are used: braces { }<br />
enclose a choice of required items, square brackets [ ] enclose optional items, and the bar<br />
symbol | separates items in a list from which one item must be chosen. Lines of command<br />
syntax that wrap should be interpreted as a single command.<br />
File name, path, and environment variable syntax is generally given with the UNIX conventions.<br />
To obtain the Windows conventions, replace the forward slash / with the backslash \ in<br />
path or directory names, and replace the $ at the beginning of an environment variable with a<br />
% at each end. For example, $SCHRODINGER/maestro becomes %SCHRODINGER%\maestro.<br />
In this document, to type text means to type the required text in the specified location, and to<br />
enter text means to type the required text, then press the ENTER key.<br />
References to literature sources are given in square brackets, like this: [10].<br />
<strong>Prime</strong> 2.5 <strong>User</strong> <strong>Manual</strong> ix
x<br />
<strong>Prime</strong> 2.5 <strong>User</strong> <strong>Manual</strong>
<strong>Prime</strong> <strong>User</strong> <strong>Manual</strong><br />
Chapter 1: Introduction<br />
Chapter 1<br />
<strong>Prime</strong> 2.1 is a highly accurate protein structure prediction suite of programs that integrates<br />
Comparative Modeling and Fold Recognition into a single user-friendly, wizard-like interface.<br />
The Comparative Modeling path incorporates the complete protein structure prediction process<br />
from template identification, to alignment, to model building. Refinement can then be done<br />
from a separate panel, and involves side-chain prediction, loop prediction, and minimization.<br />
The <strong>Prime</strong> interface was designed to accommodate both novice and expert users, while the<br />
underlying programs were designed to produce superior results in a variety of applications,<br />
including:<br />
• High-resolution homology modeling<br />
• Refinement of active sites<br />
• Induced-fit optimization<br />
• Fold Recognition<br />
• Structure-based functional annotation<br />
• Generation of alternate loop conformations<br />
1.1 The <strong>Prime</strong> Suite in Maestro<br />
To run <strong>Prime</strong> you use Maestro, the graphical user interface (GUI) for all of Schrödinger’s products.<br />
For an introduction to Maestro, see the Maestro Overview. For help with a Maestro panel,<br />
click the Help button. For more information, see the Maestro <strong>User</strong> <strong>Manual</strong>.<br />
The <strong>Prime</strong> submenu of the Maestro Applications menu has four items, Structure Prediction,<br />
Refinement, Covalent Docking, and MM-GBSA, which provide access to the Structure Prediction<br />
(<strong>Prime</strong>–SP), Refinement, covalent docking and MM-GBSA modules of the <strong>Prime</strong> suite.<br />
<strong>Prime</strong>–SP consists of a series of steps leading from a protein sequence through Comparative<br />
Modeling to the construction of a 3D structure. If a template of sufficiently high sequence<br />
identity exists in the database of known structures, the Comparative Modeling path is used. If a<br />
suitable template cannot be identified using sequence information, however, Fold Recognition<br />
can be used to identify potential templates.<br />
<strong>Prime</strong>–Refinement is a facility for refining protein structures that have been imported into<br />
Maestro or added to the Project Table from the Comparative Modeling workflow or another<br />
Schrödinger application.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 1
2<br />
Chapter 1: Introduction<br />
Covalent Docking is a facility for “docking” ligands that are covalently bound to the receptor.<br />
It uses designated reactive bonds in the ligand and a receptor bond to sample the possible<br />
attachment points, and performs a loop prediction to sample the ligand and specified parts of<br />
the receptor. The result is a set of “poses” of the ligand, docked to the receptor.<br />
<strong>Prime</strong> MM-GBSA is a facility for calculating ligand binding energies within the MM-GBSA<br />
continuum solvation model.<br />
1.2 <strong>Prime</strong> Documentation<br />
The <strong>Prime</strong> <strong>User</strong> <strong>Manual</strong> describes the <strong>Prime</strong> suite, including information about:<br />
• Essentials of using the <strong>Prime</strong>–SP module: runs and projects, job control, and the layout<br />
and common features of step panels<br />
• Starting a structure prediction: the Input Sequence and Find Homologs steps<br />
• Using the Comparative Modeling path, including the Edit Alignment and Build Structure<br />
steps<br />
• Using Fold Recognition<br />
• Using the <strong>Prime</strong>–Refinement module<br />
• Running <strong>Prime</strong> MM-GBSA calculations<br />
• Running <strong>Prime</strong> jobs from the command line<br />
This document expands on the information provided in the Maestro online help. In addition to<br />
describing steps, features, and options, this document suggests ways to determine which<br />
choices are the most useful, describes results, and provides some background and technical<br />
detail. Terms pertinent to <strong>Prime</strong> are defined in the Glossary.<br />
For a tutorial introduction to <strong>Prime</strong>, see the <strong>Prime</strong> Quick Start Guide. For information on<br />
installing <strong>Prime</strong>, see the Installation Guide.<br />
For information about Schrödinger’s Induced Fit Docking protocol, which uses <strong>Prime</strong> and<br />
Glide to optimize docking by including ligand-induced flexibility in the receptor active site,<br />
see the document Induced Fit Docking.<br />
Maestro online help includes topics for <strong>Prime</strong>. See page 131 for a discussion of help in<br />
Maestro, including tooltips (Balloon Help) and technical support contacts. For a fuller description<br />
of the help facility, see Chapter 15 of the Maestro <strong>User</strong> <strong>Manual</strong>.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
1.3 Running Schrödinger Software<br />
Chapter 1: Introduction<br />
To run any Schrödinger program on a UNIX platform, or start a Schrödinger job on a remote<br />
host from a UNIX platform, you must first set the SCHRODINGER environment variable to the<br />
installation directory for your Schrödinger software. To set this variable, enter the following<br />
command at a shell prompt:<br />
csh/tcsh: setenv SCHRODINGER installation-directory<br />
bash/ksh: export SCHRODINGER=installation-directory<br />
Once you have set the SCHRODINGER environment variable, you can start Maestro with the<br />
following command:<br />
$SCHRODINGER/maestro &<br />
It is usually a good idea to change to the desired working directory before starting Maestro.<br />
This directory then becomes Maestro’s working directory. For more information on starting<br />
Maestro, including starting Maestro on a Windows platform, see Section 2.1 of the Maestro<br />
<strong>User</strong> <strong>Manual</strong>. There is no need to set SCHRODINGER on Windows: it is set automatically.<br />
<strong>Prime</strong> jobs, with the exception of Fold Recognition, can be launched from a Windows platform<br />
to a remote UNIX host. You can run <strong>Prime</strong> MM-GBSA jobs locally on a Windows host.<br />
To launch Structure Prediction jobs from Maestro, you must have access to the PDB. If the<br />
PDB database is not installed into $SCHRODINGER, you must set the SCHRODINGER_PDB environment<br />
variable to the path to the PDB installation. For information on how to set environment<br />
variables on Windows, see Appendix A of the Installation Guide.<br />
1.4 Citing <strong>Prime</strong> in Publications<br />
The use of this product should be acknowledged in publications as:<br />
<strong>Prime</strong>, version 2.1, Schrödinger, LLC, New York, NY, 2009.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 3
4<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
<strong>Prime</strong> <strong>User</strong> <strong>Manual</strong><br />
Chapter 2: Using <strong>Prime</strong>–Structure Prediction<br />
This chapter presents the essentials of using <strong>Prime</strong>–Structure Prediction (<strong>Prime</strong>–SP):<br />
• Starting, naming, and saving <strong>Prime</strong>–SP runs in Maestro projects<br />
• Navigating within and between runs<br />
• Job control, job monitoring, and <strong>Prime</strong>–SP job options<br />
• Using <strong>Prime</strong>–SP step panels, including their common layout and features<br />
• Using the <strong>Prime</strong> menu bar and <strong>Prime</strong> toolbar<br />
2.1 <strong>Prime</strong> Runs and Maestro Projects<br />
Chapter 2<br />
A single execution of the <strong>Prime</strong> workflow using a particular set of choices of templates, paths,<br />
and settings is called a run.<br />
<strong>Prime</strong> runs are stored within Maestro projects. Structures generated in completed <strong>Prime</strong> runs<br />
are added to the Maestro Project Table. If you are unfamiliar with Maestro projects and the<br />
Maestro Project Table, you may want to review the Project Table and Project Facility online<br />
help topics. For more information, see the Maestro <strong>User</strong> <strong>Manual</strong>.<br />
Maestro always has a project open. If you begin to work without selecting or naming a project,<br />
Maestro creates a scratch (unnamed) project.<br />
<strong>Prime</strong> always has a run open, which belongs to the current project (whether it is a named<br />
project or a scratch project). If you begin the <strong>Prime</strong> workflow without specifying an existing<br />
run, the default name for the current run is run1. Multiple runs can be performed within a<br />
project, but you must save the project in order to save the data in the runs.<br />
Each project can be displayed in the Project Table panel. In <strong>Prime</strong>, Project Table entries are<br />
usually finished model structures with their properties. <strong>Prime</strong> stores data in the project as each<br />
step in a run is performed, but most of this data is not displayed in the Project Table, which can<br />
remain empty until a run is completed. For this reason you should save scratch projects in<br />
which you have done work, even if the Project Table is empty.<br />
Two Maestro projects can be merged into one. If you merge two Maestro projects, each of<br />
which contains one or more <strong>Prime</strong> runs, all the runs are included in the merged project. If two<br />
runs have the same name, Maestro automatically makes the names unique by appending -mrg1<br />
to one of them.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 5
6<br />
Chapter 2: Using <strong>Prime</strong>–Structure Prediction<br />
Maestro projects containing multiple runs are useful for comparing the results of workflows in<br />
which different selections and options were chosen: for example, a different template selected<br />
for building, or a job performed with a different set of options. Creating these multiple runs<br />
may involve navigating to an earlier step in your current run, making different selections, and<br />
then navigating forward again in a new run. For a description of the features used in navigation<br />
(the Next button, the Back button, and the Guide), see Section 2.2 on page 7.<br />
If you return to an earlier step in your current run and make different choices as you proceed<br />
forward, you will be asked whether you want to overwrite your existing data. If you want to<br />
keep the data in your current run for comparison rather than overwriting it, you can start a new<br />
run. The commands for saving the old run, starting a new one, and switching between <strong>Prime</strong><br />
runs are found under the <strong>Prime</strong> File menu.<br />
The File menu commands for managing runs are:<br />
New<br />
Asks for a new run name (all run names must be unique), and starts a new run at the Input<br />
Sequence step.<br />
Save As<br />
Copies the current run under a new run name and switches to the new run. The original run is<br />
saved under the original name. If you open a new project and begin a workflow without explicitly<br />
giving the run a name, it will be given the default name of run1.<br />
Rename<br />
Asks for a new name for the current run. All run names must be unique.<br />
Delete<br />
Deletes the current run, on confirmation.<br />
Open<br />
Displays a list of options for switching to another run. The list includes the four most recently<br />
used runs. If there are more than four runs in the current project, click the More option to select<br />
a run by name.<br />
To backtrack and try a different set of choices without overwriting your current workflow:<br />
1. Choose Save As from the <strong>Prime</strong> File menu.<br />
This saves the current run under the old name (run1 is the default for the first run in a<br />
new project) and creates a copy under the new name.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
Chapter 2: Using <strong>Prime</strong>–Structure Prediction<br />
You are now in a new workflow identical to the first except for its name.<br />
2. Navigate to the step where you want to make a different choice. If you want to go back<br />
more than one step, click the step name in the Guide.<br />
2.2 General Panel Layout<br />
Each step in <strong>Prime</strong>–SP has its own panel. The panels have many elements in common and<br />
share a standard layout. This section discusses these common elements, their location, and<br />
their function. An example of a <strong>Prime</strong>–SP panel is shown in Figure 2.1.<br />
Figure 2.1. The Find Homologs step of the Structure Prediction panel.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 7
8<br />
Chapter 2: Using <strong>Prime</strong>–Structure Prediction<br />
In the upper part of the panel is the <strong>Prime</strong> menu bar, and below that is the <strong>Prime</strong> toolbar. Below<br />
the <strong>Prime</strong> toolbar is the <strong>Prime</strong> sequence viewer. You can resize the sequence viewer by dragging<br />
the small box in the right-hand corner of the horizontal line that separates the sequence<br />
viewer from the main panel. You can select a single residue in a sequence by clicking on it, and<br />
you can select multiple residues using shift-click to select a range and control-click to select or<br />
deselect individual residues.<br />
In addition to displaying sequences for the query and for templates, in some steps the sequence<br />
viewer shows secondary structure predictions (SSPs). Right-click on an SSP for a menu of<br />
commands that includes Hide All, Delete, and Export. Right-clicking on a sequence also<br />
produces a shortcut menu of commands appropriate to the current step. You can export<br />
sequences using the Export Sequences command on the <strong>Prime</strong> File menu.<br />
The sequence viewer includes a “ruler”, which marks the residue position relative to the leftmost<br />
residue displayed in the sequence viewer. This residue position is displayed in parentheses<br />
in the tooltip for each residue. The positions are arbitrary, and can change when you<br />
display residues in the viewer. They are intended to help you keep track of the sequence as you<br />
scroll the sequence viewer.<br />
The sequence viewer can be configured to display a single line for the sequence that can be<br />
scrolled horizontally, or the sequence can be wrapped so that it scrolls vertically. The configuration<br />
is controlled by the Wrap Sequences option on the sequence viewer shortcut menu.<br />
If you want to save an image of the sequence viewer, choose Save Image from the sequence<br />
viewer shortcut menu. A file selector opens, in which you can select a format, from PNG,<br />
TIFF, and JPEG, and save the image.<br />
Below the sequence viewer are the step name and a brief description of the step, followed by a<br />
row of buttons. The first button on the left runs the main action of the step, which may involve<br />
launching a job. See Section 2.5 on page 14 for more information on running <strong>Prime</strong> jobs. An<br />
Options button is usually located to the right of the action buttons. Most buttons, including<br />
those in the <strong>Prime</strong> toolbar, the <strong>Prime</strong> Guide, and the Maestro toolbar, have tooltips that you can<br />
view by pausing the cursor over the button.<br />
The job status icon in the upper right corner of the panel turns green and spins when you<br />
launch a job. When the icon turns pink again, the job is finished. Click the icon to open the<br />
Monitor panel. For more information on job control, see Section 2.5 on page 14.<br />
The center of the panel contains the Step View, which shows data relevant to the step in table<br />
format. In the Build Structure step, the Step View contains output from the Build Structure<br />
job’s log file. You can view the complete text of a truncated table entry as a tooltip by pausing<br />
the cursor over the text. To resize a table column, drag a column border using the middle<br />
mouse button. You can sort the tables in the Step View by clicking the column heading, once<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
Chapter 2: Using <strong>Prime</strong>–Structure Prediction<br />
for ascending order, twice for descending order. You can also export data from any table in<br />
CSV or HTML format from the table shortcut menu.<br />
Below the Step View are the Back and Next buttons, the primary way to navigate between<br />
steps. You can perform a simple workflow by clicking Next after completing each step.<br />
However, it is sometimes useful to go back to previous steps, make different choices, and<br />
proceed forward again using the Next button or the Guide, explained next.<br />
The Guide, below the Back and Next buttons, is a diagram of the <strong>Prime</strong> workflow as a series of<br />
steps, each step represented by a button. The diagram branches to show the steps of both paths,<br />
Comparative Modeling and Threading.<br />
To hide the Guide, deselect the Guide check box in the Step menu.<br />
The Guide has two purposes: to show your current location in the <strong>Prime</strong>–SP process, and to let<br />
you navigate back and review or change the input at previous steps in the process. The current<br />
step is highlighted in the Guide diagram. All previously accessed steps, forward or backward,<br />
are available. You can go to any of these steps by clicking them in the Guide.<br />
Clicking on an available step button in the Guide displays the panel for that step. If a step is not<br />
yet available, its button is dimmed. Available steps include those that have already passed<br />
through in the current run, the present step, and (once step-specific criteria have been met) the<br />
next step beyond the current location. When the next step is available, clicking on it behaves<br />
the same as clicking the Next button.<br />
Note: You cannot launch searches and jobs by clicking the steps in the Guide. To launch<br />
actions pertaining to a step, click the appropriate buttons within the step panel.<br />
The Close button appears below the Guide on the lower left of the panel. To the right is the<br />
Help button, which opens online help related to the current step.<br />
2.3 The <strong>Prime</strong> Menu Bar<br />
The <strong>Prime</strong> menu bar consists of four menus: File, Edit, Display, and Step. Some of the options<br />
in these menus are also available as toolbar buttons.<br />
2.3.1 The File Menu<br />
The commands on the File menu for working with <strong>Prime</strong> runs (New, Save As, Rename, Delete,<br />
and Open) are discussed in Section 2.1 on page 5. The menu also contains the commands<br />
Export Sequences and Job Options. See Section 2.5 on page 14 for more information on job<br />
options.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 9
10<br />
Chapter 2: Using <strong>Prime</strong>–Structure Prediction<br />
2.3.2 The Edit Menu<br />
The Edit menu contains commands for editing sequences and secondary structure predictions.<br />
These editing tools are also available from the <strong>Prime</strong> toolbar, and are described with the<br />
toolbar in Section 2.4 on page 13.<br />
2.3.3 The Display Menu<br />
The Display menu includes the following commands for controlling the appearance of the<br />
sequence viewer and structures in the Maestro Workspace.<br />
Color Scheme<br />
This command opens a menu of coloring options. In <strong>Prime</strong>–SP, the coloring schemes act as<br />
modes; that is, only one color scheme can be used at any given time. Each step in <strong>Prime</strong> has a<br />
default color scheme. You can use this menu to choose a different one. When amino acid<br />
sequences are added to the sequence viewer, they are colored according to the scheme that is<br />
currently selected.<br />
When the structure associated with a sequence is displayed in the Workspace, it is generally<br />
colored using the same scheme as the sequence. However, protein structures in the Workspace<br />
are sometimes colored using other color schemes:<br />
• When a PDB structure is imported, it is displayed in the Workspace using an error reporting<br />
color scheme in which complete recognized residues are gray, incomplete residues<br />
are red, unknown or nonstandard residues are orange, residues with multiple location<br />
indicators are green, and recognized residues with some unrecognized atoms are blue.<br />
For more information about conversion of PDB structures, see the Maestro <strong>User</strong> <strong>Manual</strong>.<br />
• When the Build Structure step is complete, the color scheme applied to the structure in<br />
the Workspace is Atom PDB B Factor (Temperature Factor), in which blue atoms are those<br />
derived directly from the template, and red atoms have been predicted or modeled. To<br />
restore this color scheme, choose Atom PDB B Factor (Temperature Factor) from the<br />
Color all atoms by scheme Maestro toolbar button.<br />
The <strong>Prime</strong> color schemes are:<br />
• None: All sequences are colored standard gray.<br />
• Color by Sequence: The query sequence is colored standard gray. Each of the other<br />
sequences uses a different color.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
Chapter 2: Using <strong>Prime</strong>–Structure Prediction<br />
• Residue Type: Each amino acid sequence is colored according to the array of residue type<br />
colors.<br />
• Residue Property: Each amino acid sequence is colored according to the array of residue<br />
property colors (fewer colors than residue type).<br />
• Residue Matching: The query and templates are colored in a pairwise manner. Matching<br />
residues are colored according to their residue type, and nonmatching residues are colored<br />
standard gray. If more than one template is chosen, the query will have more colored<br />
residues because the effect is cumulative. If only the query sequence is shown, it is colored<br />
standard gray for all residues.<br />
• Residue Homology: Aligned query and template residues that are identical (highly conserved)<br />
are colored red. Aligned query and template residues that are similar according to<br />
the BLOSUM62 scoring matrix (conserved) are colored yellow. Gray is used for residues<br />
with no homology.<br />
• Secondary Structure: The query sequence is colored standard gray. All other AA<br />
sequences should have SSAs, which are colored according to the SSA colors.<br />
• Template ID: Only available for the Comparative Modeling path. The query sequence is<br />
colored standard gray. The predicted structure is colored according to template ID for<br />
each region (using the cycle colors). Each template sequence is colored with its color, but<br />
only in the region that was used (the rest of the template is colored dark gray to indicate it<br />
is not used).<br />
• Proximity: At any step in which the sequence viewer is displaying sequences that have<br />
associated structure (all steps except Input Sequence), color by proximity mode is available.<br />
When you select this color scheme, the current sequence colors do not change<br />
immediately. Instead, when the mouse is over a sequence that has structure, the cursor<br />
changes to indicate that proximity can be calculated. You can then select one or more<br />
contiguous residues in the sequence, and the system will color that sequence based on the<br />
proximity to the selected residues. No other sequences will have modified colors (even if<br />
they have structure too and are in proximity of the selected sequence).<br />
Proximity is calculated based on the shortest distance between any two atoms in two residues,<br />
not including the bonded C and N atoms in the residues next to the selected<br />
residues.<br />
To change the proximity cutoff, choose Preferences from the Display menu and change<br />
the number in the Proximity cutoff text box.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 11
12<br />
Chapter 2: Using <strong>Prime</strong>–Structure Prediction<br />
Font Size<br />
The font size of letters and numbers in the sequence viewer can be changed using this option.<br />
The font size options are:<br />
• Small<br />
• Medium<br />
• Large<br />
• Huge<br />
View Structure<br />
The options for showing structures in the Maestro Workspace include:<br />
• All<br />
• None<br />
• Predicted Only<br />
View SSA<br />
This option controls whether secondary structure assignments are shown in the sequence<br />
viewer. The default is not to show secondary structure assignments.<br />
Legend<br />
Choose this menu option to open the Legend panel, which includes the assignment of each<br />
color and symbol used in each color scheme and sequence representation. Click the Colors tab<br />
and select a Color scheme from the list, or click the Symbols tab and select a type of Sequence<br />
data, such as Pfam/HMMER or SSP. This option is also available when you right-click on a<br />
sequence in the sequence viewer.<br />
Preferences<br />
Choose this Display menu option to open the Proximity dialog box:<br />
• Proximity cutoff: The value entered in this text box defines proximity for the Proximity<br />
color scheme. The default is 4.00 Å.<br />
2.3.4 The Step Menu<br />
The Step menu provides another way to navigate between steps in <strong>Prime</strong>. It includes the<br />
following options:<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
Guide<br />
Chapter 2: Using <strong>Prime</strong>–Structure Prediction<br />
When this check box is selected, the <strong>Prime</strong> Guide is displayed (the default). Deselect the check<br />
box to hide the Guide.<br />
Input Sequence<br />
Click this option to go to the Input Sequence step.<br />
Find Homologs<br />
Click this option to go to the Find Homologs step.<br />
There is an option to click to go to each step in <strong>Prime</strong>–SP. A red diamond is displayed next to<br />
the option to go to the current step. Options to go to steps that are not available are dimmed.<br />
2.4 The <strong>Prime</strong> Toolbar<br />
The <strong>Prime</strong> toolbar is the row of buttons just under the <strong>Prime</strong> menu bar. The <strong>Prime</strong> toolbar<br />
provides convenient access to commonly used <strong>Prime</strong> tasks. Pause the mouse over a button to<br />
see its description. These task modes are also available as options in the Edit and Display<br />
menus. The toolbar buttons are shown below as they appear on the toolbar, with their names<br />
and functions.<br />
Select (Edit menu): Select a region of the sequence to see the corresponding residues<br />
highlighted in the 3D Workspace.<br />
Crop (Edit menu): The cropping tool, which can be used in the Build Structure step<br />
to select ends of the query sequence that are not to be used for building.<br />
Edit SSP (Edit menu): Edit secondary structure predictions. Highlight a region of<br />
the SSP, and then type either H for α-helix, E for β-strand, or - for loop.<br />
Set Template Regions (Edit menu): If more than one template was selected in the<br />
Find Homologs step, use this button in the Build Structure step to proceed to the<br />
mode that allows selection of template regions for building.<br />
Slide Freely (Edit menu): Use this button to edit the alignment between query and<br />
template. Selecting a residue and sliding to the right creates N-terminal gaps.<br />
Selecting a residue and sliding to the left creates C-terminal gaps.<br />
Slide As Block (Edit menu): Use this button to edit the alignment between query and<br />
template. Selecting a residue and sliding to the right creates N-terminal gaps.<br />
Selecting a residue and sliding to the left slides all residues as a block to the left.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 13
14<br />
Chapter 2: Using <strong>Prime</strong>–Structure Prediction<br />
Add Anchors (Edit menu): Set alignment constraints by setting anchors at residue<br />
positions. To remove anchors, click on the same residue again.<br />
Lock Gaps (Edit menu): To prevent gaps from being collapsed during manual<br />
editing, lock the gaps using this button. Locked gaps are indicated with a - symbol.<br />
Unlock Gaps (Edit menu) To allow gaps to collapse during manual editing, unlock<br />
the gaps using this button. Unlocked gaps are indicated with a ~ symbol.<br />
Color Scheme (Display menu): Select a color scheme for coloring sequences in the<br />
sequence viewer. If there is a structure in the 3D Workspace associated with the<br />
sequence, it will also be colored.<br />
View Structure (Display menu): Use this button to display and undisplay structures<br />
in the Workspace.<br />
View SSA (Display menu): Use this button to turn the secondary structure assignment<br />
(SSA) in the sequence viewer on and off.<br />
2.5 <strong>Prime</strong> Job Options and Control<br />
Before launching a <strong>Prime</strong> calculation from a step panel, you may want to choose which<br />
machine host the job will run on and check or modify job settings. When a run is complete,<br />
you can incorporate the results into the Maestro Project Table and save them as a project.<br />
2.5.1 The Job Options Dialog Box<br />
The Structure Prediction–Job Options dialog box, shown in Figure 2.2, allows you to specify<br />
which machine each type of <strong>Prime</strong> job will run on, and under which user name. To open the<br />
dialog box, choose Job Options from the File menu on the <strong>Prime</strong> menu bar. The Job Options<br />
dialog box lists 10 types of jobs, from Find Family (run from the Find Homologs step) to Run<br />
Refinement (run from the Refine Structure step). Specify the host machine and the user name<br />
for each type of job. The default host is localhost. Make sure your schrodinger.hosts file<br />
contains a list of available machines to run on. See the Installation Guide for information on<br />
the schrodinger.hosts file.<br />
2.5.2 Setting Up and Launching Jobs<br />
Though you specify the machines used to run jobs in the Job Options panel, you select <strong>Prime</strong><br />
job settings and launch jobs from the <strong>Prime</strong> step panels. This can be as simple as clicking an<br />
action button, such as Search. In other steps, you select the type of job from a Task menu and<br />
then launch the job by clicking Run.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
Chapter 2: Using <strong>Prime</strong>–Structure Prediction<br />
Figure 2.2. The Structure Prediction–Job Options dialog box.<br />
You can modify settings for step actions or tasks using the Options button. You select the residues<br />
or secondary structure features on which a refinement task will operate from lists, tables,<br />
and the Atom Selection dialog box in the Refine Structure step panel. Until you make a selection,<br />
the Run button remains unavailable.<br />
For more information on setting up and launching jobs, see the discussions of specific steps in<br />
the chapters that follow.<br />
2.5.3 Monitoring Jobs<br />
After you launch a job, the job status icon in the upper right corner of the step panel turns green<br />
and spins.<br />
To monitor a job using the Monitor panel, click the job status icon. You can use this panel to<br />
monitor jobs from the current session or your previous sessions.<br />
To kill a job, select the job name and then click Kill. If the job still appears to be running, you<br />
may have to go to the Project directory and remove the job file.<br />
For more information about the Monitor panel and monitoring jobs, see Section 3.2 of the Job<br />
Control Guide and the online help.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 15
16<br />
Chapter 2: Using <strong>Prime</strong>–Structure Prediction<br />
2.5.4 Adding Structures to the Project Table<br />
In the Build Structure step, predicted structures can be added to the Maestro Project Table as<br />
new entries. This is useful for comparing predicted structures produced from different runs,<br />
and for refining structures.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
<strong>Prime</strong> <strong>User</strong> <strong>Manual</strong><br />
Chapter 3: <strong>Prime</strong>–Structure Prediction: Initial<br />
Steps<br />
Chapter 3<br />
This chapter outlines the initial steps in the <strong>Prime</strong>–SP workflow. In the Input Sequence step, a<br />
query sequence is imported. In the Find Homologs step, potential templates are identified, and<br />
a path is chosen: either Comparative Modeling or Threading. Later chapters describe the steps<br />
in each path in which a model is built.<br />
Figure 3.1. The Input Sequence step, initial view.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 17
18<br />
Chapter 3: <strong>Prime</strong>–Structure Prediction: Initial Steps<br />
3.1 Input Sequence Step<br />
This step is used to read in an amino acid sequence from a file or from a protein structure in the<br />
Workspace.<br />
To read a sequence from a file:<br />
• Click From File and specify a file containing the desired sequence.<br />
Several file formats are supported, including FASTA, EMBL, GENBANK, and PIR. For a<br />
complete list, see Appendix C. File formats are auto-detected by the program. Sequences<br />
must be in standard uppercase one-letter code.<br />
To read a sequence from the Workspace:<br />
• Click From Workspace to read sequences from proteins in the Workspace.<br />
If there are multiple chains with the same sequence, the sequence will be read in only<br />
once.<br />
After input, the Sequences table in the Step View area shows all available unique sequences<br />
with their corresponding chain lengths in the order in which they are read. By default, the first<br />
sequence read in is displayed in the sequence viewer in the upper section of the panel. If<br />
multiple sequences have been read in, only one can be brought forward to the next step in a<br />
given <strong>Prime</strong> run.<br />
To select a query from the Sequences table:<br />
• Click on a sequence to select it for this run.<br />
This brings the selected sequence into the sequence viewer. The sequences are named by<br />
extracting the name from either the PDB header or the FASTA file header. Non-unique<br />
sequences are ignored. Non-unique names are modified to make them unique.<br />
Note: Sequences longer than 1,000 residues generally include more than one domain. Divide<br />
these sequences into probable domains and treat each domain as an individual query.<br />
To continue:<br />
• When you are satisfied with the query sequence, click Next to go to the Find Homologs<br />
step.<br />
If an error message appears when you click Next, you cannot proceed until the problem is<br />
resolved. Error messages are listed in Appendix D. See Section D.1 on page 127 for help<br />
with Input Sequence error messages.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
3.2 Find Homologs Step<br />
Chapter 3: <strong>Prime</strong>–Structure Prediction: Initial Steps<br />
This step is used to search for homologs of the query sequence. These homologs can then be<br />
used as templates to build a model structure.<br />
3.2.1 Importing Homologs<br />
To import a homolog not in the NCBI sequence database, click Import. This button is also used<br />
to import homologs that have been rotated into a common reference frame when multiple<br />
templates have been selected.<br />
Figure 3.2. The Input Sequence step, after input.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 19
20<br />
Chapter 3: <strong>Prime</strong>–Structure Prediction: Initial Steps<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />
Figure 3.3. The Find Homologs step, initial view.<br />
3.2.2 Searching for Homologs<br />
To launch a search for homologs:<br />
• Click Search (near the middle of the panel).<br />
Depending on what is selected in the Find Homologs–Options dialog box, BLAST or PSI-<br />
BLAST is used to search for templates. The default is to use BLAST to search the nonredundant<br />
PDB database that has been loaded with <strong>Prime</strong>. For search options, see<br />
Section 3.2.3.
To identify a family for the query (optional):<br />
• Click Find Family to run HMMER/Pfam.<br />
Chapter 3: <strong>Prime</strong>–Structure Prediction: Initial Steps<br />
This helps identify the family that best matches the query sequence. This job should take<br />
only a few minutes.<br />
The query family name found appears in the Query family name text box with an E-value in<br />
parentheses. If the E-value of the family is too large, it is not meaningful to consider the family.<br />
The sequence viewer shows the match between the query and the consensus sequence of the<br />
family. If this sequence, labeled NAME_pfam, is not visible, check for a plus sign to the left of<br />
the query name: +NAME. Click the plus sign to show any hidden sequences. The match<br />
sequence displays information about which residues are conserved in the multiple sequence<br />
alignment used to generate the Hidden Markov Model (HMM). Capital letters indicate highly<br />
conserved residues, lowercase letters indicate a match to the HMM, + means the match is<br />
conservative, and a blank indicates that the residue does not match the HMM.<br />
3.2.3 Find Homologs Search Options<br />
To change search options:<br />
• Click Options to open the Find Homologs–Options dialog box.<br />
You can use this dialog box to specify a PSI-BLAST or BLAST search, or change the settings<br />
for either type of search.<br />
General options include:<br />
• Database<br />
• Similarity Matrix<br />
• Gap Costs<br />
• Expectation value<br />
• Word size<br />
• Filter query (off by default)<br />
PSI-Blast options include:<br />
• Inclusion threshold<br />
• Number of iterations<br />
The default database selection is NCBI PDB (all). Other selections include:<br />
• NCBI PDB (non-redundant)—Does not include PDB files with identical sequences.<br />
• NCBI NR—Recommended for PSI-BLAST runs. With this selection, only the sequences<br />
with structures available in the PDB are displayed in the Homologs table.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 21
22<br />
Chapter 3: <strong>Prime</strong>–Structure Prediction: Initial Steps<br />
3.2.4 Search Results<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />
Figure 3.4. The Find Homologs–Options dialog box.<br />
The output from the homolog search appears in the table labeled Homologs: (N found, n<br />
selected). For each potential template, the row contains the ID, the name, score, expectation<br />
value, % identities, % positives, % gaps, and header information (if taken from a PDB file). All<br />
percentages are rounded to the nearest whole number, which may be zero. A description of the<br />
table columns is given in Table 3.1.<br />
You can view the complete text of each table cell as a tooltip by pausing the pointer over the<br />
cell. Columns can be sorted by clicking on the column heading. Click once to sort in<br />
descending order, and click twice to sort in ascending order. The column widths can be<br />
adjusted using the left mouse button to drag the column border.<br />
If no structural coordinates are available for a template, that row is dimmed and cannot be<br />
selected.<br />
If a chain ID is given for a homolog (e.g., Chain A in 1EMS_A), then that chain, not the entire<br />
protein, is the potential template. When aligning multiple templates with multiple chains, you<br />
may need to extract single chains using the getpdb utility, as described in Section D.2.3 of the<br />
Maestro <strong>User</strong> <strong>Manual</strong>.<br />
Once homologs are found, click on a row to view the structure of that template in the Maestro<br />
workspace. If no structural coordinates are available for selection, the row is dimmed.<br />
Selecting a template will also bring the sequence alignment of the template and the query<br />
sequence into the sequence viewer. Click on another template to deselect the first one and<br />
display the selection in the Workspace window.<br />
After you review the alignments, select the template that you want to build on by clicking on<br />
its row in the table. To select multiple templates, see Section 3.2.5.
Table 3.1. Homologs table data<br />
Column Header Description<br />
Chapter 3: <strong>Prime</strong>–Structure Prediction: Initial Steps<br />
ID The PDB ID code, including chain name. If a chain name is specified, then<br />
the chain, not the entire protein, is the potential template.<br />
Name Sequence name provided by BLAST.<br />
Score BLAST bit score.<br />
Expect BLAST expectation value.<br />
Identities (%) Percentage of residues that are identical between the sequences.<br />
Positives (%) Percentage of residues that are positive matches according to the similarity<br />
matrix selected in the Options dialog (default matrix is BLOSUM62).<br />
Gaps (%) Percentage of gaps in both query and homolog as returned by BLAST.<br />
Title From the PDB TITLE field.<br />
Compound From the PDB COMPND field.<br />
Source From the PDB SOURCE field.<br />
Experiment From the PDB EXPDTA field. This gives the experimental technique used<br />
to obtain the structure, e.g., X-RAY DIFFRACTION or NMR.<br />
Resolution From the PDB REMARK2 record, where applicable.<br />
Ligands/Cofactors From the PDB HET records: hetID (three alphanumeric characters) and<br />
name for each HET group.<br />
Warning A warning appears in this column for sequences that may be unusable for<br />
model building. See Section D.2.2, for a list of warnings.<br />
3.2.5 Multiple Templates and Structure Alignment<br />
To select more than one template for building, hold down the CTRL key and select another<br />
homolog in the table, or hold down the SHIFT key and select a set of homologs. If more than<br />
one template is selected, they will need to be structurally aligned, that is, rotated to bring them<br />
into a common reference frame, before they can be used in the next step.<br />
• If you have already created a file with structurally aligned multiple templates, import it<br />
using the Import button, as described in Section 3.2.1.<br />
• If you have not yet performed structural alignment on the selected templates, click the<br />
Align Structures button.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 23
24<br />
Chapter 3: <strong>Prime</strong>–Structure Prediction: Initial Steps<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />
Figure 3.5. The Find Homologs step with search results.<br />
If structure alignment fails or does not give the intended result, the problem may be one of the<br />
following:<br />
• The selected templates are structurally too dissimilar for a meaningful alignment.<br />
• One or more of the template structures includes multiple chains, and either<br />
• Two or more templates are chains from one .pdb file (the structure alignment facility<br />
will not align chains from the same file), or<br />
• The chains that were aligned were not the chains you intended to align.
Chapter 3: <strong>Prime</strong>–Structure Prediction: Initial Steps<br />
If the problem involves a template with multiple chains, you can use the getpdb utility (see<br />
Section D.2.3 of the Maestro <strong>User</strong> <strong>Manual</strong>) to extract each chain into a separate .pdb file, then<br />
run structure alignment again.<br />
For information on running structure alignment from the command line, see Section 12.1.<br />
3.2.6 Continuing to the Next Step<br />
When you are satisfied with the template (or structurally aligned multiple templates) you have<br />
selected, follow the Comparative Modeling Path by proceeding to the Edit Alignment step.<br />
If no satisfactory templates can be identified by BLAST/PSI-BLAST and you have not<br />
imported a template, you can follow the Threading Path by proceeding to the Fold Recognition<br />
step.<br />
Note: The maximum length of a query in the Threading Path is 1,000 residues. It is recommended<br />
that such long sequences be divided into queries by probable domains, if this<br />
has not been done already, and the homolog search be repeated for these shorter<br />
queries before entering the Threading Path.<br />
3.2.7 Saving Runs With Different Templates<br />
If you decide later to try a different template, you can avoid overwriting your existing workflow<br />
by starting a new one before you navigate back to Find Homologs to choose a different<br />
homolog. Choose Save As from the File menu. This creates a copy of the current run, allowing<br />
you to select a new template and build on it. You can then compare results from each run.<br />
3.2.8 Error Messages<br />
If there is an error, a message appears when you click Next to go to the next step. See<br />
Section D.2 on page 127 for a list of error messages for Find Homologs.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 25
26<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
<strong>Prime</strong> <strong>User</strong> <strong>Manual</strong><br />
Chapter 4<br />
Chapter 4: Comparative Modeling: Edit Alignment<br />
4.1 The Comparative Modeling Path<br />
In the initial steps of <strong>Prime</strong>–SP, you imported a query sequence (Input Sequence) and identified<br />
possible templates (Find Homologs). If a query-template match with a substantial<br />
percentage of identical residues has been selected, it is appropriate to continue the structure<br />
prediction process by following the Comparative Modeling path, a series of two <strong>Prime</strong>–SP<br />
steps used to build a model structure of the query sequence based on homology to one or more<br />
templates.<br />
The structure prediction process in the Comparative Modeling path includes the following<br />
steps:<br />
1. A satisfactory query-templates sequence alignment is produced in Edit Alignment.<br />
2. The selected query-template alignment is used to build a model structure of the query in<br />
Build Structure.<br />
4.2 Overview of the Edit Alignment Step<br />
In order to build a model structure of the query, the Build Structure step requires a satisfactory<br />
alignment between templates and query. The alignment of templates to query generated in the<br />
Find Homologs step is based only on sequence. Therefore, there is room for improvement. The<br />
Edit Alignment step allows you to improve alignments using various tools before building a<br />
model in the Build Structure step.<br />
There are several ways to produce an alignment for use in Build Structure:<br />
• Accept the BLAST/PSI-BLAST alignment generated in the Find Homologs step and currently<br />
displayed in the sequence viewer.<br />
• Import an alignment.<br />
• Generate or import secondary structure predictions, then use the Align program to create<br />
a new alignment.<br />
• Edit any of these alignments manually.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 27
28<br />
Chapter 4: Comparative Modeling: Edit Alignment<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />
Figure 4.1. The Edit Alignment panel, initial view.<br />
If you want to search for a family that best matches the query sequence, but did not do so in the<br />
Find Homologs step, click Find Family to run HMMER/Pfam. For more information on this<br />
option, see Section 3.2.2 on page 20.<br />
When you are satisfied with the alignment, click the Next button to proceed to the Build Structure<br />
step.
4.3 Initial and Imported Alignments<br />
Chapter 4: Comparative Modeling: Edit Alignment<br />
4.3.1 Accepting the BLAST/PSI-BLAST Alignment<br />
When you start the Edit Alignment step, the BLAST/PSI-BLAST alignment for the template<br />
you selected in Find Homologs is displayed in the sequence viewer. See Figure 4.1.<br />
If this is the template and the alignment you want to use, click Next and go to Build Structure.<br />
To accept a different template, using its existing alignment, click on the desired template and<br />
then click Next.<br />
4.3.2 Exporting an Alignment<br />
You may want to save the existing alignment for later use before making changes. Click the<br />
Export button to the right of the Alignments table to open the Export Alignments To File panel.<br />
To import the alignment again, see Section 4.3.3.<br />
4.3.3 Importing an Alignment<br />
<strong>Prime</strong> can also use an alignment generated in another program (or exported previously from<br />
<strong>Prime</strong>). The sequences in the alignment file must be exactly the same as those used by <strong>Prime</strong>,<br />
and so must the sequence and structure (PDB) names.<br />
Note: The template PDB ID must be uppercase in the alignment file.<br />
To import an alignment, click the Import button to the right of the Alignments table. This opens<br />
the file selection window (Import Alignments From File). For details of the allowed formats, see<br />
Appendix C.<br />
4.4 Secondary Structure Predictions (SSPs)<br />
4.4.1 Generating SSPs<br />
A secondary structure prediction for the query is required to run the Align program. To run all<br />
available SSP programs, click Run SSP. One secondary structure prediction program (SSpro)<br />
is bundled with <strong>Prime</strong>. However, the optional third-party program PSIPRED is highly recommended<br />
for optimal results, especially for GPCRs. PSIPRED is not available on Windows. See<br />
Appendix A for more information on this program.<br />
When the SSP run is finished, the secondary structure predictions are displayed in the sequence<br />
viewer below the query sequence. If there is a problem running the structure prediction<br />
programs, an error message is displayed.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 29
30<br />
Chapter 4: Comparative Modeling: Edit Alignment<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />
Figure 4.2. The Edit Alignment step, after Run SSP.<br />
4.4.2 Exporting and Importing SSPs<br />
You may want to save an SSP for later use. Right-click on an SSP for a menu of options,<br />
including deleting the SSP, hiding all SSPs, or opening the Export SSP panel. By default, the<br />
SSP is saved in native Schrödinger format. If there is a problem exporting a file, an error<br />
message is displayed.<br />
To import a secondary structure prediction for the query, click Import SSP and select the SSP<br />
file, which may be in FASTA or Maestro format. If necessary, use the seqconvert utility to<br />
change the file format. (See Section 12.2 for more information on seqconvert.) If the file
Chapter 4: Comparative Modeling: Edit Alignment<br />
contains bad data or the sequence doesn’t match the query sequence, an error message is<br />
displayed.<br />
4.4.3 Deleting or Editing an SSP<br />
If you believe an SSP is incorrect, you may delete it entirely or edit the incorrect portions. To<br />
delete an SSP, right-click on the SSP and choose Delete from the shortcut menu.<br />
To edit an SSP:<br />
1. Click the Edit SSP toolbar button:<br />
2. Highlight the part of the SSP you want to change.<br />
3. Type the character e, h, or -.<br />
E represents strand, H represents helix, and – represents loop.<br />
The highlighted region of the prediction is changed to your choice.<br />
4.4.4 Resetting SSPs<br />
To retrieve an unmodified SSP after editing, click Run SSP. This immediately resets the SSP,<br />
but does not rerun the prediction programs.<br />
4.5 Generated and Modified Alignments<br />
4.5.1 Running the Align Program<br />
Before running an alignment, if you want to take advantage of the special capabilities for<br />
aligning GPCRs, select Align GPCRs. These capabilities include fingerprint matching, a<br />
customized GPCR sequence database, and identification of transmembrane helices.<br />
To take advantage of expert knowledge, one or more pairs of query/template residues that you<br />
know should be aligned can be constrained to remain so, by using alignment constraints.<br />
To constrain a pair of residues to remain aligned:<br />
1. Click the Add Anchors toolbar button.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 31
32<br />
Chapter 4: Comparative Modeling: Edit Alignment<br />
2. Select one of the residues in the pair that you want to remain aligned during the Align<br />
calculation.<br />
An anchor symbol is displayed in the ruler at the corresponding residue position.<br />
When you are satisfied with the secondary structure predictions for the query, click Align to<br />
start the alignment job.<br />
When Align completes the generation of the new alignment, you may accept it and continue to<br />
the next step or edit it manually as described below.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />
Figure 4.3. The Edit Alignment step after alignment.
4.5.2 Align Program Technical Details<br />
Chapter 4: Comparative Modeling: Edit Alignment<br />
The goal of the program launched from the Edit Alignment step, Single Template Alignment,<br />
is to generate an accurate alignment between proteins with medium to high sequence identity<br />
(20-90%). Features of the program include:<br />
• A position-specific substitutional matrix (PSSM) for the query sequence, derived from<br />
PSI-BLAST, is used to match the template sequence.<br />
• In order to minimize the inaccuracy in a single secondary structure prediction, a novel<br />
and robust algorithm has been designed to derive a composite secondary structure for the<br />
query sequence from multiple predictions. The composite SSP is aligned to the SSA of<br />
the template.<br />
4.5.3 <strong>Manual</strong>ly Editing the Alignment<br />
You can edit any alignment, however obtained, using combinations of these <strong>Prime</strong> toolbar<br />
buttons: Slide Freely, Slide as Block, Add(/Remove) Anchors, Lock Gaps, and Unlock Gaps.<br />
Slide Freely (Edit menu): Edit the alignment between query and template. Selecting<br />
a residue and sliding to the right creates N-terminal gaps. Selecting a residue and<br />
sliding to the left creates C-terminal gaps.<br />
Slide As Block (Edit menu): Edit the alignment between query and template.<br />
Selecting a residue and sliding to the right slides all residues as a block to the right.<br />
Selecting a residue and sliding to the left slides all residues as a block to the left.<br />
Add Anchors (Edit menu): Set alignment constraints by setting anchors at residue<br />
positions. To remove anchors, click on the same residue again.<br />
Lock Gaps (Edit menu): To prevent gaps from being collapsed during manual<br />
editing, lock the gaps using this button. Locked gaps are indicated with a - symbol.<br />
Unlock Gaps (Edit menu): To allow gaps to collapse during manual editing, unlock<br />
the gaps using this button. Unlocked gaps are indicated with a ~ symbol.<br />
To assist in this task, you can view a preliminary model of the query in the Workspace. Select<br />
<strong>Manual</strong> Threading and click Update. The Workspace shows the template structure colored by<br />
query residue property. Each template residue is colored according to the property of the corresponding<br />
query residue in the alignment, revealing the location of charged, polar, and hydrophobic<br />
query residues. Use this option to make sure that insertions/gaps are not in the middle<br />
of regions of secondary structure such as alpha helices or beta sheets. It can also be used to<br />
check that hydrophobic residues point into hydrophobic regions and polar residues point<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 33
34<br />
Chapter 4: Comparative Modeling: Edit Alignment<br />
towards solvent. As you make changes to the alignment, click Update again to see how each<br />
change has altered the mapping of the query’s residue properties onto the template structure.<br />
If there are obvious errors in the alignment (such as gaps in regions of secondary structure), a<br />
warning will appear in the Build Structure dialog box. <strong>Manual</strong> editing can be used to adjust the<br />
alignment before attempting to build the structure again.<br />
4.5.4 Updating the Step View Table<br />
After the Align job is complete, the Step View table is updated with the new alignment score,<br />
the percent identity, the percent similarity, and the percent gaps (all rounded to the nearest<br />
integer). This information can be updated after any change, including manual editing of the<br />
alignment, by clicking the Update button. The changed alignment is also reflected in the<br />
Maestro Workspace display of the structure.<br />
4.6 Error Messages<br />
When you click Next at the end of the Edit Alignment step, the Check Alignment warning may<br />
appear. The warning lists possible alignment problems that you may want to fix before<br />
continuing to the next step. For example, gaps in secondary structure elements of the template,<br />
or gaps in the query that are aligned to secondary structure elements of the template.<br />
If you do not want to make corrections to the alignment, click Continue to go to the next step.<br />
To make corrections based on the information in the error message, make a note of the gap<br />
locations given, click Cancel to close the message box while remaining in Edit Alignment, and<br />
change the alignment accordingly.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
<strong>Prime</strong> <strong>User</strong> <strong>Manual</strong><br />
Chapter 5<br />
Chapter 5: Comparative Modeling: Build Structure<br />
5.1 The Build Process<br />
The Build Structure step builds a model structure of the query sequence based on the templates<br />
selected in the previous step. Figure 5.1 shows the initial view of the Build Structure step.<br />
Figure 5.1. The Build Structure step, initial view.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 35
36<br />
Chapter 5: Comparative Modeling: Build Structure<br />
The Build process includes the following steps:<br />
1. Coordination of the copying of backbone atoms for aligned regions and side chains of<br />
conserved residues<br />
2. Building insertions and closing deletions in the alignment<br />
3. Building tails out to the last residue in the template<br />
The Build process may also include the following steps:<br />
• Side-chain prediction of non-conserved residues<br />
• Minimization of all atoms not derived directly from the template<br />
You can specify which optional steps to perform by clicking Options and making selections in<br />
the dialog box. See Section 5.2.3.<br />
5.2 Preparing to Build<br />
If you want to build a model using multiple templates or including ligands or cofactors, you<br />
must specify how these are to be treated before clicking the Build button to start the building<br />
process.<br />
5.2.1 Multiple Templates (Set Template Regions)<br />
If you intend to use multiple templates to build a model of the query, you will have already<br />
selected them and ensured that they are structurally aligned (i.e., rotated into a common coordinate<br />
frame; see Section 3.2.5). In the Build Structure step, before starting the structure build<br />
by clicking the Build button, you must specify which template to use for each region of the<br />
query. If you do not select regions from other templates, the topmost template in the table will<br />
be used for the entire query.<br />
To enter the mode that allows you to set the template regions, click the Set Template Regions<br />
toolbar button:<br />
Highlight the region of each template in the sequence viewer that you want to use. Click the<br />
Set Template Regions button again to end the selection. (You may need to scroll over and<br />
select again if the desired region is wider than the display window.)<br />
The Set Template Regions tool helps make sure that only one template residue is assigned to a<br />
query residue at one time. Regions of a template not being used to model the query are colored<br />
dark gray (regardless of coloring schemes). When you begin to set template regions, only the<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
Chapter 5: Comparative Modeling: Build Structure<br />
first template is in use. As you select regions in the second template, the tool automatically<br />
deselects the corresponding residues in the first template. If you select regions for use from a<br />
third template, the corresponding regions of the first and second templates are deselected, and<br />
so on.<br />
5.2.2 Including Ligands and Cofactors<br />
Any cofactors and ligands present in the template you have chosen (except HOH) are listed in<br />
the Include ligands and cofactors box. Click in the list to select one or more ligands and cofactors.<br />
Your selections are highlighted in yellow in the Workspace.<br />
5.2.3 Build Options<br />
Click Options to open the Build Structure–Options dialog box and specify which optional<br />
model building steps to perform:<br />
• Retain rotamers for conserved residues—retain side-chain rotamers for conserved residues.<br />
• Optimize side chains—Optimize the side chains.<br />
• Minimize residues not derived from templates—minimize residues that are not derived<br />
from the templates. Only available if Optimize side chains has been selected.<br />
By default, all three of the above options are selected.<br />
5.2.4 Omitting Structural Discontinuities<br />
You can disable the building of structural discontinuities arising from insertions, template<br />
junctions, and deletions. The discontinuities are capped with NMA and ACE, so the final structure<br />
is discontinuous. If you choose any of these options, you should check that your final<br />
structure is reasonable.<br />
Figure 5.2. The Build Structure–Options dialog box.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 37
38<br />
Chapter 5: Comparative Modeling: Build Structure<br />
If there is an insertion in the query (a template gap) that requires the building of a long loop,<br />
you can instead cut the query sequence and cap it with NMA and ACE, and not build the loop,<br />
by selecting the Insertions (template gaps) option under Omit structural discontinuities, and<br />
enter the maximum length of loops that will be built in the text box. This option could be<br />
useful where there are long insertions that are not in the region of interest.<br />
Likewise, if there are deletions in the query (a query gap), or if there is a junction between two<br />
templates, you can choose to cap the ends of the two pieces and not try to join them in building<br />
the structure. To do this, select Deletions (query gaps) or Template junctions under Omit structural<br />
discontinuities.<br />
When the structure is built with discontinuities, the results appear to be misaligned in the<br />
sequence viewer. To display the correct alignment, right-click in the sequence viewer and<br />
choose Align by Residue Number.<br />
5.3 Running the Build Structure Job<br />
When you click Build, the model-building program begins, and its log is displayed in real time<br />
in the Log file window (see Figure 5.3). It can also be viewed in the Monitor panel (click the job<br />
status icon in the upper right corner of any <strong>Prime</strong>–SP panel to open the Monitor panel). Finally,<br />
the build log is saved to a log file in the directory where <strong>Prime</strong> is running.<br />
First, the log lists the chosen templates. If you did not intend to choose more than one template,<br />
or if you neglected to align the templates, stop the job and restart it with the correct template.<br />
Jobs can be stopped, paused, and resumed from the Monitor panel. See Section 2.5 on page 14<br />
for more information on Job Control.<br />
The next part of the log file gives information about the build process for template transition<br />
regions, then for insertions, and then for side chains, followed by non-template regions. The<br />
last line of the log describes the diagnosis and success of model building.<br />
Once the model building calculations are complete, the model is displayed in the Workspace<br />
superimposed on the template, using the color scheme Atom PDB B Factor (Temperature<br />
Factor). Blue atoms are those derived directly from the template; red atoms have been<br />
predicted or modeled. To restore this color scheme, choose Atom PDB B Factor (Temperature<br />
Factor) from the Color all atoms by scheme toolbar button menu in the main window.<br />
Build Structure also creates an atom set named non_template_residues which can be found in<br />
the Sets tab of the Atom Selection dialog box. This set of inserted residues is generally chosen<br />
to undergo refinement.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
Chapter 5: Comparative Modeling: Build Structure<br />
Figure 5.3. The Build Structure step after building.<br />
The sequence viewer contains the query sequence, the templates, and the sequence of the<br />
newly predicted structure in that order. To view only the model structure, select View Structure<br />
from the Display menu and then select Predicted Only. (The other options are All and None.)<br />
You can also run the Build Structure job from the command line. To do so, click Write to write<br />
the input file. The file is written to the current working directory, and is named according to the<br />
run and project. A dialog box informs you of the file name and the command to use, which is:<br />
$SCHRODINGER/bldstruct input-file<br />
For more information on this command, see Section 11.2 on page 88.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 39
40<br />
Chapter 5: Comparative Modeling: Build Structure<br />
5.4 Saving and Exporting Built Structures<br />
When building is complete, the built structures can be added to the Maestro project by clicking<br />
Add to Project Table. Once you have added the structure to the Project Table, you can refine the<br />
structure by choosing Applications > <strong>Prime</strong> > Refinement in the main window. You can also<br />
export any Project Table entry, such as a built structure, to a file of another format. For information<br />
on exporting structures, see Section 3.2 of the Maestro <strong>User</strong> <strong>Manual</strong>.<br />
5.5 Technical Notes for the Build Structure Step<br />
The Build Structure step uses the following resources and methods:<br />
• The OPLS_2005 all-atom force field for energy scoring of proteins as well as for ligands<br />
and other non-amino-acid residues.<br />
• A Surface Generalized Born (SGB) continuum solvation model for treating solvation<br />
energies and effects.<br />
• Residue-specific side-chain rotamer and backbone dihedral libraries, derived from the<br />
non-redundant data sets extracted from the PDB, and representing values most commonly<br />
observed in protein crystal structures. The libraries allow for both the evaluation of existing<br />
rotamer/dihedral values and the systematic prediction of new ones.<br />
Both Build Structure and the step that follows, Refine Structure, can treat the 20 standard<br />
amino acids as well as modified residues and ligands.<br />
Model building uses the following procedure:<br />
1. Non-conserved residues are mutated to the desired identity. Side chains are added by<br />
finding the first rotamer in the library that does not produce a clash.<br />
2. Insertions, deletions, and template transitions are built. These are cases in which the<br />
backbone itself needs to be reconstructed, either due to gaps in the alignment or template<br />
transitions that produce gaps in the 3D structure. These gaps are closed by reconstruction<br />
of the affected region ab initio, using a backbone dihedral library. If no reasonable conformation<br />
can be generated that closes the loop structure, the region being reconstructed<br />
will be expanded until closure is possible.<br />
3. Gap reconstruction is done by finding any single loop conformation that closes the structure<br />
and is physically reasonable. If multiple conformations are found, the one that deviates<br />
from the template structure as little as possible is chosen. No attempt is made to<br />
perform an exhaustive optimization of the region.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
Chapter 5: Comparative Modeling: Build Structure<br />
4. Side-chain prediction (optional). Side chains may be optimized. The two choices are<br />
either: (a) optimize all side chains from scratch, or (b) optimize only those that are not in<br />
the template.<br />
5. Minimization of non-template regions (optional). Any portions of the backbone that were<br />
not derived directly from one of the templates (and were thus reconstructed ab initio) are<br />
subject to a local minimization as described in Section 7.7 on page 62.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 41
42<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
<strong>Prime</strong> <strong>User</strong> <strong>Manual</strong><br />
Chapter 6: Fold Recognition<br />
Chapter 6<br />
When sequence searches in the Find Homologs step of <strong>Prime</strong>–SP fail to find templates, or<br />
query-template sequence identity is low, structure prediction may be continued using the<br />
Threading path. The Threading path involves use of Fold Recognition to find candidate “seed”<br />
templates, followed by a return to the Comparative Modeling path with a selection of<br />
templates. Fold recognition is not available on Windows.<br />
6.1 Overview of the Fold Recognition Step<br />
The Fold Recognition step is designed to find structural templates that could not be found by a<br />
sequence search. The use of secondary structure matching is important in finding remote<br />
(
44<br />
Chapter 6: Fold Recognition<br />
6.2.2 Generating SSPs<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />
Figure 6.1. The Fold Recognition panel, initial view.<br />
To generate SSPs for the SSP profile, run all available SSP programs by clicking the Run SSP<br />
button. One secondary structure prediction program (SSpro) is bundled with <strong>Prime</strong>. However,<br />
an optional third party program is recommended for optimal results. See Chapter A for more<br />
information on this program.
Figure 6.2. The Fold Recognition panel after Run SSP.<br />
Chapter 6: Fold Recognition<br />
When the secondary structure prediction jobs finish, the SSPs are displayed in the sequence<br />
viewer below (as children of) the query sequence. (If an error occurs when running a structure<br />
prediction program, an error message is printed to the log file displayed in the Monitor panel.)<br />
6.2.3 Editing or Deleting SSPs<br />
You can use your expert knowledge to improve the query’s SSP profile by deleting or editing<br />
SSPs you believe are incorrect.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 45
46<br />
Chapter 6: Fold Recognition<br />
To delete an SSP, right-click on the SSP in the sequence viewer and choose Delete from the<br />
shortcut menu.<br />
To edit an SSP:<br />
1. Click the Edit SSP toolbar button:<br />
2. Highlight the part of the SSP you want to change.<br />
3. Type the character e, h, or -.<br />
E represents strand, H represents helix, and – represents loop.<br />
The highlighted region of the prediction is changed to your choice.<br />
6.2.4 Resetting SSPs<br />
To retrieve an unmodified SSP after editing, click Run SSP. This immediately resets the SSP<br />
(but does not rerun the prediction programs).<br />
6.3 Searching for Templates<br />
When you are satisfied with the SSP profile for the query, click Search to run the Fold Recognition<br />
program. You may prefer to change the search options described in the following section<br />
before doing so.<br />
6.3.1 Search Program Options<br />
Display of Results<br />
Show Top N Results<br />
By default, the 100 top-ranked templates found are shown in the Templates table. This number<br />
can be changed in the Show Top N Results text box before clicking Search.<br />
Z-Scoring<br />
Searching can be done with or without Z-scoring. The Z-score shows how significant a match<br />
is relative to a random match. It is calculated by shuffling the query sequence. See Section 9.2<br />
on page 71 for more information on Z-score filtering.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
Figure 6.3. The Fold Recognition–Options dialog box.<br />
You may want to turn on Z-scoring if the query protein is any of the following:<br />
• Less than 120 residues long<br />
• More than 90 percent alpha (helical)<br />
• More than 90 percent beta (strand)<br />
Chapter 6: Fold Recognition<br />
Click Options to open the Fold Recognition–Options dialog box and turn on Z-scoring.<br />
Note: Running the search program with Z-scoring on a query with 300 residues typically<br />
requires three to four hours. Without Z-scoring, running a search on a protein of<br />
similar length typically takes only 10-15 minutes.<br />
6.3.2 Search Program Technical Details<br />
The search program finds potential templates for model building by searching a database of<br />
structure folds (SCOP domains) generated from the PDB using secondary structure information.<br />
Profile-sequence matching and composite secondary structure information are the same as in<br />
the Align program used in Edit Alignment in the Comparative Modeling path. The search<br />
program used in the Threading path has two additional features:<br />
• Weighting is adjusted by degree of structural conservation inside a family of templates.<br />
Each residue in the template sequence is defined as conserved or variable according to<br />
Multiple Structure Alignment (MSTRA) of the template with its structural neighbors.<br />
When aligning conserved residues, higher weight is given for matching and higher penalty<br />
is given for opening gaps, and vice versa for variable residues.<br />
• Optional Z-scoring, as described above and discussed in more detail in Section 9.2 on<br />
page 71.<br />
6.3.3 Search Output<br />
When the search is complete, the Templates table displays candidate templates, ranked from<br />
the closest homolog to the least close. This preliminary ranking is based on a scoring function<br />
that combines sequence, secondary structure, and degree of structural conservation. If Zscoring<br />
is on when Search is launched, rankings take Z-score into account as well.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 47
48<br />
Chapter 6: Fold Recognition<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />
Figure 6.4. The Fold Recognition panel with Search results.<br />
While it is possible that the top-ranked template will yield good results, it is recommended that<br />
multiple templates be used in model building. By default, the top 25 templates are selected.<br />
You can select the desired templates using control-click and shift-click. To sort the templates<br />
by properties other than rank, click the appropriate column heading.<br />
Optionally, if you have not already searched for the family of the query, you can click Find<br />
Family to run HMMER/Pfam. The query family name found, with an E-value in parentheses,<br />
appears in the Query family name text box. The sequence viewer shows the match between the<br />
query and the consensus sequence of the family, as described in Section 3.2.2 on page 20. Ifa
Chapter 6: Fold Recognition<br />
potentially meaningful (E-value under 1) family identification is found, it may help you select<br />
different or additional seed templates to bring forward to Build Backbone.<br />
To view a template colored by secondary structure, click the check box in the V (Visualization)<br />
column. The template appears in the sequence viewer, and its secondary structure can be<br />
compared to the SSP profile of the query.<br />
Note: The query-template “alignments” that can be visualized in this step are provided only<br />
as a guide in identifying seed templates. Gaps or other unfavorable features will not<br />
affect the quality of the models built using these templates.<br />
6.4 Evaluating Templates<br />
To make use of the seed templates to build a model, they must first be evaluated for their suitability<br />
for comparative modeling. The evaluation is done with the use of two scripts. The<br />
process involves running an alignment for each template, then running the script to do the evaluation.<br />
The first step is to export the SSPs in CASP format (or copy them from the working directory<br />
so that they have a .casp extension). It is recommended that you use both SSPs.<br />
For each template that you want to use, run the alignment script as follows:<br />
$SCHRODINGER/run -FROM psp CMAlign.pl target template seqFile sspFiles<br />
The arguments for this command are described in Table 6.1.<br />
Table 6.1. Arguments for the CMAlign.pl script.<br />
Argument Description<br />
target Name of the query sequence.<br />
template Name of the template, as shown in the Search table of the Fold Recognition step,<br />
including any underscore characters.<br />
seqFile Target sequence file.<br />
sspFiles CASP formatted secondary structure prediction files (.casp), blank-separated.<br />
The alignments are stored in the subdirectory CMAlign. Once you have run CMAlign.pl for<br />
each of the desired templates, you can run the evaluation script:<br />
cd CMAlign<br />
$SCHRODINGER/run -FROM psp checkAlign.pl target<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 49
50<br />
The script provides a recommendation for each template, and provides some information on<br />
the alignment.<br />
6.5 Building a Model<br />
Based on the recommendations of the script, you can use the Comparative Modeling path to<br />
build a model on any of the suitable templates. To do so, the template must be exported and<br />
read back into <strong>Prime</strong>, which you can do with the following procedure:<br />
1. In the Fold Recognition step, include the template in the Workspace (click the V column).<br />
2. Click the Create entry from Workspace main toolbar button to create a project entry.<br />
3. Export the project entry as a PDB file.<br />
4. In the Structure Prediction panel, create a new run and read in the query sequence again.<br />
5. Click Next.<br />
6. In the Find Homolog step, import the saved template PDB file.<br />
There is no need to run Search because a template was manually imported.<br />
7. Select the imported template, and follow the Comparative Modeling path (or click Edit<br />
Alignment in the Guide).<br />
8. In the Edit Alignment step, click Import SSP and select the .casp SSP files that you used<br />
for the analysis.<br />
9. Click Align to run the alignment.<br />
From this point, you can continue to the Build Structure step as normal.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
<strong>Prime</strong> <strong>User</strong> <strong>Manual</strong><br />
Chapter 7: <strong>Prime</strong>–Refinement<br />
Chapter 7<br />
<strong>Prime</strong>–Refinement is a module used to refine protein structures from the Maestro Workspace.<br />
The methods can be applied to protein structures from any source, including one built using the<br />
<strong>Prime</strong>–Structure Prediction workflow. It runs independently of <strong>Prime</strong>–SP from its own<br />
Maestro panel, the Refinement panel. The Refinement panel offers the following refinement<br />
protocols: loop refinement, side-chain prediction, minimization, and a single-point energy<br />
calculation at the current geometry of the model structure.<br />
In many cases, <strong>Prime</strong>–Refinement is used on protein structures from sources other than <strong>Prime</strong>,<br />
but it can also be used with structures added to a project from the Build Structure step in the<br />
<strong>Prime</strong>–SP module. A model structure built in the Build Structure step may require further<br />
refinement. If there are insertions or deletions in the alignment between query and template, it<br />
is recommended that you perform a loop refinement, since insertions and deletions are most<br />
frequently found in loop regions. As another example, if you have added a water molecule to<br />
the binding site of the protein, you can use <strong>Prime</strong>–Refinement on the composite entry.<br />
<strong>Prime</strong>–Refinement can also refine structures with covalently bound ligands. The ligands can be<br />
multiply connected, covalently bound to each other and to the protein. This capability allows<br />
the refinement of proteins with phosphorylated residues or attached sugars, for example.<br />
<strong>Prime</strong>–Refinement has an implicit membrane model, in which the membrane is modeled by a<br />
slab of low dielectric constant in which the protein is immersed. The width of the slab and the<br />
orientation of the protein with respect to the slab can be adjusted.<br />
7.1 Preparing Structures for Refinement<br />
<strong>Prime</strong>–Refinement has a great deal of flexibility in the types of structures it can handle, and can<br />
fix many structural problems. For standard residues, <strong>Prime</strong>–Refinement can fix formal charges<br />
and bond orders, and correct disparities between the sequence and the structure. For example,<br />
if a residue has the coordinates of ALA but is called SER, the 3HB will be ignored and the OG<br />
and HG added during refinement. The standard residues are the 20 canonical amino acids;<br />
ACE and NMA; HOH; CYX (disulfide); and the acid/base variants ASH/AS1 (ASP), GLH/<br />
GL1 (GLU), ARN (ARG), LYN (LYS), HIE/HIP/HID (HIS), CYT (CYS), SRO (SER), TYO<br />
(TYR).<br />
<strong>Prime</strong>–Refinement has certain conditions that must be met by the input structures for a job to<br />
be run successfully. Some of these conditions only apply to covalently bound ligands.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 51
52<br />
Chapter 7: <strong>Prime</strong>–Refinement<br />
For all kinds of structures, the following condition must be met:<br />
• The structure must be an all-atom structure.<br />
This condition applies to the protein and any ligand, waters or cofactors present in the<br />
structure: all hydrogens must be present.<br />
You can add hydrogens to a structure by displaying the structure in the Workspace and<br />
double-clicking the Add hydrogens button in the toolbar.<br />
However, for structures containing nonstandard residues, you must correct bond orders<br />
and formal charges first.<br />
For structures with nonstandard residues, the following conditions must be met:<br />
• No residue can consist of disconnected pieces without formal bonds.<br />
This means, among other things, that metals should not have formal bonds to the rest of<br />
the structure, and should be treated as ionic. For example, in hemes the Fe must be a separate<br />
residue with a +2 formal charge and no bonds, and the “bonded” N atoms must have<br />
a –1 formal charge.<br />
• Bound residues must contain at least 3 atoms between bonds to other residues.<br />
This means that, for example, a dihedral angle cannot span more than 2 residues.<br />
• Bond orders and formal charges must be corrected.<br />
For nonstandard residues the supplied structure will be used, and you must check for correct<br />
bond orders and formal charges.<br />
When you rename residues, you should be aware that OPLS_2005 parameters are used for<br />
standard residues and nonstandard residues.<br />
Proteins can also be used as ligands, and the same conditions apply. No special treatment is<br />
needed for two distinct chains connected by a disulfide or similar side-chain linkage. If the<br />
“ligand” chain has nonstandard terminal groups or is connected to the main chain by its backbone,<br />
two steps are required to make sure that the “ligand” chain is treated correctly:<br />
1. Rename standard residues appearing in the ligand to nonstandard names.<br />
2. Rename the atoms in inter-residue C–N bonds between renamed (ligand) residues. For<br />
example, change “_C__ “to “_C*_”, where the underscores represent spaces. Other<br />
atoms can keep their PDB names; only one of C or N needs to be changed.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
Chapter 7: <strong>Prime</strong>–Refinement<br />
If your structure does not meet one or more of the conditions outlined above, you must fix it<br />
before you can successfully run a refinement. Some of the problems only become apparent<br />
when you have run a refinement job, so it can be a useful diagnostic procedure to run a <strong>Prime</strong><br />
energy calculation first. This job is quick, and produces warnings in the log file, which you can<br />
check in the Monitor panel.<br />
Many of the preparation tasks are performed automatically or interactively with the Protein<br />
Preparation Wizard. The Protein Preparation Wizard panel can be opened from the Workflows<br />
menu on the main toolbar. When the preparation has finished, you should check that all<br />
changes made are correct. You can fix any remaining errors with the procedures below. For<br />
more information on the Protein Preparation Wizard, see the Protein Preparation Guide.<br />
Procedures for fixing bond orders, formal charges, and atom names are given below. These<br />
procedures use the Build toolbar and the Build panel, which you can display by clicking Show/<br />
Hide the Build toolbar on the main toolbar or opening the panel from this button menu.<br />
To assign bond orders:<br />
1. Choose Assign Bond Orders from the Tools menu.<br />
2. Inspect the residues in the structure for any remaining bond orders that are incorrect.<br />
3. If there are still incorrect bond orders, click the Increment bond order or Decrement bond<br />
order button on the Build toolbar.<br />
4. Click the bonds in the Workspace structure that needs correction.<br />
You can also right-click the bonds in the Workspace structure, and choose the correct<br />
order from the Order submenu of the shortcut menu.<br />
To view and change formal charges:<br />
1. From the Label atoms button menu on the main toolbar, choose Formal Charge.<br />
Charges are displayed for atoms that are charged.<br />
2. Click the Increment formal charge or Decrement formal charge button on the Build toolbar.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 53
54<br />
Chapter 7: <strong>Prime</strong>–Refinement<br />
3. Click on an incorrectly charged atom in the Workspace until its charge is correct.<br />
The formal charge is incremented or decremented by one unit for each click.<br />
To change PDB atom names:<br />
1. From the Label atoms button menu on the main toolbar, choose PDB Atom Name.<br />
2. In the Atom Properties tab of the Build panel, choose PDB Atom Name from the Property<br />
option menu.<br />
3. Ensure that Pick is selected in the Apply PDB atom name section, and that Atoms is chosen<br />
from the Pick option menu.<br />
4. Zoom in on one of the residues (middle+right mouse buttons or mouse wheel).<br />
5. For each atom that needs renaming, enter the desired PDB atom name in the PDB atom<br />
name text box, then click on the atom.<br />
6. Repeat Step 4 and Step 5 for each residue whose atoms need renaming.<br />
To change residue names:<br />
1. From the Label atoms button menu on the main toolbar, choose Residue information.<br />
2. In the Residue Properties tab of the Build panel, choose Residue Name from the Property<br />
option menu.<br />
3. Ensure that Pick is selected in the Apply residue PDB name section, and that Residues is<br />
chosen from the Pick option menu.<br />
4. For each residue that needs renaming, enter the desired PDB name in the Residue PDB<br />
name text box, then click on the residue.<br />
You might need to zoom in to select the residues (middle+right mouse buttons or mouse<br />
wheel).<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
7.2 Using the Refinement Panel<br />
Chapter 7: <strong>Prime</strong>–Refinement<br />
The four tasks that can be performed in the Refinement panel are single-point energy calculation,<br />
loop refinement, side-chain prediction, and minimization. These tasks are discussed in the<br />
following sections. The general procedure for running a structure refinement is given below.<br />
To run structure refinement tasks:<br />
1. Display the structure you want to refine in the Workspace.<br />
If the structure has alternate positions, choose the desired set of alternate positions.<br />
2. Choose Applications > <strong>Prime</strong> > Refinement from the main Maestro window.<br />
The Refinement panel is displayed.<br />
3. Select a task from the Task menu.<br />
The default task is Refine loops.<br />
4. If you want to use an implicit membrane model, select Use implicit membrane, then click<br />
Set Up Membrane to define the thickness and orientation of the membrane.<br />
5. Select the region for refinement, as appropriate.<br />
When you have selected a region for refinement, the Start and Write buttons become<br />
available.<br />
6. Click Start.<br />
The Start dialog box is displayed. If you want to run the job later, using refinestruct,<br />
you can click Write to create the input files.<br />
7. Make changes to the job settings, as appropriate.<br />
8. Click Start.<br />
The refinement job begins and the Monitor panel is displayed.<br />
The Start and Write buttons remain dimmed until you have specified some part of the structure<br />
for refinement. When you have made a selection, the Start and Write buttons become available.<br />
The Write button writes the input file for the job, which you can run at a later time<br />
When selecting regions for refinement, you may want to view information about possible<br />
structural problems. This information is available in the Protein Reports panel, which can be<br />
opened from the Tools menu on the main menu bar. For more information on this panel, see<br />
Section 9.5.3 of the Maestro <strong>User</strong> <strong>Manual</strong>.<br />
The OPLS_2005 force field is used for all refinement tasks.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 55
56<br />
Chapter 7: <strong>Prime</strong>–Refinement<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />
Figure 7.1. The <strong>Prime</strong> Membrane Setup panel.<br />
7.3 Setting Up an Implicit Membrane<br />
The <strong>Prime</strong> implicit membrane model is intended for proteins embedded in a membrane. The<br />
implicit membrane is a low-dielectric slab-shaped region, which is treated in the same way as<br />
the high-dielectric implicit solvent region. Hydrophobic groups, which normally pay a solvation<br />
penalty for creating their hydrophobic pocket in the high dielectric region, do not have to<br />
pay that penalty while in the membrane slab. Conversely, hydrophilic groups lose any shortranged<br />
solvation energy from the high dielectric region when moving into the low dielectric<br />
region.<br />
The implicit membrane model is intended for use with proteins that span the membrane. The<br />
slab is trimmed for efficiency, so that any structure that is too small or too far away from the<br />
membrane is likely to result in too much trimming and might not produce useful results.<br />
The membrane is marked in the Workspace by a semitransparent cube with a white line that is<br />
perpendicular to the surfaces of the membrane (which are planar). The membrane surfaces are<br />
colored red, and the other faces of the cube (which are interior to the membrane) are colored<br />
green. You can control whether the sides of the cube are displayed with the Show cube sides<br />
option. The direction of the line shows the orientation of the membrane.You can adjust both the<br />
thickness and the orientation of the membrane.<br />
To set up an implicit membrane:<br />
1. Click Set Up Membrane.<br />
The Membrane Setup panel opens.
2. Click New Membrane (Auto-Place).<br />
A membrane is placed, marked in the Workspace as described above.<br />
3. Select Adjust membrane position.<br />
Chapter 7: <strong>Prime</strong>–Refinement<br />
4. Use the middle mouse button to rotate the membrane and the right mouse button to translate<br />
the membrane.<br />
5. Adjust the thickness by entering a value in the text box or using the arrow buttons to<br />
adjust the thickness up or down by 0.1 Å at a time.<br />
6. Deselect Adjust membrane position.<br />
This is necessary to exit membrane adjustment and return to normal Workspace<br />
operations.<br />
If you want to add the membrane to the entries that are selected in the Project Table, click Save<br />
to Selected Entries. The membrane is stored as a set of coordinates for each end of the white<br />
line. The entries that you select should therefore contain proteins that occupy approximately<br />
the same region of space as the protein used to define the membrane. Otherwise, you will have<br />
to adjust the membrane location for these entries.<br />
If you already have a membrane defined for another project entry, you can load it for the<br />
current entry by selecting the entry that has the membrane and clicking Load from Selected<br />
Entry.<br />
If the protein has pockets that would be occupied by solvent (water), you can select these<br />
regions to exclude from the implicit membrane using the tools in the Region to exclude from<br />
membrane section. The solvent dielectric constant is then used for these excluded regions.<br />
Excluded regions are defined by a set of spheres that are placed on atoms adjacent to the<br />
region. The spheres can overlap, and should cover the entire excluded region. It does not matter<br />
that they overlap the protein, because the protein dielectric is not affected. To define a region,<br />
select Place exclusion sphere and pick atoms in the Workspace that are adjacent to the region.<br />
For each pick a sphere is placed on the atom. You can remove the sphere by picking the atom<br />
again. To adjust the sphere volume, use the Exclusion sphere radius slider. This slider only uses<br />
integer values, because the exact size of the sphere isn’t critical: the spheres only has to cover<br />
the excluded region.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 57
58<br />
Chapter 7: <strong>Prime</strong>–Refinement<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />
Figure 7.2. The Energy Analysis task in the Refinement panel.<br />
7.4 <strong>Prime</strong> Energy Analysis<br />
To perform a single-point molecular mechanics energy calculation, choose Energy Analysis<br />
from the Task option menu. The calculation uses the OPLS_2005 all-atom force field for<br />
protein residues as well as for ligands and cofactors. If desired, set up an implicit membrane<br />
model. Click Start, make job settings in the Start dialog box, then click Start to run the job.<br />
The output includes a breakdown of the energy into covalent, Coulombic, van der Waals and<br />
solvation energy contributions. Each of these is added as a Maestro property to the output<br />
structure file, named <strong>Prime</strong> Covalent, <strong>Prime</strong> Coulomb, <strong>Prime</strong> vdW, and <strong>Prime</strong> Solvation. The<br />
output also includes a breakdown of the energy on a per-residue basis.<br />
7.5 Refining Loops<br />
<strong>Prime</strong>–Refinement is capable of refining loop structures of various lengths, and provides algorithms<br />
for different loop lengths. In addition, loops whose structure affects other loops can be<br />
cooperatively refined in pairs.<br />
To refine one or more loops serially:<br />
1. Choose Refine Loops from the Task menu.
Figure 7.3. The Refine loops task in the Refinement panel.<br />
2. Click Load from Workspace.<br />
Chapter 7: <strong>Prime</strong>–Refinement<br />
The loop features of the Workspace structure appear in the Loops table. For each loop, the<br />
table includes a Run check box, a feature name such as loop1, and the numbers of the first<br />
(Res1) and last (Res2) residues in the loop. When you click a row in the table to select a<br />
loop, the Workspace shows the loop residues highlighted in yellow. Note that loops of<br />
one or two residues are not listed, as they are unsuitable for the refinement protocol.<br />
3. Select a loop by clicking its name in the Feature column.<br />
The loop row is highlighted in yellow. The default settings for the loop are displayed.<br />
4. If necessary, shorten the loop region to be refined by changing the first and last residue<br />
numbers (Res1 and Res2).<br />
The time required to refine a loop scales approximately linearly with the length of the<br />
loop, and the sampling options recommended for loops longer than five residues are also<br />
more time-intensive.<br />
5. (Optional) Click Options to select a sampling method and make settings for this loop in<br />
the Structure Refinement Options dialog box.<br />
If the loop is longer than five residues, you should select a sampling method other than<br />
the default.<br />
See Section 7.8.3 for information on the loop refinement options.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 59
60<br />
Chapter 7: <strong>Prime</strong>–Refinement<br />
6. To designate this loop for refinement, select the check box in the Run column.<br />
The Start and Write buttons becomes available.<br />
7. Repeat Step 3 through Step 6 for each loop you want to refine.<br />
8. (Optional) Set up and enable an implicit membrane model.<br />
9. Click Start.<br />
The Start dialog box opens.<br />
10. Make any job settings, then click Start.<br />
The loop refinement job is started.<br />
When you choose several loops for refinement, they are refined in series. The first selected<br />
loop is refined, the best scoring structure for that loop is determined, that structure is used in<br />
refining the next loop, and so on. When all loops have been refined, one final structure is<br />
returned. If only one loop is refined, the four highest scoring structures are returned by default.<br />
If you want to refine two loops cooperatively, you should replace Step 3 through Step 7 above<br />
with the following steps:<br />
1. Click the Run column for the two loops you want to refine cooperatively.<br />
2. If necessary, shorten the loop region to be refined by changing the first and last residue<br />
numbers (Res1 and Res2).<br />
3. Click Options to open the Structure Refinement Options dialog box.<br />
4. Select Cooperative loop sampling in the Loop refinement section.<br />
5. Set any other options, then click OK.<br />
For technical details about loop refinement, see Section 7.9.1.<br />
When a loop refinement begins, a validation program checks the loop features of the structure.<br />
Apparent errors are reported in a warning dialog box. You may choose Run All Features, Run<br />
Only Valid Features,orCancel. It is recommended that you make a note of the invalid features,<br />
click Cancel, and correct the structure if appropriate.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
Chapter 7: <strong>Prime</strong>–Refinement<br />
Figure 7.4. The Predict side chains task in the Refinement panel.<br />
7.6 Predicting Side Chains<br />
To predict side-chain conformations:<br />
1. Select Predict side chains from the Task options menu.<br />
The panel displays atom selection options under the heading Residues for side chain<br />
refinement.<br />
2. Use the atom selection options to specify the residues for which you want to predict or<br />
refine side-chain conformations (See Chapter 5 of the Maestro <strong>User</strong> <strong>Manual</strong> for information<br />
on selecting atoms).<br />
If you imported a PDB structure that has residues with missing atoms, you can select<br />
those residues by clicking Select Residues with Missing Atoms. The selection is based on<br />
the information generated when the structure was imported, not on the current structure.<br />
The selection therefore includes any residues that were fixed since import.<br />
3. Click Options to change the Number of structures to return from the default, which is to<br />
return a single side-chain prediction.<br />
4. Click the Run button to start the side-chain prediction job.<br />
You can also use this process to add side chains to a backbone structure from Refine Backbone.<br />
For technical details about side-chain prediction, see Section 7.9.3.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 61
62<br />
Chapter 7: <strong>Prime</strong>–Refinement<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />
Figure 7.5. The Minimize task in the Refinement panel.<br />
7.7 Minimizing Structures<br />
The Minimize refinement task performs a truncated-Newton energy minimization, using the<br />
OPLS_2005 all-atom force field for proteins as well as for ligands and cofactors, and treating<br />
solvation energies and effects via the Surface Generalized Born (SGB) continuum solvation<br />
model.<br />
To minimize all or part of the structure selected:<br />
1. Select Minimize from the Task menu.<br />
Atom selection options are displayed under the heading Atoms for minimization.<br />
2. Click All to minimize all residues, or specify a set of atoms or residues.<br />
For more information on atom selection, see Chapter 5 of the Maestro <strong>User</strong> <strong>Manual</strong>.<br />
3. Click Start to start the minimization job.
Figure 7.6. The Structure Refinement Options dialog box.<br />
7.8 Refinement Panel Options<br />
Chapter 7: <strong>Prime</strong>–Refinement<br />
The Structure Refinement Options dialog box offers three sets of options: General options,<br />
Dielectric options, and options for Loop Refinement. Click Options in the Refinement panel to<br />
view or change the options and settings.<br />
7.8.1 General Options<br />
The General section supplies options that apply to all types of refinement. The options include<br />
specification of a random-number seed for starting the side-chain prediction and use of crystal<br />
symmetry.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 63
64<br />
Chapter 7: <strong>Prime</strong>–Refinement<br />
Seed<br />
Specify the seed for the random-number generator used in side-chain predictions. The seed is<br />
used for side-chain prediction in both the loop refinement and the side-chain prediction tasks.<br />
The options are:<br />
• Random: use a random seed.<br />
• Constant: use the integer given in the text box as the seed. The default is 0.<br />
Use crystal symmetry<br />
If crystal symmetry is known for this protein, apply periodic boundary conditions so that the<br />
crystal symmetry is satisfied.<br />
7.8.2 Dielectric Options<br />
The Dielectric section allows you to set the dielectric constants used in the calculations, in the<br />
following text boxes.<br />
Internal dielectric<br />
Set the dielectric constant used for the interior of the protein and any implicit membrane.<br />
External dielectric<br />
Set the dielectric constant for the (continuum) solvent.<br />
7.8.3 Loop Refinement Options<br />
The Loop Refinement section provides options for selecting the sampling method, job options,<br />
output options, and some general options.<br />
<strong>Prime</strong> can perform cooperative loop sampling as well as serial loop sampling. Both the size of<br />
the loop and the sampling options chosen affect the time needed for loop refinement. The<br />
options are:<br />
Cooperative loop sampling<br />
Select this option to refine two loops together. Each loop is refined in turn until the refinement<br />
has converged for both loops. You must have only two loops selected to use this option.<br />
Serial loop sampling<br />
Select this option to refine each loop independently. The option menu provides five levels of<br />
sampling accuracy: Default, Extended Low, Extended Medium, Extended High, Ultra Extended.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
Chapter 7: <strong>Prime</strong>–Refinement<br />
The text below the option and menu displays the recommended loop length for the selected<br />
accuracy.<br />
Extended sampling samples loop conformations more thoroughly, using a series of loop<br />
constraint settings. Up to eight processors can be used simultaneously to perform loop refinement.<br />
Extended sampling can be run at four different sampling levels: Low, Medium, High, and<br />
Ultra. If you select Ultra Extended, you can only refine one loop at a time.<br />
Job<br />
These options allow you to set a limit on the number of processors over which to distribute the<br />
loop sampling subjobs. Each sampling method runs in several stages, and each stage can use a<br />
different number of processors. The maximum number of processors that can be used by the<br />
different sampling methods are:<br />
Default 1<br />
Extended Low 2<br />
Extended Medium 4<br />
Extended High 8<br />
Ultra Extended 8*L (L is the loop length)<br />
• To make use of as many processors as are available, select Distribute subjobs over maximum<br />
available processors.<br />
• To set the maximum number of processors, select Distribute subjobs over and enter a<br />
value in the text box.<br />
Output for single loop feature<br />
These options determine how many structures are returned:<br />
• Maximum number of structures to return: Select this option to return more than one structure<br />
for each loop specified for refinement. The default when this option is selected is to<br />
return 4 structures for each loop.<br />
• Energy cutoff: Select this option to prevent higher-energy structures from being returned.<br />
Structures whose energy is higher than that of the lowest-energy structure by more than<br />
this amount are not returned. The default cutoff when this option is selected is 10 kcal/<br />
mol.<br />
The following settings are applied to the loops you have selected for refinement.<br />
• Maximum CA movement: To limit the extent to which alpha carbon atoms can move from<br />
their original positions, select this option, and enter a limit in the text box. The default<br />
distance is 3.0 Å. The extended sampling procedure tests multiple values for Maximum<br />
CA movement, so this option is unavailable when extended sampling is selected.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 65
66<br />
Chapter 7: <strong>Prime</strong>–Refinement<br />
• Unfreeze side chains within: Select this option to allow side chains near the specified loop<br />
to move, and specify a distance in the text box. The default distance is 7.5 Å.<br />
• Minimum allowed vdW overlap: By default, this value is set to 0.70, meaning that the distance<br />
between atoms must be at least 70% of their ‘ideal’ van der Waals separation. The<br />
extended sampling procedure tests multiple values for the minimum overlap, and therefore<br />
this option is unavailable when extended sampling has been selected.<br />
7.9 Technical Details for Refinement<br />
7.9.1 Loop Refinement<br />
Loop refinement using the Default option proceeds as follows:<br />
1. The loop is reconstructed using the backbone dihedral library, by building up half from<br />
each direction. The resolution starts off very coarsely, becoming finer until it manages to<br />
produce a required number of physically realistic loop conformations (based on the<br />
length of the loop).<br />
2. The large number of loops generated in the first step are clustered, and representatives of<br />
each cluster are selected.<br />
3. These cluster representatives are then scored as follows:<br />
a. Side chains are re-added to the loop residues, as well as any residues within a specified<br />
cutoff, according to the side-chain optimization procedure.<br />
b. The loop and contact side chains are minimized.<br />
c. The energy is calculated.<br />
4. The best scoring loop structures are returned.<br />
Loop prediction with extended sampling is specifically designed to overcome sampling problems<br />
with long loops, in which the number of residues results in an unwieldy number of<br />
possible loop conformations.<br />
Extended sampling uses the following procedure:<br />
1. Two initial predictions are run on the loop. These are intended to coarsely sample conformational<br />
space and are carried out with two slightly different sets of prediction parameters.<br />
Each initial prediction returns up to four top-scoring conformations, resulting in up<br />
to eight structures. These eight structures are intended as initial points for finer sampling<br />
of what appears to be favorable regions (basins) of conformational space.<br />
2. Each of the eight structures generated in Step 1 is run through another stage of refine-<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
Chapter 7: <strong>Prime</strong>–Refinement<br />
ment, with a 4 Å restriction on Cα movement. This is intended to sample the basins more<br />
finely.<br />
3. The eight refined structures from Step 2 are combined with the original structures from<br />
Step 1, and this set of 16 structures is ranked by energy. The five top-scoring structures<br />
are then subjected to a final stage of refinement, with a 2 Å restriction on Cα movement.<br />
This is intended to perform fine-grained sampling of the best basins.<br />
This procedure defines the “Extended High” sampling procedure. The “Extended Medium”<br />
and “Extended Low” carry four and two structures, respectively, into Step 2 and Step 3.<br />
In ultra-extended sampling, the three steps of extended sampling are followed by the steps<br />
below.<br />
4. The top four structures are selected, and two predictions run on each, with part of the loop<br />
frozen. Runs are done with “moving windows” with varying numbers of contiguous residues:<br />
all n residues, the two possible contiguous sets of n-1 residues, the three sets of n-2<br />
residues, and so on. The structures are sorted after each cycle of a given number of residues.<br />
5. Step 2 is repeated with the top eight structures cumulatively generated, then Step 3 is<br />
repeated with the top eight structures cumulatively generated.<br />
7.9.2 Cooperative Loop Refinement<br />
Cooperative loop refinement proceeds as follows:<br />
1. A list of side chains that will be optimized is generated. By default this list includes any<br />
residue with an atom within 7.5 Å of either loop 1 or loop 2.<br />
2. Loop 1 is sampled, allowing side chains in the list from Step 1 to move, holding the backbone<br />
of loop 2 fixed at its original conformation. Eight loop conformations are generated.<br />
3. Loop 2 is sampled, allowing side chains in the list from Step 1 to move, holding the backbone<br />
of loop 1 fixed at its original conformation. Eight loop conformations are generated.<br />
This step runs concurrently with Step 2.<br />
4. Conformations of loop 1 from Step 2 with loop 2 from Step 3 are merged into a set of<br />
structures.<br />
5. All side chains in the list determined in Step 1 are reoptimized, keeping only those structures<br />
whose energy is below the energy cutoff.<br />
6. All residues in both loops are then minimized, including the backbone, keeping only<br />
those structures whose energy is below the energy cutoff.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 67
68<br />
Chapter 7: <strong>Prime</strong>–Refinement<br />
7. The 4 best new structures are taken as input for resampling, starting from Step 2. The process<br />
is repeated until convergence is obtained.<br />
7.9.3 Side-Chain Prediction<br />
The structure refinement program uses the following procedure to re-predict conformations for<br />
the side chain set that you select.<br />
1. Side-chain rotamers are randomized for nonconserved residues (the default) or for all residues.<br />
2. Beginning with the first residue to be predicted, the side-chain rotamer library is used to<br />
find the rotamer with the lowest energy while keeping all other side chains fixed. Once<br />
the process is complete for the first residue, the next residue is treated, and so forth until<br />
all have been done once (a single pass has been completed).<br />
3. Once the pass is complete, Step 2 is repeated from the beginning, and then repeated again<br />
until the side-chain rotamers appear to be converged (no more changes are occurring).<br />
4. Minimization is run on all of the side-chain atoms (but not backbone atoms) of the residues<br />
being treated.<br />
7.10 Refinement with Nonstandard Residues<br />
If you want to refine a structure with nonstandard residues, you can make changes to the residues<br />
in the structures before refinement. Several nonstandard residues are supported in the<br />
Build panel: the neutralized forms of ASP, LYS, ARG, and GLU, which are named ASH, LYN,<br />
ARN, and GLH; and the tautomers of HIS (HID=HIS and HIE) and its protonated form, HIP.<br />
Ionization (deprotonation) of four other residues are supported, CYS, SER, THR, and TYR.<br />
The names of the ionized residues are CYT (thiolate), SRO (alkoxide), THO (alkoxide), and<br />
TYO (phenoxide). In addition, the disulfide-bridged CYS is supported as CYX. To change one<br />
of these residues, you can either change the name, or change the structure. For example, you<br />
can generate an ionized SER by either of the following methods:<br />
• Changing the residue name to SRO. The HG is deleted and a formal negative charge is<br />
added to the OG.<br />
• Deleting the HG (if present) and adding a formal charge to the OG. The residue is<br />
renamed to SRO.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
7.11 Output of Refinement Jobs<br />
Chapter 7: <strong>Prime</strong>–Refinement<br />
By default, the structures created in <strong>Prime</strong>–Refinement are appended to the Project Table for<br />
the project currently open. If you would prefer the output structure to replace the input structure,<br />
or would rather not incorporate the output structure into the project at all, select the<br />
appropriate option from the Incorporate option menu in the Start dialog box:<br />
• Append new entries as a new group<br />
• Append new entries individually<br />
• Replace existing entries<br />
• Do not incorporate<br />
In addition to returning the structures, the output of a refinement job include several Maestro<br />
properties. The main properties are <strong>Prime</strong> Energy and <strong>Prime</strong> Energy Difference, which reports<br />
the difference in energy between the input and output structures. The energy difference is<br />
cumulative: if the property already exists, the difference is incremented in the current job. For<br />
loop refinement and side-chain prediction jobs, the lipophilic term is reported separately as<br />
<strong>Prime</strong> Lipo; the sum of the <strong>Prime</strong> energy and the lipophilic term is used to rank loop conformations.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 69
70<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
<strong>Prime</strong> <strong>User</strong> <strong>Manual</strong><br />
Chapter 8: Maestro Protein Structure Alignment<br />
Chapter 8<br />
It is sometimes useful to be able to align proteins outside the usual <strong>Prime</strong> workflow. Maestro<br />
provides access to the structure alignment facilities of <strong>Prime</strong> so that you can perform such<br />
alignments. Two panels are available from the tools menu: the Protein Structure Alignment<br />
panel, and the Align Binding Sites panel.<br />
8.1 The Protein Structure Alignment Panel<br />
To open the Protein Structure Alignment panel, choose Protein Structure Alignment from the<br />
Tools menu in the main window.<br />
In the Protein Structure Alignment panel, structure alignment is performed on Project Table<br />
entries that have been included in the Workspace. The reference structure, whose frame of<br />
reference is used for the alignment, must be the first included entry (the entry with the lowest<br />
row number). To make another protein the reference, move that protein above all the other<br />
included entries in the Project Table.<br />
By default, the alignment includes all residues. However, it is often useful to align a subset of<br />
residues that have greater similarity than the structures as a whole, or are more informative to<br />
compare. Because this program uses matching of secondary structure elements, the subset<br />
should have enough contiguous residues to include at least one helix or strand. Very small<br />
numbers of residues do not produce useful alignments. Selection of subsets is discussed further<br />
under Residues to align, below.<br />
When the alignment calculation is complete, the structures are placed in the same frame of<br />
reference in the Workspace, and the results of the calculation are listed in the text display area<br />
of the Protein Structure Alignment panel. The aligned residues are listed and an RMSD and<br />
Alignment Score are reported. The RMSD is calculated based only on those residues that are<br />
considered to have been successfully aligned, and therefore does not represent an overall<br />
comparison of the structures. It is computed from the C-alpha atoms of the aligned residues.<br />
An Alignment Score lower than 0.6–0.7 indicates a good alignment. Alignment Scores greater<br />
than about 0.7–0.8, or a failure of the structural alignment calculation, indicates there is insufficient<br />
structural similarity for a meaningful alignment. For details on the algorithm, see the<br />
paper by Honig and Yang (J. Mol. Biol. 2000, 301, 665).<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 71
72<br />
Chapter 8: Maestro Protein Structure Alignment<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />
Figure 8.1. The Protein Structure Alignment panel.<br />
To run a default alignment job, include two or more protein structures in the Workspace, open<br />
the Protein Structure Alignment panel, and click Align to start the alignment calculation.<br />
The Protein Structure Alignment panel provides some tools for controlling which parts of the<br />
protein are aligned and how they are aligned. These tools are described below.<br />
Residues to align Section<br />
Alignment can be performed for all residues or for a subset of residues. You might need to try<br />
different subsets to obtain the most useful alignment. You can use the picking controls, Workspace<br />
selection, or the Atom Selection dialog box to specify subsets. Subsets should include at<br />
least one secondary structure element (helix or strand). Clicking the Select button opens the<br />
Atom Selection dialog box to the Residue tab. Here you can select a range of residue numbers,<br />
specify portions of the residue sequence, or click Secondary Structure in the list of options on<br />
the left to choose residues based on their location in Helix, Strand, or Loop structures. Subsets<br />
can be combined, intersected, or otherwise modified in the Atom Selection dialog box.
Alignment Section<br />
Chapter 8: Maestro Protein Structure Alignment<br />
Gap penalty<br />
In the structure alignment calculation, this is the alignment score penalty for initiating a gap in<br />
the alignment. The default gap penalty is 2.00. The main purpose of this parameter is to<br />
prevent the generation of very poor initial alignments.<br />
Deletion penalty<br />
In the structure alignment calculation, this is the penalty for extending a gap. The default deletion<br />
penalty is 1.00.<br />
Secondary Structure Section<br />
Use scanning alignment<br />
This option is deselected by default. When it is selected, structure alignment is carried out in<br />
scanning rather than the default global mode. Generally, the default global mode is more<br />
useful. However, if you do not find any similarity between structures with global alignment,<br />
you can try scanning alignment.<br />
Global alignment involves running double dynamic programming on the full scoring matrix<br />
built from all secondary structural elements (SSEs) in both structures. In scanning alignment<br />
double dynamic programming is run on successively smaller subregions of the full scoring<br />
matrix (window sizes of 20, 15, 10 and 5 SSEs) until a reasonable alignment is produced.<br />
Assuming a reasonable alignment is found it is used as a starting point for building a global<br />
alignment (involving all SSEs).<br />
Window length<br />
Number of successive SSEs that are to be used as a baseline for the global alignment. The<br />
default is 5. This option is significant only if Use scanning alignment has been selected.<br />
Minimum similarity<br />
This is the alignment score above which alignments are reported as unsuccessful. The default<br />
is 1.00, where alignment scores of 0.7 or less mean greater similarity, and alignment scores<br />
above 1.0 mean less similarity. When alignment scores are greater than 1.0, the alignment is<br />
unlikely to be meaningful.<br />
Structural similarity of proteins is measured by the Protein Structural Distance (PSD). This<br />
measure combines the similarity scores for the secondary structural elements obtained from<br />
dynamic programming and the RMSD of aligned residues for the optimal alignment. For<br />
details see Yang, A.; Honig, B. Journal of Molecular Biology 2000, 301, 665-678.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 73
74<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />
Figure 8.2. The Align Binding Sites panel.<br />
Minimum length<br />
Minimum number of SSEs that are to be aligned in scanning alignment. The default is 2.<br />
Results Text Area<br />
Results from the alignment job are displayed in this text area.<br />
8.2 The Align Binding Sites Panel<br />
The Align Binding Sites panel allows you to align the binding sites (or other structural features)<br />
of members of a family of proteins. The Align Binding Sites job first runs a Protein Structure<br />
Alignment to obtain the global alignment and then automatically generates the list of atoms to<br />
use in a pairwise alignment of the C-alpha atoms from the residues selected.<br />
To open the Align Binding Sites panel, choose Tools > Align Binding Sites in the main window.
Chapter 8: Maestro Protein Structure Alignment<br />
The Use proteins from option menu provides choices of the source of protein structures. If you<br />
choose Project Table, the entries that are selected in the Project Table are used. If you choose<br />
File, the Input file text box and Browse button become available, and you can browse for the<br />
input file.<br />
In the Residues for alignment section you select the residues that are to be used to align the<br />
proteins. There are two options, described below.<br />
• Automatically detect binding site residues—Align residues within a specified distance of<br />
the ligand. The distance is specified in the Align residues within N Å from the ligand text<br />
box. Two options are available for specifying the ligand:<br />
• Detect ligand automatically—Detect the ligand by selecting the molecule that is<br />
most ligand-like (number of atoms).<br />
• Use molecule number—Specify the ligand by molecule number. You can enter a<br />
number in the text box, or select Pick ligand and pick a ligand atom in the Workspace.<br />
• <strong>Manual</strong>ly select residues—Specify the residues to align by using any of the following<br />
methods:<br />
• Enter the residue specification in the Align residues text box, in the format<br />
chain:number.<br />
• Click Select residues to open the Atom Selection dialog box and define the residues.<br />
• Select Pick a residue to include/exclude and pick residues in the Workspace. Picking<br />
an already included residue excludes it.<br />
The residues are listed in the Align residues text box, which you can edit to remove or add<br />
residues.<br />
The alignment is performed on atom pairs. You can ignore atom pairs in the alignment process<br />
if they are further apart than a specified distance, by entering the distance in the Ignore pairs<br />
greater than N Å apart text box.<br />
If you have proteins that are prealigned, you can select Structures are prealigned to suppress<br />
the global alignment of the protein structures, and align only by using the selected residues.<br />
When you have finished making settings, click Start to start the job.<br />
You can also run a job to align binding sites from the command line—see Section 12.4 on<br />
page 119 for more information.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 75
76<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
<strong>Prime</strong> <strong>User</strong> <strong>Manual</strong><br />
Chapter 9: Docking Covalently Bound Ligands<br />
Chapter 9<br />
While Glide is highly efficient at docking ligands that are bound to a receptor through<br />
hydrogen bonds or various nonbonded interactions, it is not able to dock ligands that are<br />
covalently bound to the receptor. <strong>Prime</strong> provides a covalent docking facility that allows you to<br />
select the attachment point to the receptor and the possible attachment points on the ligand, and<br />
run a <strong>Prime</strong> loop prediction to find poses for the ligand.<br />
9.1 Running Covalent Docking from Maestro<br />
Covalent docking calculations can be set up and run from the Covalent Docking panel. To open<br />
the Covalent Docking panel, choose Applications > <strong>Prime</strong> > Covalent Docking in the main<br />
window.<br />
The Covalent Docking panel has two main sections. Above these sections, you can specify the<br />
source of the ligands. In the Ligand reactive groups section, you can specify the kinds of<br />
groups that will react at the receptor attachment site. In the Receptor options section, you can<br />
specify the receptor bond that is broken and choose receptor residues to sample.<br />
9.1.1 Selecting the Ligands<br />
The ligands that you dock to the receptor can be taken from the Project Table or from a file.<br />
• To use the structures that are selected in the Project Table, choose Project Table from the<br />
Use ligands from option menu.<br />
• To read the ligands from a file, choose File from the Use ligands from option menu. You<br />
can then either enter the file name in the Input file text box, or click Browse and navigate<br />
to the file in the file selector that opens. The file must be a Maestro file, and can be compressed<br />
(.maegz, .mae.gz) or uncompressed (.mae).<br />
You should ensure that the structures are all-atom, 3D structures that are properly prepared, for<br />
example by using LigPrep. See the LigPrep <strong>User</strong> <strong>Manual</strong> for more information on ligand preparation.<br />
The poses that are returned include both the ligand and the receptor, so you should consider<br />
carefully how many ligands you want to dock.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 77
78<br />
Chapter 9: Docking Covalently Bound Ligands<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />
Figure 9.1. The Covalent Docking panel.<br />
9.1.2 Defining the Ligand Reactive Groups<br />
In the Ligand reactive groups section, you specify the reactive groups on the ligands, in which<br />
a bond is broken to form a bond with the receptor. Each group is specified by a SMARTS<br />
pattern and the index of the atom that leaves, with its attachments, when the ligand bond is<br />
broken. You can specify multiple reactive groups. For each ligand, all groups are tried for a<br />
match to the ligand, and each match is docked.<br />
To define each SMARTS pattern, you can enter the pattern in the SMARTS pattern text box, or<br />
you can select atoms on a typical ligand molecule in the Workspace, and click Get From Selection<br />
to place a pattern in the SMARTS pattern text box. You can then edit the SMARTS pattern<br />
in the text box if you want.
Chapter 9: Docking Covalently Bound Ligands<br />
When you are satisfied with the SMARTS pattern, you must then designate an atom on the<br />
ligand that is removed, along with the atoms attached to it, when the ligand forms a bond with<br />
the receptor. The fragment with the smaller number of atoms is the part that is removed. The<br />
atom is specified by its index (position) in the SMARTS pattern; the first atom index is 1. This<br />
index must be entered into the Leaving atom text box.<br />
Having defined a SMARTS pattern and the leaving group, click Add. The SMARTS pattern is<br />
added to the table below, and will be used when processing the ligands. You can repeat this<br />
process for as many SMARTS patterns as you want to use. If you want to remove a pattern<br />
from the table, select it and click Remove.<br />
9.1.3 Specifying the Receptor Bond to Break<br />
In the Receptor options section you specify the bond in the receptor that is broken to form a<br />
bond with the ligand. The receptor must be displayed in the Workspace, and is written to a file<br />
when you start the job.<br />
To define the bond that is broken, select Pick receptor bond to break, and pick the desired bond<br />
on the receptor in the Workspace. The Staying atom and Leaving atom text boxes are filled with<br />
the atom numbers of the atom in the bond that remains and the atom in the bond that is<br />
removed when the bond is broken. The bond is marked in the Workspace with yellow markers.<br />
If you pick the wrong bond, you can simply pick again, and the new bond replaces the old.<br />
You can also enter or edit the atom numbers in the Staying atom and Leaving atom text boxes.<br />
To ensure that you have the correct atom numbers, you should label the atoms. For more information<br />
on labeling atoms, see Section 6.3 of the Maestro <strong>User</strong> <strong>Manual</strong>.<br />
9.1.4 Specifying Receptor Residues to Sample<br />
In the Receptor options section you can also choose residues that are allowed to adjust to the<br />
formation of the covalent complex in the loop sampling procedure.<br />
The first choice is whether to sample the residue that is attached to the ligand. To do so, select<br />
Sample attachment residue.<br />
The next step is to pick other residues in the Workspace for sampling. To do so, select Pick<br />
additional residue to sample, and pick residues in the Workspace. You can pick any number of<br />
residues. As well as picking, you can select the desired residues in the Atom Selection dialog<br />
box, which you open by clicking Select. The residues that you select by either method are<br />
listed in the Additional residues text box.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 79
80<br />
Chapter 9: Docking Covalently Bound Ligands<br />
You can extend your selection to include residues that are within a given distance of the<br />
selected residues. To do so, enter a value in the Include residues within N Å of picked residues<br />
text box. These residues are also added to the list in the Additional residues text box.<br />
You can remove residues from the list to sample by deleting them from the Additional residues<br />
text box.<br />
9.1.5 Running the Job<br />
When you have finished making settings, click Start to open the Start dialog box. In this dialog<br />
box you can make job settings and start the job.<br />
The results are returned in a Maestro file. Each pose includes both the ligand and the receptor,<br />
one pose for each ligand attachment point found.<br />
9.2 Running Covalent Docking from the Command Line<br />
It is also possible to run covalent docking jobs from the command line. To do so, you should<br />
first set up the input file in the Covalent Docking panel, as described in the previous section,<br />
and then click Write, to write the input file for the job. This button opens a file selector, in<br />
which you can browse to the location and choose the file name.<br />
The covalent docking job is run with the same pipelining workflow scripts as other workflows,<br />
such as the Virtual Screening Workflow (VSW) and Quantum-Polarized Ligand Docking<br />
(QPLD). Covalent docking does not have its own script, so you can use the script for VSW, and<br />
run the job as<br />
$SCHRODINGER/vsw [job-options] input-file<br />
The job options are the usual Job Control options that specify the host and other resources.<br />
These options are described in Section 2.3 of the Job Control Guide.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
<strong>Prime</strong> <strong>User</strong> <strong>Manual</strong><br />
Chapter 10: <strong>Prime</strong> MM-GBSA<br />
Chapter 10<br />
The <strong>Prime</strong> MM-GBSA panel can be used to calculate ligand binding energies and ligand strain<br />
energies for a set of ligands and a single receptor, using the MM-GBSA technology available<br />
with <strong>Prime</strong>.<br />
The ligands and the receptor must be properly prepared beforehand, for example, by using<br />
LigPrep and the Protein Preparation Wizard. The ligands must be pre-positioned with respect<br />
to the receptor, and the receptor must be prepared as for a <strong>Prime</strong> refinement calculation.<br />
To open the <strong>Prime</strong> MM-GBSA panel, choose MM-GBSA from the <strong>Prime</strong> submenu of the Applications<br />
menu.<br />
After preparing your structures, you can set up the calculation as follows:<br />
1. Specify the source of the structures.<br />
You can take structures from a Pose Viewer file (Glide output), or from separated ligand<br />
and protein structures. If you choose the latter option, you must ensure that the ligands<br />
and the protein are properly prepared and aligned.<br />
2. (optional) Choose calculation settings.<br />
If you want to evaluate ligand strain energies as well as ligand binding energies, select<br />
Calculate ligand strain energies. With this option, an extra calculation on the ligand is<br />
performed at its geometry in the complex, and this result is combined with the free ligand<br />
calculation to calculate the strain energy.<br />
If you want to use ligand partial charges evaluated by some other program, such as QSite,<br />
select Use ligand input partial charges. The input partial charges on the ligand will then<br />
be used instead of those assigned using the default force field.<br />
If you want to use the <strong>Prime</strong> implicit membrane model, select Use implicit membrane.<br />
The receptor you choose must have been already set up with the membrane, as described<br />
in Section 7.3 on page 56.<br />
3. (optional) Select a region within a certain distance of the ligand for which the protein<br />
structure will be relaxed in the calculation.<br />
The atoms in all residues within the specified distance of the first ligand processed are<br />
included in the flexible region. By default, all protein atoms are frozen, and only the<br />
ligand structure is relaxed. The larger the flexible region, the longer the calculation takes.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 81
82<br />
Chapter 10: <strong>Prime</strong> MM-GBSA<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />
Figure 10.1. The <strong>Prime</strong> MM-GBSA panel.<br />
4. Click Start to start the job, or click Write to write the input file and run the job from the<br />
command line.<br />
When you click Start, the Start dialog box opens, in which you can choose to incorporate<br />
the results into the Maestro project, set the job name, select the host, and set the number<br />
of CPUS and number of subjobs, if you are running the job on a multiprocessor host or<br />
submitting it to a queuing system. You can divide the ligand set between a specified number<br />
of subjobs, to achieve load balancing. The number of subjobs must be no smaller than<br />
the number of CPUs, and for optimal load balancing, should be several times the number<br />
of CPUs.<br />
You can also run <strong>Prime</strong> MM-GBSA from the command line on Unix, with the input file written<br />
from the panel. The syntax is as follows:<br />
$SCHRODINGER/prime_mmgbsa [options] input-file<br />
The input file must be a Maestro file with the receptor as the first entry followed by one or<br />
more ligands (i.e. pose viewer file format). The options are described in Table 10.1.
Table 10.1. Options for the prime_mmgbsa command<br />
Option Description<br />
Chapter 10: <strong>Prime</strong> MM-GBSA<br />
-b jobname Use alternative job basename. Default is to use the input file basename.<br />
-cmae Use atomic partial charges from the input Maestro file in simulations<br />
-extdiel value Set the external dielectric to value.<br />
-h Print help message and exit<br />
-intdiel value Set the internal dielectric (including the membrane) to value.<br />
-keep Save all minimized ligand, complex, and protein structures (default is to save<br />
only minimized protein-ligand complexes)<br />
-lstrain Include an estimate of the ligand strain energy in the output<br />
-membrane Use the <strong>Prime</strong> implicit membrane model in protein and complex simulations<br />
-n subjobs Number of subjobs to run.<br />
-rflex file File listing protein residues to be treated flexibly. The residues are specified<br />
in the format chain:residue.<br />
-v Print version information and exit<br />
The output Maestro structure file includes a number of properties, which are described in<br />
Table 10.2.<br />
Table 10.2. Output properties from a <strong>Prime</strong> MM-GBSA calculation.<br />
Property Description<br />
<strong>Prime</strong> Coulomb Coulomb energy of the complex<br />
<strong>Prime</strong> Covalent Covalent binding energy of the complex<br />
<strong>Prime</strong> vdW Van der Waals energy of the complex<br />
<strong>Prime</strong> Solv SA Solvation surface area of the complex<br />
<strong>Prime</strong> Solv GB Solvation energy of the complex<br />
<strong>Prime</strong> Energy <strong>Prime</strong> energy of the complex<br />
<strong>Prime</strong> MMGBSA<br />
Complex Energy<br />
<strong>Prime</strong> MMGBSA Ligand<br />
Energy<br />
<strong>Prime</strong> MMGBSA<br />
Receptor Energy<br />
MMGBSA energy of the complex<br />
MMGBSA energy of the free ligand<br />
MMGBSA energy of the uncomplexed receptor<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 83
84<br />
Chapter 10: <strong>Prime</strong> MM-GBSA<br />
Table 10.2. Output properties from a <strong>Prime</strong> MM-GBSA calculation. (Continued)<br />
Property Description<br />
<strong>Prime</strong> MMGBSA Ligand<br />
in Complex Energy<br />
<strong>Prime</strong> MMGBSA Ligand<br />
Strain Energy<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />
MMGBSA energy of the ligand in the complex<br />
MMGBSA ligand strain energy<br />
<strong>Prime</strong> MMGBSA DG bind MMGBSA free energy of binding<br />
<strong>Prime</strong> MMGBSA DG bind<br />
no Ligand Strain<br />
MMGBSA free energy of binding without including ligand strain
<strong>Prime</strong> <strong>User</strong> <strong>Manual</strong><br />
Chapter 11: Command Syntax<br />
Chapter 11<br />
This chapter summarizes the command syntax for various programs that are part of <strong>Prime</strong>. All<br />
the programs are executed by driver scripts, which run under the Schrödinger Job Control<br />
facility. For more information on this facility, see the Job Control Guide. The conventions used<br />
in describing the command syntax are outlined in the Document Conventions section at the<br />
front of this manual.<br />
Residue specifications are used with a number of the input file keywords for several of the<br />
scripts. These have the format C:N, where C is a one-character chain ID (‘_’ if the chain ID is<br />
blank) and N is the residue number and insertion code (if any). Atom specifications similarly<br />
have the format C:N:A, where A is the 4-character PDB atom name with spaces replaced by<br />
underscores, e.g. A:185:_SG_.<br />
Most of these scripts accept the standard Schrödinger Job Control command line options and<br />
other commonly used options, listed in Table 11.1. Exceptions are noted in the program<br />
descriptions.<br />
Table 11.1. Standard Schrödinger job control options.<br />
Option Description<br />
-HOST host run the job on the specified host. The default is the local host.<br />
-LOCAL Run the job in the local directory rather than the temporary directory<br />
-TMPDIR tmpdir Run the job in the temporary directory tmpdir<br />
-WAIT Do no return control to the shell until the job finishes<br />
-INTERVAL n Update the output every n seconds<br />
-NICE Run the job at reduced priority<br />
-DEBUG Turn on debugging for job control as well as the program<br />
-HELP Show information on command-line usage<br />
11.1 blast—Run Blast Searches<br />
The blast script provides an interface to the Blast and PSIBlast programs from NCBI, while<br />
also supporting standard Schrodinger Job Control options. All options are passed on to the<br />
script via an input file.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 85
86<br />
Chapter 11: Command Syntax<br />
Syntax<br />
$SCHRODINGER/blast jobname [options]<br />
Input is read from the file jobname.inp. Output is written by default to jobname.out.<br />
Options<br />
The blast script accepts the options listed in Table 11.1.<br />
Input File<br />
Keywords control the operation of the blast script, and are listed in Table 11.2. Most<br />
keywords are simply a translation of options available from the blastall or blastpgp<br />
executables distributed by NCBI. For optional keywords, the Blast default values are used.<br />
Since these defaults can differ between different version of Blast they are not listed here.<br />
Table 11.2. Keywords for the blast script<br />
Keyword Description<br />
QUERY_FILE filename Required. Name of the FASTA file containing the input/query<br />
sequence<br />
DATABASE {nr | pdb} Database to be used in the search. In a default installation the two<br />
allowed values are nr and pdb.<br />
ALIGNMENT_FORMAT value Alignment view options (Blast -m option). Allowed values are:<br />
0 pairwise<br />
1 query-anchored showing identities<br />
2 query-anchored no identities<br />
3 flat query-anchored, show identities<br />
4 flat query-anchored, no identities<br />
5 query-anchored no identities and blunt ends<br />
6 flat query-anchored, no identities and blunt ends<br />
7 XML Blast output<br />
8 tabular<br />
9 tabular with comment lines<br />
m2io Maestro format (default)<br />
EXPECT value Expectation value (E) (Blast -e option).<br />
FILTER {true | false} Filter query sequence: DUST with blastn, SEG with others. (Blast<br />
–F option).<br />
GAP_OPEN_COST value Cost to open a gap (Blast -G option)<br />
GAP_EXTEND_COST value Cost to extend a gap (Blast -E option)<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
Table 11.2. Keywords for the blast script (Continued)<br />
Keyword Description<br />
Return Value and Errors<br />
blast returns 0 if successful and a non-zero value in all other cases.<br />
Examples<br />
Chapter 11: Command Syntax<br />
MATRIX type Sequence similarity matrix to use for the alignments. Allowed values<br />
of type are BLOSUM45, BLOSUM62, BLOSUM80, PAM30, PAM70<br />
MAX_ROUNDS n Maximum number of passes to use in multipass version (PSI-Blast<br />
–j option); only applies to runs using PROGRAM blastpgp<br />
E_VALUE_THRESHOLD value e-value threshold for inclusion in multipass model (PSIBlast -h<br />
option).<br />
OUTPUT_FILE filename Name of the output file. Default: jobname.out.<br />
PROGNAME name Type of blast search to run. The allowed values are blastp,<br />
blastpgp, and bl2seq (for pairwise alignments). Default: blastp.<br />
TEMPLATE_FILE filename Name of the FASTA file that contains the template sequence<br />
(bl2seq –j option). Only applies to pairwise alignments using<br />
bl2seq<br />
PSSM_ASCII_OUTFILE<br />
filename<br />
EXPAND_HITS {true |<br />
false}<br />
The following is an example input file; query.aa is the query sequence in FASTA format.<br />
QUERY_FILE query.aa<br />
PROGNAME blastp<br />
DATABASE pdb<br />
Environment Variables<br />
ASCII output file for PSI-BLAST matrix (PSIBlast -Q option).<br />
Expand redundant hits, based on information in the NCBI define for a<br />
particular hit. Only available when using Maestro format.<br />
SHOW {all | pdb} Allows the user to restrict the types of hits that are included in the<br />
output. all includes all hits; pdb only includes hits from the PDB.<br />
Only available when using Maestro format.<br />
The following environment variables are supported by the blast script:<br />
PSP_BLASTDB directory containing the customized Blast databases<br />
PSP_BLAST_DIR directory containing a custom Blast installation<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 87
88<br />
Chapter 11: Command Syntax<br />
11.2 bldstruct—Build a Protein Model from an Alignment<br />
The bldstruct script can be used to build initial (unrefined) protein models from an alignment<br />
to one or more templates via the Protein Local Optimization Program (PLOP).<br />
bldstruct handles all the tasks associated with the building of homology models via PLOP<br />
(i.e. parsing and preparation of input files, interaction with PLOP, and handling of output structures,<br />
etc.). All jobs handled by bldstruct are run under Schrödinger Job Control.<br />
Multithreaded execution (with Open MP) can be enabled for bldstruct jobs by setting the<br />
environment variable OMP_NUM_THREADS. It is recommended to set this environment variable<br />
in the schrodinger.hosts file for hosts on which multithreading is supported—see<br />
Section 6.1 of the Installation Guide.<br />
Syntax<br />
$SCHRODINGER/bldstruct input-file [input-file2 ...] [options]<br />
The input file can be specified with or without the .inp extension. If multiple input files are<br />
specified, the chains are combined to build a multimer. Each input file should correspond to a<br />
single chain of the multimer, and be accompanied by the usual associated PDB template structure<br />
and alignment files. The output is a single structure that contains all of the chains, refined<br />
together; the job and the output file are named after the first input file.<br />
Options<br />
The -DEBUG and -HOST options are recognized by bldstruct. See Section 2.3 of the Job<br />
Control Guide for more information.<br />
Command Input File<br />
The command input file is named jobname.inp. This input file contains keyword-value pairs,<br />
one pair per line separated by one or more spaces. The keywords are given in Table 11.3.<br />
Table 11.3. Keywords for the bldstruct script<br />
Keyword syntax Description<br />
BUILD_DELETIONS {true |<br />
false}<br />
BUILD_TRANSITIONS {true |<br />
false}<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />
Turn on or off closure of the chain breaks near deletions in the<br />
alignment. If not closed, they will remain as chain breaks in the<br />
output structure. Default: true.<br />
Turn on or off closure of the junctions between templates in multitemplate<br />
homology modelling jobs. If not closed, they will remain<br />
as chain breaks in the output structure. Default: true.
Table 11.3. Keywords for the bldstruct script (Continued)<br />
Keyword syntax Description<br />
Chapter 11: Command Syntax<br />
COMPOSITE_ARRAY string A string of integers indicating from which template (or, more specifically,<br />
from which alignment) coordinates for a given residue<br />
should be obtained. Thus, this string of integers must be equal in<br />
length to the query sequence being used in the alignment files.<br />
COMPOSITE_ARRAY is required for multiple template jobs. It is not<br />
required for single template jobs, but if it is present, it is ignored.<br />
KEEP_ROTAMERS {true |<br />
false}<br />
Specifies which side chains to optimize. If true, the rotamers for<br />
conserved side chains are preserved, and only non-conserved side<br />
chains are predicted. If false, all side chains are predicted. This<br />
keyword is ignored if SIDE_OPT is false. Default: true.<br />
MAX_INSERTION_SIZE n Specifies the longest insertion in the alignment that will be built.<br />
Insertions longer than this will be omitted, and not appear in the<br />
output structure. Residue numbering will reflect the full query<br />
sequence however. Default: 1000.<br />
MINIMIZE {true | false} Turn on or off minimization of regions that were not directly copied<br />
from the template structure. These regions primarily include portions<br />
of the structure involved in closing gaps or building insertions,<br />
but also include any side chains optimized due to the<br />
SIDE_OPT keyword. Default: true.<br />
SIDE_OPT {true | false} Turn on or off a side chain optimization stage. Default: true.<br />
template_ALIGN_FILE<br />
filename<br />
template_HETERO_i<br />
ligand-spec<br />
Required. The name of the file containing the alignment between<br />
this template and the query. Details on the format are provided in<br />
the Examples section below.<br />
Required if a ligand is present. Specifies a ligand to include from<br />
this template. The format of the specification is<br />
AAA C:###<br />
where AAA is the ligand’s three letter code, C is its chain ID, ### is<br />
its residue number. Multiple ligands from the template are numbered<br />
using i as an index, starting from zero.<br />
TEMPLATE_NAME template Required. A label used to identify a given template in subsequent<br />
data fields. For example, if TEMPLATE_NAME is set to 1BPJ_A,<br />
other template-specific fields are given as 1BPJ_A_STRUCT_FILE,<br />
1BPJ_A_ALIGN_FILE, and so on.The chain to be used is specified<br />
by an underscore and letter suffix: for example, for 1BPJ_A, chain<br />
A will be used. The chain suffix must be included in the label.<br />
template_NUMBER n Required. Used to identify a given template in the<br />
COMPOSITE_ARRAY (see below). Numbering goes from 1 to 9.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 89
90<br />
Chapter 11: Command Syntax<br />
Table 11.3. Keywords for the bldstruct script (Continued)<br />
Keyword syntax Description<br />
template_STRUCT_FILE<br />
filename<br />
USE_SYMMETRY {true |<br />
false}<br />
Files<br />
In addition to the input files (structure and alignment) and the job files (command input file,<br />
jobname.inp, and log file, jobname.log), bldstruct produces the following output files:<br />
jobname-out.mae Built model in Maestro format.<br />
jobname-out.pdb Built model in PDB format.<br />
Return Value and Errors<br />
bldstruct returns a value of 0 on successful completion, 1 otherwise. All error messages are<br />
printed to the file jobname.log. There is no comprehensive listing of possible errors as these<br />
can arise in a variety of programs and scripts.<br />
Examples<br />
The following is a sample input file for building on two templates:<br />
QUERY_OFFSET 0<br />
1TSV_NUMBER 1<br />
1TSV_STRUCT_FILE pdb1TSV.ent<br />
1TSV_ALIGN_FILE 1TSV.align<br />
TEMPLATE_NAME 1BPJ_A<br />
1BPJ_A_NUMBER 2<br />
1BPJ_A_STRUCT_FILE pdb1BPJ_A.ent<br />
1BPJ_A_ALIGN_FILE 1BPJ_A.align<br />
1BPJ_A_HETERO_0 UMP :317<br />
1BPJ_A_HETERO_1 ZN :35<br />
TAILS 0<br />
SIDE_OPT true<br />
MINIMIZE true<br />
KEEP_ROTAMERS true<br />
MAX_INSERTION_SIZE 1000<br />
BUILD_DELETIONS true<br />
BUILD_TRANSITIONS true<br />
USE_SYMMETRY false<br />
COMPOSITE_ARRAY 111111111111111111111112222222222222222222222222222222222221111111<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />
Required. The PDB file to be used for this particular template.<br />
Turn on or off use of the crystallographic symmetry of the template<br />
file when constructing the homology model. Default: false.
Chapter 11: Command Syntax<br />
The alignment file format is as follows. The first line corresponds to the query sequence (indicated<br />
by ProbeAA:) and the second line corresponds to the template sequence (indicated by<br />
Fold AA:). A period is used as a gap character, and an X is used for non-standard residues.<br />
Below is a sample alignment file.<br />
ProbeAA: .AAEEKTEFDVILKAAGANKVAVIKAVRGATGLGLKEAKDLVESA...PAALKEGVSKDDAEALKKALEEAG<br />
AEVEVK<br />
Fold AA: AAQEEKTEFDVVLKSFGQNKIQVIKVVREITGLGLKEAKDLVEKAGSPDAVIKSGVSKEEAEEIKKKLEEAG<br />
AEVELK<br />
The sequences are split for formatting reasons, and should be on a single line in the actual file.<br />
11.3 fr—Fold Recognition<br />
fr finds template proteins that are similar to the query sequence at both sequence and structure<br />
levels.<br />
The fold recognition program used in <strong>Prime</strong> differs from standard sequence search program<br />
such as BLAST by taking into account secondary structure matching and profile-sequence<br />
matching. As a result, fr can find structural homologs with low sequence identity to the query<br />
(
92<br />
Chapter 11: Command Syntax<br />
Syntax<br />
Options<br />
$SCHRODINGER/fr jobname [options]<br />
The -DEBUG and -HOST options are recognized by fr. See Section 2.3 of the Job Control<br />
Guide for more information.<br />
Command Input File<br />
The command input file is named jobname.inp. This input file contains keyword-value pairs,<br />
one pair per line separated by one or more spaces. The keywords are given in Table 11.4.<br />
Table 11.4. Keywords for the fr script.<br />
Keyword syntax Description<br />
QUERY_NAME name Name of the query sequence<br />
QUERY_FILE filename Query sequence file, in FASTA format<br />
ALIFORMAT value Select format of alignments file. Default is Maestro format. For “normal”<br />
format set value to dev.<br />
BGNRES n The first residue of the query that is used in search for homologs. Default: 1<br />
(the first residue).<br />
ENDRES n The last residue of the query that is used. Default: the last residue of the<br />
complete query<br />
PREDSS[_i] filename Secondary structure prediction file of the query sequence, in CASP format.<br />
One occurrence of this keyword is required for each file, and there must be<br />
at least one SSP. If only one prediction is used, the keyword is PREDSS; if<br />
more predictions are used, the keyword is suffixed with an index, starting<br />
from zero: PREDSS_0, PREDSS_1, and so on.<br />
Files<br />
• Input file—named jobname.inp. Contains keywords for running the program.<br />
• Query sequence file—File containing the complete sequence of the query, in FASTA format.<br />
Specified in the input file.<br />
• Secondary structure prediction files in CASP format for the query sequence. See below<br />
for an example of CASP format.<br />
• Output Ranking File—File named jobname.out containing a list of templates ranked<br />
according to the FR scoring function as described above.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
Chapter 11: Command Syntax<br />
• Output alignment file—File named jobname.ali containing pairwise alignment between<br />
the query and a template. By default, the alignment is in Maestro format (m2io). In order<br />
to view alignments in normal format (i.e. aligned residues from query and template are<br />
placed on top of each other), add the keyword ALIFORMAT to the input file with value set<br />
to dev.<br />
• Log file—File named jobname.log containing progress of the program, including warnings<br />
and error messages.<br />
Return Value and Errors<br />
The script returns 0 for success and 1 for failure.<br />
In case of failure, check the log file for details on what might be have caused the program to<br />
fail. To turn on debugging output use the -DEBUG command line option as follows:<br />
$SCHRODINGER/fr -DEBUG jobname<br />
Examples<br />
The following file demonstrates the use of three SSPs.<br />
QUERY_NAME HypProtein_furiosus_truncated<br />
QUERY_FILE HP.seq<br />
PREDSS_0 prime_foldrecog_run1_HP-ssp1.casp<br />
PREDSS_1 prime_foldrecog_run1_HP-ssp2.casp<br />
PREDSS_2 prime_foldrecog_run1_HP-ssp3.casp<br />
BGNRES 1<br />
ENDRES 197<br />
The following is an example of a CASP format file. The CASP format has a minimum of 2<br />
columns per line, with the first being the one-letter code for the amino acid and the second<br />
being the secondary structure type. An optional third column gives the confidence level of the<br />
prediction.<br />
PFRMAT SS<br />
TARGET 1HL6:A<br />
AUTHOR xxxx-xxxx-xxxx<br />
REMARK written by Schrodinger LLC Software<br />
METHOD not applicable<br />
MODEL 1<br />
M C 1.00<br />
A C 1.00<br />
D H 1.00<br />
V H 1.00<br />
...<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 93
94<br />
Chapter 11: Command Syntax<br />
E C 1.00<br />
K C 1.00<br />
R C 1.00<br />
R C 1.00<br />
R C 1.00<br />
END<br />
11.4 multirefine—Multi-Step Extended Sampling<br />
The multirefine script automates extended sampling protocols, which consist of multiple<br />
executions of refinestruct on a set of structures. Multirefine handles the submission of the<br />
subjobs and the compilation of the results. Currently available protocols include extended loop<br />
sampling, ultra-extended loop sampling, and cooperative loop pair sampling.<br />
Syntax<br />
$SCHRODINGER/multirefine input-file [options]<br />
where input-file is the name of the command input file, with or without a .inp extension, and<br />
serves also as the default job name. The input file must be either the first or last command-line<br />
argument, and everything else on the command line is passed as is to Job Control.<br />
Options<br />
The standard Job Control command line options listed in Table 11.1 are accepted. There are no<br />
options specific to this program.<br />
Command Input File<br />
Most of the keywords are the same as the general keywords and the keywords for protocol 1<br />
(loop and helix refinement) for refinestruct: see Section 11.5 on page 98 for details. The<br />
differences and new keywords are described below.<br />
Table 11.5. Keywords for the multirefine input file<br />
Keyword syntax Description<br />
HOST hostname Job host to which the subjobs will be submitted. The top-level driver script<br />
will run on the host specified by the -HOST command-line argument (or on<br />
the launch host if none is specified.) If hostname is the same as that specified<br />
by -HOST or is localhost, the subjobs will run on the same host as the toplevel<br />
driver (or be submitted to the same queuing system if the top-level host<br />
is a queue). Otherwise, the driver will submit subjobs using hostname as their<br />
command-line argument. Not available with refinestruct.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
Table 11.5. Keywords for the multirefine input file (Continued)<br />
Keyword syntax Description<br />
Chapter 11: Command Syntax<br />
LOOP_i_RES_j Specify beginning and end of loops. Same as for refinestruct. Required if<br />
SEG_LIST is not given. Depending on the protocol, 1, 2, or an arbitrary number<br />
of loops may be specified. For example, to specify the beginning and end<br />
of loop 0, from residue 10 to residue 20 in chain A, use the following:<br />
LOOP_0_RES_0 A:10<br />
LOOP_0_RES_1 A:20<br />
MAX_JOBS n Maximum number of subjobs that can run at one time. This can be set to a<br />
value less than the value of THREADS if sufficient resources are not available.<br />
Not available with refinestruct.<br />
PRIME_TYPE n Type of refinement to be carried out. Can be specified as either a number or<br />
text string. Different from refinestruct. Available choices are:<br />
0 or default: Equivalent to running a single loop refinement using<br />
refinestruct. Not particularly useful except for completeness and testing<br />
purposes. Does not recognize THREADS keyword.<br />
1 or extended: Used for Extended Sampling from the GUI. Does not recognize<br />
MIN_OVERLAP and MAX_CA_MOVEMENT keywords, as these are used<br />
internally by the protocol. Accepts multiple loops as input which will be<br />
run sequentially.<br />
2 or long_loop: No longer used.<br />
3 or loop_pair: Used for Cooperative Loop Sampling from the GUI.<br />
Exactly two loops must be specified.<br />
2 or long_loop_2: Used for Ultra-extended Sampling from the GUI. Does<br />
not recognize MIN_OVERLAP and MAX_CA_MOVEMENT keywords, as these<br />
are used internally by the protocol. Also does not recognize THREADS keyword,<br />
as the sampling is determined by the loop length. Only one loop can<br />
be specified.<br />
STRUCT_FILE Required. The input structure file in Maestro format. Same as for<br />
refinestruct.<br />
SEG_LIST filename Name of a file containing the list of loops to be refined. For the example given<br />
for the keyword LOOP_i_RES_j, the file would consist of the single line:<br />
loop A:10 A:20<br />
Not available with refinestruct.<br />
THREADS n Number of simultaneous jobs to be run during selected refinement stages.<br />
Determines the sampling level for some protocols. Note: the jobs specified by<br />
THREADS do not have to actually run in parallel—see MAX_JOBS. Not available<br />
with refinestruct.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 95
96<br />
Chapter 11: Command Syntax<br />
Files<br />
In addition to the jobname.log and jobname.err files, multirefine produces the following<br />
output files:<br />
jobname-out.mae Final structures, all in a single file.<br />
jobname.pdb PDB format file corresponding to lowest-energy structure.<br />
Other structures will have corresponding files jobname-<br />
2.pdb, and so on.<br />
jobname.restart Contains information that allows a job to be restarted if it<br />
fails.<br />
jobname-run-N.archive.zip Archives of subjob output files.<br />
Temporary files, which are normally deleted when no longer needed unless the debugging<br />
option is used, are named according to the following patterns:<br />
jobname-conf-N.* Files associated with temporary structures. All active structures in the<br />
ensemble have a unique serial number N.<br />
jobname-run-N.* Files associated with a given subjob. All subjobs have a unique serial<br />
number N.<br />
Return Value and Errors<br />
multirefine returns a value of 0 on successful completion, 1 otherwise. All scripts run by<br />
multirefine that return a value of 1 also print a message to the file jobname.err. This file is<br />
empty otherwise, as all non-fatal messages are printed to the file jobname.log. Output from<br />
Job Control goes to standard output, and any errors arising before the driver script has started<br />
likewise go to standard output or standard error.<br />
Restarting an Incomplete Job<br />
If a job fails, you can restart it by submitting it again from the same directory. The<br />
jobname.restart file and the archives of subjobs must be present in the directory, and the<br />
directory must be accessible from the host on which the driver job runs. Also, the job records<br />
for the failed jobs must not be purged from the job database, so that the restart mechanism has<br />
access to information on the failed subjobs. If the job records are missing, the subjobs are<br />
repeated.<br />
Failed jobs that were started from Maestro must be restarted from the command line. You can<br />
use the -PROJ and -D<strong>ISP</strong> options to assign the job to the Maestro project and set the incorporation<br />
mode.<br />
Once the job finishes successfully, the subjob archives and the restart file are removed.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
Chapter 11: Command Syntax<br />
It is not necessary to use -LOCAL or -SAVE to ensure that a job can be restarted, but there must<br />
be sufficient space in the working directory for the subjob archives.<br />
Examples<br />
The following is an example of an input file produced by Maestro for a Serial loop sampling<br />
(Extended Medium) job:<br />
STRUCT_FILE prime_refine.mae<br />
PRIME_TYPE 1<br />
LOOP_0_RES_0 _:35<br />
LOOP_0_RES_1 _:40<br />
THREADS 4<br />
MAX_JOBS 2<br />
NUM_OUTPUT_STRUCT 4<br />
RES_SPHERE 7.50<br />
USE_CRYSTAL_SYMMETRY false<br />
USE_RANDOM_SEED true<br />
SEED 0<br />
HOST localhost<br />
Setting PRIME_TYPE to 1 selects the extended sampling protocol; setting THREADS to 4 corresponds<br />
to the Medium sampling level. Setting MAX_JOBS to 2 requires that no more than 2<br />
subjobs can run at once (e.g. if using a 2-processor machine.)<br />
Environment Variables<br />
In addition to the standard Schrödinger environment variables, multirefine recognizes the<br />
following environment variables:<br />
PSP_RB_DEBUG<br />
PSP_DEBUG<br />
PSP_RB_LOCAL<br />
PSP_LOCAL<br />
TFM_QUIET<br />
TFM_VERBOSE<br />
If this environment variable is set, debugging will be turned on and the<br />
subdirectory containing intermediate files will be kept. Equivalent to using<br />
the command-line option -DEBUG.<br />
If this is set, all calculations will run in the local launch directory. Equivalent<br />
to using the command-line option -LOCAL.<br />
These environment variables can be set to decrease or increase the jobrelated<br />
output in the log file.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 97
98<br />
Chapter 11: Command Syntax<br />
11.5 refinestruct—Refine a Protein Structure<br />
The refinestruct script runs protein structure refinement jobs (side chain prediction and<br />
energy minimization as well as loop and helix prediction) using the Protein Local Optimization<br />
Program (PLOP).<br />
refinestruct handles all the tasks associated with the refinement of protein structures via<br />
PLOP: parsing and preparation of input files, interaction with PLOP, and handling of output<br />
structures, etc. It also fixes the structural issues that are fixed by primefix.py (see<br />
Section 12.3 on page 118). All jobs handled by refinestruct are run under Schrödinger Job<br />
Control.<br />
Multithreaded execution (with Open MP) can be enabled for bldstruct jobs by setting the<br />
environment variable OMP_NUM_THREADS. It is recommended to set this environment variable<br />
in the schrodinger.hosts file for hosts on which multithreading is supported—see<br />
Section 6.1 of the Installation Guide.<br />
Syntax<br />
$SCHRODINGER/refinestruct [options] jobname<br />
The input is read from the file jobname.inp and the structural output is written to the file<br />
jobname-out.mae. If the input structure file has alternate coordinates, the refinement is<br />
performed on the first set of alternate coordinates.<br />
Options<br />
The refinestruct script accepts the options listed in Table 11.1.<br />
Command Input File<br />
The command input file contains a list of keywords that control the operation of the<br />
refinestruct script, one per line. The keywords are listed in the tables below. Generally<br />
valid keywords are listed in Table 11.6 and protocol-specific keywords are listed in Table 11.7,<br />
under the protocol to which they apply.<br />
Table 11.6. General keywords for the refinestruct script<br />
Keyword syntax Description<br />
ECUTOFF value Energy cutoff for return of conformations. Returns all conformations<br />
within value kcal/mol of the lowest-energy structure. It can be used in<br />
conjunction with NUM_OUTPUT_STRUCT, to impose a maximum on the<br />
number of such conformations to return. As with<br />
NUM_OUTPUT_STRUCT, it only applies to single loop refinements.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
Table 11.6. General keywords for the refinestruct script (Continued)<br />
Keyword syntax Description<br />
Chapter 11: Command Syntax<br />
EXT_DIEL value Dielectric constant of the continuum solvation model. Default: 80.0.<br />
Used with SGB_MOD sgbnp.<br />
INT_DIEL value Dielectric constant used within the radius of an atom. Default: 1.0.<br />
NUM_OUTPUT_STRUCT n Number of conformations to return for single loop refinements. This has<br />
no effect on helix refinements or jobs composed of more than one refinement<br />
(e.g. 2 loops, 1 helix + 1 loop, etc.). All structures are returned in a<br />
single structure file in Maestro format. Default: 1.<br />
PAIR_CONSTRAINT_i<br />
atom1,atom2,value<br />
Specify a constraint on a pair of atoms. The format of the atom specifications<br />
is given at the beginning of this chapter. The value is the distance<br />
between the atoms in angstroms.<br />
PRIME_TYPE type Type of refinement to perform. The available options are:<br />
SIDE_PRED Predict Side Chains (prefix SC)<br />
LOOP_BLD Predict Loops and Helices<br />
REAL_MIN Minimize Energy (prefix MINI)<br />
SITE_OPT Active Site Minimization (including ligand)<br />
ENERGY Energy Calculation<br />
RESIDUE_i spec Specify a residue to refine. The numbering i begins at 0, and increases<br />
incrementally. The format of the residue specification spec is given at<br />
the beginning of this chapter.<br />
RETAIN_CORRECTIONS<br />
{yes | no}<br />
Retain corrections made when duplicate residue numbers are found. The<br />
duplications detected are between non-protein residues or between protein<br />
and non-protein residues. The non-protein residue is assigned a new,<br />
unique chain name temporarily. If you set this keyword to yes, the new<br />
chain name is written to the output file. Default: no.<br />
SEED n The integer to use as the seed for the random number generator if<br />
USE_RANDOM_SEED is false. If it is true, SEED is ignored. Default: -1,<br />
meaning that the program will generate a random seed rather than use<br />
the supplied seed.<br />
SELECT string Specify the manner for specifying which residues to refine. Available<br />
options are:<br />
all Refine all residues<br />
pick Refine the specific residues, indicated by included<br />
RESIDUE_i parameters (see above).<br />
file Read list of residues to refine from a file. The file consists of a<br />
series of lines with a residue specification on each line.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 99
100<br />
Chapter 11: Command Syntax<br />
Table 11.6. General keywords for the refinestruct script (Continued)<br />
Keyword syntax Description<br />
SGB_MOD {sgbnp|<br />
vdgbnp}<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />
Specify the implicit solvation model to use. The choices are:<br />
sgbnp—use the standard generalized Born model.<br />
vdgbnp—use the variable-dielectric generalized Born model, which<br />
uses different dielectric constants based on the polar or charged nature<br />
of the protein residue: 4 for Lys, 3 for Glu/Arg, 2 for Asp/His, 1 for all<br />
others. This method improves loop predictions.<br />
Default: vdgbnp.<br />
STRUCT_FILE filename Input structure file in Maestro format.<br />
USE_CRYSTAL_SYMMETRY<br />
{yes | no}<br />
USE_RANDOM_SEED<br />
{yes | no}<br />
USE_MAE_CHARGES<br />
{yes | no}<br />
USE_MEMBRANE<br />
{yes | no}<br />
Set to true if the input structure contains unit cell information and crystal<br />
symmetric atoms are to be included in calculation. Default: no.<br />
Indicates whether to use a random seed for the random number generator.<br />
This keyword has no effect on Minimization tasks. Default: no.<br />
Indicates whether to use atomic partial charges from the Maestro input<br />
file for untemplated residues (ligands, cofactors). Default: no.<br />
Indicates whether to use the implicit membrane model. The model must<br />
be set up from the Setup Membrane panel in Maestro. Default: no.<br />
USE_SGB {yes | no} Use the SGB implicit solvation model. If the model is not used, calculations<br />
are run in vacuum with a dielectric of 6.0. Default: no.<br />
Table 11.7. Protocol-specific keywords for the refinestruct script<br />
Keyword syntax Description<br />
LOOP_BLD<br />
CA_CONSTRAINT_i<br />
spec,[x,y,z]value<br />
Specify a constraint on the position of a specific alpha carbon atom. The<br />
format of the residue specification spec is given at the beginning of this<br />
chapter. The optional coordinates specify the target position; if omitted,<br />
the target position is the initial position. The value is the maximum distance<br />
that the C-alpha atom can move from the target position, in angstroms.<br />
Specifying a target position allows you to move a loop to a<br />
desired location.<br />
LOOP_i_RES_j spec Specify residue j for defining loop i. Loops are numbered sequentially,<br />
beginning at 0. Two occurrences of this keyword are required, with j=0<br />
(the beginning of the loop) and j=1 (the end of the loop). spec is a residue<br />
specification as defined above.
Table 11.7. Protocol-specific keywords for the refinestruct script (Continued)<br />
Keyword syntax Description<br />
HELIX_BUILD<br />
spec1/spec2<br />
Files<br />
Chapter 11: Command Syntax<br />
In addition to the input structure file, defined by the value of STRUCT_FILE, and the job files<br />
(the command input file jobname.inp, and the log file jobname.log), bldstruct produces<br />
the following output file:<br />
Return Value<br />
Build the specified sequence as a helix. The format of the residue specifications<br />
spec1 and spec2 is given at the beginning of this chapter. You<br />
should include enough residues on either side of the helical region in the<br />
loop prediction to ensure that sufficient flexibility is available to build<br />
the loop.<br />
MAX_CA_MOVEMENT value Specify the maximum distance that any alpha carbon atom in the loop<br />
should be allowed to move during refinement. Omitting this parameter<br />
indicates that no restriction should be applied.<br />
RES_SPHERE value All side chains that lie within this distance of the loop will also be optimized<br />
during refinement. This allows these nearby side chains to “react”<br />
to the loop being predicted. Default 0.0.<br />
MIN_OVERLAP value Fraction of van der Waals distance used to define a clash. Smaller values<br />
allow structures with worse potential atomic overlaps to be generated.<br />
Default: 0.70<br />
SITE_OPT<br />
LIGAND string Residue specification for the ligand.<br />
NPASSES n Number of passes through side chain optimization of the protein residues<br />
followed by minimization of the protein residues (including the<br />
backbone). These two steps are repeated the specified number of times<br />
before proceeding to optimization of the protein and the ligand.<br />
Default: 2. Files written from Maestro have NPASSES set to 1.<br />
jobname-out.mae Final structures, all in a single file.<br />
refinestruct returns a value of 0 on successful completion, 17 for license errors, and 1<br />
otherwise. All error messages are printed to the file jobname.log. There is no comprehensive<br />
listing of possible errors as these can arise in a variety of programs and scripts.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 101
102<br />
Chapter 11: Command Syntax<br />
Environment Variables<br />
In addition to those used for all Schrödinger software, refinestruct uses the environment<br />
variable PSP_OPLS_VERSION to define the OPLS force field version. The default value is<br />
2005. It can be set to 2001 to use the earlier set of force field parameters for ligands and nonstandard<br />
residues. The OMP_NUM_THREADS environment variable is also supported for<br />
OpenMP multithreaded execution (see above).<br />
Examples<br />
The following is a sample minimization input file:<br />
STRUCT_FILE input-structure.mae<br />
PRIME_TYPE REAL_MIN<br />
SELECT pick<br />
RESIDUE_0 A:1<br />
RESIDUE_1 A:20<br />
RESIDUE_2 A:23<br />
RESIDUE_3 A:26<br />
RESIDUE_4 A:27<br />
RESIDUE_5 A:30<br />
RESIDUE_6 A:31<br />
RESIDUE_7 A:36<br />
RESIDUE_8 A:40<br />
RESIDUE_9 A:42<br />
RESIDUE_10 A:53<br />
USE_RANDOM_SEED false<br />
SEED 0<br />
The arrangement in columns is for ease of reading only—the spacing is not significant.<br />
11.6 ska—Protein Structure Alignment<br />
The SKA program carries out structural alignment of proteins by aligning protein backbones at<br />
different levels of detail.<br />
The ska script provides an interface to the SKA structural alignment program. As with other<br />
computational programs in <strong>Prime</strong> its operation is controlled through simple input files. The<br />
allowed keywords and their values are discussed in more detail below. The ska script allows<br />
you to run the SKA program in three different modes under Job Control:<br />
• align mode—SKA aligns two or more protein structures to one another.<br />
• dbcreate mode—SKA creates databases for use with scan mode. You should split PDB<br />
files with multiple chains into single-chain files before adding them to the database, as<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
SKA has difficulty aligning to multiple chains.<br />
Chapter 11: Command Syntax<br />
• scan mode—SKA searches a database of protein structures and identifies potential structural<br />
analogues. By default a non-redundant database of protein chains from the PDB is<br />
searched.<br />
Syntax<br />
$SCHRODINGER/ska [options] jobname<br />
By default the input is read from the file jobname.inp and the output is written to<br />
jobname.out.<br />
ska does not generate output other than the explicitly defined database and a log file in<br />
dbcreate mode—see below for details.<br />
Options<br />
The only option accepted by ska is -DEBUG, which turns on debugging mode for both Job<br />
Control and the SKA program.<br />
Command Input File<br />
The input file consists of lines containing keyword-value pairs, one per line. The keywords are<br />
described in Table 11.8. The set of allowed and required keywords depends on the mode,<br />
which is determined by the MODE keyword.<br />
Return Value and Errors<br />
The script returns 0 for success and 1 for failure.<br />
In case of failure, check the log file for details on what might be have caused the program to<br />
fail. To turn on debugging output use the -DEBUG command line option as follows:<br />
$SCHRODINGER/ska -DEBUG jobname<br />
Environment Variables<br />
The ska program handles all required environment variables internally and requires only that<br />
the environment variable SCHRODINGER be set. Should the environment variable TROLLTOP<br />
exist in the current environment it will be overwritten.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 103
104<br />
Chapter 11: Command Syntax<br />
Table 11.8. Keywords syntax for ska input file<br />
Keyword syntax Mode Description<br />
MODE mode all The SKA operating mode; allowed values are:<br />
*align*: structurally align two or more protein structures<br />
to on another. This is the default mode<br />
*scan*: given a query structure scan a database of protein<br />
structures and identify potential structural homologues<br />
*dbcreate*: create a database that can be search in<br />
scan mode from a list of PDB files<br />
QUERY_FILE filename<br />
[name]<br />
TEMPLATE filename<br />
[name]<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />
*align*<br />
*scan*<br />
Required. The input query structure. In *align* mode,<br />
the structure against which other proteins are aligned. In<br />
*scan* mode, the structure used for scanning the database.<br />
The optional name (*align* mode only) sets the<br />
name of this structure to a value other than the filename<br />
(the default). The name is displayed in the output file<br />
(see Examples, below). name cannot contain the file separator<br />
character (‘/’ on UNIX).<br />
*align* Required. PDB structures that are to be aligned to the<br />
query structure, set by QUERY_FILE. The TEMPLATE<br />
keyword can appear multiple times in order to be able to<br />
carry out multiple structure alignments (aligning more<br />
than one template structure to the query structure). The<br />
optional name sets the name of this entry to a value other<br />
than the filename (the default). The name is displayed in<br />
the output file (see Examples, below). name cannot contain<br />
the file separator character (‘/’ on UNIX).<br />
DATABASE filename *scan* The absolute path to the database to scan with the query<br />
structure. The database can be generated in dbcreate<br />
mode. Default: database supplied with <strong>Prime</strong>.<br />
DBNAME filename *dbcreate* The name of the file in which to store the database create<br />
by ska. filename must be the file name without any path<br />
specification.<br />
FILELIST filename *dbcreate* A file containing a list of absolute filenames for structures<br />
(in PDB format) that are to be included in a database<br />
of templates that can be search with ska in scan<br />
mode. Absolute filenames are required to avoid having<br />
to copy large numbers of PDB files to the temporary<br />
directory in which the job is run.<br />
GAP_OPEN value *align*<br />
*scan*<br />
Gap initiation penalty.<br />
GAP_DEL value *align* Deletion penalty.
Table 11.8. Keywords syntax for ska input file (Continued)<br />
Keyword syntax Mode Description<br />
OUTPUT_FILE filename *align*<br />
*scan*<br />
ALISTRUCS_OUTFILE<br />
filename<br />
PSD_THRESHOLD value *align*<br />
*scan*<br />
SSE_MINSIM value *align*<br />
*scan*<br />
SSE_MINLEN value *align*<br />
*scan*<br />
SSE_WINLEN value *align*<br />
*scan*<br />
SWITCH [true | false] *align*<br />
*scan*<br />
Examples<br />
Chapter 11: Command Syntax<br />
The following minimal input file (assumed to be called align.inp) structurally aligns two<br />
protein structures to one another:<br />
QUERY_FILE query.pdb<br />
TEMPLATE template.pdb<br />
File in which to store the structural alignment. Default is<br />
jobname.out.<br />
*align* File in which to store the 3D coordinates of the aligned<br />
structures.<br />
PSD threshold for inclusion in the MSA. Overridden by<br />
RECKLESS.<br />
Minimum Secondary Structural Element (SSE) similarity<br />
for the alignment.<br />
Minimum SSE length for the alignment.<br />
Align only SSE windows of length value.<br />
Carry out global alignment first and then align subwindows<br />
if no similarity was found.<br />
ORDER [seed | psd] *align* psd: order the alignment by PSD to the seed structure.<br />
CLEAN *align* Exclude structures with a PSD < PSD_THRESHOLD from<br />
the alignment.<br />
RECKLESS [true |<br />
false]<br />
*align*<br />
*scan*<br />
Force the alignment of structures even if no similarity is<br />
found.<br />
where query.pdb is the name of the PDB file containing the query structure (the structure to<br />
which we want to align) and template.pdb contains the template structure (the structure we<br />
want to align to the query).<br />
When run using $SCHRODINGER/ska align it should produce the following output:<br />
• align.out—a file containing the structural alignment<br />
• align.log—a log file that contains log information and diagnostics on the run<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 105
106<br />
Chapter 11: Command Syntax<br />
Note that the actual 3D coordinates of the superposed structures are not written out by default<br />
You must use ALISTRUCS_OUTFILE to write out the structures.<br />
The following input file (align.inp):<br />
QUERY_FILE query.pdb<br />
TEMPLATE template.pdb<br />
ALISTRUCS_OUTFILE superposition.pdb<br />
saves the 3D coordinates of the aligned structures in the file superposition.pdb.<br />
The following is a sample input file for database creation in dbcreate mode:<br />
MODE dbcreate<br />
FILELIST filelist.dat<br />
DBNAME mydatabase.dat<br />
The file filelist.dat contains the file names for the templates, including the path, as<br />
follows:<br />
/scr/templates/template-1.pdb<br />
/scr/templates/template-2.pdb<br />
/scr/templates/template-3.pdb<br />
/scr/templates/template-4.pdb<br />
A minimal input file for a scan is as follows:<br />
MODE scan<br />
QUERY_FILE query.pdb<br />
Running ska with this input file would scan the database supplied with <strong>Prime</strong> for structural<br />
analogs. The output from the structural search contains pairwise alignments between the query<br />
structure and every structure in the database (assuming the two structures are sufficiently<br />
similar). The results can be analyzed in a more convenient way with the graphical SKA Results<br />
Viewer utility, $SCHRODINGER/utilities/SkaResultsViewer.<br />
The use of the optional name argument is illustrated next. For the following input file:<br />
QUERY_FILE query.pdb 1ctf<br />
TEMPLATE 1dd3_A.pdb<br />
the output file is as follows:<br />
.........+.........+.........+.........+.........+.........+.....<br />
1ctf 1 ceeeeeecc<br />
1dd3_A 1 cchhhhhhhhhhhchhhhhhhhhhhhhhhcccchhhhhhhhhhhhhhhhhhhhhhccceeeeeecc<br />
1ctf 1 ---------------------------------------------------------EFDVILKAA<br />
1dd3_A 1 MTIDEIIEAIEKLTVSELAELVKKLEDKFGVTAAAPVAVAAAPVAGAAAGAAQEEKTEFDVVLKSF<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
Chapter 11: Command Syntax<br />
..........+.........+.........+.........+.........+.........+.<br />
1ctf 62 ccchhhhhhhhhhhhccchhhhhhhhhhc c eeeeeccchhhhhhhhhhhhhhcceeeee<br />
1dd3_A 67 ccchhhhhhhhhhhhcchhhhhhhhhhhcccccccceeccchhhhhhhhhhhhhhcceeeee<br />
1ctf 62 GANKVAVIKAVRGATGLGLKEAKDLVESA-P--AALKEGVSKDDAEALKKALEEAGAEVEVK<br />
1dd3_A 67 GQNKIQVIKVVREITGLGLKEAKDLVEKAGSPDAVIKSGVSKEEAEEIKKKLEEAGAEVELK<br />
RMSD: 0.763733<br />
PSD: 0.024524<br />
Without the name argument for the QUERY_FILE keyword, the output would be<br />
.........+.........+.........+.........+.........+.........+.....<br />
query 1 ceeeeeecc<br />
1dd3_A 1 cchhhhhhhhhhhchhhhhhhhhhhhhhhcccchhhhhhhhhhhhhhhhhhhhhhccceeeeeecc<br />
query 1 ---------------------------------------------------------EFDVILKAA<br />
1dd3_A 1 MTIDEIIEAIEKLTVSELAELVKKLEDKFGVTAAAPVAVAAAPVAGAAAGAAQEEKTEFDVVLKSF<br />
..........+.........+.........+.........+.........+.........+.<br />
query 62 ccchhhhhhhhhhhhccchhhhhhhhhhc c eeeeeccchhhhhhhhhhhhhhcceeeee<br />
1dd3_A 67 ccchhhhhhhhhhhhcchhhhhhhhhhhcccccccceeccchhhhhhhhhhhhhhcceeeee<br />
query 62 GANKVAVIKAVRGATGLGLKEAKDLVESA-P--AALKEGVSKDDAEALKKALEEAGAEVEVK<br />
1dd3_A 67 GQNKIQVIKVVREITGLGLKEAKDLVEKAGSPDAVIKSGVSKEEAEEIKKKLEEAGAEVELK<br />
RMSD: 0.763733<br />
PSD: 0.024524<br />
11.7 ssp—Secondary Structure Prediction<br />
The ssp script provides an interface to the secondary structure prediction programs SSPro,<br />
which is bundled with <strong>Prime</strong>, as well as PSIPRED, which is available from a third party.<br />
Syntax<br />
$SCHRODINGER/ssp [options] jobname<br />
By default the input is read from the file jobname.inp and the output is written to<br />
jobname.out.<br />
Options<br />
The standard Job Control command line options listed in Table 11.1 are accepted. There are no<br />
options specific to this program.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 107
108<br />
Chapter 11: Command Syntax<br />
Command Input File<br />
The command input file consists of lines containing keyword-value pairs, one per line. The<br />
keywords are described in Table 11.9.<br />
Table 11.9. Keywords for the ssp input file.<br />
Keyword syntax Description<br />
QUERY_FILE filename Required. Name of the FASTA file containing the query sequence<br />
METHOD program Required. Name of the secondary structure prediction that is to be run.<br />
Allowed values are sspro, psipred, profking, and all. all means<br />
run all available programs. If the input file contains multiple METHOD keywords,<br />
only the last one is used.<br />
OUTPUT_FILE filename Optional. Name of the output file. Default is jobname.out.<br />
Return Value and Errors<br />
ssp returns 0 if successful and a non-zero value in all other cases.<br />
Environment Variables<br />
The environment variables that can be used to control the secondary structure prediction<br />
programs are listed in Table 11.10.<br />
Table 11.10. Environment variables for secondary structure prediction programs.<br />
Environment Variable Description<br />
PSP_SSPRO_DB BLAST database to be searched when assembling the multiple sequence<br />
alignment used in the SSPro prediction<br />
PSP_PSIPRED_DB BLAST database to be searched when assembling the sequence profile<br />
used in the PSIPRED prediction<br />
Examples<br />
The following minimal input file, called sspro.inp, generates an SSPro secondary structure<br />
prediction for the sequence stored in file query.fasta:<br />
QUERY_FILE query.fasta<br />
METHOD sspro<br />
The prediction produced by the back-end is written to the file sspro.out.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
Chapter 11: Command Syntax<br />
The following input file, assumed to be called all-methods.inp, instructs the ssp script to<br />
produce secondary structure prediction for all methods available—SSPro and, depending on<br />
the local installation, PSIPRED.<br />
QUERY_FILE query.fasta<br />
METHOD all<br />
11.8 sta—Single Template Alignment<br />
The Single Template Alignment (STA) program in PRIME is designed for protein sequences<br />
with medium to high sequence identity (>25%). STA differs from standard sequence alignment<br />
such as BLAST by taking into account secondary structure matching as well as profilesequence<br />
matching. As a result, STA often generates better alignment than BLAST in the<br />
regions where sequence conservation is relatively weak. STA can also take in constraints, such<br />
as user-selected residue pairs to be aligned together and/or residues in either query or template<br />
that are not to be aligned (gaps).<br />
STA can use templates directly from the PDB, as selected by running the Find Homologs step,<br />
and can also use templates from the <strong>Prime</strong> fold library, as selected in the Fold Recognition<br />
step. For the latter, you can export the template from the Fold Recognition table, and save it as<br />
a PDB file.<br />
Syntax<br />
$SCHRODINGER/sta [options] jobname<br />
The input is read from the file jobname.inp and the output is written to jobname.out.<br />
Options<br />
The sta script accepts the -DEBUG and -HOST options, as described in Table 11.1.<br />
Command Input File<br />
The input file consists of lines containing keyword-value pairs, one per line. The keywords are<br />
described in Table 11.11.<br />
Files<br />
• Input file—named jobname.inp. Contains keywords for running the program.<br />
• Query sequence file—File containing the complete sequence of the query, in FASTA format.<br />
Specified in the input file.<br />
• Secondary structure prediction files in CASP format for the query sequence. See page 93<br />
for an example of CASP format.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 109
110<br />
Chapter 11: Command Syntax<br />
Table 11.11. Keywords for the sta program.<br />
Keyword syntax Description<br />
QUERY_NAME name Name of the query sequence.<br />
QUERY_FILE filename File name of the query sequence in FASTA format.<br />
FORMAT string Alignment file format. Default is Maestro. Set to dev for plain text format—see<br />
page 111 for an example.<br />
BGNRES n The first residue of the query that is used in alignment. Default: 1.<br />
ENDRES n The last residue of the query sequence that is used. Default: the last residue<br />
of the original query sequence.<br />
PREDSS[_i] filename Secondary structure prediction file of the query sequence, in CASP format.<br />
One occurrence of this keyword is required for each file, and there<br />
must be at least one SSP file specified. If only one prediction is used, the<br />
keyword is PREDSS; if more predictions are used, the keyword is suffixed<br />
with an index i, starting from zero: PREDSS_0, PREDSS_1, and so<br />
on. For an example of CASP format, see page 93.<br />
TEMPLATE_PDB filename Filename of the template structure, in PDB format.<br />
MODE string Optional. Needed in the input file only when there are user-specified<br />
constraints. In that case, the value is set to user.<br />
PAIR pair-list Optional. <strong>User</strong>-specified residue pairing, used only when MODE is set to<br />
user. All the residue pairs can be added after the keyword in one or<br />
multiple lines. For example, the following lines pair residue 12, 20, 60<br />
from the query sequence with residue 15, 24, 65 from the template.<br />
MODE user<br />
PAIR 12 15 20 24<br />
PAIR 60 65<br />
GAP gap-list Optional. <strong>User</strong>-specified gaps, used only when MODE is set to user. Can<br />
be used with or without PAIR. Gaps are specified by the residues in<br />
either query (P) or template (T) that are aligned with them. For example,<br />
the following opens gaps against query residue 10, 11, template residue<br />
34 and 35.<br />
MODE user<br />
GAP P10 P11 T34 T35<br />
• Template PDB file—Template structure file, in PDB format. The sequence used for the<br />
template is taken from SEQRES lines in the PDB file. If SEQRES is not found in the PDB<br />
file, the template sequence is then taken from ATOM records.<br />
• Output alignment file—File named jobname.out containing pairwise alignment between<br />
the query and the template. By default, the alignment is in Maestro format. To generate<br />
plain text format, include FORMAT dev in the command input file. In plain text format,<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
Chapter 11: Command Syntax<br />
aligned residues from the query and the template are placed on top of each other—see<br />
below for an example.<br />
• Log file—File named jobname.logcontaining progress of the sta program, including<br />
warnings and error messages.<br />
Return Value and Errors<br />
sta returns 0 for success and non-zero for failure. In case of failure, check the log file for<br />
details, and search for ERROR. To further analyze the failure, use the -DEBUG option when you<br />
run sta.<br />
Examples<br />
Below is an example input file, where query_seq.fasta stores the query sequence in<br />
FASTA format.<br />
QUERY_NAME query<br />
QUERY_FILE query_seq.fasta<br />
PREDSS_0 query-ssp1.casp<br />
PREDSS_1 query-ssp2.casp<br />
PREDSS_3 query-ssp3.casp<br />
BGNRES 1<br />
ENDRES 149<br />
TEMPLATE_PDB pdb1vpe.ent<br />
Below is an example of the plain text format for output alignment files. This format is generated<br />
when you include ALIFORMAT dev in the command input file.<br />
probe length=70======tmplt_T0281_d1dq3a2 length=79 TotalScore= 42 SEQ= -4 SSTR= 102<br />
SEQID= 0.061538 nali= 65<br />
probe bgnRes=1 endRes=70 template bgnRes=3 endRes=75<br />
ProbeAA: ..MWMPPRPEEVARKLRRLGFVERMAKGGHRLYTHPDGRIVVVPFH...SGELP....KG<br />
ProbeSS: LLLLLLHHHHHHHHHHLLLEEEEELLLEEEEELLLLLEEEeLLL LLLLL HH<br />
Fold AA: GNFGLPLNFNAFKEWASEYGVEFKTNGSQTIAIIND...ERISLGQWHTRNRVSKAVLVK<br />
Fold SS: LLLEELLLHHHHHHHHHLLLLEEEEELLEEEEEELL EEEELLLHHHHLLEEHHHHHH<br />
ProbeAA: TFKRILRDAGLTEEEFHNL....<br />
ProbeSS: HHHHHHHHhLLLHHHHHLL<br />
Fold AA: MLRKLYEATK.DEEVKRMLHLIE<br />
Fold SS: HHHHHHHHHL LHHHHHHHHHHL<br />
The file contains header information at the top, then in the alignment section, the query<br />
sequence (ProbeAA), query composite secondary structure (ProbeSS), template sequence<br />
(Fold AA) and template secondary structure (Fold SS) are given. Gaps are denoted by<br />
periods in the sequences and by spaces in the secondary structures.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 111
112<br />
Chapter 11: Command Syntax<br />
11.9 tfm—Tertiary Folding Module<br />
The tfm program generates one or more complete backbone conformations starting from a<br />
primary sequence and secondary structure, along with optional domain boundaries and sheet<br />
topologies. In addition, tfm can accept several different types of constraint including multiple<br />
coordinate templates, multiple types of distance constraints, and dihedral angle constraints. A<br />
large number of run-time parameters allow the code to be used in a variety of ways, from minimizing<br />
a homology model to full ab initio folding.<br />
Syntax<br />
The tfm program has hundreds of user-specified options contained in over a dozen input files.<br />
A detailed description of all the options is beyond the scope of this document. The refine<br />
command can be used as a wrapper to prepare default input files. If suitable input has been<br />
prepared, there is also a top-level wrapper for tfm that can be used to run the program directly<br />
under Schrodinger Job Control. It can be invoked from the command line via:<br />
$SCHRODINGER/tfm input-file [options]<br />
where input-file is typically the job name, with or without a .inp extension. The input file<br />
must be either the first or last command-line argument, and everything else on the command<br />
line is passed as is to Job Control.<br />
Options<br />
All options are specified in the input file, and there are no command-line options, other than<br />
those recognized by Job Control.<br />
Command Input File<br />
There are only three keywords recognized in the command input file, listed in Table 11.12.<br />
Table 11.12. Keywords for tfm program.<br />
Keyword Description<br />
INPUT_FILE filename The input file passed to the executable.<br />
OUTPUT_FILE filename The output produced by the executable.<br />
STRUCT_NAME name Optional. Prefix for files in the local directory that will be automatically<br />
treated as input, such as template or constraint files.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
Files<br />
Chapter 11: Command Syntax<br />
In addition to the descriptive output in the file specified by OUTPUT_FILE and the error<br />
messages in name.err, tfm produces the following files, where the first part of the file name<br />
is tfm by default but can be changed by other input options, and N is a serial number.<br />
tfm.N.pdb Output structures, if applicable.<br />
tfm.N.spair Possible strand pair assignments for unpaired strands, if automatic strandpairing<br />
is being used.<br />
Return Value and Errors<br />
tfm returns a value of 0 on successful completion, 1 otherwise. All error messages are printed<br />
to the file name.err, where name is either the STRUCT_NAME value or the SEQNAME parameter<br />
in the file specified by INPUT_FILE. This file is empty otherwise, and all output from the<br />
program itself is written to the file specified by OUTPUT_FILE. Output from Job Control goes<br />
to stdout, and any errors arising before the driver script has started will likewise go to stdout or<br />
stderr.<br />
Examples<br />
Sample input for tfm can be generated by running refine with the -LOCAL and -DEBUG<br />
options. The command<br />
refine sample<br />
creates a directory run_sample with the input file sample.1.in (among other files determined<br />
by the refinement type). An input file sample_tfm.inp containing the following lines can be<br />
used to run the job from the command line<br />
INPUT_FILE sample.1.in<br />
OUTPUT_FILE sample.1.out<br />
with the command<br />
tfm sample_tfm<br />
Environment Variables<br />
In addition to the standard environment variables tfm recognizes the following environment<br />
variables:<br />
TFM_QUIET<br />
TFM_VERBOSE<br />
These environment variables can be set to decrease or increase the jobrelated<br />
output in the log file. The value to which they are set is irrelevant.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 113
114<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
<strong>Prime</strong> <strong>User</strong> <strong>Manual</strong><br />
Chapter 12: Command-Line Utilities<br />
12.1 Multiple Template Alignment: structalign<br />
Chapter 12<br />
If multiple templates are selected in the Find Homologs step, they must be rotated into the<br />
same coordinate reference frame before they can be used. The Align Structures button is the<br />
usual means of performing structural alignment of templates, but the structalign utility can<br />
be run from the command line.<br />
The input files must be in PDB format and the templates in them should consist of single<br />
chains. To use the output files, each must be imported into Find Homologs and selected as one<br />
of the templates, replacing the original, unaligned version.<br />
Syntax:<br />
$SCHRODINGER/utilities/structalign [options] input-files<br />
The input files must be in PDB format. The output file name is generated by adding the prefix<br />
rot- to the input file name.<br />
Options:<br />
-h Print usage message.<br />
-force Force alignment even if structures are not sufficiently similar.<br />
To use structalign from the command line:<br />
1. Obtain the PDB structure file of each template. This can be done easily by clicking on the<br />
sequence name of each template of interest to display it in the Workspace.<br />
2. For each structure, choose Export from the Project menu of the Maestro main window to<br />
save the coordinates of the structure to a file. Select PDB format.<br />
You should now have the following files:<br />
template1.pdb template2.pdb template3.pdb<br />
3. At the command line, enter:<br />
$SCHRODINGER/utilities/structalign template1.pdb template2.pdb<br />
template3.pdb<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 115
116<br />
Chapter 12: Command-Line Utilities<br />
This command runs the structalign utility and returns one aligned file for each<br />
template:<br />
rot-template1.pdb<br />
rot-template2.pdb<br />
rot-template3.pdb<br />
Note: If you encounter a problem involving templates with multiple chains, use the<br />
getpdb utility (see Section D.2.3 of the Maestro <strong>User</strong> <strong>Manual</strong>) to extract each<br />
chain into a separate .pdb file, then run structalign again.<br />
4. Import each rot-template.pdb file using the Import button.<br />
Once the files are imported and the aligned templates appear in the Homologs table, select<br />
them all using shift-click or control-click. (Do not select the original structures.) Click Next to<br />
continue to the Edit Alignment step.<br />
12.2 Format Conversion: seqconvert<br />
This utility converts sequences between any permitted input file format and output file format.<br />
This may be useful if autoconversion does not proceed as expected.<br />
Syntax:<br />
$SCHRODINGER/utilities/seqconvert [options]<br />
input-format input_filename output-format output_filename<br />
One use of the seqconvert utility is to convert SSPs produced in <strong>Prime</strong> to FASTA format with<br />
the following command:<br />
$SCHRODINGER/utilities/seqconvert -inative infile.seq<br />
-ofasta outfile.fasta<br />
Options are described in Table 12.1.<br />
Table 12.1. Keywords for the seqconvert utility<br />
Option Description<br />
-start Read from this sequence in the input file<br />
-end Read to (and including) this sequence in the input file<br />
-h Display usage message<br />
-total Read at most this number of sequences<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
Table 12.1. Keywords for the seqconvert utility (Continued)<br />
Option Description<br />
-ali Read an alignment in the input file<br />
Chapter 12: Command-Line Utilities<br />
-pdb_use_atoms Use ATOM (not SEQRES) records with PDB files<br />
Input File Formats:<br />
-iany Autodetect input file format<br />
-inative Input file in native (Schrödinger) format<br />
-ifasta Input file in FASTA format<br />
-ipdb Input file in PDB format<br />
-inbrf_pir Input file in NBRF-PIR format<br />
-itext_plain Input file in TEXT-PLAIN format<br />
-igcg Input file in GCG format<br />
-iswissprot Input file in SWISS-PROT format<br />
-igenbank Input file in GENBANK format<br />
-incbi Input file in NCBI format<br />
-iphylip Input file in PHYLIP format<br />
-inexus_paup Input file in NEXUS-PAUP format<br />
-iselex Input file in SELEX format<br />
-istaden Input file in STADEN format<br />
-ipfam_stockholm Input file in PFAM-STOCKHOLM format<br />
-icasp Input file in CASP format<br />
Output File Formats:<br />
-onative Output file in native (Schrödinger) format<br />
-ofasta Output file in FASTA format<br />
-onbrf_pir Output file in NBRF-PIR format<br />
-otext_plain Output file in TEXT-PLAIN format<br />
-ogcg Output file in GCG format<br />
-oswissprot Output file in SWISS-PROT format<br />
-ocasp Output file in CASP format<br />
-ogenbank Output file in GENBANK format<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 117
118<br />
Chapter 12: Command-Line Utilities<br />
12.3 Structure Preparation: primefix.py<br />
Python script for preparing structures for use with <strong>Prime</strong>, which requires certain conventions<br />
for residue naming. You should not normally need to use this script, because its functions are<br />
performed when you run a <strong>Prime</strong> refinement job. The script performs the following actions:<br />
• Assigns a unique, single residue name, residue number, and chain name to all ligand residues,<br />
and ensures that new name doesn’t conflict with amino acid residue names.<br />
• Left-justifies 3-character residue names, and ensures that 1- and 2-character residue<br />
names are treated appropriately.<br />
• Converts PDB names to upper case.<br />
• Deletes all bonds to metals and assigns formal charges to the metal and previously<br />
attached atoms. FeS clusters are fixed with the appropriate bonds and formal charges.<br />
• Sets water molecule residue names to "HOH " and PDB atom names to " O ", "1H "<br />
and "2H ". This action is performed for residues named TIP, TIP3, TIP4, SPC, HOH, and<br />
DOD.<br />
• Renames terminal caps: "NME " to "NMA ", pdbname " C " of NMA to " CA ", and<br />
pdbname " CA " of ACE to " CH3".<br />
• Atoms from the same molecule are placed together in the structure output.<br />
<strong>Prime</strong> can fail if a residue is not in order of connectivity. If this is the case in your structure, it<br />
should be exported from Maestro as a PDB file, then reimported and exported in Maestro<br />
format before running this script.<br />
Syntax:<br />
$SCHRODINGER/run primefix.py infile outfile<br />
infile Input file (Maestro format) containing the structure<br />
outfile Output structure file (Maestro format) containing the prepared structure<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
Chapter 12: Command-Line Utilities<br />
12.4 Structure Alignment: align_binding_sites<br />
This script performs a pairwise superposition of multiple structures using the C-alpha atoms of<br />
selected residues. The reference structure for the alignment is the first structure in the input<br />
file. The C-alpha atoms used for the alignment come from residues within the specified cutoff<br />
distance from the ligand in the reference structure. Alternatively, a list of residues may be<br />
given for the alignment, which must correspond to residues in the reference structure. C-alpha<br />
atoms in the structures that are aligned that are greater than the specified distance from any Calpha<br />
in the reference are not considered in the alignment.<br />
$SCHRODINGER/utilities/align_binding_sites [options] input-file<br />
The options are given in Table 12.2. The Job Control options described in Table 2.1 and<br />
Table 2.2 of the Job Control Guide are supported, with the exception of -INTERVAL.<br />
Table 12.2. Options for the align_binding_sites command.<br />
Option Description<br />
-v[ersion] Show the program version and exit.<br />
-h[elp] Show usage message and exit.<br />
-l[igmol] molnum Molecule number of the ligand. Default is to automatically detect the<br />
ligand.<br />
-c[utoff] value Binding site cutoff distance from the ligand in angstroms. Residues<br />
within this distance of the ligand are used in the alignment. Default: 5.<br />
-d[ist] value Atom pairs greater than this distance are not considered in the alignment<br />
(default=5)<br />
-p[realigned] Structures are prealigned. Do not run global protein structure alignment<br />
(SKA).<br />
-r[es] list Residues for the alignment. This option overrides the use of -cutoff.<br />
Format for a residue is chain:resnum. If there is no chain name, use an<br />
underscore, e.g. _:999 . At least three residues are required. The<br />
comma-separated list should not have spaces.<br />
-j[obname] jobname Jobname to use. The default is based on the input file name.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 119
120<br />
Chapter 12: Command-Line Utilities<br />
12.5 Database Update Utilities<br />
<strong>Prime</strong> uses two databases, the PDB database and the BLAST database. As these databases are<br />
continually being updated, the versions distributed with the software will not have the latest<br />
contributions. Utilities have been provided for updating these databases from the web sites.<br />
12.5.1 update_BLASTDB<br />
This utility updates the BLAST database from a copy on the Schrödinger web site that is<br />
synchronized nightly with the BLAST web site. The preformatted database files are downloaded.<br />
Syntax:<br />
$SCHRODINGER/utilities/update_BLASTDB<br />
The BLAST database is installed in the first location found in the following list:<br />
• $PSP_BLASTDB<br />
• $SCHRODINGER_THIRDPARTY/database/blast<br />
• $SCHRODINGER/thirdparty/database/blast<br />
These environment variables are described in Appendix B. You must have permission to write<br />
to the destination in order to run this utility.<br />
12.5.2 rsync_pdb<br />
This utility creates or updates a local mirror of the PDB.<br />
Syntax:<br />
$SCHRODINGER/utilities/rsync_pdb<br />
The PDB is installed in the first location found in the following list:<br />
• $SCHRODINGER_PDB<br />
• $SCHRODINGER_THIRDPARTY/database/pdb<br />
• $SCHRODINGER/thirdparty/database/pdb<br />
These environment variables are described in Appendix B. You must have permission to write<br />
to the destination in order to run this utility.<br />
This utility uses an rsync server on the Schrödinger web site that enables the files to be downloaded.<br />
The files are written in the new remediated format. The Schrödinger site is updated<br />
nightly from the PDB.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
<strong>Prime</strong> <strong>User</strong> <strong>Manual</strong><br />
Appendix A: Third-Party Programs<br />
A.1 Location of Third-Party Programs<br />
Appendix A<br />
As well as internal and bundled software and data, <strong>Prime</strong> uses certain external third-party<br />
programs and databases for sequence and homolog searches, protein family searches, and<br />
secondary structure prediction. These are described under the headings in this topic.<br />
A.1.1 BLAST/PSI-BLAST, Pfam, and the PDB<br />
See the home pages for these programs and servers for more information:<br />
• BLAST/PSI-BLAST<br />
http://www.ncbi.nlm.nih.gov/blast<br />
The BLAST version distributed with <strong>Prime</strong> is version 2.2.16. To download the BLAST<br />
database, you should use the update_BLASTDB utility. This utility uses rsync to download<br />
or update the database, in the correct format for use with <strong>Prime</strong>.<br />
• HMMER/Pfam<br />
http://pfam.sanger.ac.uk<br />
This site has links to mirror sites that may be more convenient for downloads. You can<br />
download the Pfam database from ftp.schrodinger.com/support/hidden/prime/<br />
Pfam_fs.gz.<br />
• PDB<br />
http://www.rcsb.org/pdb/<br />
Note that the PDB format has changed, as of August 1, 2007. You should use the<br />
rsync_pdb utility to update your local copy of the PDB for use with <strong>Prime</strong>. <strong>Prime</strong> uses<br />
the new format, and automatically converts files in the old format.<br />
A.1.2 Secondary Structure Prediction<br />
Two distinct third-party secondary structure prediction programs can be used to increase the<br />
accuracy of, and get better results from, <strong>Prime</strong>. One of these programs (SSpro) is already<br />
bundled with <strong>Prime</strong>. The other (PSIPRED) is highly recommended, and may be manually<br />
installed by you, at your election, and subject to applicable license terms, if any. At the third-<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 121
122<br />
Appendix A: Third-Party Programs<br />
party web site whose address is given below, you may download the program and then install<br />
it. For instructions on how to install third-party programs, see Section 3.7 of the Installation<br />
Guide (Linux) or Section 4.2 of the Installation Guide (Windows), or the Third Party Programs<br />
page of our web site.<br />
The PSIPRED home page is located at:<br />
http://bioinf.cs.ucl.ac.uk/psipred/<br />
and contains information about the program, referrals to terms of use and/or license terms (in<br />
the README file), and a link to download the program. The link to download PSIPRED is<br />
contained in the section of the page labeled “Overview of prediction methods.”<br />
Note: PSIPRED is not available on Windows.<br />
A.2 Third-Party Program Use Agreement<br />
By using the links above or the third-party programs, you acknowledge and agree to the<br />
following:<br />
PSIPRED is a third party program (“Third Party Program”) and may therefore be subject to<br />
third party license agreements and fees. We (i.e., Schrödinger, LLC and its affiliates) have no<br />
responsibility or liability, directly or indirectly, for Third Party Programs or for any third party<br />
Web sites (“Linked Sites”) or for any damage or loss alleged to be caused by or in connection<br />
with use of or reliance thereon. Any warranties that we make regarding our own products and<br />
services do not apply to the Third Party Programs or Linked Sites, or to the interaction<br />
between, or interoperability of, our products and services and the Third Party Programs. Referrals<br />
and links to Third Party Programs and Linked Sites do not constitute an endorsement of<br />
such Third Party Programs or Linked Sites.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
<strong>Prime</strong> <strong>User</strong> <strong>Manual</strong><br />
Appendix B: Environment Variables<br />
Appendix B<br />
The only environment variable that must be set before you can run <strong>Prime</strong> is SCHRODINGER.<br />
The following list is presented for your convenience in case you prefer a different location for<br />
any <strong>Prime</strong>-related directories.<br />
General<br />
SCHRODINGER_THIRDPARTY: Directory containing third-party software and databases, which<br />
can be installed with the installers supplied by Schrödinger. The databases are installed in the<br />
database subdirectory; the software in the bin subdirectory. Each third-party product has its<br />
own subdirectory under these subdirectories. The default location if this environment variable<br />
is not set is $SCHRODINGER/thirdparty.<br />
PDB<br />
SCHRODINGER_PDB: Full path to a user-supplied copy of the PDB database. The default for<br />
this directory is $SCHRODINGER/thirdparty/database/pdb if only SCHRODINGER is set;<br />
if SCHRODINGER_THIRDPARTY is set, the default is $SCHRODINGER_THIRDPARTY/<br />
database/pdb. The copy of the database must retain the directory hierarchy used by the PDB<br />
and must include the following subdirectories to be usable with <strong>Prime</strong>:<br />
data/structures/all/pdb<br />
data/structures/divided/pdb<br />
data/structures/obsolete/pdb<br />
On Windows the data/structures/all subdirectory is absent (and is not required).<br />
Blast<br />
PSP_BLAST_DIR: Location of Blast executables. Default is $SCHRODINGER_THIRDPARTY/<br />
bin/platform/blast/bin, with the default for SCHRODINGER_THIRDPARTY noted above.<br />
PSP_BLAST_DATA: Location of Blast matrices. Default is $SCHRODINGER_THIRDPARTY/<br />
bin/platform/blast/data, with the default for SCHRODINGER_THIRDPARTY noted above.<br />
PSP_BLASTDB: Location of Blast databases. The BLAST databases distributed with <strong>Prime</strong> are<br />
formatted with version 2.2.16. If your BLAST database does not conform to this format, you<br />
might experience problems with <strong>Prime</strong>. Default is $SCHRODINGER_THIRDPARTY/<br />
database/blast, with the default for SCHRODINGER_THIRDPARTY noted above.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 123
124<br />
Appendix B: Environment Variables<br />
HMMER/Pfam<br />
PSP_HMMER_DIR: Location of HMMER distribution (the executables are assumed to be in the<br />
binaries subdirectory). Default is $SCHRODINGER_THIRDPARTY/bin/platform/hmmer,<br />
with the default for SCHRODINGER_THIRDPARTY noted above.<br />
PSP_HMMERDB: Location of Pfam database. Default is $SCHRODINGER_THIRDPARTY/<br />
database/pfam, with the default for SCHRODINGER_THIRDPARTY noted above.<br />
PSIPRED<br />
PSP_PSIPRED_DIR: Location of PSIPRED installation (top level directory containing the bin<br />
and data directories). Default is $SCHRODINGER_THIRDPARTY/bin/platform/psipred,<br />
with the default for SCHRODINGER_THIRDPARTY noted above.<br />
PSP_PSIPRED_DB: Sequence database to use for assembling the PSI-BLAST profile. Allowed<br />
values are nr and pdb. Default: pdb.<br />
SSPRO<br />
PSP_SSPRO_DB: Sequence database to use for assembling the SSPRO profile. Allowed values<br />
are nr and pdb. Default: pdb.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
<strong>Prime</strong> <strong>User</strong> <strong>Manual</strong><br />
Appendix C: File Formats<br />
C.1 Input File Formats<br />
The input file formats are:<br />
• native (Schrödinger) format<br />
• FASTA format<br />
• PDB format<br />
Appendix C<br />
Sequences are read from the PDB SEQRES records, if they exist. Otherwise they are read<br />
from the ATOM records. If SEQRES records do not exist and residues are missing from<br />
the ATOM records, the query sequence will of course contain gaps.<br />
• NBRF-PIR format<br />
• TEXT-PLAIN format<br />
• GCG format<br />
• SWISS-PROT format<br />
• GENBANK format<br />
• NCBI format<br />
• PHYLIP format<br />
• NEXUS-PAUP format<br />
• SELEX format<br />
• STADEN format<br />
• PFAM-STOCKHOLM format<br />
• CASP format<br />
C.2 Output File Formats<br />
The output file formats are:<br />
• native (Schrödinger) format<br />
• FASTA format<br />
• NBRF-PIR format<br />
• TEXT-PLAIN format<br />
• GCG format<br />
• SWISS-PROT format<br />
• CASP format<br />
• GENBANK format<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 125
126<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
<strong>Prime</strong> <strong>User</strong> <strong>Manual</strong><br />
Appendix D: Error Messages and Warnings<br />
D.1 Input Sequence<br />
Invalid File<br />
This message appears if the system fails to extract a valid sequence from the file.<br />
• Check that the sequence is in one of the supported sequence file formats.<br />
Appendix D<br />
• If FASTA format is used, make sure there are no extra spaces between the “>” and the<br />
sequence name, and that there are no extra spaces at the end of the line.<br />
No Sequences Found<br />
This message appears if the system fails to extract any sequences from the workspace.<br />
• Check the Project Table to ensure the correct entries are included in the Workspace.<br />
• Use Open Project in the Maestro Window to open a project.<br />
Invalid Sequence Characters<br />
This message appears if the sequence contains non-standard one-letter code for amino acids.<br />
• Check that sequence uses the one-letter code and is all uppercase.<br />
• Change/edit any non-standard characters from your sequence using an editor, and then<br />
restart job.<br />
D.2 Find Homologs<br />
D.2.1 Error Messages<br />
Multiple Templates Not Aligned<br />
• If you select multiple templates and proceed to Edit Alignment, you are reminded that the<br />
templates must be structurally aligned to one another.<br />
If one of the following messages appears when you click Next to go to the Edit Alignment step,<br />
you will not be able to proceed along the Comparative Modeling path until the problem is<br />
resolved:<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 127
128<br />
Appendix D: Error Messages and Warnings<br />
No Homologs Found<br />
• You may change your search options and continue searching, or proceed to Fold Recognition.<br />
Invalid File<br />
• Check to ensure the homolog file exists in either of the following locations:<br />
$SCHRODINGER/thirdparty/database/pdb/data/structures/divided/pdb/*<br />
$SCHRODINGER_THIRDPARTY/database/pdb/data/structures/divided/pdb/*<br />
and is in the right format (UNIX compressed, pdb*.ent.gz).<br />
D.2.2 Warnings in Homologs Table<br />
These messages appear in the Warning column of the Homologs table for sequences that are<br />
likely to be unusable for model building:<br />
Many missing residues<br />
• Too many consecutive missing residues to guarantee model building will succeed.<br />
!UNUSABLE: Alternate residue type<br />
• The residue type was undetermined, causing the file to include the residue twice but with<br />
different identity (e.g. ALA20 and VAL20). This is not supported by <strong>Prime</strong>.<br />
!UNUSABLE: Bad SEQRES<br />
• General problems with the SEQRES.<br />
!UNUSABLE: CA only<br />
• File contains only alpha C coordinates.<br />
!UNUSABLE: Molecule molname should be labeled as HETATM<br />
• A non-protein molecule has been listed as ATOM when it should be listed as HETATM.<br />
ATOM records are reserved for proteins only.<br />
!UNUSABLE: SEQRES/ATOM mismatch<br />
• The sequence listed in the SEQRES records does not match the actual sequence in the<br />
ATOM records.<br />
!UNUSABLE: SEQRES is missing residues<br />
• There are residues in the ATOM records that are not listed in the SEQRES records. PDB<br />
files require that the ATOM records be a subset of the SEQRES records.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
!UNUSABLE: SEQRES numbering is wrong<br />
Appendix D: Error Messages and Warnings<br />
• The number of residues in a particular chain does not agree with the number of residues<br />
shown in columns 14 to 17 of SEQRES.<br />
D.3 Edit Alignment<br />
Check Alignment<br />
When you click Next at the end of the Edit Alignment step, the Check Alignment warning may<br />
appear. The warning lists possible alignment problems that you may want to fix before<br />
continuing to the next step. For example, there may be gaps in secondary structure elements of<br />
the template, or gaps in the query that are aligned to secondary structure elements of the<br />
template.<br />
If you do not want to make corrections to the alignment, click Continue to go to the next step.<br />
To make corrections based on the information in the error message, make a note of the gap<br />
locations given, click Cancel to close the message box while remaining in Edit Alignment, and<br />
change the alignment accordingly.<br />
D.4 Build Structure<br />
Error messages and warnings for this generally appear in the log of a running job. The log can<br />
be viewed in the job Monitor panel until the job is completed.<br />
D.5 Refinement<br />
Error messages for Refinement generally appear in the log of a running job. The following is<br />
an example of a warning in the log of a loop refinement.<br />
Invalid Features<br />
When a loop refinement begins, a validation program checks the loop features of the structure.<br />
Apparent errors are reported in a warning dialog box. You may choose to Run All Features,<br />
Run Only Valid Features, orCancel. It is recommended that you make a note of the invalid<br />
features, click Cancel, and correct the structure if appropriate.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 129
130<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
<strong>Prime</strong> <strong>User</strong> <strong>Manual</strong><br />
Getting Help<br />
Schrödinger software is distributed with documentation in PDF format. If the documentation is<br />
not installed in $SCHRODINGER/docs on a computer that you have access to, you should install<br />
it or ask your system administrator to install it.<br />
For help installing and setting up licenses for Schrödinger software and installing documentation,<br />
see the Installation Guide. For information on running jobs, see the Job Control Guide.<br />
Maestro has automatic, context-sensitive help (Auto-Help and Balloon Help, or tooltips), and<br />
an online help system. To get help, follow the steps below.<br />
• Check the Auto-Help text box, which is located at the foot of the main window. If help is<br />
available for the task you are performing, it is automatically displayed there. Auto-Help<br />
contains a single line of information. For more detailed information, use the online help.<br />
• If you want information about a GUI element, such as a button or option, there may be<br />
Balloon Help for the item. Pause the cursor over the element. If the Balloon Help does<br />
not appear, check that Show Balloon Help is selected in the Maestro menu of the main<br />
window. If there is Balloon Help for the element, it appears within a few seconds.<br />
• For information about a panel or the tab that is displayed in a panel, click the Help button<br />
in the panel, or press F1. The help topic is displayed in your browser.<br />
• For other information in the online help, open the default help topic by choosing Online<br />
Help from the Help menu on the main menu bar or by pressing CTRL+H. This topic is displayed<br />
in your browser. You can navigate to topics in the navigation bar.<br />
The Help menu also provides access to the manuals (including a full text search), the FAQ<br />
pages, the New Features pages, and several other topics.<br />
If you do not find the information you need in the Maestro help system, check the following:<br />
• Maestro <strong>User</strong> <strong>Manual</strong>, for detailed information on using Maestro<br />
• Maestro Command Reference <strong>Manual</strong>, for information on Maestro commands<br />
• Maestro Overview, for an overview of the main features of Maestro<br />
• Maestro Tutorial, for a tutorial introduction to basic Maestro features<br />
• <strong>Prime</strong> Quick Start Guide, for a tutorial introduction to <strong>Prime</strong><br />
• <strong>Prime</strong> Frequently Asked Questions pages, at<br />
https://www.schrodinger.com/<strong>Prime</strong>_FAQ.html<br />
• Known Issues pages, available on the Support Center.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 131
132<br />
Getting Help<br />
The manuals are also available in PDF format from the Schrödinger Support Center. Local<br />
copies of the FAQs and Known Issues pages can be viewed by opening the file<br />
Suite_2009_Index.html, which is in the docs directory of the software installation, and<br />
following the links to the relevant index pages.<br />
Information on available scripts can be found on the Script Center. Information on available<br />
software updates can be obtained by choosing Check for Updates from the Maestro menu.<br />
If you have questions that are not answered from any of the above sources, contact Schrödinger<br />
using the information below.<br />
E-mail: help@schrodinger.com<br />
USPS: Schrödinger, 101 SW Main Street, Suite 1300, Portland, OR 97204<br />
Phone: (503) 299-1150<br />
Fax: (503) 299-4532<br />
WWW: http://www.schrodinger.com<br />
FTP: ftp://ftp.schrodinger.com<br />
Generally, e-mail correspondence is best because you can send machine output, if necessary.<br />
When sending e-mail messages, please include the following information:<br />
• All relevant user input and machine output<br />
• <strong>Prime</strong> purchaser (company, research institution, or individual)<br />
• Primary <strong>Prime</strong> user<br />
• Computer platform type<br />
• Operating system with version number<br />
• <strong>Prime</strong> version number<br />
• Maestro version number<br />
• mmshare version number<br />
On UNIX you can obtain the machine and system information listed above by entering the<br />
following command at a shell prompt:<br />
$SCHRODINGER/utilities/postmortem<br />
This command generates a file named username-host-schrodinger.tar.gz, which you<br />
should send to help@schrodinger.com. If you have a job that failed, enter the following<br />
command:<br />
$SCHRODINGER/utilities/postmortem jobid<br />
where jobid is the job ID of the failed job, which you can find in the Monitor panel. This<br />
command archives job information as well as the machine and system information, and<br />
includes input and output files (but not structure files). If you have sensitive data in the job<br />
launch directory, you should move those files to another location first. The archive is named<br />
jobid-archive.tar.gz, and should be sent to help@schrodinger.com instead.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
Getting Help<br />
If Maestro fails, an error report that contains the relevant information is written to the current<br />
working directory. The report is named maestro_error.txt, and should be sent to<br />
help@schrodinger.com. A message giving the location of this file is written to the terminal<br />
window.<br />
More information on the postmortem command can be found in Appendix A of the Job<br />
Control Guide.<br />
On Windows, machine and system information is stored on your desktop in the file<br />
schrodinger_machid.txt. If you have installed software versions for more than one<br />
release, there will be multiple copies of this file, named schrodinger_machid-N.txt,<br />
where N is a number. In this case you should check that you send the correct version of the file<br />
(which will usually be the latest version).<br />
If Maestro fails to start, send email to help@schrodinger.com describing the circumstances,<br />
and attach the file maestro_error.txt. If Maestro fails after startup, attach this file and the<br />
file maestro.EXE.dmp. These files can be found in the following directory:<br />
%USERPROFILE%\Local Settings\Application Data\Schrodinger\appcrash<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 133
134<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
<strong>Prime</strong> <strong>User</strong> <strong>Manual</strong><br />
Glossary<br />
Chapter<br />
alignment—The optimal matching of residue positions between sequences, typically a query<br />
sequence and one or more template sequences.<br />
anchor—A constraint on alignment set at a given residue position. Alignment changes must<br />
preserve the query-template pairing at that residue until the anchor is removed.<br />
ASL—Atom Specification Language.<br />
cofactor—The complement of an enzyme reaction, usually a metal ion or metal complex.<br />
Comparative Modeling—Protein structure modeling based on a query-template match with a<br />
substantial percentage of identical residues (usually 50% or greater sequence identity).<br />
composite template—A type of template used in the Threading path, produced from the core<br />
(invariable) and variable regions of a family of structurally similar proteins.<br />
constraints—Tools to keep regions of a sequence (alignment constraints) or structure (during<br />
minimization) in a particular configuration.<br />
deletions—The residues missing from a query sequence that are present in a template<br />
sequence.<br />
Fold Recognition—The use of secondary structure matching and profiles generated from<br />
multiple sequence/structure alignments to find templates when sequence methods are unsuccessful.<br />
gaps—The spaces in an alignment resulting from insertions and deletions.<br />
HETATOMs—The atoms of residues, including amino acids, that are not one of the standard<br />
20 amino acids. In PDB files, HETATM.<br />
homolog—A sequence/structure related to the query sequence; i.e., a sequence with many of<br />
the same residues in the same patterns as the query sequence. Usually these sequences are<br />
derived from the same family and may have similar function.<br />
insertions—The extra residues found in a query sequence that are not found in a template<br />
sequence.<br />
ligand—A small molecule that binds to a protein; e.g., an antigen, hormone, neurotransmitter,<br />
substrate, inhibitor, and so on.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 135
136<br />
Glossary<br />
loop—A region of undefined secondary structure.<br />
project—A collection of related data, such as structures with their associated properties. In<br />
<strong>Prime</strong> a project comprises one or more runs (executions of the <strong>Prime</strong> workflow). The project<br />
may include data that does not appear in the project table.<br />
Project Table—The Maestro panel associated with a project, featuring a table with rows of<br />
entries and columns of properties. In <strong>Prime</strong>, project table entries are usually model structures.<br />
query sequence—A sequence of unknown structure or fold.<br />
Ranking Score—The score used to rank composite templates derived from different seed<br />
templates. Generated by the Global Scoring Function in the Threading Path.<br />
refinement—An improvement of a model structure through energy-based optimization of<br />
selected regions.<br />
run—A single execution of the <strong>Prime</strong> workflow using a particular set of choices (of templates,<br />
of paths, and of settings). Each run belongs to a project. Runs cannot be saved without saving<br />
the project to which they belong.<br />
SSA—Secondary structure assignment.<br />
SSP—Secondary structure prediction.<br />
template sequence—A sequence of known structure and fold used as a basis for building a<br />
model of the query.<br />
Threading—A structure prediction process in which Fold Recognition is used to define<br />
templates, then backbone models are built via alignment to composite templates and refined.<br />
May be used when query-template sequence identity is low.<br />
Workspace—The open area in the center of the Maestro window in which structures are<br />
displayed.<br />
Z-Score—Measures the compatibility of the query sequence with the model structure, relative<br />
to the compatibility of randomly shuffled sequences of the same composition.<br />
<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>
Symbols<br />
+ (plus sign), in Pfam sequence ........................ 21<br />
- (hyphen)<br />
for locked gap ....................................... 14, 33<br />
for loop ........................................... 13, 31, 46<br />
~ (tilde), for unlocked gap........................... 14, 33<br />
A<br />
Add Anchors ............................................... 14, 33<br />
Align program................................................... 47<br />
aligned regions .................................................. 36<br />
alignment......................................................... 135<br />
editing................................................... 13, 33<br />
alignment score ................................................. 34<br />
interpretation of values ............................... 71<br />
alpha carbons<br />
aligning structures by ............................... 119<br />
constraining specific ................................. 100<br />
maximum movement .......................... 65, 101<br />
anchors ............................................................ 135<br />
alignment constraints............................ 14, 33<br />
removing............................................... 14, 33<br />
setting ................................................... 14, 33<br />
ASL................................................................. 135<br />
atom names, correcting ..................................... 54<br />
ATOM records ................................................ 125<br />
B<br />
Balloon Help ....................................................... 8<br />
binding sites, aligning ....................................... 74<br />
bit score............................................................. 23<br />
BLAST<br />
database format......................................... 123<br />
expectation value ........................................ 23<br />
percentage gaps .......................................... 23<br />
updating database ..................................... 120<br />
BLAST/PSI-BLAST....................................... 121<br />
blocks, sliding ............................................. 13, 33<br />
BLOSUM62 similarity matrix .......................... 23<br />
bond orders, assigning ...................................... 53<br />
bonds, to metals ................................................ 52<br />
Build Structure step........................................... 35<br />
C<br />
chain lengths ..................................................... 18<br />
Index<br />
chain name ........................................................ 23<br />
chains, multiple......................................... 18, 102<br />
charged residues................................................ 33<br />
code, sequence .................................................. 18<br />
cofactors.............................................. 36, 37, 135<br />
Color by Sequence ............................................ 10<br />
Color Scheme.............................................. 10, 14<br />
<strong>Prime</strong> Display Menu................................... 10<br />
<strong>Prime</strong> toolbar .............................................. 10<br />
color schemes.............................................. 10–11<br />
for Set Template Regions ........................... 36<br />
columns<br />
sorting............................................... 8, 22, 48<br />
V (Visualization) ........................................ 49<br />
Comparative Modeling ............................... 1, 135<br />
Comparative Modeling Path ................... 9, 17, 25<br />
composite SSP .................................................. 33<br />
composite templates........................................ 135<br />
Compound (PDB COMPND) ........................... 23<br />
conserved residues ...................................... 36, 47<br />
constraints ....................................................... 135<br />
alpha carbon.............................................. 100<br />
atom pair..................................................... 99<br />
continuum solvation model............................... 40<br />
conventions, document...................................... ix<br />
cooperative refinement of loops........................ 60<br />
coordinate copying............................................ 36<br />
Covalent Docking panel.................................... 78<br />
covalently bound ligands .................................. 51<br />
crystal symmetry............................................... 64<br />
C-terminal gaps........................................... 13, 33<br />
D<br />
Delete<br />
<strong>Prime</strong> runs..................................................... 6<br />
SSP ......................................................... 8, 31<br />
deletions .............................................. 36, 51, 135<br />
directory<br />
installation .................................................... 3<br />
Maestro working........................................... 3<br />
Display menu (<strong>Prime</strong>)................................. 10, 13<br />
domains....................................................... 18, 25<br />
E<br />
E (for beta strand) ................................. 13, 31, 46<br />
Edit menu (<strong>Prime</strong>)............................................. 13<br />
<strong>Prime</strong> 2.5 <strong>User</strong> <strong>Manual</strong> 137
138<br />
Index<br />
Edit SSP ............................................................ 13<br />
editing alignment .............................................. 33<br />
editing sequences .............................................. 10<br />
EMBL ............................................................... 18<br />
Energy Analysis task......................................... 58<br />
energy minimization ......................................... 62<br />
entries, Project Table......................................... 16<br />
environment variables ..................................... 123<br />
SCHRODINGER .............................................. 3<br />
SCHRODINGER_PDB...................................... 3<br />
error messages................................... 18, 127, 129<br />
Expect column .................................................. 23<br />
expectation value............................................... 23<br />
Export Sequences............................................ 8, 9<br />
Export SSP.............................................. 8, 30, 43<br />
extracting sequence names................................ 18<br />
F<br />
family ................................................................ 21<br />
FASTA<br />
file format ............................................. 18, 30<br />
file header ................................................... 18<br />
file formats ........................................................ 18<br />
File menu ............................................................ 6<br />
Job Options................................................. 14<br />
Find Homologs step .................................... 17, 18<br />
Fold Recognition....................................... 43, 135<br />
folds................................................................... 47<br />
Font Size, in Sequence Viewer ......................... 12<br />
force field<br />
Build Structure............................................ 40<br />
energy minimization................................... 62<br />
nonstandard residues .................................. 52<br />
single-point energy ..................................... 58<br />
formal charges, correcting ................................ 53<br />
G<br />
gaps ................................................................. 135<br />
C-terminal............................................. 13, 33<br />
locking .................................................. 14, 33<br />
N-terminal............................................. 13, 33<br />
percentage................................................... 23<br />
in query....................................................... 23<br />
in secondary structure......................... 34, 129<br />
in template .................................................. 23<br />
unlocking .............................................. 14, 33<br />
<strong>Prime</strong> 2.5 <strong>User</strong> <strong>Manual</strong><br />
GENBANK ....................................................... 18<br />
getpdb utility.................................................. 22<br />
GPCRs<br />
aligning....................................................... 31<br />
SSPs for ...................................................... 29<br />
Guide......................................................... 8, 9, 13<br />
H<br />
H (for alpha helix)................................. 13, 31, 46<br />
header<br />
FASTA file.................................................. 18<br />
PDB ............................................................ 18<br />
helix, building ................................................. 101<br />
HETATMs ....................................................... 135<br />
HETATOMs .................................................... 135<br />
hidden sequences .............................................. 21<br />
Hide All (SSPs)................................................... 8<br />
HMMER/Pfam.................................... 21, 28, 121<br />
homologs......................................................... 135<br />
importing .................................................... 19<br />
Homologs table ................................................. 23<br />
host machine ..................................................... 14<br />
hydrogens, adding............................................. 52<br />
hydrophobic residues ........................................ 33<br />
hyphen symbol<br />
for locked gap ....................................... 14, 33<br />
for loop ........................................... 13, 31, 46<br />
I<br />
ID, PDB chain................................................... 23<br />
Identities column............................................... 23<br />
Import SSP.................................................. 30, 43<br />
importing structurally aligned templates .......... 19<br />
Include ligands and cofactors window.............. 37<br />
Input Sequence step .......................................... 17<br />
insertions....................................... 33, 36, 51, 135<br />
J<br />
Job Monitor panel ....................................... 15, 38<br />
job options......................................................... 14<br />
Job Options dialog box ................................. 9, 14<br />
job status icon ............................................... 8, 15<br />
jobs<br />
killing.......................................................... 15<br />
monitoring .................................................. 15<br />
stopping ...................................................... 38
K<br />
killing jobs ........................................................ 15<br />
L<br />
leaving group<br />
ligand .......................................................... 78<br />
receptor....................................................... 79<br />
libraries<br />
backbone dihedral....................................... 40<br />
side-chain rotamer ...................................... 40<br />
ligands................................................. 36, 37, 135<br />
covalently bound......................................... 51<br />
ligands, covalently bound ........................... 52, 77<br />
lipophilic term................................................... 69<br />
locking gaps ................................................ 14, 33<br />
log file ............................................................... 38<br />
loop refinement ................................................. 51<br />
cooperative.................................................. 60<br />
invalid features check ................................. 60<br />
options ........................................................ 64<br />
order of ....................................................... 60<br />
settings........................................................ 59<br />
time required............................................... 59<br />
loops................................................................ 136<br />
M<br />
Maestro, starting ................................................. 3<br />
<strong>Manual</strong> Threading............................................. 33<br />
maximum length ............................................... 25<br />
maximum sequence length................................ 18<br />
menu bar (<strong>Prime</strong>) ............................................ 8, 9<br />
menus<br />
Display (<strong>Prime</strong>) .................................... 10, 13<br />
Edit (<strong>Prime</strong>) ................................................ 13<br />
File (<strong>Prime</strong>)............................................. 6, 14<br />
merging projects.................................................. 5<br />
metals, bonds to ................................................ 52<br />
minimization ............................................... 36, 62<br />
missing residues.............................................. 125<br />
Monitor panel................................................ 8, 38<br />
monitoring jobs ................................................. 15<br />
mouse functions ............................................ 8, 22<br />
multimers, building ........................................... 88<br />
multiple chains<br />
building as multimer................................... 88<br />
with same sequence .................................... 18<br />
Index<br />
with SKA.................................................. 102<br />
multiple sequences............................................ 18<br />
multiple templates............................................. 36<br />
exporting................................................... 115<br />
N<br />
NAME_pfam sequence ..................................... 21<br />
names<br />
chains.......................................................... 23<br />
extracting sequence..................................... 18<br />
non-unique.................................................. 18<br />
sequences.............................................. 18, 23<br />
native Schrödinger format................................. 30<br />
New command .................................................... 6<br />
non-conserved residues..................................... 36<br />
None color scheme............................................ 10<br />
N-terminal gaps........................................... 13, 33<br />
O<br />
Open command ................................................... 6<br />
OPLS_2005 force field ................... 40, 52, 58, 62<br />
optimizing side chains....................................... 37<br />
options<br />
job........................................................... 9, 14<br />
in panel layout .............................................. 8<br />
P<br />
panel layout......................................................... 7<br />
panels<br />
Job Options................................................. 14<br />
Monitor......................................................... 8<br />
path<br />
Comparative Modeling ................. 1, 9, 17, 25<br />
Threading.................................................... 17<br />
PDB........................................................... 47, 121<br />
ATOM records .......................................... 125<br />
chain ID ...................................................... 23<br />
header ......................................................... 18<br />
ID code ....................................................... 23<br />
SEQRES ................................................... 125<br />
updating database ..................................... 120<br />
PDB atom names, correcting for receptor-ligand<br />
complex.......................................................... 54<br />
Pfam .................................................... 21, 28, 121<br />
PIR .................................................................... 18<br />
plus sign (+), in Pfam sequence ........................ 21<br />
<strong>Prime</strong> 2.5 <strong>User</strong> <strong>Manual</strong> 139
140<br />
Index<br />
polar residues .................................................... 33<br />
position-specific substitutional matrix (PSSM) 33<br />
Positives column ............................................... 23<br />
Preferences........................................................ 12<br />
command .................................................... 11<br />
<strong>Prime</strong> Guide .................................................. 8, 13<br />
<strong>Prime</strong> menu bar............................................... 8, 9<br />
<strong>Prime</strong> runs ........................................................... 5<br />
<strong>Prime</strong> Sequence Viewer ...................................... 8<br />
<strong>Prime</strong> toolbar................................................. 8, 13<br />
<strong>Prime</strong> workflow................................................... 9<br />
<strong>Prime</strong>-Refinement ......................................... 1, 51<br />
<strong>Prime</strong>–Structure Prediction....................... 1, 7, 17<br />
processors, maximum number .......................... 65<br />
product installation.......................................... 131<br />
programs, third-party ...................................... 121<br />
Project Table ............................................. 16, 136<br />
projects............................................................ 136<br />
proximity........................................................... 11<br />
Proximity color scheme .................................... 11<br />
proximity cutoff .......................................... 11, 12<br />
PSIPRED ........................................................ 121<br />
Q<br />
query sequence................................................ 136<br />
query, maximum length .............................. 18, 25<br />
R<br />
random seed, for side-chain prediction............. 64<br />
Ranking Score................................................. 136<br />
read sequence.................................................... 18<br />
records<br />
ATOM....................................................... 125<br />
SEQRES ................................................... 125<br />
Refine Structure ................................................ 16<br />
Refinement .......................................................... 1<br />
refinement ....................................................... 136<br />
cooperative.................................................. 60<br />
protocols ..................................................... 51<br />
Refinement panel .............................................. 51<br />
regions<br />
aligned ........................................................ 36<br />
of multiple templates .................................. 36<br />
of templates, setting.................................... 36<br />
Rename command............................................... 6<br />
Res1................................................................... 59<br />
<strong>Prime</strong> 2.5 <strong>User</strong> <strong>Manual</strong><br />
Res2................................................................... 59<br />
Residue Matching color scheme ....................... 11<br />
residue names, correcting for receptor-ligand<br />
complex.......................................................... 54<br />
Residue Property color scheme......................... 11<br />
Residue Type color scheme .............................. 11<br />
residues<br />
charged........................................................ 33<br />
conserved.............................................. 36, 47<br />
hydrophobic................................................ 33<br />
incomplete .................................................. 61<br />
missing...................................................... 125<br />
non-conserved............................................. 36<br />
nonstandard................................................. 52<br />
polar............................................................ 33<br />
standard, list of ........................................... 51<br />
variable ....................................................... 47<br />
rotamers............................................................. 37<br />
ruler..................................................................... 8<br />
Run SSP ................................................ 31, 44, 46<br />
runs.......................................................... 5, 6, 136<br />
S<br />
sampling accuracy, loop refinement.................. 64<br />
Save As ......................................................... 6, 25<br />
saving <strong>Prime</strong> runs................................................ 6<br />
Schrödinger contact information..................... 132<br />
schrodinger.hosts file............................ 14<br />
SCOP domains.................................................. 47<br />
score<br />
alignment .................................................... 34<br />
BLAST bit .................................................. 23<br />
Z.................................................................. 46<br />
scratch projects.................................................... 5<br />
Search program<br />
in Find Homologs....................................... 20<br />
in Fold Recognition .................................... 47<br />
secondary structure assignment .................. 14, 33<br />
Secondary Structure color scheme.................... 11<br />
secondary structure prediction programs........ 121<br />
See also SSPs<br />
Select toolbar command.................................... 13<br />
seqconvert utility.......................... 30, 43, 116<br />
SEQRES records............................................. 125<br />
sequence identity............................................... 23<br />
sequence positives............................................. 23<br />
Sequence Viewer (<strong>Prime</strong>).............................. 8, 21
sequences<br />
coloring....................................................... 14<br />
command menu for....................................... 8<br />
cropping...................................................... 18<br />
editing......................................................... 10<br />
exporting....................................................... 9<br />
maximum length................................... 18, 25<br />
multiple....................................................... 18<br />
names.................................................... 18, 23<br />
non-unique.................................................. 18<br />
selecting...................................................... 18<br />
standard code.............................................. 18<br />
unusable...................................................... 23<br />
Set Template Regions........................................ 13<br />
color scheme............................................... 36<br />
Set Template Regions toolbar button ................ 36<br />
side chains......................................................... 36<br />
optimizing................................................... 37<br />
predicting.................................................... 36<br />
rotamers ...................................................... 37<br />
unfreezing in loop refinement..................... 66<br />
similarity, BLOSUM62 matrix ......................... 23<br />
Slide As Block ............................................ 13, 33<br />
Slide Freely ................................................. 13, 33<br />
solvation............................................................ 40<br />
sorting tables, by column heading ................ 8, 22<br />
SSA ................................................................. 136<br />
SSP profile, in Fold Recognition ...................... 43<br />
SSpro................................................... 29, 44, 121<br />
SSPs ................................................................ 136<br />
command menu for....................................... 8<br />
composite.................................................... 33<br />
deleting ................................................... 8, 31<br />
editing......................................................... 13<br />
exporting........................................... 8, 30, 43<br />
in Fold Recognition .................................... 43<br />
hiding all....................................................... 8<br />
importing .............................................. 30, 43<br />
retrieving unmodified ........................... 31, 46<br />
in Sequence Viewer ...................................... 8<br />
standard residues............................................... 51<br />
Step View ............................................................ 8<br />
steps<br />
action ............................................................ 8<br />
available and unavailable.............................. 9<br />
Find Homologs ..................................... 17, 18<br />
Input Sequence ........................................... 17<br />
Index<br />
name ............................................................. 8<br />
panel layout .................................................. 7<br />
structalign utility.................................... 115<br />
structurally aligned templates ........................... 23<br />
structure refinement, loop ................................. 51<br />
structures<br />
coloring....................................................... 14<br />
structures, exporting........................................ 115<br />
Surface Generalized Born (SGB)...................... 40<br />
T<br />
tables ................................................................... 8<br />
cells............................................................. 22<br />
sorting........................................................... 8<br />
in Step View.................................................. 8<br />
of templates ................................................ 47<br />
Template ID color scheme ................................ 11<br />
template sequence ........................................... 136<br />
templates<br />
multiple....................................................... 36<br />
regions ........................................................ 36<br />
structurally aligned ..................................... 23<br />
Templates table ................................................. 47<br />
third-party programs ......................... 44, 121, 122<br />
Threading ........................................................ 136<br />
Threading path ............................................ 17, 25<br />
tilde symbol, for unlocked gap.................... 14, 33<br />
Title, PDB header.............................................. 23<br />
toolbar, <strong>Prime</strong>................................................ 8, 13<br />
tooltips .......................................................... 8, 22<br />
for buttons................................................... 13<br />
for table entries............................................. 8<br />
truncated-Newton.............................................. 62<br />
U<br />
unlocking gaps ............................................ 14, 33<br />
unusable sequence............................................. 23<br />
Update (button) ................................................. 33<br />
utilities<br />
getpdb...................................................... 22<br />
seqconvert .............................. 30, 43, 116<br />
structalign........................................ 115<br />
V<br />
V (Visualization) column.................................. 49<br />
View SSA.......................................................... 14<br />
<strong>Prime</strong> 2.5 <strong>User</strong> <strong>Manual</strong> 141
142<br />
Index<br />
View Structure............................................. 12, 14<br />
W<br />
warnings...................................... 18, 23, 127–129<br />
workflow ....................................................... 9, 17<br />
<strong>Prime</strong> 2.5 <strong>User</strong> <strong>Manual</strong><br />
Workspace........................................... 10, 14, 136<br />
Z<br />
Z-Score...................................................... 46, 136
120 West 45th Street, 29th Floor<br />
New York, NY 10036<br />
Zeppelinstraße 13<br />
81669 München, Germany<br />
101 SW Main Street, Suite 1300<br />
Portland, OR 97204<br />
Dynamostraße 13<br />
68165 Mannheim, Germany<br />
8910 University Center Lane, Suite 270<br />
San Diego, CA 92122<br />
Quatro House, Frimley Road<br />
Camberley GU16 7ER, United Kingdom<br />
SCHRÖDINGER ®