25.03.2013 Views

Prime User Manual - ISP

Prime User Manual - ISP

Prime User Manual - ISP

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Prime</strong> <strong>User</strong> <strong>Manual</strong><br />

<strong>Prime</strong> 2.1<br />

<strong>User</strong> <strong>Manual</strong><br />

Schrödinger Press


<strong>Prime</strong> <strong>User</strong> <strong>Manual</strong> Copyright © 2009 Schrödinger, LLC. All rights reserved.<br />

While care has been taken in the preparation of this publication, Schrödinger<br />

assumes no responsibility for errors or omissions, or for damages resulting from<br />

the use of the information contained herein.<br />

Canvas, CombiGlide, ConfGen, Epik, Glide, Impact, Jaguar, Liaison, LigPrep,<br />

Maestro, Phase, <strong>Prime</strong>, <strong>Prime</strong>X, QikProp, QikFit, QikSim, QSite, SiteMap, Strike, and<br />

WaterMap are trademarks of Schrödinger, LLC. Schrödinger and MacroModel are<br />

registered trademarks of Schrödinger, LLC. MCPRO is a trademark of William L.<br />

Jorgensen. Desmond is a trademark of D. E. Shaw Research. Desmond is used with<br />

the permission of D. E. Shaw Research. All rights reserved. This publication may<br />

contain the trademarks of other companies.<br />

Schrödinger software includes software and libraries provided by third parties. For<br />

details of the copyrights, and terms and conditions associated with such included<br />

third party software, see the Legal Notices for Third-Party Software in your product<br />

installation at $SCHRODINGER/docs/html/third_party_legal.html (Linux OS) or<br />

%SCHRODINGER%\docs\html\third_party_legal.html (Windows OS).<br />

This publication may refer to other third party software not included in or with<br />

Schrödinger software ("such other third party software"), and provide links to third<br />

party Web sites ("linked sites"). References to such other third party software or<br />

linked sites do not constitute an endorsement by Schrödinger, LLC. Use of such<br />

other third party software and linked sites may be subject to third party license<br />

agreements and fees. Schrödinger, LLC and its affiliates have no responsibility or<br />

liability, directly or indirectly, for such other third party software and linked sites,<br />

or for damage resulting from the use thereof. Any warranties that we make<br />

regarding Schrödinger products and services do not apply to such other third party<br />

software or linked sites, or to the interaction between, or interoperability of,<br />

Schrödinger products and services and such other third party software.<br />

June 2009


Contents<br />

Document Conventions ..................................................................................................... ix<br />

Chapter 1: Introduction ....................................................................................................... 1<br />

1.1 The <strong>Prime</strong> Suite in Maestro ..................................................................................... 1<br />

1.2 <strong>Prime</strong> Documentation............................................................................................... 2<br />

1.3 Running Schrödinger Software .............................................................................. 3<br />

1.4 Citing <strong>Prime</strong> in Publications.................................................................................... 3<br />

Chapter 2: Using <strong>Prime</strong>–Structure Prediction .................................................... 5<br />

2.1 <strong>Prime</strong> Runs and Maestro Projects.......................................................................... 5<br />

2.2 General Panel Layout................................................................................................ 7<br />

2.3 The <strong>Prime</strong> Menu Bar ................................................................................................. 9<br />

2.3.1 The File Menu ..................................................................................................... 9<br />

2.3.2 The Edit Menu................................................................................................... 10<br />

2.3.3 The Display Menu ............................................................................................. 10<br />

2.3.4 The Step Menu ................................................................................................. 12<br />

2.4 The <strong>Prime</strong> Toolbar ................................................................................................... 13<br />

2.5 <strong>Prime</strong> Job Options and Control ............................................................................ 14<br />

2.5.1 The Job Options Dialog Box ............................................................................. 14<br />

2.5.2 Setting Up and Launching Jobs ........................................................................ 14<br />

2.5.3 Monitoring Jobs................................................................................................. 15<br />

2.5.4 Adding Structures to the Project Table.............................................................. 16<br />

Chapter 3: <strong>Prime</strong>–Structure Prediction: Initial Steps ................................... 17<br />

3.1 Input Sequence Step............................................................................................... 18<br />

3.2 Find Homologs Step ............................................................................................... 19<br />

3.2.1 Importing Homologs.......................................................................................... 19<br />

3.2.2 Searching for Homologs.................................................................................... 20<br />

3.2.3 Find Homologs Search Options ........................................................................ 21<br />

3.2.4 Search Results.................................................................................................. 22<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> iii


iv<br />

Contents<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />

3.2.5 Multiple Templates and Structure Alignment..................................................... 23<br />

3.2.6 Continuing to the Next Step .............................................................................. 25<br />

3.2.7 Saving Runs With Different Templates.............................................................. 25<br />

3.2.8 Error Messages................................................................................................. 25<br />

Chapter 4: Comparative Modeling: Edit Alignment....................................... 27<br />

4.1 The Comparative Modeling Path .......................................................................... 27<br />

4.2 Overview of the Edit Alignment Step................................................................... 27<br />

4.3 Initial and Imported Alignments ........................................................................... 29<br />

4.3.1 Accepting the BLAST/PSI-BLAST Alignment ................................................... 29<br />

4.3.2 Exporting an Alignment..................................................................................... 29<br />

4.3.3 Importing an Alignment..................................................................................... 29<br />

4.4 Secondary Structure Predictions (SSPs)............................................................ 29<br />

4.4.1 Generating SSPs .............................................................................................. 29<br />

4.4.2 Exporting and Importing SSPs.......................................................................... 30<br />

4.4.3 Deleting or Editing an SSP ............................................................................... 31<br />

4.4.4 Resetting SSPs................................................................................................. 31<br />

4.5 Generated and Modified Alignments ................................................................... 31<br />

4.5.1 Running the Align Program............................................................................... 31<br />

4.5.2 Align Program Technical Details ....................................................................... 33<br />

4.5.3 <strong>Manual</strong>ly Editing the Alignment......................................................................... 33<br />

4.5.4 Updating the Step View Table ........................................................................... 34<br />

4.6 Error Messages........................................................................................................ 34<br />

Chapter 5: Comparative Modeling: Build Structure ...................................... 35<br />

5.1 The Build Process ................................................................................................... 35<br />

5.2 Preparing to Build ................................................................................................... 36<br />

5.2.1 Multiple Templates (Set Template Regions)...................................................... 36<br />

5.2.2 Including Ligands and Cofactors....................................................................... 37<br />

5.2.3 Build Options..................................................................................................... 37<br />

5.2.4 Omitting Structural Discontinuities .................................................................... 37


Contents<br />

5.3 Running the Build Structure Job.......................................................................... 38<br />

5.4 Saving and Exporting Built Structures ............................................................... 40<br />

5.5 Technical Notes for the Build Structure Step..................................................... 40<br />

Chapter 6: Fold Recognition ......................................................................................... 43<br />

6.1 Overview of the Fold Recognition Step .............................................................. 43<br />

6.2 Producing an SSP Profile ...................................................................................... 43<br />

6.2.1 Importing and Exporting SSPs.......................................................................... 43<br />

6.2.2 Generating SSPs .............................................................................................. 44<br />

6.2.3 Editing or Deleting SSPs................................................................................... 45<br />

6.2.4 Resetting SSPs................................................................................................. 46<br />

6.3 Searching for Templates ........................................................................................ 46<br />

6.3.1 Search Program Options .................................................................................. 46<br />

6.3.2 Search Program Technical Details.................................................................... 47<br />

6.3.3 Search Output................................................................................................... 47<br />

6.4 Evaluating Templates ............................................................................................. 49<br />

6.5 Building a Model...................................................................................................... 50<br />

Chapter 7: <strong>Prime</strong>–Refinement..................................................................................... 51<br />

7.1 Preparing Structures for Refinement .................................................................. 51<br />

7.2 Using the Refinement Panel.................................................................................. 55<br />

7.3 Setting Up an Implicit Membrane ......................................................................... 56<br />

7.4 <strong>Prime</strong> Energy Analysis ........................................................................................... 58<br />

7.5 Refining Loops ........................................................................................................ 58<br />

7.6 Predicting Side Chains........................................................................................... 61<br />

7.7 Minimizing Structures ............................................................................................ 62<br />

7.8 Refinement Panel Options..................................................................................... 63<br />

7.8.1 General Options................................................................................................ 63<br />

7.8.2 Dielectric Options.............................................................................................. 64<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> v


vi<br />

Contents<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />

7.8.3 Loop Refinement Options ................................................................................. 64<br />

7.9 Technical Details for Refinement ......................................................................... 66<br />

7.9.1 Loop Refinement............................................................................................... 66<br />

7.9.2 Cooperative Loop Refinement .......................................................................... 67<br />

7.9.3 Side-Chain Prediction ....................................................................................... 68<br />

7.10 Refinement with Nonstandard Residues .......................................................... 68<br />

7.11 Output of Refinement Jobs ................................................................................. 68<br />

Chapter 8: Maestro Protein Structure Alignment ........................................... 71<br />

8.1 The Protein Structure Alignment Panel............................................................... 71<br />

8.2 The Align Binding Sites Panel .............................................................................. 74<br />

Chapter 9: Docking Covalently Bound Ligands............................................... 77<br />

9.1 Running Covalent Docking from Maestro........................................................... 77<br />

9.1.1 Selecting the Ligands........................................................................................ 77<br />

9.1.2 Defining the Ligand Reactive Groups ............................................................... 78<br />

9.1.3 Specifying the Receptor Bond to Break ............................................................ 79<br />

9.1.4 Specifying Receptor Residues to Sample......................................................... 79<br />

9.1.5 Running the Job................................................................................................ 80<br />

9.2 Running Covalent Docking from the Command Line....................................... 80<br />

Chapter 10: <strong>Prime</strong> MM-GBSA ..................................................................................... 81<br />

Chapter 11: Command Syntax.................................................................................... 85<br />

11.1 blast—Run Blast Searches.................................................................................. 85<br />

11.2 bldstruct—Build a Protein Model from an Alignment..................................... 88<br />

11.3 fr—Fold Recognition ............................................................................................ 91<br />

11.4 multirefine—Multi-Step Extended Sampling .................................................... 94<br />

11.5 refinestruct—Refine a Protein Structure .......................................................... 98<br />

11.6 ska—Protein Structure Alignment ................................................................... 102


Contents<br />

11.7 ssp—Secondary Structure Prediction............................................................. 107<br />

11.8 sta—Single Template Alignment ...................................................................... 109<br />

11.9 tfm—Tertiary Folding Module ........................................................................... 112<br />

Chapter 12: Command-Line Utilities ..................................................................... 115<br />

12.1 Multiple Template Alignment: structalign....................................................... 115<br />

12.2 Format Conversion: seqconvert....................................................................... 116<br />

12.3 Structure Preparation: primefix.py .................................................................. 118<br />

12.4 Structure Alignment: align_binding_sites...................................................... 119<br />

12.5 Database Update Utilities .................................................................................. 120<br />

12.5.1 update_BLASTDB......................................................................................... 120<br />

12.5.2 rsync_pdb ..................................................................................................... 120<br />

Appendix A: Third-Party Programs ........................................................................ 121<br />

A.1 Location of Third-Party Programs .................................................................... 121<br />

A.1.1 BLAST/PSI-BLAST, Pfam, and the PDB......................................................... 121<br />

A.1.2 Secondary Structure Prediction...................................................................... 121<br />

A.2 Third-Party Program Use Agreement ............................................................... 122<br />

Appendix B: Environment Variables ...................................................................... 123<br />

Appendix C: File Formats ............................................................................................. 125<br />

C.1 Input File Formats ............................................................................................... 125<br />

C.2 Output File Formats ............................................................................................ 125<br />

Appendix D: Error Messages and Warnings ................................................... 127<br />

D.1 Input Sequence .................................................................................................... 127<br />

D.2 Find Homologs ..................................................................................................... 127<br />

D.2.1 Error Messages .............................................................................................. 127<br />

D.2.2 Warnings in Homologs Table .......................................................................... 128<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> vii


viii<br />

Contents<br />

D.3 Edit Alignment ...................................................................................................... 129<br />

D.4 Build Structure ..................................................................................................... 129<br />

D.5 Refinement ............................................................................................................ 129<br />

Getting Help ........................................................................................................................... 131<br />

Glossary.................................................................................................................................... 135<br />

Index ............................................................................................................................................ 137<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


Document Conventions<br />

In addition to the use of italics for names of documents, the font conventions that are used in<br />

this document are summarized in the table below.<br />

Font Example Use<br />

Sans serif Project Table Names of GUI features, such as panels, menus,<br />

menu items, buttons, and labels<br />

Monospace $SCHRODINGER/maestro File names, directory names, commands, environment<br />

variables, and screen output<br />

Italic filename Text that the user must replace with a value<br />

Sans serif<br />

uppercase<br />

CTRL+H Keyboard keys<br />

Links to other locations in the current document or to other PDF documents are colored like<br />

this: Document Conventions.<br />

In descriptions of command syntax, the following UNIX conventions are used: braces { }<br />

enclose a choice of required items, square brackets [ ] enclose optional items, and the bar<br />

symbol | separates items in a list from which one item must be chosen. Lines of command<br />

syntax that wrap should be interpreted as a single command.<br />

File name, path, and environment variable syntax is generally given with the UNIX conventions.<br />

To obtain the Windows conventions, replace the forward slash / with the backslash \ in<br />

path or directory names, and replace the $ at the beginning of an environment variable with a<br />

% at each end. For example, $SCHRODINGER/maestro becomes %SCHRODINGER%\maestro.<br />

In this document, to type text means to type the required text in the specified location, and to<br />

enter text means to type the required text, then press the ENTER key.<br />

References to literature sources are given in square brackets, like this: [10].<br />

<strong>Prime</strong> 2.5 <strong>User</strong> <strong>Manual</strong> ix


x<br />

<strong>Prime</strong> 2.5 <strong>User</strong> <strong>Manual</strong>


<strong>Prime</strong> <strong>User</strong> <strong>Manual</strong><br />

Chapter 1: Introduction<br />

Chapter 1<br />

<strong>Prime</strong> 2.1 is a highly accurate protein structure prediction suite of programs that integrates<br />

Comparative Modeling and Fold Recognition into a single user-friendly, wizard-like interface.<br />

The Comparative Modeling path incorporates the complete protein structure prediction process<br />

from template identification, to alignment, to model building. Refinement can then be done<br />

from a separate panel, and involves side-chain prediction, loop prediction, and minimization.<br />

The <strong>Prime</strong> interface was designed to accommodate both novice and expert users, while the<br />

underlying programs were designed to produce superior results in a variety of applications,<br />

including:<br />

• High-resolution homology modeling<br />

• Refinement of active sites<br />

• Induced-fit optimization<br />

• Fold Recognition<br />

• Structure-based functional annotation<br />

• Generation of alternate loop conformations<br />

1.1 The <strong>Prime</strong> Suite in Maestro<br />

To run <strong>Prime</strong> you use Maestro, the graphical user interface (GUI) for all of Schrödinger’s products.<br />

For an introduction to Maestro, see the Maestro Overview. For help with a Maestro panel,<br />

click the Help button. For more information, see the Maestro <strong>User</strong> <strong>Manual</strong>.<br />

The <strong>Prime</strong> submenu of the Maestro Applications menu has four items, Structure Prediction,<br />

Refinement, Covalent Docking, and MM-GBSA, which provide access to the Structure Prediction<br />

(<strong>Prime</strong>–SP), Refinement, covalent docking and MM-GBSA modules of the <strong>Prime</strong> suite.<br />

<strong>Prime</strong>–SP consists of a series of steps leading from a protein sequence through Comparative<br />

Modeling to the construction of a 3D structure. If a template of sufficiently high sequence<br />

identity exists in the database of known structures, the Comparative Modeling path is used. If a<br />

suitable template cannot be identified using sequence information, however, Fold Recognition<br />

can be used to identify potential templates.<br />

<strong>Prime</strong>–Refinement is a facility for refining protein structures that have been imported into<br />

Maestro or added to the Project Table from the Comparative Modeling workflow or another<br />

Schrödinger application.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 1


2<br />

Chapter 1: Introduction<br />

Covalent Docking is a facility for “docking” ligands that are covalently bound to the receptor.<br />

It uses designated reactive bonds in the ligand and a receptor bond to sample the possible<br />

attachment points, and performs a loop prediction to sample the ligand and specified parts of<br />

the receptor. The result is a set of “poses” of the ligand, docked to the receptor.<br />

<strong>Prime</strong> MM-GBSA is a facility for calculating ligand binding energies within the MM-GBSA<br />

continuum solvation model.<br />

1.2 <strong>Prime</strong> Documentation<br />

The <strong>Prime</strong> <strong>User</strong> <strong>Manual</strong> describes the <strong>Prime</strong> suite, including information about:<br />

• Essentials of using the <strong>Prime</strong>–SP module: runs and projects, job control, and the layout<br />

and common features of step panels<br />

• Starting a structure prediction: the Input Sequence and Find Homologs steps<br />

• Using the Comparative Modeling path, including the Edit Alignment and Build Structure<br />

steps<br />

• Using Fold Recognition<br />

• Using the <strong>Prime</strong>–Refinement module<br />

• Running <strong>Prime</strong> MM-GBSA calculations<br />

• Running <strong>Prime</strong> jobs from the command line<br />

This document expands on the information provided in the Maestro online help. In addition to<br />

describing steps, features, and options, this document suggests ways to determine which<br />

choices are the most useful, describes results, and provides some background and technical<br />

detail. Terms pertinent to <strong>Prime</strong> are defined in the Glossary.<br />

For a tutorial introduction to <strong>Prime</strong>, see the <strong>Prime</strong> Quick Start Guide. For information on<br />

installing <strong>Prime</strong>, see the Installation Guide.<br />

For information about Schrödinger’s Induced Fit Docking protocol, which uses <strong>Prime</strong> and<br />

Glide to optimize docking by including ligand-induced flexibility in the receptor active site,<br />

see the document Induced Fit Docking.<br />

Maestro online help includes topics for <strong>Prime</strong>. See page 131 for a discussion of help in<br />

Maestro, including tooltips (Balloon Help) and technical support contacts. For a fuller description<br />

of the help facility, see Chapter 15 of the Maestro <strong>User</strong> <strong>Manual</strong>.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


1.3 Running Schrödinger Software<br />

Chapter 1: Introduction<br />

To run any Schrödinger program on a UNIX platform, or start a Schrödinger job on a remote<br />

host from a UNIX platform, you must first set the SCHRODINGER environment variable to the<br />

installation directory for your Schrödinger software. To set this variable, enter the following<br />

command at a shell prompt:<br />

csh/tcsh: setenv SCHRODINGER installation-directory<br />

bash/ksh: export SCHRODINGER=installation-directory<br />

Once you have set the SCHRODINGER environment variable, you can start Maestro with the<br />

following command:<br />

$SCHRODINGER/maestro &<br />

It is usually a good idea to change to the desired working directory before starting Maestro.<br />

This directory then becomes Maestro’s working directory. For more information on starting<br />

Maestro, including starting Maestro on a Windows platform, see Section 2.1 of the Maestro<br />

<strong>User</strong> <strong>Manual</strong>. There is no need to set SCHRODINGER on Windows: it is set automatically.<br />

<strong>Prime</strong> jobs, with the exception of Fold Recognition, can be launched from a Windows platform<br />

to a remote UNIX host. You can run <strong>Prime</strong> MM-GBSA jobs locally on a Windows host.<br />

To launch Structure Prediction jobs from Maestro, you must have access to the PDB. If the<br />

PDB database is not installed into $SCHRODINGER, you must set the SCHRODINGER_PDB environment<br />

variable to the path to the PDB installation. For information on how to set environment<br />

variables on Windows, see Appendix A of the Installation Guide.<br />

1.4 Citing <strong>Prime</strong> in Publications<br />

The use of this product should be acknowledged in publications as:<br />

<strong>Prime</strong>, version 2.1, Schrödinger, LLC, New York, NY, 2009.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 3


4<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


<strong>Prime</strong> <strong>User</strong> <strong>Manual</strong><br />

Chapter 2: Using <strong>Prime</strong>–Structure Prediction<br />

This chapter presents the essentials of using <strong>Prime</strong>–Structure Prediction (<strong>Prime</strong>–SP):<br />

• Starting, naming, and saving <strong>Prime</strong>–SP runs in Maestro projects<br />

• Navigating within and between runs<br />

• Job control, job monitoring, and <strong>Prime</strong>–SP job options<br />

• Using <strong>Prime</strong>–SP step panels, including their common layout and features<br />

• Using the <strong>Prime</strong> menu bar and <strong>Prime</strong> toolbar<br />

2.1 <strong>Prime</strong> Runs and Maestro Projects<br />

Chapter 2<br />

A single execution of the <strong>Prime</strong> workflow using a particular set of choices of templates, paths,<br />

and settings is called a run.<br />

<strong>Prime</strong> runs are stored within Maestro projects. Structures generated in completed <strong>Prime</strong> runs<br />

are added to the Maestro Project Table. If you are unfamiliar with Maestro projects and the<br />

Maestro Project Table, you may want to review the Project Table and Project Facility online<br />

help topics. For more information, see the Maestro <strong>User</strong> <strong>Manual</strong>.<br />

Maestro always has a project open. If you begin to work without selecting or naming a project,<br />

Maestro creates a scratch (unnamed) project.<br />

<strong>Prime</strong> always has a run open, which belongs to the current project (whether it is a named<br />

project or a scratch project). If you begin the <strong>Prime</strong> workflow without specifying an existing<br />

run, the default name for the current run is run1. Multiple runs can be performed within a<br />

project, but you must save the project in order to save the data in the runs.<br />

Each project can be displayed in the Project Table panel. In <strong>Prime</strong>, Project Table entries are<br />

usually finished model structures with their properties. <strong>Prime</strong> stores data in the project as each<br />

step in a run is performed, but most of this data is not displayed in the Project Table, which can<br />

remain empty until a run is completed. For this reason you should save scratch projects in<br />

which you have done work, even if the Project Table is empty.<br />

Two Maestro projects can be merged into one. If you merge two Maestro projects, each of<br />

which contains one or more <strong>Prime</strong> runs, all the runs are included in the merged project. If two<br />

runs have the same name, Maestro automatically makes the names unique by appending -mrg1<br />

to one of them.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 5


6<br />

Chapter 2: Using <strong>Prime</strong>–Structure Prediction<br />

Maestro projects containing multiple runs are useful for comparing the results of workflows in<br />

which different selections and options were chosen: for example, a different template selected<br />

for building, or a job performed with a different set of options. Creating these multiple runs<br />

may involve navigating to an earlier step in your current run, making different selections, and<br />

then navigating forward again in a new run. For a description of the features used in navigation<br />

(the Next button, the Back button, and the Guide), see Section 2.2 on page 7.<br />

If you return to an earlier step in your current run and make different choices as you proceed<br />

forward, you will be asked whether you want to overwrite your existing data. If you want to<br />

keep the data in your current run for comparison rather than overwriting it, you can start a new<br />

run. The commands for saving the old run, starting a new one, and switching between <strong>Prime</strong><br />

runs are found under the <strong>Prime</strong> File menu.<br />

The File menu commands for managing runs are:<br />

New<br />

Asks for a new run name (all run names must be unique), and starts a new run at the Input<br />

Sequence step.<br />

Save As<br />

Copies the current run under a new run name and switches to the new run. The original run is<br />

saved under the original name. If you open a new project and begin a workflow without explicitly<br />

giving the run a name, it will be given the default name of run1.<br />

Rename<br />

Asks for a new name for the current run. All run names must be unique.<br />

Delete<br />

Deletes the current run, on confirmation.<br />

Open<br />

Displays a list of options for switching to another run. The list includes the four most recently<br />

used runs. If there are more than four runs in the current project, click the More option to select<br />

a run by name.<br />

To backtrack and try a different set of choices without overwriting your current workflow:<br />

1. Choose Save As from the <strong>Prime</strong> File menu.<br />

This saves the current run under the old name (run1 is the default for the first run in a<br />

new project) and creates a copy under the new name.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


Chapter 2: Using <strong>Prime</strong>–Structure Prediction<br />

You are now in a new workflow identical to the first except for its name.<br />

2. Navigate to the step where you want to make a different choice. If you want to go back<br />

more than one step, click the step name in the Guide.<br />

2.2 General Panel Layout<br />

Each step in <strong>Prime</strong>–SP has its own panel. The panels have many elements in common and<br />

share a standard layout. This section discusses these common elements, their location, and<br />

their function. An example of a <strong>Prime</strong>–SP panel is shown in Figure 2.1.<br />

Figure 2.1. The Find Homologs step of the Structure Prediction panel.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 7


8<br />

Chapter 2: Using <strong>Prime</strong>–Structure Prediction<br />

In the upper part of the panel is the <strong>Prime</strong> menu bar, and below that is the <strong>Prime</strong> toolbar. Below<br />

the <strong>Prime</strong> toolbar is the <strong>Prime</strong> sequence viewer. You can resize the sequence viewer by dragging<br />

the small box in the right-hand corner of the horizontal line that separates the sequence<br />

viewer from the main panel. You can select a single residue in a sequence by clicking on it, and<br />

you can select multiple residues using shift-click to select a range and control-click to select or<br />

deselect individual residues.<br />

In addition to displaying sequences for the query and for templates, in some steps the sequence<br />

viewer shows secondary structure predictions (SSPs). Right-click on an SSP for a menu of<br />

commands that includes Hide All, Delete, and Export. Right-clicking on a sequence also<br />

produces a shortcut menu of commands appropriate to the current step. You can export<br />

sequences using the Export Sequences command on the <strong>Prime</strong> File menu.<br />

The sequence viewer includes a “ruler”, which marks the residue position relative to the leftmost<br />

residue displayed in the sequence viewer. This residue position is displayed in parentheses<br />

in the tooltip for each residue. The positions are arbitrary, and can change when you<br />

display residues in the viewer. They are intended to help you keep track of the sequence as you<br />

scroll the sequence viewer.<br />

The sequence viewer can be configured to display a single line for the sequence that can be<br />

scrolled horizontally, or the sequence can be wrapped so that it scrolls vertically. The configuration<br />

is controlled by the Wrap Sequences option on the sequence viewer shortcut menu.<br />

If you want to save an image of the sequence viewer, choose Save Image from the sequence<br />

viewer shortcut menu. A file selector opens, in which you can select a format, from PNG,<br />

TIFF, and JPEG, and save the image.<br />

Below the sequence viewer are the step name and a brief description of the step, followed by a<br />

row of buttons. The first button on the left runs the main action of the step, which may involve<br />

launching a job. See Section 2.5 on page 14 for more information on running <strong>Prime</strong> jobs. An<br />

Options button is usually located to the right of the action buttons. Most buttons, including<br />

those in the <strong>Prime</strong> toolbar, the <strong>Prime</strong> Guide, and the Maestro toolbar, have tooltips that you can<br />

view by pausing the cursor over the button.<br />

The job status icon in the upper right corner of the panel turns green and spins when you<br />

launch a job. When the icon turns pink again, the job is finished. Click the icon to open the<br />

Monitor panel. For more information on job control, see Section 2.5 on page 14.<br />

The center of the panel contains the Step View, which shows data relevant to the step in table<br />

format. In the Build Structure step, the Step View contains output from the Build Structure<br />

job’s log file. You can view the complete text of a truncated table entry as a tooltip by pausing<br />

the cursor over the text. To resize a table column, drag a column border using the middle<br />

mouse button. You can sort the tables in the Step View by clicking the column heading, once<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


Chapter 2: Using <strong>Prime</strong>–Structure Prediction<br />

for ascending order, twice for descending order. You can also export data from any table in<br />

CSV or HTML format from the table shortcut menu.<br />

Below the Step View are the Back and Next buttons, the primary way to navigate between<br />

steps. You can perform a simple workflow by clicking Next after completing each step.<br />

However, it is sometimes useful to go back to previous steps, make different choices, and<br />

proceed forward again using the Next button or the Guide, explained next.<br />

The Guide, below the Back and Next buttons, is a diagram of the <strong>Prime</strong> workflow as a series of<br />

steps, each step represented by a button. The diagram branches to show the steps of both paths,<br />

Comparative Modeling and Threading.<br />

To hide the Guide, deselect the Guide check box in the Step menu.<br />

The Guide has two purposes: to show your current location in the <strong>Prime</strong>–SP process, and to let<br />

you navigate back and review or change the input at previous steps in the process. The current<br />

step is highlighted in the Guide diagram. All previously accessed steps, forward or backward,<br />

are available. You can go to any of these steps by clicking them in the Guide.<br />

Clicking on an available step button in the Guide displays the panel for that step. If a step is not<br />

yet available, its button is dimmed. Available steps include those that have already passed<br />

through in the current run, the present step, and (once step-specific criteria have been met) the<br />

next step beyond the current location. When the next step is available, clicking on it behaves<br />

the same as clicking the Next button.<br />

Note: You cannot launch searches and jobs by clicking the steps in the Guide. To launch<br />

actions pertaining to a step, click the appropriate buttons within the step panel.<br />

The Close button appears below the Guide on the lower left of the panel. To the right is the<br />

Help button, which opens online help related to the current step.<br />

2.3 The <strong>Prime</strong> Menu Bar<br />

The <strong>Prime</strong> menu bar consists of four menus: File, Edit, Display, and Step. Some of the options<br />

in these menus are also available as toolbar buttons.<br />

2.3.1 The File Menu<br />

The commands on the File menu for working with <strong>Prime</strong> runs (New, Save As, Rename, Delete,<br />

and Open) are discussed in Section 2.1 on page 5. The menu also contains the commands<br />

Export Sequences and Job Options. See Section 2.5 on page 14 for more information on job<br />

options.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 9


10<br />

Chapter 2: Using <strong>Prime</strong>–Structure Prediction<br />

2.3.2 The Edit Menu<br />

The Edit menu contains commands for editing sequences and secondary structure predictions.<br />

These editing tools are also available from the <strong>Prime</strong> toolbar, and are described with the<br />

toolbar in Section 2.4 on page 13.<br />

2.3.3 The Display Menu<br />

The Display menu includes the following commands for controlling the appearance of the<br />

sequence viewer and structures in the Maestro Workspace.<br />

Color Scheme<br />

This command opens a menu of coloring options. In <strong>Prime</strong>–SP, the coloring schemes act as<br />

modes; that is, only one color scheme can be used at any given time. Each step in <strong>Prime</strong> has a<br />

default color scheme. You can use this menu to choose a different one. When amino acid<br />

sequences are added to the sequence viewer, they are colored according to the scheme that is<br />

currently selected.<br />

When the structure associated with a sequence is displayed in the Workspace, it is generally<br />

colored using the same scheme as the sequence. However, protein structures in the Workspace<br />

are sometimes colored using other color schemes:<br />

• When a PDB structure is imported, it is displayed in the Workspace using an error reporting<br />

color scheme in which complete recognized residues are gray, incomplete residues<br />

are red, unknown or nonstandard residues are orange, residues with multiple location<br />

indicators are green, and recognized residues with some unrecognized atoms are blue.<br />

For more information about conversion of PDB structures, see the Maestro <strong>User</strong> <strong>Manual</strong>.<br />

• When the Build Structure step is complete, the color scheme applied to the structure in<br />

the Workspace is Atom PDB B Factor (Temperature Factor), in which blue atoms are those<br />

derived directly from the template, and red atoms have been predicted or modeled. To<br />

restore this color scheme, choose Atom PDB B Factor (Temperature Factor) from the<br />

Color all atoms by scheme Maestro toolbar button.<br />

The <strong>Prime</strong> color schemes are:<br />

• None: All sequences are colored standard gray.<br />

• Color by Sequence: The query sequence is colored standard gray. Each of the other<br />

sequences uses a different color.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


Chapter 2: Using <strong>Prime</strong>–Structure Prediction<br />

• Residue Type: Each amino acid sequence is colored according to the array of residue type<br />

colors.<br />

• Residue Property: Each amino acid sequence is colored according to the array of residue<br />

property colors (fewer colors than residue type).<br />

• Residue Matching: The query and templates are colored in a pairwise manner. Matching<br />

residues are colored according to their residue type, and nonmatching residues are colored<br />

standard gray. If more than one template is chosen, the query will have more colored<br />

residues because the effect is cumulative. If only the query sequence is shown, it is colored<br />

standard gray for all residues.<br />

• Residue Homology: Aligned query and template residues that are identical (highly conserved)<br />

are colored red. Aligned query and template residues that are similar according to<br />

the BLOSUM62 scoring matrix (conserved) are colored yellow. Gray is used for residues<br />

with no homology.<br />

• Secondary Structure: The query sequence is colored standard gray. All other AA<br />

sequences should have SSAs, which are colored according to the SSA colors.<br />

• Template ID: Only available for the Comparative Modeling path. The query sequence is<br />

colored standard gray. The predicted structure is colored according to template ID for<br />

each region (using the cycle colors). Each template sequence is colored with its color, but<br />

only in the region that was used (the rest of the template is colored dark gray to indicate it<br />

is not used).<br />

• Proximity: At any step in which the sequence viewer is displaying sequences that have<br />

associated structure (all steps except Input Sequence), color by proximity mode is available.<br />

When you select this color scheme, the current sequence colors do not change<br />

immediately. Instead, when the mouse is over a sequence that has structure, the cursor<br />

changes to indicate that proximity can be calculated. You can then select one or more<br />

contiguous residues in the sequence, and the system will color that sequence based on the<br />

proximity to the selected residues. No other sequences will have modified colors (even if<br />

they have structure too and are in proximity of the selected sequence).<br />

Proximity is calculated based on the shortest distance between any two atoms in two residues,<br />

not including the bonded C and N atoms in the residues next to the selected<br />

residues.<br />

To change the proximity cutoff, choose Preferences from the Display menu and change<br />

the number in the Proximity cutoff text box.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 11


12<br />

Chapter 2: Using <strong>Prime</strong>–Structure Prediction<br />

Font Size<br />

The font size of letters and numbers in the sequence viewer can be changed using this option.<br />

The font size options are:<br />

• Small<br />

• Medium<br />

• Large<br />

• Huge<br />

View Structure<br />

The options for showing structures in the Maestro Workspace include:<br />

• All<br />

• None<br />

• Predicted Only<br />

View SSA<br />

This option controls whether secondary structure assignments are shown in the sequence<br />

viewer. The default is not to show secondary structure assignments.<br />

Legend<br />

Choose this menu option to open the Legend panel, which includes the assignment of each<br />

color and symbol used in each color scheme and sequence representation. Click the Colors tab<br />

and select a Color scheme from the list, or click the Symbols tab and select a type of Sequence<br />

data, such as Pfam/HMMER or SSP. This option is also available when you right-click on a<br />

sequence in the sequence viewer.<br />

Preferences<br />

Choose this Display menu option to open the Proximity dialog box:<br />

• Proximity cutoff: The value entered in this text box defines proximity for the Proximity<br />

color scheme. The default is 4.00 Å.<br />

2.3.4 The Step Menu<br />

The Step menu provides another way to navigate between steps in <strong>Prime</strong>. It includes the<br />

following options:<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


Guide<br />

Chapter 2: Using <strong>Prime</strong>–Structure Prediction<br />

When this check box is selected, the <strong>Prime</strong> Guide is displayed (the default). Deselect the check<br />

box to hide the Guide.<br />

Input Sequence<br />

Click this option to go to the Input Sequence step.<br />

Find Homologs<br />

Click this option to go to the Find Homologs step.<br />

There is an option to click to go to each step in <strong>Prime</strong>–SP. A red diamond is displayed next to<br />

the option to go to the current step. Options to go to steps that are not available are dimmed.<br />

2.4 The <strong>Prime</strong> Toolbar<br />

The <strong>Prime</strong> toolbar is the row of buttons just under the <strong>Prime</strong> menu bar. The <strong>Prime</strong> toolbar<br />

provides convenient access to commonly used <strong>Prime</strong> tasks. Pause the mouse over a button to<br />

see its description. These task modes are also available as options in the Edit and Display<br />

menus. The toolbar buttons are shown below as they appear on the toolbar, with their names<br />

and functions.<br />

Select (Edit menu): Select a region of the sequence to see the corresponding residues<br />

highlighted in the 3D Workspace.<br />

Crop (Edit menu): The cropping tool, which can be used in the Build Structure step<br />

to select ends of the query sequence that are not to be used for building.<br />

Edit SSP (Edit menu): Edit secondary structure predictions. Highlight a region of<br />

the SSP, and then type either H for α-helix, E for β-strand, or - for loop.<br />

Set Template Regions (Edit menu): If more than one template was selected in the<br />

Find Homologs step, use this button in the Build Structure step to proceed to the<br />

mode that allows selection of template regions for building.<br />

Slide Freely (Edit menu): Use this button to edit the alignment between query and<br />

template. Selecting a residue and sliding to the right creates N-terminal gaps.<br />

Selecting a residue and sliding to the left creates C-terminal gaps.<br />

Slide As Block (Edit menu): Use this button to edit the alignment between query and<br />

template. Selecting a residue and sliding to the right creates N-terminal gaps.<br />

Selecting a residue and sliding to the left slides all residues as a block to the left.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 13


14<br />

Chapter 2: Using <strong>Prime</strong>–Structure Prediction<br />

Add Anchors (Edit menu): Set alignment constraints by setting anchors at residue<br />

positions. To remove anchors, click on the same residue again.<br />

Lock Gaps (Edit menu): To prevent gaps from being collapsed during manual<br />

editing, lock the gaps using this button. Locked gaps are indicated with a - symbol.<br />

Unlock Gaps (Edit menu) To allow gaps to collapse during manual editing, unlock<br />

the gaps using this button. Unlocked gaps are indicated with a ~ symbol.<br />

Color Scheme (Display menu): Select a color scheme for coloring sequences in the<br />

sequence viewer. If there is a structure in the 3D Workspace associated with the<br />

sequence, it will also be colored.<br />

View Structure (Display menu): Use this button to display and undisplay structures<br />

in the Workspace.<br />

View SSA (Display menu): Use this button to turn the secondary structure assignment<br />

(SSA) in the sequence viewer on and off.<br />

2.5 <strong>Prime</strong> Job Options and Control<br />

Before launching a <strong>Prime</strong> calculation from a step panel, you may want to choose which<br />

machine host the job will run on and check or modify job settings. When a run is complete,<br />

you can incorporate the results into the Maestro Project Table and save them as a project.<br />

2.5.1 The Job Options Dialog Box<br />

The Structure Prediction–Job Options dialog box, shown in Figure 2.2, allows you to specify<br />

which machine each type of <strong>Prime</strong> job will run on, and under which user name. To open the<br />

dialog box, choose Job Options from the File menu on the <strong>Prime</strong> menu bar. The Job Options<br />

dialog box lists 10 types of jobs, from Find Family (run from the Find Homologs step) to Run<br />

Refinement (run from the Refine Structure step). Specify the host machine and the user name<br />

for each type of job. The default host is localhost. Make sure your schrodinger.hosts file<br />

contains a list of available machines to run on. See the Installation Guide for information on<br />

the schrodinger.hosts file.<br />

2.5.2 Setting Up and Launching Jobs<br />

Though you specify the machines used to run jobs in the Job Options panel, you select <strong>Prime</strong><br />

job settings and launch jobs from the <strong>Prime</strong> step panels. This can be as simple as clicking an<br />

action button, such as Search. In other steps, you select the type of job from a Task menu and<br />

then launch the job by clicking Run.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


Chapter 2: Using <strong>Prime</strong>–Structure Prediction<br />

Figure 2.2. The Structure Prediction–Job Options dialog box.<br />

You can modify settings for step actions or tasks using the Options button. You select the residues<br />

or secondary structure features on which a refinement task will operate from lists, tables,<br />

and the Atom Selection dialog box in the Refine Structure step panel. Until you make a selection,<br />

the Run button remains unavailable.<br />

For more information on setting up and launching jobs, see the discussions of specific steps in<br />

the chapters that follow.<br />

2.5.3 Monitoring Jobs<br />

After you launch a job, the job status icon in the upper right corner of the step panel turns green<br />

and spins.<br />

To monitor a job using the Monitor panel, click the job status icon. You can use this panel to<br />

monitor jobs from the current session or your previous sessions.<br />

To kill a job, select the job name and then click Kill. If the job still appears to be running, you<br />

may have to go to the Project directory and remove the job file.<br />

For more information about the Monitor panel and monitoring jobs, see Section 3.2 of the Job<br />

Control Guide and the online help.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 15


16<br />

Chapter 2: Using <strong>Prime</strong>–Structure Prediction<br />

2.5.4 Adding Structures to the Project Table<br />

In the Build Structure step, predicted structures can be added to the Maestro Project Table as<br />

new entries. This is useful for comparing predicted structures produced from different runs,<br />

and for refining structures.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


<strong>Prime</strong> <strong>User</strong> <strong>Manual</strong><br />

Chapter 3: <strong>Prime</strong>–Structure Prediction: Initial<br />

Steps<br />

Chapter 3<br />

This chapter outlines the initial steps in the <strong>Prime</strong>–SP workflow. In the Input Sequence step, a<br />

query sequence is imported. In the Find Homologs step, potential templates are identified, and<br />

a path is chosen: either Comparative Modeling or Threading. Later chapters describe the steps<br />

in each path in which a model is built.<br />

Figure 3.1. The Input Sequence step, initial view.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 17


18<br />

Chapter 3: <strong>Prime</strong>–Structure Prediction: Initial Steps<br />

3.1 Input Sequence Step<br />

This step is used to read in an amino acid sequence from a file or from a protein structure in the<br />

Workspace.<br />

To read a sequence from a file:<br />

• Click From File and specify a file containing the desired sequence.<br />

Several file formats are supported, including FASTA, EMBL, GENBANK, and PIR. For a<br />

complete list, see Appendix C. File formats are auto-detected by the program. Sequences<br />

must be in standard uppercase one-letter code.<br />

To read a sequence from the Workspace:<br />

• Click From Workspace to read sequences from proteins in the Workspace.<br />

If there are multiple chains with the same sequence, the sequence will be read in only<br />

once.<br />

After input, the Sequences table in the Step View area shows all available unique sequences<br />

with their corresponding chain lengths in the order in which they are read. By default, the first<br />

sequence read in is displayed in the sequence viewer in the upper section of the panel. If<br />

multiple sequences have been read in, only one can be brought forward to the next step in a<br />

given <strong>Prime</strong> run.<br />

To select a query from the Sequences table:<br />

• Click on a sequence to select it for this run.<br />

This brings the selected sequence into the sequence viewer. The sequences are named by<br />

extracting the name from either the PDB header or the FASTA file header. Non-unique<br />

sequences are ignored. Non-unique names are modified to make them unique.<br />

Note: Sequences longer than 1,000 residues generally include more than one domain. Divide<br />

these sequences into probable domains and treat each domain as an individual query.<br />

To continue:<br />

• When you are satisfied with the query sequence, click Next to go to the Find Homologs<br />

step.<br />

If an error message appears when you click Next, you cannot proceed until the problem is<br />

resolved. Error messages are listed in Appendix D. See Section D.1 on page 127 for help<br />

with Input Sequence error messages.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


3.2 Find Homologs Step<br />

Chapter 3: <strong>Prime</strong>–Structure Prediction: Initial Steps<br />

This step is used to search for homologs of the query sequence. These homologs can then be<br />

used as templates to build a model structure.<br />

3.2.1 Importing Homologs<br />

To import a homolog not in the NCBI sequence database, click Import. This button is also used<br />

to import homologs that have been rotated into a common reference frame when multiple<br />

templates have been selected.<br />

Figure 3.2. The Input Sequence step, after input.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 19


20<br />

Chapter 3: <strong>Prime</strong>–Structure Prediction: Initial Steps<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />

Figure 3.3. The Find Homologs step, initial view.<br />

3.2.2 Searching for Homologs<br />

To launch a search for homologs:<br />

• Click Search (near the middle of the panel).<br />

Depending on what is selected in the Find Homologs–Options dialog box, BLAST or PSI-<br />

BLAST is used to search for templates. The default is to use BLAST to search the nonredundant<br />

PDB database that has been loaded with <strong>Prime</strong>. For search options, see<br />

Section 3.2.3.


To identify a family for the query (optional):<br />

• Click Find Family to run HMMER/Pfam.<br />

Chapter 3: <strong>Prime</strong>–Structure Prediction: Initial Steps<br />

This helps identify the family that best matches the query sequence. This job should take<br />

only a few minutes.<br />

The query family name found appears in the Query family name text box with an E-value in<br />

parentheses. If the E-value of the family is too large, it is not meaningful to consider the family.<br />

The sequence viewer shows the match between the query and the consensus sequence of the<br />

family. If this sequence, labeled NAME_pfam, is not visible, check for a plus sign to the left of<br />

the query name: +NAME. Click the plus sign to show any hidden sequences. The match<br />

sequence displays information about which residues are conserved in the multiple sequence<br />

alignment used to generate the Hidden Markov Model (HMM). Capital letters indicate highly<br />

conserved residues, lowercase letters indicate a match to the HMM, + means the match is<br />

conservative, and a blank indicates that the residue does not match the HMM.<br />

3.2.3 Find Homologs Search Options<br />

To change search options:<br />

• Click Options to open the Find Homologs–Options dialog box.<br />

You can use this dialog box to specify a PSI-BLAST or BLAST search, or change the settings<br />

for either type of search.<br />

General options include:<br />

• Database<br />

• Similarity Matrix<br />

• Gap Costs<br />

• Expectation value<br />

• Word size<br />

• Filter query (off by default)<br />

PSI-Blast options include:<br />

• Inclusion threshold<br />

• Number of iterations<br />

The default database selection is NCBI PDB (all). Other selections include:<br />

• NCBI PDB (non-redundant)—Does not include PDB files with identical sequences.<br />

• NCBI NR—Recommended for PSI-BLAST runs. With this selection, only the sequences<br />

with structures available in the PDB are displayed in the Homologs table.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 21


22<br />

Chapter 3: <strong>Prime</strong>–Structure Prediction: Initial Steps<br />

3.2.4 Search Results<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />

Figure 3.4. The Find Homologs–Options dialog box.<br />

The output from the homolog search appears in the table labeled Homologs: (N found, n<br />

selected). For each potential template, the row contains the ID, the name, score, expectation<br />

value, % identities, % positives, % gaps, and header information (if taken from a PDB file). All<br />

percentages are rounded to the nearest whole number, which may be zero. A description of the<br />

table columns is given in Table 3.1.<br />

You can view the complete text of each table cell as a tooltip by pausing the pointer over the<br />

cell. Columns can be sorted by clicking on the column heading. Click once to sort in<br />

descending order, and click twice to sort in ascending order. The column widths can be<br />

adjusted using the left mouse button to drag the column border.<br />

If no structural coordinates are available for a template, that row is dimmed and cannot be<br />

selected.<br />

If a chain ID is given for a homolog (e.g., Chain A in 1EMS_A), then that chain, not the entire<br />

protein, is the potential template. When aligning multiple templates with multiple chains, you<br />

may need to extract single chains using the getpdb utility, as described in Section D.2.3 of the<br />

Maestro <strong>User</strong> <strong>Manual</strong>.<br />

Once homologs are found, click on a row to view the structure of that template in the Maestro<br />

workspace. If no structural coordinates are available for selection, the row is dimmed.<br />

Selecting a template will also bring the sequence alignment of the template and the query<br />

sequence into the sequence viewer. Click on another template to deselect the first one and<br />

display the selection in the Workspace window.<br />

After you review the alignments, select the template that you want to build on by clicking on<br />

its row in the table. To select multiple templates, see Section 3.2.5.


Table 3.1. Homologs table data<br />

Column Header Description<br />

Chapter 3: <strong>Prime</strong>–Structure Prediction: Initial Steps<br />

ID The PDB ID code, including chain name. If a chain name is specified, then<br />

the chain, not the entire protein, is the potential template.<br />

Name Sequence name provided by BLAST.<br />

Score BLAST bit score.<br />

Expect BLAST expectation value.<br />

Identities (%) Percentage of residues that are identical between the sequences.<br />

Positives (%) Percentage of residues that are positive matches according to the similarity<br />

matrix selected in the Options dialog (default matrix is BLOSUM62).<br />

Gaps (%) Percentage of gaps in both query and homolog as returned by BLAST.<br />

Title From the PDB TITLE field.<br />

Compound From the PDB COMPND field.<br />

Source From the PDB SOURCE field.<br />

Experiment From the PDB EXPDTA field. This gives the experimental technique used<br />

to obtain the structure, e.g., X-RAY DIFFRACTION or NMR.<br />

Resolution From the PDB REMARK2 record, where applicable.<br />

Ligands/Cofactors From the PDB HET records: hetID (three alphanumeric characters) and<br />

name for each HET group.<br />

Warning A warning appears in this column for sequences that may be unusable for<br />

model building. See Section D.2.2, for a list of warnings.<br />

3.2.5 Multiple Templates and Structure Alignment<br />

To select more than one template for building, hold down the CTRL key and select another<br />

homolog in the table, or hold down the SHIFT key and select a set of homologs. If more than<br />

one template is selected, they will need to be structurally aligned, that is, rotated to bring them<br />

into a common reference frame, before they can be used in the next step.<br />

• If you have already created a file with structurally aligned multiple templates, import it<br />

using the Import button, as described in Section 3.2.1.<br />

• If you have not yet performed structural alignment on the selected templates, click the<br />

Align Structures button.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 23


24<br />

Chapter 3: <strong>Prime</strong>–Structure Prediction: Initial Steps<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />

Figure 3.5. The Find Homologs step with search results.<br />

If structure alignment fails or does not give the intended result, the problem may be one of the<br />

following:<br />

• The selected templates are structurally too dissimilar for a meaningful alignment.<br />

• One or more of the template structures includes multiple chains, and either<br />

• Two or more templates are chains from one .pdb file (the structure alignment facility<br />

will not align chains from the same file), or<br />

• The chains that were aligned were not the chains you intended to align.


Chapter 3: <strong>Prime</strong>–Structure Prediction: Initial Steps<br />

If the problem involves a template with multiple chains, you can use the getpdb utility (see<br />

Section D.2.3 of the Maestro <strong>User</strong> <strong>Manual</strong>) to extract each chain into a separate .pdb file, then<br />

run structure alignment again.<br />

For information on running structure alignment from the command line, see Section 12.1.<br />

3.2.6 Continuing to the Next Step<br />

When you are satisfied with the template (or structurally aligned multiple templates) you have<br />

selected, follow the Comparative Modeling Path by proceeding to the Edit Alignment step.<br />

If no satisfactory templates can be identified by BLAST/PSI-BLAST and you have not<br />

imported a template, you can follow the Threading Path by proceeding to the Fold Recognition<br />

step.<br />

Note: The maximum length of a query in the Threading Path is 1,000 residues. It is recommended<br />

that such long sequences be divided into queries by probable domains, if this<br />

has not been done already, and the homolog search be repeated for these shorter<br />

queries before entering the Threading Path.<br />

3.2.7 Saving Runs With Different Templates<br />

If you decide later to try a different template, you can avoid overwriting your existing workflow<br />

by starting a new one before you navigate back to Find Homologs to choose a different<br />

homolog. Choose Save As from the File menu. This creates a copy of the current run, allowing<br />

you to select a new template and build on it. You can then compare results from each run.<br />

3.2.8 Error Messages<br />

If there is an error, a message appears when you click Next to go to the next step. See<br />

Section D.2 on page 127 for a list of error messages for Find Homologs.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 25


26<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


<strong>Prime</strong> <strong>User</strong> <strong>Manual</strong><br />

Chapter 4<br />

Chapter 4: Comparative Modeling: Edit Alignment<br />

4.1 The Comparative Modeling Path<br />

In the initial steps of <strong>Prime</strong>–SP, you imported a query sequence (Input Sequence) and identified<br />

possible templates (Find Homologs). If a query-template match with a substantial<br />

percentage of identical residues has been selected, it is appropriate to continue the structure<br />

prediction process by following the Comparative Modeling path, a series of two <strong>Prime</strong>–SP<br />

steps used to build a model structure of the query sequence based on homology to one or more<br />

templates.<br />

The structure prediction process in the Comparative Modeling path includes the following<br />

steps:<br />

1. A satisfactory query-templates sequence alignment is produced in Edit Alignment.<br />

2. The selected query-template alignment is used to build a model structure of the query in<br />

Build Structure.<br />

4.2 Overview of the Edit Alignment Step<br />

In order to build a model structure of the query, the Build Structure step requires a satisfactory<br />

alignment between templates and query. The alignment of templates to query generated in the<br />

Find Homologs step is based only on sequence. Therefore, there is room for improvement. The<br />

Edit Alignment step allows you to improve alignments using various tools before building a<br />

model in the Build Structure step.<br />

There are several ways to produce an alignment for use in Build Structure:<br />

• Accept the BLAST/PSI-BLAST alignment generated in the Find Homologs step and currently<br />

displayed in the sequence viewer.<br />

• Import an alignment.<br />

• Generate or import secondary structure predictions, then use the Align program to create<br />

a new alignment.<br />

• Edit any of these alignments manually.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 27


28<br />

Chapter 4: Comparative Modeling: Edit Alignment<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />

Figure 4.1. The Edit Alignment panel, initial view.<br />

If you want to search for a family that best matches the query sequence, but did not do so in the<br />

Find Homologs step, click Find Family to run HMMER/Pfam. For more information on this<br />

option, see Section 3.2.2 on page 20.<br />

When you are satisfied with the alignment, click the Next button to proceed to the Build Structure<br />

step.


4.3 Initial and Imported Alignments<br />

Chapter 4: Comparative Modeling: Edit Alignment<br />

4.3.1 Accepting the BLAST/PSI-BLAST Alignment<br />

When you start the Edit Alignment step, the BLAST/PSI-BLAST alignment for the template<br />

you selected in Find Homologs is displayed in the sequence viewer. See Figure 4.1.<br />

If this is the template and the alignment you want to use, click Next and go to Build Structure.<br />

To accept a different template, using its existing alignment, click on the desired template and<br />

then click Next.<br />

4.3.2 Exporting an Alignment<br />

You may want to save the existing alignment for later use before making changes. Click the<br />

Export button to the right of the Alignments table to open the Export Alignments To File panel.<br />

To import the alignment again, see Section 4.3.3.<br />

4.3.3 Importing an Alignment<br />

<strong>Prime</strong> can also use an alignment generated in another program (or exported previously from<br />

<strong>Prime</strong>). The sequences in the alignment file must be exactly the same as those used by <strong>Prime</strong>,<br />

and so must the sequence and structure (PDB) names.<br />

Note: The template PDB ID must be uppercase in the alignment file.<br />

To import an alignment, click the Import button to the right of the Alignments table. This opens<br />

the file selection window (Import Alignments From File). For details of the allowed formats, see<br />

Appendix C.<br />

4.4 Secondary Structure Predictions (SSPs)<br />

4.4.1 Generating SSPs<br />

A secondary structure prediction for the query is required to run the Align program. To run all<br />

available SSP programs, click Run SSP. One secondary structure prediction program (SSpro)<br />

is bundled with <strong>Prime</strong>. However, the optional third-party program PSIPRED is highly recommended<br />

for optimal results, especially for GPCRs. PSIPRED is not available on Windows. See<br />

Appendix A for more information on this program.<br />

When the SSP run is finished, the secondary structure predictions are displayed in the sequence<br />

viewer below the query sequence. If there is a problem running the structure prediction<br />

programs, an error message is displayed.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 29


30<br />

Chapter 4: Comparative Modeling: Edit Alignment<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />

Figure 4.2. The Edit Alignment step, after Run SSP.<br />

4.4.2 Exporting and Importing SSPs<br />

You may want to save an SSP for later use. Right-click on an SSP for a menu of options,<br />

including deleting the SSP, hiding all SSPs, or opening the Export SSP panel. By default, the<br />

SSP is saved in native Schrödinger format. If there is a problem exporting a file, an error<br />

message is displayed.<br />

To import a secondary structure prediction for the query, click Import SSP and select the SSP<br />

file, which may be in FASTA or Maestro format. If necessary, use the seqconvert utility to<br />

change the file format. (See Section 12.2 for more information on seqconvert.) If the file


Chapter 4: Comparative Modeling: Edit Alignment<br />

contains bad data or the sequence doesn’t match the query sequence, an error message is<br />

displayed.<br />

4.4.3 Deleting or Editing an SSP<br />

If you believe an SSP is incorrect, you may delete it entirely or edit the incorrect portions. To<br />

delete an SSP, right-click on the SSP and choose Delete from the shortcut menu.<br />

To edit an SSP:<br />

1. Click the Edit SSP toolbar button:<br />

2. Highlight the part of the SSP you want to change.<br />

3. Type the character e, h, or -.<br />

E represents strand, H represents helix, and – represents loop.<br />

The highlighted region of the prediction is changed to your choice.<br />

4.4.4 Resetting SSPs<br />

To retrieve an unmodified SSP after editing, click Run SSP. This immediately resets the SSP,<br />

but does not rerun the prediction programs.<br />

4.5 Generated and Modified Alignments<br />

4.5.1 Running the Align Program<br />

Before running an alignment, if you want to take advantage of the special capabilities for<br />

aligning GPCRs, select Align GPCRs. These capabilities include fingerprint matching, a<br />

customized GPCR sequence database, and identification of transmembrane helices.<br />

To take advantage of expert knowledge, one or more pairs of query/template residues that you<br />

know should be aligned can be constrained to remain so, by using alignment constraints.<br />

To constrain a pair of residues to remain aligned:<br />

1. Click the Add Anchors toolbar button.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 31


32<br />

Chapter 4: Comparative Modeling: Edit Alignment<br />

2. Select one of the residues in the pair that you want to remain aligned during the Align<br />

calculation.<br />

An anchor symbol is displayed in the ruler at the corresponding residue position.<br />

When you are satisfied with the secondary structure predictions for the query, click Align to<br />

start the alignment job.<br />

When Align completes the generation of the new alignment, you may accept it and continue to<br />

the next step or edit it manually as described below.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />

Figure 4.3. The Edit Alignment step after alignment.


4.5.2 Align Program Technical Details<br />

Chapter 4: Comparative Modeling: Edit Alignment<br />

The goal of the program launched from the Edit Alignment step, Single Template Alignment,<br />

is to generate an accurate alignment between proteins with medium to high sequence identity<br />

(20-90%). Features of the program include:<br />

• A position-specific substitutional matrix (PSSM) for the query sequence, derived from<br />

PSI-BLAST, is used to match the template sequence.<br />

• In order to minimize the inaccuracy in a single secondary structure prediction, a novel<br />

and robust algorithm has been designed to derive a composite secondary structure for the<br />

query sequence from multiple predictions. The composite SSP is aligned to the SSA of<br />

the template.<br />

4.5.3 <strong>Manual</strong>ly Editing the Alignment<br />

You can edit any alignment, however obtained, using combinations of these <strong>Prime</strong> toolbar<br />

buttons: Slide Freely, Slide as Block, Add(/Remove) Anchors, Lock Gaps, and Unlock Gaps.<br />

Slide Freely (Edit menu): Edit the alignment between query and template. Selecting<br />

a residue and sliding to the right creates N-terminal gaps. Selecting a residue and<br />

sliding to the left creates C-terminal gaps.<br />

Slide As Block (Edit menu): Edit the alignment between query and template.<br />

Selecting a residue and sliding to the right slides all residues as a block to the right.<br />

Selecting a residue and sliding to the left slides all residues as a block to the left.<br />

Add Anchors (Edit menu): Set alignment constraints by setting anchors at residue<br />

positions. To remove anchors, click on the same residue again.<br />

Lock Gaps (Edit menu): To prevent gaps from being collapsed during manual<br />

editing, lock the gaps using this button. Locked gaps are indicated with a - symbol.<br />

Unlock Gaps (Edit menu): To allow gaps to collapse during manual editing, unlock<br />

the gaps using this button. Unlocked gaps are indicated with a ~ symbol.<br />

To assist in this task, you can view a preliminary model of the query in the Workspace. Select<br />

<strong>Manual</strong> Threading and click Update. The Workspace shows the template structure colored by<br />

query residue property. Each template residue is colored according to the property of the corresponding<br />

query residue in the alignment, revealing the location of charged, polar, and hydrophobic<br />

query residues. Use this option to make sure that insertions/gaps are not in the middle<br />

of regions of secondary structure such as alpha helices or beta sheets. It can also be used to<br />

check that hydrophobic residues point into hydrophobic regions and polar residues point<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 33


34<br />

Chapter 4: Comparative Modeling: Edit Alignment<br />

towards solvent. As you make changes to the alignment, click Update again to see how each<br />

change has altered the mapping of the query’s residue properties onto the template structure.<br />

If there are obvious errors in the alignment (such as gaps in regions of secondary structure), a<br />

warning will appear in the Build Structure dialog box. <strong>Manual</strong> editing can be used to adjust the<br />

alignment before attempting to build the structure again.<br />

4.5.4 Updating the Step View Table<br />

After the Align job is complete, the Step View table is updated with the new alignment score,<br />

the percent identity, the percent similarity, and the percent gaps (all rounded to the nearest<br />

integer). This information can be updated after any change, including manual editing of the<br />

alignment, by clicking the Update button. The changed alignment is also reflected in the<br />

Maestro Workspace display of the structure.<br />

4.6 Error Messages<br />

When you click Next at the end of the Edit Alignment step, the Check Alignment warning may<br />

appear. The warning lists possible alignment problems that you may want to fix before<br />

continuing to the next step. For example, gaps in secondary structure elements of the template,<br />

or gaps in the query that are aligned to secondary structure elements of the template.<br />

If you do not want to make corrections to the alignment, click Continue to go to the next step.<br />

To make corrections based on the information in the error message, make a note of the gap<br />

locations given, click Cancel to close the message box while remaining in Edit Alignment, and<br />

change the alignment accordingly.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


<strong>Prime</strong> <strong>User</strong> <strong>Manual</strong><br />

Chapter 5<br />

Chapter 5: Comparative Modeling: Build Structure<br />

5.1 The Build Process<br />

The Build Structure step builds a model structure of the query sequence based on the templates<br />

selected in the previous step. Figure 5.1 shows the initial view of the Build Structure step.<br />

Figure 5.1. The Build Structure step, initial view.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 35


36<br />

Chapter 5: Comparative Modeling: Build Structure<br />

The Build process includes the following steps:<br />

1. Coordination of the copying of backbone atoms for aligned regions and side chains of<br />

conserved residues<br />

2. Building insertions and closing deletions in the alignment<br />

3. Building tails out to the last residue in the template<br />

The Build process may also include the following steps:<br />

• Side-chain prediction of non-conserved residues<br />

• Minimization of all atoms not derived directly from the template<br />

You can specify which optional steps to perform by clicking Options and making selections in<br />

the dialog box. See Section 5.2.3.<br />

5.2 Preparing to Build<br />

If you want to build a model using multiple templates or including ligands or cofactors, you<br />

must specify how these are to be treated before clicking the Build button to start the building<br />

process.<br />

5.2.1 Multiple Templates (Set Template Regions)<br />

If you intend to use multiple templates to build a model of the query, you will have already<br />

selected them and ensured that they are structurally aligned (i.e., rotated into a common coordinate<br />

frame; see Section 3.2.5). In the Build Structure step, before starting the structure build<br />

by clicking the Build button, you must specify which template to use for each region of the<br />

query. If you do not select regions from other templates, the topmost template in the table will<br />

be used for the entire query.<br />

To enter the mode that allows you to set the template regions, click the Set Template Regions<br />

toolbar button:<br />

Highlight the region of each template in the sequence viewer that you want to use. Click the<br />

Set Template Regions button again to end the selection. (You may need to scroll over and<br />

select again if the desired region is wider than the display window.)<br />

The Set Template Regions tool helps make sure that only one template residue is assigned to a<br />

query residue at one time. Regions of a template not being used to model the query are colored<br />

dark gray (regardless of coloring schemes). When you begin to set template regions, only the<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


Chapter 5: Comparative Modeling: Build Structure<br />

first template is in use. As you select regions in the second template, the tool automatically<br />

deselects the corresponding residues in the first template. If you select regions for use from a<br />

third template, the corresponding regions of the first and second templates are deselected, and<br />

so on.<br />

5.2.2 Including Ligands and Cofactors<br />

Any cofactors and ligands present in the template you have chosen (except HOH) are listed in<br />

the Include ligands and cofactors box. Click in the list to select one or more ligands and cofactors.<br />

Your selections are highlighted in yellow in the Workspace.<br />

5.2.3 Build Options<br />

Click Options to open the Build Structure–Options dialog box and specify which optional<br />

model building steps to perform:<br />

• Retain rotamers for conserved residues—retain side-chain rotamers for conserved residues.<br />

• Optimize side chains—Optimize the side chains.<br />

• Minimize residues not derived from templates—minimize residues that are not derived<br />

from the templates. Only available if Optimize side chains has been selected.<br />

By default, all three of the above options are selected.<br />

5.2.4 Omitting Structural Discontinuities<br />

You can disable the building of structural discontinuities arising from insertions, template<br />

junctions, and deletions. The discontinuities are capped with NMA and ACE, so the final structure<br />

is discontinuous. If you choose any of these options, you should check that your final<br />

structure is reasonable.<br />

Figure 5.2. The Build Structure–Options dialog box.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 37


38<br />

Chapter 5: Comparative Modeling: Build Structure<br />

If there is an insertion in the query (a template gap) that requires the building of a long loop,<br />

you can instead cut the query sequence and cap it with NMA and ACE, and not build the loop,<br />

by selecting the Insertions (template gaps) option under Omit structural discontinuities, and<br />

enter the maximum length of loops that will be built in the text box. This option could be<br />

useful where there are long insertions that are not in the region of interest.<br />

Likewise, if there are deletions in the query (a query gap), or if there is a junction between two<br />

templates, you can choose to cap the ends of the two pieces and not try to join them in building<br />

the structure. To do this, select Deletions (query gaps) or Template junctions under Omit structural<br />

discontinuities.<br />

When the structure is built with discontinuities, the results appear to be misaligned in the<br />

sequence viewer. To display the correct alignment, right-click in the sequence viewer and<br />

choose Align by Residue Number.<br />

5.3 Running the Build Structure Job<br />

When you click Build, the model-building program begins, and its log is displayed in real time<br />

in the Log file window (see Figure 5.3). It can also be viewed in the Monitor panel (click the job<br />

status icon in the upper right corner of any <strong>Prime</strong>–SP panel to open the Monitor panel). Finally,<br />

the build log is saved to a log file in the directory where <strong>Prime</strong> is running.<br />

First, the log lists the chosen templates. If you did not intend to choose more than one template,<br />

or if you neglected to align the templates, stop the job and restart it with the correct template.<br />

Jobs can be stopped, paused, and resumed from the Monitor panel. See Section 2.5 on page 14<br />

for more information on Job Control.<br />

The next part of the log file gives information about the build process for template transition<br />

regions, then for insertions, and then for side chains, followed by non-template regions. The<br />

last line of the log describes the diagnosis and success of model building.<br />

Once the model building calculations are complete, the model is displayed in the Workspace<br />

superimposed on the template, using the color scheme Atom PDB B Factor (Temperature<br />

Factor). Blue atoms are those derived directly from the template; red atoms have been<br />

predicted or modeled. To restore this color scheme, choose Atom PDB B Factor (Temperature<br />

Factor) from the Color all atoms by scheme toolbar button menu in the main window.<br />

Build Structure also creates an atom set named non_template_residues which can be found in<br />

the Sets tab of the Atom Selection dialog box. This set of inserted residues is generally chosen<br />

to undergo refinement.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


Chapter 5: Comparative Modeling: Build Structure<br />

Figure 5.3. The Build Structure step after building.<br />

The sequence viewer contains the query sequence, the templates, and the sequence of the<br />

newly predicted structure in that order. To view only the model structure, select View Structure<br />

from the Display menu and then select Predicted Only. (The other options are All and None.)<br />

You can also run the Build Structure job from the command line. To do so, click Write to write<br />

the input file. The file is written to the current working directory, and is named according to the<br />

run and project. A dialog box informs you of the file name and the command to use, which is:<br />

$SCHRODINGER/bldstruct input-file<br />

For more information on this command, see Section 11.2 on page 88.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 39


40<br />

Chapter 5: Comparative Modeling: Build Structure<br />

5.4 Saving and Exporting Built Structures<br />

When building is complete, the built structures can be added to the Maestro project by clicking<br />

Add to Project Table. Once you have added the structure to the Project Table, you can refine the<br />

structure by choosing Applications > <strong>Prime</strong> > Refinement in the main window. You can also<br />

export any Project Table entry, such as a built structure, to a file of another format. For information<br />

on exporting structures, see Section 3.2 of the Maestro <strong>User</strong> <strong>Manual</strong>.<br />

5.5 Technical Notes for the Build Structure Step<br />

The Build Structure step uses the following resources and methods:<br />

• The OPLS_2005 all-atom force field for energy scoring of proteins as well as for ligands<br />

and other non-amino-acid residues.<br />

• A Surface Generalized Born (SGB) continuum solvation model for treating solvation<br />

energies and effects.<br />

• Residue-specific side-chain rotamer and backbone dihedral libraries, derived from the<br />

non-redundant data sets extracted from the PDB, and representing values most commonly<br />

observed in protein crystal structures. The libraries allow for both the evaluation of existing<br />

rotamer/dihedral values and the systematic prediction of new ones.<br />

Both Build Structure and the step that follows, Refine Structure, can treat the 20 standard<br />

amino acids as well as modified residues and ligands.<br />

Model building uses the following procedure:<br />

1. Non-conserved residues are mutated to the desired identity. Side chains are added by<br />

finding the first rotamer in the library that does not produce a clash.<br />

2. Insertions, deletions, and template transitions are built. These are cases in which the<br />

backbone itself needs to be reconstructed, either due to gaps in the alignment or template<br />

transitions that produce gaps in the 3D structure. These gaps are closed by reconstruction<br />

of the affected region ab initio, using a backbone dihedral library. If no reasonable conformation<br />

can be generated that closes the loop structure, the region being reconstructed<br />

will be expanded until closure is possible.<br />

3. Gap reconstruction is done by finding any single loop conformation that closes the structure<br />

and is physically reasonable. If multiple conformations are found, the one that deviates<br />

from the template structure as little as possible is chosen. No attempt is made to<br />

perform an exhaustive optimization of the region.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


Chapter 5: Comparative Modeling: Build Structure<br />

4. Side-chain prediction (optional). Side chains may be optimized. The two choices are<br />

either: (a) optimize all side chains from scratch, or (b) optimize only those that are not in<br />

the template.<br />

5. Minimization of non-template regions (optional). Any portions of the backbone that were<br />

not derived directly from one of the templates (and were thus reconstructed ab initio) are<br />

subject to a local minimization as described in Section 7.7 on page 62.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 41


42<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


<strong>Prime</strong> <strong>User</strong> <strong>Manual</strong><br />

Chapter 6: Fold Recognition<br />

Chapter 6<br />

When sequence searches in the Find Homologs step of <strong>Prime</strong>–SP fail to find templates, or<br />

query-template sequence identity is low, structure prediction may be continued using the<br />

Threading path. The Threading path involves use of Fold Recognition to find candidate “seed”<br />

templates, followed by a return to the Comparative Modeling path with a selection of<br />

templates. Fold recognition is not available on Windows.<br />

6.1 Overview of the Fold Recognition Step<br />

The Fold Recognition step is designed to find structural templates that could not be found by a<br />

sequence search. The use of secondary structure matching is important in finding remote<br />

(


44<br />

Chapter 6: Fold Recognition<br />

6.2.2 Generating SSPs<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />

Figure 6.1. The Fold Recognition panel, initial view.<br />

To generate SSPs for the SSP profile, run all available SSP programs by clicking the Run SSP<br />

button. One secondary structure prediction program (SSpro) is bundled with <strong>Prime</strong>. However,<br />

an optional third party program is recommended for optimal results. See Chapter A for more<br />

information on this program.


Figure 6.2. The Fold Recognition panel after Run SSP.<br />

Chapter 6: Fold Recognition<br />

When the secondary structure prediction jobs finish, the SSPs are displayed in the sequence<br />

viewer below (as children of) the query sequence. (If an error occurs when running a structure<br />

prediction program, an error message is printed to the log file displayed in the Monitor panel.)<br />

6.2.3 Editing or Deleting SSPs<br />

You can use your expert knowledge to improve the query’s SSP profile by deleting or editing<br />

SSPs you believe are incorrect.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 45


46<br />

Chapter 6: Fold Recognition<br />

To delete an SSP, right-click on the SSP in the sequence viewer and choose Delete from the<br />

shortcut menu.<br />

To edit an SSP:<br />

1. Click the Edit SSP toolbar button:<br />

2. Highlight the part of the SSP you want to change.<br />

3. Type the character e, h, or -.<br />

E represents strand, H represents helix, and – represents loop.<br />

The highlighted region of the prediction is changed to your choice.<br />

6.2.4 Resetting SSPs<br />

To retrieve an unmodified SSP after editing, click Run SSP. This immediately resets the SSP<br />

(but does not rerun the prediction programs).<br />

6.3 Searching for Templates<br />

When you are satisfied with the SSP profile for the query, click Search to run the Fold Recognition<br />

program. You may prefer to change the search options described in the following section<br />

before doing so.<br />

6.3.1 Search Program Options<br />

Display of Results<br />

Show Top N Results<br />

By default, the 100 top-ranked templates found are shown in the Templates table. This number<br />

can be changed in the Show Top N Results text box before clicking Search.<br />

Z-Scoring<br />

Searching can be done with or without Z-scoring. The Z-score shows how significant a match<br />

is relative to a random match. It is calculated by shuffling the query sequence. See Section 9.2<br />

on page 71 for more information on Z-score filtering.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


Figure 6.3. The Fold Recognition–Options dialog box.<br />

You may want to turn on Z-scoring if the query protein is any of the following:<br />

• Less than 120 residues long<br />

• More than 90 percent alpha (helical)<br />

• More than 90 percent beta (strand)<br />

Chapter 6: Fold Recognition<br />

Click Options to open the Fold Recognition–Options dialog box and turn on Z-scoring.<br />

Note: Running the search program with Z-scoring on a query with 300 residues typically<br />

requires three to four hours. Without Z-scoring, running a search on a protein of<br />

similar length typically takes only 10-15 minutes.<br />

6.3.2 Search Program Technical Details<br />

The search program finds potential templates for model building by searching a database of<br />

structure folds (SCOP domains) generated from the PDB using secondary structure information.<br />

Profile-sequence matching and composite secondary structure information are the same as in<br />

the Align program used in Edit Alignment in the Comparative Modeling path. The search<br />

program used in the Threading path has two additional features:<br />

• Weighting is adjusted by degree of structural conservation inside a family of templates.<br />

Each residue in the template sequence is defined as conserved or variable according to<br />

Multiple Structure Alignment (MSTRA) of the template with its structural neighbors.<br />

When aligning conserved residues, higher weight is given for matching and higher penalty<br />

is given for opening gaps, and vice versa for variable residues.<br />

• Optional Z-scoring, as described above and discussed in more detail in Section 9.2 on<br />

page 71.<br />

6.3.3 Search Output<br />

When the search is complete, the Templates table displays candidate templates, ranked from<br />

the closest homolog to the least close. This preliminary ranking is based on a scoring function<br />

that combines sequence, secondary structure, and degree of structural conservation. If Zscoring<br />

is on when Search is launched, rankings take Z-score into account as well.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 47


48<br />

Chapter 6: Fold Recognition<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />

Figure 6.4. The Fold Recognition panel with Search results.<br />

While it is possible that the top-ranked template will yield good results, it is recommended that<br />

multiple templates be used in model building. By default, the top 25 templates are selected.<br />

You can select the desired templates using control-click and shift-click. To sort the templates<br />

by properties other than rank, click the appropriate column heading.<br />

Optionally, if you have not already searched for the family of the query, you can click Find<br />

Family to run HMMER/Pfam. The query family name found, with an E-value in parentheses,<br />

appears in the Query family name text box. The sequence viewer shows the match between the<br />

query and the consensus sequence of the family, as described in Section 3.2.2 on page 20. Ifa


Chapter 6: Fold Recognition<br />

potentially meaningful (E-value under 1) family identification is found, it may help you select<br />

different or additional seed templates to bring forward to Build Backbone.<br />

To view a template colored by secondary structure, click the check box in the V (Visualization)<br />

column. The template appears in the sequence viewer, and its secondary structure can be<br />

compared to the SSP profile of the query.<br />

Note: The query-template “alignments” that can be visualized in this step are provided only<br />

as a guide in identifying seed templates. Gaps or other unfavorable features will not<br />

affect the quality of the models built using these templates.<br />

6.4 Evaluating Templates<br />

To make use of the seed templates to build a model, they must first be evaluated for their suitability<br />

for comparative modeling. The evaluation is done with the use of two scripts. The<br />

process involves running an alignment for each template, then running the script to do the evaluation.<br />

The first step is to export the SSPs in CASP format (or copy them from the working directory<br />

so that they have a .casp extension). It is recommended that you use both SSPs.<br />

For each template that you want to use, run the alignment script as follows:<br />

$SCHRODINGER/run -FROM psp CMAlign.pl target template seqFile sspFiles<br />

The arguments for this command are described in Table 6.1.<br />

Table 6.1. Arguments for the CMAlign.pl script.<br />

Argument Description<br />

target Name of the query sequence.<br />

template Name of the template, as shown in the Search table of the Fold Recognition step,<br />

including any underscore characters.<br />

seqFile Target sequence file.<br />

sspFiles CASP formatted secondary structure prediction files (.casp), blank-separated.<br />

The alignments are stored in the subdirectory CMAlign. Once you have run CMAlign.pl for<br />

each of the desired templates, you can run the evaluation script:<br />

cd CMAlign<br />

$SCHRODINGER/run -FROM psp checkAlign.pl target<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 49


50<br />

The script provides a recommendation for each template, and provides some information on<br />

the alignment.<br />

6.5 Building a Model<br />

Based on the recommendations of the script, you can use the Comparative Modeling path to<br />

build a model on any of the suitable templates. To do so, the template must be exported and<br />

read back into <strong>Prime</strong>, which you can do with the following procedure:<br />

1. In the Fold Recognition step, include the template in the Workspace (click the V column).<br />

2. Click the Create entry from Workspace main toolbar button to create a project entry.<br />

3. Export the project entry as a PDB file.<br />

4. In the Structure Prediction panel, create a new run and read in the query sequence again.<br />

5. Click Next.<br />

6. In the Find Homolog step, import the saved template PDB file.<br />

There is no need to run Search because a template was manually imported.<br />

7. Select the imported template, and follow the Comparative Modeling path (or click Edit<br />

Alignment in the Guide).<br />

8. In the Edit Alignment step, click Import SSP and select the .casp SSP files that you used<br />

for the analysis.<br />

9. Click Align to run the alignment.<br />

From this point, you can continue to the Build Structure step as normal.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


<strong>Prime</strong> <strong>User</strong> <strong>Manual</strong><br />

Chapter 7: <strong>Prime</strong>–Refinement<br />

Chapter 7<br />

<strong>Prime</strong>–Refinement is a module used to refine protein structures from the Maestro Workspace.<br />

The methods can be applied to protein structures from any source, including one built using the<br />

<strong>Prime</strong>–Structure Prediction workflow. It runs independently of <strong>Prime</strong>–SP from its own<br />

Maestro panel, the Refinement panel. The Refinement panel offers the following refinement<br />

protocols: loop refinement, side-chain prediction, minimization, and a single-point energy<br />

calculation at the current geometry of the model structure.<br />

In many cases, <strong>Prime</strong>–Refinement is used on protein structures from sources other than <strong>Prime</strong>,<br />

but it can also be used with structures added to a project from the Build Structure step in the<br />

<strong>Prime</strong>–SP module. A model structure built in the Build Structure step may require further<br />

refinement. If there are insertions or deletions in the alignment between query and template, it<br />

is recommended that you perform a loop refinement, since insertions and deletions are most<br />

frequently found in loop regions. As another example, if you have added a water molecule to<br />

the binding site of the protein, you can use <strong>Prime</strong>–Refinement on the composite entry.<br />

<strong>Prime</strong>–Refinement can also refine structures with covalently bound ligands. The ligands can be<br />

multiply connected, covalently bound to each other and to the protein. This capability allows<br />

the refinement of proteins with phosphorylated residues or attached sugars, for example.<br />

<strong>Prime</strong>–Refinement has an implicit membrane model, in which the membrane is modeled by a<br />

slab of low dielectric constant in which the protein is immersed. The width of the slab and the<br />

orientation of the protein with respect to the slab can be adjusted.<br />

7.1 Preparing Structures for Refinement<br />

<strong>Prime</strong>–Refinement has a great deal of flexibility in the types of structures it can handle, and can<br />

fix many structural problems. For standard residues, <strong>Prime</strong>–Refinement can fix formal charges<br />

and bond orders, and correct disparities between the sequence and the structure. For example,<br />

if a residue has the coordinates of ALA but is called SER, the 3HB will be ignored and the OG<br />

and HG added during refinement. The standard residues are the 20 canonical amino acids;<br />

ACE and NMA; HOH; CYX (disulfide); and the acid/base variants ASH/AS1 (ASP), GLH/<br />

GL1 (GLU), ARN (ARG), LYN (LYS), HIE/HIP/HID (HIS), CYT (CYS), SRO (SER), TYO<br />

(TYR).<br />

<strong>Prime</strong>–Refinement has certain conditions that must be met by the input structures for a job to<br />

be run successfully. Some of these conditions only apply to covalently bound ligands.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 51


52<br />

Chapter 7: <strong>Prime</strong>–Refinement<br />

For all kinds of structures, the following condition must be met:<br />

• The structure must be an all-atom structure.<br />

This condition applies to the protein and any ligand, waters or cofactors present in the<br />

structure: all hydrogens must be present.<br />

You can add hydrogens to a structure by displaying the structure in the Workspace and<br />

double-clicking the Add hydrogens button in the toolbar.<br />

However, for structures containing nonstandard residues, you must correct bond orders<br />

and formal charges first.<br />

For structures with nonstandard residues, the following conditions must be met:<br />

• No residue can consist of disconnected pieces without formal bonds.<br />

This means, among other things, that metals should not have formal bonds to the rest of<br />

the structure, and should be treated as ionic. For example, in hemes the Fe must be a separate<br />

residue with a +2 formal charge and no bonds, and the “bonded” N atoms must have<br />

a –1 formal charge.<br />

• Bound residues must contain at least 3 atoms between bonds to other residues.<br />

This means that, for example, a dihedral angle cannot span more than 2 residues.<br />

• Bond orders and formal charges must be corrected.<br />

For nonstandard residues the supplied structure will be used, and you must check for correct<br />

bond orders and formal charges.<br />

When you rename residues, you should be aware that OPLS_2005 parameters are used for<br />

standard residues and nonstandard residues.<br />

Proteins can also be used as ligands, and the same conditions apply. No special treatment is<br />

needed for two distinct chains connected by a disulfide or similar side-chain linkage. If the<br />

“ligand” chain has nonstandard terminal groups or is connected to the main chain by its backbone,<br />

two steps are required to make sure that the “ligand” chain is treated correctly:<br />

1. Rename standard residues appearing in the ligand to nonstandard names.<br />

2. Rename the atoms in inter-residue C–N bonds between renamed (ligand) residues. For<br />

example, change “_C__ “to “_C*_”, where the underscores represent spaces. Other<br />

atoms can keep their PDB names; only one of C or N needs to be changed.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


Chapter 7: <strong>Prime</strong>–Refinement<br />

If your structure does not meet one or more of the conditions outlined above, you must fix it<br />

before you can successfully run a refinement. Some of the problems only become apparent<br />

when you have run a refinement job, so it can be a useful diagnostic procedure to run a <strong>Prime</strong><br />

energy calculation first. This job is quick, and produces warnings in the log file, which you can<br />

check in the Monitor panel.<br />

Many of the preparation tasks are performed automatically or interactively with the Protein<br />

Preparation Wizard. The Protein Preparation Wizard panel can be opened from the Workflows<br />

menu on the main toolbar. When the preparation has finished, you should check that all<br />

changes made are correct. You can fix any remaining errors with the procedures below. For<br />

more information on the Protein Preparation Wizard, see the Protein Preparation Guide.<br />

Procedures for fixing bond orders, formal charges, and atom names are given below. These<br />

procedures use the Build toolbar and the Build panel, which you can display by clicking Show/<br />

Hide the Build toolbar on the main toolbar or opening the panel from this button menu.<br />

To assign bond orders:<br />

1. Choose Assign Bond Orders from the Tools menu.<br />

2. Inspect the residues in the structure for any remaining bond orders that are incorrect.<br />

3. If there are still incorrect bond orders, click the Increment bond order or Decrement bond<br />

order button on the Build toolbar.<br />

4. Click the bonds in the Workspace structure that needs correction.<br />

You can also right-click the bonds in the Workspace structure, and choose the correct<br />

order from the Order submenu of the shortcut menu.<br />

To view and change formal charges:<br />

1. From the Label atoms button menu on the main toolbar, choose Formal Charge.<br />

Charges are displayed for atoms that are charged.<br />

2. Click the Increment formal charge or Decrement formal charge button on the Build toolbar.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 53


54<br />

Chapter 7: <strong>Prime</strong>–Refinement<br />

3. Click on an incorrectly charged atom in the Workspace until its charge is correct.<br />

The formal charge is incremented or decremented by one unit for each click.<br />

To change PDB atom names:<br />

1. From the Label atoms button menu on the main toolbar, choose PDB Atom Name.<br />

2. In the Atom Properties tab of the Build panel, choose PDB Atom Name from the Property<br />

option menu.<br />

3. Ensure that Pick is selected in the Apply PDB atom name section, and that Atoms is chosen<br />

from the Pick option menu.<br />

4. Zoom in on one of the residues (middle+right mouse buttons or mouse wheel).<br />

5. For each atom that needs renaming, enter the desired PDB atom name in the PDB atom<br />

name text box, then click on the atom.<br />

6. Repeat Step 4 and Step 5 for each residue whose atoms need renaming.<br />

To change residue names:<br />

1. From the Label atoms button menu on the main toolbar, choose Residue information.<br />

2. In the Residue Properties tab of the Build panel, choose Residue Name from the Property<br />

option menu.<br />

3. Ensure that Pick is selected in the Apply residue PDB name section, and that Residues is<br />

chosen from the Pick option menu.<br />

4. For each residue that needs renaming, enter the desired PDB name in the Residue PDB<br />

name text box, then click on the residue.<br />

You might need to zoom in to select the residues (middle+right mouse buttons or mouse<br />

wheel).<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


7.2 Using the Refinement Panel<br />

Chapter 7: <strong>Prime</strong>–Refinement<br />

The four tasks that can be performed in the Refinement panel are single-point energy calculation,<br />

loop refinement, side-chain prediction, and minimization. These tasks are discussed in the<br />

following sections. The general procedure for running a structure refinement is given below.<br />

To run structure refinement tasks:<br />

1. Display the structure you want to refine in the Workspace.<br />

If the structure has alternate positions, choose the desired set of alternate positions.<br />

2. Choose Applications > <strong>Prime</strong> > Refinement from the main Maestro window.<br />

The Refinement panel is displayed.<br />

3. Select a task from the Task menu.<br />

The default task is Refine loops.<br />

4. If you want to use an implicit membrane model, select Use implicit membrane, then click<br />

Set Up Membrane to define the thickness and orientation of the membrane.<br />

5. Select the region for refinement, as appropriate.<br />

When you have selected a region for refinement, the Start and Write buttons become<br />

available.<br />

6. Click Start.<br />

The Start dialog box is displayed. If you want to run the job later, using refinestruct,<br />

you can click Write to create the input files.<br />

7. Make changes to the job settings, as appropriate.<br />

8. Click Start.<br />

The refinement job begins and the Monitor panel is displayed.<br />

The Start and Write buttons remain dimmed until you have specified some part of the structure<br />

for refinement. When you have made a selection, the Start and Write buttons become available.<br />

The Write button writes the input file for the job, which you can run at a later time<br />

When selecting regions for refinement, you may want to view information about possible<br />

structural problems. This information is available in the Protein Reports panel, which can be<br />

opened from the Tools menu on the main menu bar. For more information on this panel, see<br />

Section 9.5.3 of the Maestro <strong>User</strong> <strong>Manual</strong>.<br />

The OPLS_2005 force field is used for all refinement tasks.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 55


56<br />

Chapter 7: <strong>Prime</strong>–Refinement<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />

Figure 7.1. The <strong>Prime</strong> Membrane Setup panel.<br />

7.3 Setting Up an Implicit Membrane<br />

The <strong>Prime</strong> implicit membrane model is intended for proteins embedded in a membrane. The<br />

implicit membrane is a low-dielectric slab-shaped region, which is treated in the same way as<br />

the high-dielectric implicit solvent region. Hydrophobic groups, which normally pay a solvation<br />

penalty for creating their hydrophobic pocket in the high dielectric region, do not have to<br />

pay that penalty while in the membrane slab. Conversely, hydrophilic groups lose any shortranged<br />

solvation energy from the high dielectric region when moving into the low dielectric<br />

region.<br />

The implicit membrane model is intended for use with proteins that span the membrane. The<br />

slab is trimmed for efficiency, so that any structure that is too small or too far away from the<br />

membrane is likely to result in too much trimming and might not produce useful results.<br />

The membrane is marked in the Workspace by a semitransparent cube with a white line that is<br />

perpendicular to the surfaces of the membrane (which are planar). The membrane surfaces are<br />

colored red, and the other faces of the cube (which are interior to the membrane) are colored<br />

green. You can control whether the sides of the cube are displayed with the Show cube sides<br />

option. The direction of the line shows the orientation of the membrane.You can adjust both the<br />

thickness and the orientation of the membrane.<br />

To set up an implicit membrane:<br />

1. Click Set Up Membrane.<br />

The Membrane Setup panel opens.


2. Click New Membrane (Auto-Place).<br />

A membrane is placed, marked in the Workspace as described above.<br />

3. Select Adjust membrane position.<br />

Chapter 7: <strong>Prime</strong>–Refinement<br />

4. Use the middle mouse button to rotate the membrane and the right mouse button to translate<br />

the membrane.<br />

5. Adjust the thickness by entering a value in the text box or using the arrow buttons to<br />

adjust the thickness up or down by 0.1 Å at a time.<br />

6. Deselect Adjust membrane position.<br />

This is necessary to exit membrane adjustment and return to normal Workspace<br />

operations.<br />

If you want to add the membrane to the entries that are selected in the Project Table, click Save<br />

to Selected Entries. The membrane is stored as a set of coordinates for each end of the white<br />

line. The entries that you select should therefore contain proteins that occupy approximately<br />

the same region of space as the protein used to define the membrane. Otherwise, you will have<br />

to adjust the membrane location for these entries.<br />

If you already have a membrane defined for another project entry, you can load it for the<br />

current entry by selecting the entry that has the membrane and clicking Load from Selected<br />

Entry.<br />

If the protein has pockets that would be occupied by solvent (water), you can select these<br />

regions to exclude from the implicit membrane using the tools in the Region to exclude from<br />

membrane section. The solvent dielectric constant is then used for these excluded regions.<br />

Excluded regions are defined by a set of spheres that are placed on atoms adjacent to the<br />

region. The spheres can overlap, and should cover the entire excluded region. It does not matter<br />

that they overlap the protein, because the protein dielectric is not affected. To define a region,<br />

select Place exclusion sphere and pick atoms in the Workspace that are adjacent to the region.<br />

For each pick a sphere is placed on the atom. You can remove the sphere by picking the atom<br />

again. To adjust the sphere volume, use the Exclusion sphere radius slider. This slider only uses<br />

integer values, because the exact size of the sphere isn’t critical: the spheres only has to cover<br />

the excluded region.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 57


58<br />

Chapter 7: <strong>Prime</strong>–Refinement<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />

Figure 7.2. The Energy Analysis task in the Refinement panel.<br />

7.4 <strong>Prime</strong> Energy Analysis<br />

To perform a single-point molecular mechanics energy calculation, choose Energy Analysis<br />

from the Task option menu. The calculation uses the OPLS_2005 all-atom force field for<br />

protein residues as well as for ligands and cofactors. If desired, set up an implicit membrane<br />

model. Click Start, make job settings in the Start dialog box, then click Start to run the job.<br />

The output includes a breakdown of the energy into covalent, Coulombic, van der Waals and<br />

solvation energy contributions. Each of these is added as a Maestro property to the output<br />

structure file, named <strong>Prime</strong> Covalent, <strong>Prime</strong> Coulomb, <strong>Prime</strong> vdW, and <strong>Prime</strong> Solvation. The<br />

output also includes a breakdown of the energy on a per-residue basis.<br />

7.5 Refining Loops<br />

<strong>Prime</strong>–Refinement is capable of refining loop structures of various lengths, and provides algorithms<br />

for different loop lengths. In addition, loops whose structure affects other loops can be<br />

cooperatively refined in pairs.<br />

To refine one or more loops serially:<br />

1. Choose Refine Loops from the Task menu.


Figure 7.3. The Refine loops task in the Refinement panel.<br />

2. Click Load from Workspace.<br />

Chapter 7: <strong>Prime</strong>–Refinement<br />

The loop features of the Workspace structure appear in the Loops table. For each loop, the<br />

table includes a Run check box, a feature name such as loop1, and the numbers of the first<br />

(Res1) and last (Res2) residues in the loop. When you click a row in the table to select a<br />

loop, the Workspace shows the loop residues highlighted in yellow. Note that loops of<br />

one or two residues are not listed, as they are unsuitable for the refinement protocol.<br />

3. Select a loop by clicking its name in the Feature column.<br />

The loop row is highlighted in yellow. The default settings for the loop are displayed.<br />

4. If necessary, shorten the loop region to be refined by changing the first and last residue<br />

numbers (Res1 and Res2).<br />

The time required to refine a loop scales approximately linearly with the length of the<br />

loop, and the sampling options recommended for loops longer than five residues are also<br />

more time-intensive.<br />

5. (Optional) Click Options to select a sampling method and make settings for this loop in<br />

the Structure Refinement Options dialog box.<br />

If the loop is longer than five residues, you should select a sampling method other than<br />

the default.<br />

See Section 7.8.3 for information on the loop refinement options.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 59


60<br />

Chapter 7: <strong>Prime</strong>–Refinement<br />

6. To designate this loop for refinement, select the check box in the Run column.<br />

The Start and Write buttons becomes available.<br />

7. Repeat Step 3 through Step 6 for each loop you want to refine.<br />

8. (Optional) Set up and enable an implicit membrane model.<br />

9. Click Start.<br />

The Start dialog box opens.<br />

10. Make any job settings, then click Start.<br />

The loop refinement job is started.<br />

When you choose several loops for refinement, they are refined in series. The first selected<br />

loop is refined, the best scoring structure for that loop is determined, that structure is used in<br />

refining the next loop, and so on. When all loops have been refined, one final structure is<br />

returned. If only one loop is refined, the four highest scoring structures are returned by default.<br />

If you want to refine two loops cooperatively, you should replace Step 3 through Step 7 above<br />

with the following steps:<br />

1. Click the Run column for the two loops you want to refine cooperatively.<br />

2. If necessary, shorten the loop region to be refined by changing the first and last residue<br />

numbers (Res1 and Res2).<br />

3. Click Options to open the Structure Refinement Options dialog box.<br />

4. Select Cooperative loop sampling in the Loop refinement section.<br />

5. Set any other options, then click OK.<br />

For technical details about loop refinement, see Section 7.9.1.<br />

When a loop refinement begins, a validation program checks the loop features of the structure.<br />

Apparent errors are reported in a warning dialog box. You may choose Run All Features, Run<br />

Only Valid Features,orCancel. It is recommended that you make a note of the invalid features,<br />

click Cancel, and correct the structure if appropriate.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


Chapter 7: <strong>Prime</strong>–Refinement<br />

Figure 7.4. The Predict side chains task in the Refinement panel.<br />

7.6 Predicting Side Chains<br />

To predict side-chain conformations:<br />

1. Select Predict side chains from the Task options menu.<br />

The panel displays atom selection options under the heading Residues for side chain<br />

refinement.<br />

2. Use the atom selection options to specify the residues for which you want to predict or<br />

refine side-chain conformations (See Chapter 5 of the Maestro <strong>User</strong> <strong>Manual</strong> for information<br />

on selecting atoms).<br />

If you imported a PDB structure that has residues with missing atoms, you can select<br />

those residues by clicking Select Residues with Missing Atoms. The selection is based on<br />

the information generated when the structure was imported, not on the current structure.<br />

The selection therefore includes any residues that were fixed since import.<br />

3. Click Options to change the Number of structures to return from the default, which is to<br />

return a single side-chain prediction.<br />

4. Click the Run button to start the side-chain prediction job.<br />

You can also use this process to add side chains to a backbone structure from Refine Backbone.<br />

For technical details about side-chain prediction, see Section 7.9.3.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 61


62<br />

Chapter 7: <strong>Prime</strong>–Refinement<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />

Figure 7.5. The Minimize task in the Refinement panel.<br />

7.7 Minimizing Structures<br />

The Minimize refinement task performs a truncated-Newton energy minimization, using the<br />

OPLS_2005 all-atom force field for proteins as well as for ligands and cofactors, and treating<br />

solvation energies and effects via the Surface Generalized Born (SGB) continuum solvation<br />

model.<br />

To minimize all or part of the structure selected:<br />

1. Select Minimize from the Task menu.<br />

Atom selection options are displayed under the heading Atoms for minimization.<br />

2. Click All to minimize all residues, or specify a set of atoms or residues.<br />

For more information on atom selection, see Chapter 5 of the Maestro <strong>User</strong> <strong>Manual</strong>.<br />

3. Click Start to start the minimization job.


Figure 7.6. The Structure Refinement Options dialog box.<br />

7.8 Refinement Panel Options<br />

Chapter 7: <strong>Prime</strong>–Refinement<br />

The Structure Refinement Options dialog box offers three sets of options: General options,<br />

Dielectric options, and options for Loop Refinement. Click Options in the Refinement panel to<br />

view or change the options and settings.<br />

7.8.1 General Options<br />

The General section supplies options that apply to all types of refinement. The options include<br />

specification of a random-number seed for starting the side-chain prediction and use of crystal<br />

symmetry.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 63


64<br />

Chapter 7: <strong>Prime</strong>–Refinement<br />

Seed<br />

Specify the seed for the random-number generator used in side-chain predictions. The seed is<br />

used for side-chain prediction in both the loop refinement and the side-chain prediction tasks.<br />

The options are:<br />

• Random: use a random seed.<br />

• Constant: use the integer given in the text box as the seed. The default is 0.<br />

Use crystal symmetry<br />

If crystal symmetry is known for this protein, apply periodic boundary conditions so that the<br />

crystal symmetry is satisfied.<br />

7.8.2 Dielectric Options<br />

The Dielectric section allows you to set the dielectric constants used in the calculations, in the<br />

following text boxes.<br />

Internal dielectric<br />

Set the dielectric constant used for the interior of the protein and any implicit membrane.<br />

External dielectric<br />

Set the dielectric constant for the (continuum) solvent.<br />

7.8.3 Loop Refinement Options<br />

The Loop Refinement section provides options for selecting the sampling method, job options,<br />

output options, and some general options.<br />

<strong>Prime</strong> can perform cooperative loop sampling as well as serial loop sampling. Both the size of<br />

the loop and the sampling options chosen affect the time needed for loop refinement. The<br />

options are:<br />

Cooperative loop sampling<br />

Select this option to refine two loops together. Each loop is refined in turn until the refinement<br />

has converged for both loops. You must have only two loops selected to use this option.<br />

Serial loop sampling<br />

Select this option to refine each loop independently. The option menu provides five levels of<br />

sampling accuracy: Default, Extended Low, Extended Medium, Extended High, Ultra Extended.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


Chapter 7: <strong>Prime</strong>–Refinement<br />

The text below the option and menu displays the recommended loop length for the selected<br />

accuracy.<br />

Extended sampling samples loop conformations more thoroughly, using a series of loop<br />

constraint settings. Up to eight processors can be used simultaneously to perform loop refinement.<br />

Extended sampling can be run at four different sampling levels: Low, Medium, High, and<br />

Ultra. If you select Ultra Extended, you can only refine one loop at a time.<br />

Job<br />

These options allow you to set a limit on the number of processors over which to distribute the<br />

loop sampling subjobs. Each sampling method runs in several stages, and each stage can use a<br />

different number of processors. The maximum number of processors that can be used by the<br />

different sampling methods are:<br />

Default 1<br />

Extended Low 2<br />

Extended Medium 4<br />

Extended High 8<br />

Ultra Extended 8*L (L is the loop length)<br />

• To make use of as many processors as are available, select Distribute subjobs over maximum<br />

available processors.<br />

• To set the maximum number of processors, select Distribute subjobs over and enter a<br />

value in the text box.<br />

Output for single loop feature<br />

These options determine how many structures are returned:<br />

• Maximum number of structures to return: Select this option to return more than one structure<br />

for each loop specified for refinement. The default when this option is selected is to<br />

return 4 structures for each loop.<br />

• Energy cutoff: Select this option to prevent higher-energy structures from being returned.<br />

Structures whose energy is higher than that of the lowest-energy structure by more than<br />

this amount are not returned. The default cutoff when this option is selected is 10 kcal/<br />

mol.<br />

The following settings are applied to the loops you have selected for refinement.<br />

• Maximum CA movement: To limit the extent to which alpha carbon atoms can move from<br />

their original positions, select this option, and enter a limit in the text box. The default<br />

distance is 3.0 Å. The extended sampling procedure tests multiple values for Maximum<br />

CA movement, so this option is unavailable when extended sampling is selected.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 65


66<br />

Chapter 7: <strong>Prime</strong>–Refinement<br />

• Unfreeze side chains within: Select this option to allow side chains near the specified loop<br />

to move, and specify a distance in the text box. The default distance is 7.5 Å.<br />

• Minimum allowed vdW overlap: By default, this value is set to 0.70, meaning that the distance<br />

between atoms must be at least 70% of their ‘ideal’ van der Waals separation. The<br />

extended sampling procedure tests multiple values for the minimum overlap, and therefore<br />

this option is unavailable when extended sampling has been selected.<br />

7.9 Technical Details for Refinement<br />

7.9.1 Loop Refinement<br />

Loop refinement using the Default option proceeds as follows:<br />

1. The loop is reconstructed using the backbone dihedral library, by building up half from<br />

each direction. The resolution starts off very coarsely, becoming finer until it manages to<br />

produce a required number of physically realistic loop conformations (based on the<br />

length of the loop).<br />

2. The large number of loops generated in the first step are clustered, and representatives of<br />

each cluster are selected.<br />

3. These cluster representatives are then scored as follows:<br />

a. Side chains are re-added to the loop residues, as well as any residues within a specified<br />

cutoff, according to the side-chain optimization procedure.<br />

b. The loop and contact side chains are minimized.<br />

c. The energy is calculated.<br />

4. The best scoring loop structures are returned.<br />

Loop prediction with extended sampling is specifically designed to overcome sampling problems<br />

with long loops, in which the number of residues results in an unwieldy number of<br />

possible loop conformations.<br />

Extended sampling uses the following procedure:<br />

1. Two initial predictions are run on the loop. These are intended to coarsely sample conformational<br />

space and are carried out with two slightly different sets of prediction parameters.<br />

Each initial prediction returns up to four top-scoring conformations, resulting in up<br />

to eight structures. These eight structures are intended as initial points for finer sampling<br />

of what appears to be favorable regions (basins) of conformational space.<br />

2. Each of the eight structures generated in Step 1 is run through another stage of refine-<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


Chapter 7: <strong>Prime</strong>–Refinement<br />

ment, with a 4 Å restriction on Cα movement. This is intended to sample the basins more<br />

finely.<br />

3. The eight refined structures from Step 2 are combined with the original structures from<br />

Step 1, and this set of 16 structures is ranked by energy. The five top-scoring structures<br />

are then subjected to a final stage of refinement, with a 2 Å restriction on Cα movement.<br />

This is intended to perform fine-grained sampling of the best basins.<br />

This procedure defines the “Extended High” sampling procedure. The “Extended Medium”<br />

and “Extended Low” carry four and two structures, respectively, into Step 2 and Step 3.<br />

In ultra-extended sampling, the three steps of extended sampling are followed by the steps<br />

below.<br />

4. The top four structures are selected, and two predictions run on each, with part of the loop<br />

frozen. Runs are done with “moving windows” with varying numbers of contiguous residues:<br />

all n residues, the two possible contiguous sets of n-1 residues, the three sets of n-2<br />

residues, and so on. The structures are sorted after each cycle of a given number of residues.<br />

5. Step 2 is repeated with the top eight structures cumulatively generated, then Step 3 is<br />

repeated with the top eight structures cumulatively generated.<br />

7.9.2 Cooperative Loop Refinement<br />

Cooperative loop refinement proceeds as follows:<br />

1. A list of side chains that will be optimized is generated. By default this list includes any<br />

residue with an atom within 7.5 Å of either loop 1 or loop 2.<br />

2. Loop 1 is sampled, allowing side chains in the list from Step 1 to move, holding the backbone<br />

of loop 2 fixed at its original conformation. Eight loop conformations are generated.<br />

3. Loop 2 is sampled, allowing side chains in the list from Step 1 to move, holding the backbone<br />

of loop 1 fixed at its original conformation. Eight loop conformations are generated.<br />

This step runs concurrently with Step 2.<br />

4. Conformations of loop 1 from Step 2 with loop 2 from Step 3 are merged into a set of<br />

structures.<br />

5. All side chains in the list determined in Step 1 are reoptimized, keeping only those structures<br />

whose energy is below the energy cutoff.<br />

6. All residues in both loops are then minimized, including the backbone, keeping only<br />

those structures whose energy is below the energy cutoff.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 67


68<br />

Chapter 7: <strong>Prime</strong>–Refinement<br />

7. The 4 best new structures are taken as input for resampling, starting from Step 2. The process<br />

is repeated until convergence is obtained.<br />

7.9.3 Side-Chain Prediction<br />

The structure refinement program uses the following procedure to re-predict conformations for<br />

the side chain set that you select.<br />

1. Side-chain rotamers are randomized for nonconserved residues (the default) or for all residues.<br />

2. Beginning with the first residue to be predicted, the side-chain rotamer library is used to<br />

find the rotamer with the lowest energy while keeping all other side chains fixed. Once<br />

the process is complete for the first residue, the next residue is treated, and so forth until<br />

all have been done once (a single pass has been completed).<br />

3. Once the pass is complete, Step 2 is repeated from the beginning, and then repeated again<br />

until the side-chain rotamers appear to be converged (no more changes are occurring).<br />

4. Minimization is run on all of the side-chain atoms (but not backbone atoms) of the residues<br />

being treated.<br />

7.10 Refinement with Nonstandard Residues<br />

If you want to refine a structure with nonstandard residues, you can make changes to the residues<br />

in the structures before refinement. Several nonstandard residues are supported in the<br />

Build panel: the neutralized forms of ASP, LYS, ARG, and GLU, which are named ASH, LYN,<br />

ARN, and GLH; and the tautomers of HIS (HID=HIS and HIE) and its protonated form, HIP.<br />

Ionization (deprotonation) of four other residues are supported, CYS, SER, THR, and TYR.<br />

The names of the ionized residues are CYT (thiolate), SRO (alkoxide), THO (alkoxide), and<br />

TYO (phenoxide). In addition, the disulfide-bridged CYS is supported as CYX. To change one<br />

of these residues, you can either change the name, or change the structure. For example, you<br />

can generate an ionized SER by either of the following methods:<br />

• Changing the residue name to SRO. The HG is deleted and a formal negative charge is<br />

added to the OG.<br />

• Deleting the HG (if present) and adding a formal charge to the OG. The residue is<br />

renamed to SRO.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


7.11 Output of Refinement Jobs<br />

Chapter 7: <strong>Prime</strong>–Refinement<br />

By default, the structures created in <strong>Prime</strong>–Refinement are appended to the Project Table for<br />

the project currently open. If you would prefer the output structure to replace the input structure,<br />

or would rather not incorporate the output structure into the project at all, select the<br />

appropriate option from the Incorporate option menu in the Start dialog box:<br />

• Append new entries as a new group<br />

• Append new entries individually<br />

• Replace existing entries<br />

• Do not incorporate<br />

In addition to returning the structures, the output of a refinement job include several Maestro<br />

properties. The main properties are <strong>Prime</strong> Energy and <strong>Prime</strong> Energy Difference, which reports<br />

the difference in energy between the input and output structures. The energy difference is<br />

cumulative: if the property already exists, the difference is incremented in the current job. For<br />

loop refinement and side-chain prediction jobs, the lipophilic term is reported separately as<br />

<strong>Prime</strong> Lipo; the sum of the <strong>Prime</strong> energy and the lipophilic term is used to rank loop conformations.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 69


70<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


<strong>Prime</strong> <strong>User</strong> <strong>Manual</strong><br />

Chapter 8: Maestro Protein Structure Alignment<br />

Chapter 8<br />

It is sometimes useful to be able to align proteins outside the usual <strong>Prime</strong> workflow. Maestro<br />

provides access to the structure alignment facilities of <strong>Prime</strong> so that you can perform such<br />

alignments. Two panels are available from the tools menu: the Protein Structure Alignment<br />

panel, and the Align Binding Sites panel.<br />

8.1 The Protein Structure Alignment Panel<br />

To open the Protein Structure Alignment panel, choose Protein Structure Alignment from the<br />

Tools menu in the main window.<br />

In the Protein Structure Alignment panel, structure alignment is performed on Project Table<br />

entries that have been included in the Workspace. The reference structure, whose frame of<br />

reference is used for the alignment, must be the first included entry (the entry with the lowest<br />

row number). To make another protein the reference, move that protein above all the other<br />

included entries in the Project Table.<br />

By default, the alignment includes all residues. However, it is often useful to align a subset of<br />

residues that have greater similarity than the structures as a whole, or are more informative to<br />

compare. Because this program uses matching of secondary structure elements, the subset<br />

should have enough contiguous residues to include at least one helix or strand. Very small<br />

numbers of residues do not produce useful alignments. Selection of subsets is discussed further<br />

under Residues to align, below.<br />

When the alignment calculation is complete, the structures are placed in the same frame of<br />

reference in the Workspace, and the results of the calculation are listed in the text display area<br />

of the Protein Structure Alignment panel. The aligned residues are listed and an RMSD and<br />

Alignment Score are reported. The RMSD is calculated based only on those residues that are<br />

considered to have been successfully aligned, and therefore does not represent an overall<br />

comparison of the structures. It is computed from the C-alpha atoms of the aligned residues.<br />

An Alignment Score lower than 0.6–0.7 indicates a good alignment. Alignment Scores greater<br />

than about 0.7–0.8, or a failure of the structural alignment calculation, indicates there is insufficient<br />

structural similarity for a meaningful alignment. For details on the algorithm, see the<br />

paper by Honig and Yang (J. Mol. Biol. 2000, 301, 665).<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 71


72<br />

Chapter 8: Maestro Protein Structure Alignment<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />

Figure 8.1. The Protein Structure Alignment panel.<br />

To run a default alignment job, include two or more protein structures in the Workspace, open<br />

the Protein Structure Alignment panel, and click Align to start the alignment calculation.<br />

The Protein Structure Alignment panel provides some tools for controlling which parts of the<br />

protein are aligned and how they are aligned. These tools are described below.<br />

Residues to align Section<br />

Alignment can be performed for all residues or for a subset of residues. You might need to try<br />

different subsets to obtain the most useful alignment. You can use the picking controls, Workspace<br />

selection, or the Atom Selection dialog box to specify subsets. Subsets should include at<br />

least one secondary structure element (helix or strand). Clicking the Select button opens the<br />

Atom Selection dialog box to the Residue tab. Here you can select a range of residue numbers,<br />

specify portions of the residue sequence, or click Secondary Structure in the list of options on<br />

the left to choose residues based on their location in Helix, Strand, or Loop structures. Subsets<br />

can be combined, intersected, or otherwise modified in the Atom Selection dialog box.


Alignment Section<br />

Chapter 8: Maestro Protein Structure Alignment<br />

Gap penalty<br />

In the structure alignment calculation, this is the alignment score penalty for initiating a gap in<br />

the alignment. The default gap penalty is 2.00. The main purpose of this parameter is to<br />

prevent the generation of very poor initial alignments.<br />

Deletion penalty<br />

In the structure alignment calculation, this is the penalty for extending a gap. The default deletion<br />

penalty is 1.00.<br />

Secondary Structure Section<br />

Use scanning alignment<br />

This option is deselected by default. When it is selected, structure alignment is carried out in<br />

scanning rather than the default global mode. Generally, the default global mode is more<br />

useful. However, if you do not find any similarity between structures with global alignment,<br />

you can try scanning alignment.<br />

Global alignment involves running double dynamic programming on the full scoring matrix<br />

built from all secondary structural elements (SSEs) in both structures. In scanning alignment<br />

double dynamic programming is run on successively smaller subregions of the full scoring<br />

matrix (window sizes of 20, 15, 10 and 5 SSEs) until a reasonable alignment is produced.<br />

Assuming a reasonable alignment is found it is used as a starting point for building a global<br />

alignment (involving all SSEs).<br />

Window length<br />

Number of successive SSEs that are to be used as a baseline for the global alignment. The<br />

default is 5. This option is significant only if Use scanning alignment has been selected.<br />

Minimum similarity<br />

This is the alignment score above which alignments are reported as unsuccessful. The default<br />

is 1.00, where alignment scores of 0.7 or less mean greater similarity, and alignment scores<br />

above 1.0 mean less similarity. When alignment scores are greater than 1.0, the alignment is<br />

unlikely to be meaningful.<br />

Structural similarity of proteins is measured by the Protein Structural Distance (PSD). This<br />

measure combines the similarity scores for the secondary structural elements obtained from<br />

dynamic programming and the RMSD of aligned residues for the optimal alignment. For<br />

details see Yang, A.; Honig, B. Journal of Molecular Biology 2000, 301, 665-678.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 73


74<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />

Figure 8.2. The Align Binding Sites panel.<br />

Minimum length<br />

Minimum number of SSEs that are to be aligned in scanning alignment. The default is 2.<br />

Results Text Area<br />

Results from the alignment job are displayed in this text area.<br />

8.2 The Align Binding Sites Panel<br />

The Align Binding Sites panel allows you to align the binding sites (or other structural features)<br />

of members of a family of proteins. The Align Binding Sites job first runs a Protein Structure<br />

Alignment to obtain the global alignment and then automatically generates the list of atoms to<br />

use in a pairwise alignment of the C-alpha atoms from the residues selected.<br />

To open the Align Binding Sites panel, choose Tools > Align Binding Sites in the main window.


Chapter 8: Maestro Protein Structure Alignment<br />

The Use proteins from option menu provides choices of the source of protein structures. If you<br />

choose Project Table, the entries that are selected in the Project Table are used. If you choose<br />

File, the Input file text box and Browse button become available, and you can browse for the<br />

input file.<br />

In the Residues for alignment section you select the residues that are to be used to align the<br />

proteins. There are two options, described below.<br />

• Automatically detect binding site residues—Align residues within a specified distance of<br />

the ligand. The distance is specified in the Align residues within N Å from the ligand text<br />

box. Two options are available for specifying the ligand:<br />

• Detect ligand automatically—Detect the ligand by selecting the molecule that is<br />

most ligand-like (number of atoms).<br />

• Use molecule number—Specify the ligand by molecule number. You can enter a<br />

number in the text box, or select Pick ligand and pick a ligand atom in the Workspace.<br />

• <strong>Manual</strong>ly select residues—Specify the residues to align by using any of the following<br />

methods:<br />

• Enter the residue specification in the Align residues text box, in the format<br />

chain:number.<br />

• Click Select residues to open the Atom Selection dialog box and define the residues.<br />

• Select Pick a residue to include/exclude and pick residues in the Workspace. Picking<br />

an already included residue excludes it.<br />

The residues are listed in the Align residues text box, which you can edit to remove or add<br />

residues.<br />

The alignment is performed on atom pairs. You can ignore atom pairs in the alignment process<br />

if they are further apart than a specified distance, by entering the distance in the Ignore pairs<br />

greater than N Å apart text box.<br />

If you have proteins that are prealigned, you can select Structures are prealigned to suppress<br />

the global alignment of the protein structures, and align only by using the selected residues.<br />

When you have finished making settings, click Start to start the job.<br />

You can also run a job to align binding sites from the command line—see Section 12.4 on<br />

page 119 for more information.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 75


76<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


<strong>Prime</strong> <strong>User</strong> <strong>Manual</strong><br />

Chapter 9: Docking Covalently Bound Ligands<br />

Chapter 9<br />

While Glide is highly efficient at docking ligands that are bound to a receptor through<br />

hydrogen bonds or various nonbonded interactions, it is not able to dock ligands that are<br />

covalently bound to the receptor. <strong>Prime</strong> provides a covalent docking facility that allows you to<br />

select the attachment point to the receptor and the possible attachment points on the ligand, and<br />

run a <strong>Prime</strong> loop prediction to find poses for the ligand.<br />

9.1 Running Covalent Docking from Maestro<br />

Covalent docking calculations can be set up and run from the Covalent Docking panel. To open<br />

the Covalent Docking panel, choose Applications > <strong>Prime</strong> > Covalent Docking in the main<br />

window.<br />

The Covalent Docking panel has two main sections. Above these sections, you can specify the<br />

source of the ligands. In the Ligand reactive groups section, you can specify the kinds of<br />

groups that will react at the receptor attachment site. In the Receptor options section, you can<br />

specify the receptor bond that is broken and choose receptor residues to sample.<br />

9.1.1 Selecting the Ligands<br />

The ligands that you dock to the receptor can be taken from the Project Table or from a file.<br />

• To use the structures that are selected in the Project Table, choose Project Table from the<br />

Use ligands from option menu.<br />

• To read the ligands from a file, choose File from the Use ligands from option menu. You<br />

can then either enter the file name in the Input file text box, or click Browse and navigate<br />

to the file in the file selector that opens. The file must be a Maestro file, and can be compressed<br />

(.maegz, .mae.gz) or uncompressed (.mae).<br />

You should ensure that the structures are all-atom, 3D structures that are properly prepared, for<br />

example by using LigPrep. See the LigPrep <strong>User</strong> <strong>Manual</strong> for more information on ligand preparation.<br />

The poses that are returned include both the ligand and the receptor, so you should consider<br />

carefully how many ligands you want to dock.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 77


78<br />

Chapter 9: Docking Covalently Bound Ligands<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />

Figure 9.1. The Covalent Docking panel.<br />

9.1.2 Defining the Ligand Reactive Groups<br />

In the Ligand reactive groups section, you specify the reactive groups on the ligands, in which<br />

a bond is broken to form a bond with the receptor. Each group is specified by a SMARTS<br />

pattern and the index of the atom that leaves, with its attachments, when the ligand bond is<br />

broken. You can specify multiple reactive groups. For each ligand, all groups are tried for a<br />

match to the ligand, and each match is docked.<br />

To define each SMARTS pattern, you can enter the pattern in the SMARTS pattern text box, or<br />

you can select atoms on a typical ligand molecule in the Workspace, and click Get From Selection<br />

to place a pattern in the SMARTS pattern text box. You can then edit the SMARTS pattern<br />

in the text box if you want.


Chapter 9: Docking Covalently Bound Ligands<br />

When you are satisfied with the SMARTS pattern, you must then designate an atom on the<br />

ligand that is removed, along with the atoms attached to it, when the ligand forms a bond with<br />

the receptor. The fragment with the smaller number of atoms is the part that is removed. The<br />

atom is specified by its index (position) in the SMARTS pattern; the first atom index is 1. This<br />

index must be entered into the Leaving atom text box.<br />

Having defined a SMARTS pattern and the leaving group, click Add. The SMARTS pattern is<br />

added to the table below, and will be used when processing the ligands. You can repeat this<br />

process for as many SMARTS patterns as you want to use. If you want to remove a pattern<br />

from the table, select it and click Remove.<br />

9.1.3 Specifying the Receptor Bond to Break<br />

In the Receptor options section you specify the bond in the receptor that is broken to form a<br />

bond with the ligand. The receptor must be displayed in the Workspace, and is written to a file<br />

when you start the job.<br />

To define the bond that is broken, select Pick receptor bond to break, and pick the desired bond<br />

on the receptor in the Workspace. The Staying atom and Leaving atom text boxes are filled with<br />

the atom numbers of the atom in the bond that remains and the atom in the bond that is<br />

removed when the bond is broken. The bond is marked in the Workspace with yellow markers.<br />

If you pick the wrong bond, you can simply pick again, and the new bond replaces the old.<br />

You can also enter or edit the atom numbers in the Staying atom and Leaving atom text boxes.<br />

To ensure that you have the correct atom numbers, you should label the atoms. For more information<br />

on labeling atoms, see Section 6.3 of the Maestro <strong>User</strong> <strong>Manual</strong>.<br />

9.1.4 Specifying Receptor Residues to Sample<br />

In the Receptor options section you can also choose residues that are allowed to adjust to the<br />

formation of the covalent complex in the loop sampling procedure.<br />

The first choice is whether to sample the residue that is attached to the ligand. To do so, select<br />

Sample attachment residue.<br />

The next step is to pick other residues in the Workspace for sampling. To do so, select Pick<br />

additional residue to sample, and pick residues in the Workspace. You can pick any number of<br />

residues. As well as picking, you can select the desired residues in the Atom Selection dialog<br />

box, which you open by clicking Select. The residues that you select by either method are<br />

listed in the Additional residues text box.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 79


80<br />

Chapter 9: Docking Covalently Bound Ligands<br />

You can extend your selection to include residues that are within a given distance of the<br />

selected residues. To do so, enter a value in the Include residues within N Å of picked residues<br />

text box. These residues are also added to the list in the Additional residues text box.<br />

You can remove residues from the list to sample by deleting them from the Additional residues<br />

text box.<br />

9.1.5 Running the Job<br />

When you have finished making settings, click Start to open the Start dialog box. In this dialog<br />

box you can make job settings and start the job.<br />

The results are returned in a Maestro file. Each pose includes both the ligand and the receptor,<br />

one pose for each ligand attachment point found.<br />

9.2 Running Covalent Docking from the Command Line<br />

It is also possible to run covalent docking jobs from the command line. To do so, you should<br />

first set up the input file in the Covalent Docking panel, as described in the previous section,<br />

and then click Write, to write the input file for the job. This button opens a file selector, in<br />

which you can browse to the location and choose the file name.<br />

The covalent docking job is run with the same pipelining workflow scripts as other workflows,<br />

such as the Virtual Screening Workflow (VSW) and Quantum-Polarized Ligand Docking<br />

(QPLD). Covalent docking does not have its own script, so you can use the script for VSW, and<br />

run the job as<br />

$SCHRODINGER/vsw [job-options] input-file<br />

The job options are the usual Job Control options that specify the host and other resources.<br />

These options are described in Section 2.3 of the Job Control Guide.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


<strong>Prime</strong> <strong>User</strong> <strong>Manual</strong><br />

Chapter 10: <strong>Prime</strong> MM-GBSA<br />

Chapter 10<br />

The <strong>Prime</strong> MM-GBSA panel can be used to calculate ligand binding energies and ligand strain<br />

energies for a set of ligands and a single receptor, using the MM-GBSA technology available<br />

with <strong>Prime</strong>.<br />

The ligands and the receptor must be properly prepared beforehand, for example, by using<br />

LigPrep and the Protein Preparation Wizard. The ligands must be pre-positioned with respect<br />

to the receptor, and the receptor must be prepared as for a <strong>Prime</strong> refinement calculation.<br />

To open the <strong>Prime</strong> MM-GBSA panel, choose MM-GBSA from the <strong>Prime</strong> submenu of the Applications<br />

menu.<br />

After preparing your structures, you can set up the calculation as follows:<br />

1. Specify the source of the structures.<br />

You can take structures from a Pose Viewer file (Glide output), or from separated ligand<br />

and protein structures. If you choose the latter option, you must ensure that the ligands<br />

and the protein are properly prepared and aligned.<br />

2. (optional) Choose calculation settings.<br />

If you want to evaluate ligand strain energies as well as ligand binding energies, select<br />

Calculate ligand strain energies. With this option, an extra calculation on the ligand is<br />

performed at its geometry in the complex, and this result is combined with the free ligand<br />

calculation to calculate the strain energy.<br />

If you want to use ligand partial charges evaluated by some other program, such as QSite,<br />

select Use ligand input partial charges. The input partial charges on the ligand will then<br />

be used instead of those assigned using the default force field.<br />

If you want to use the <strong>Prime</strong> implicit membrane model, select Use implicit membrane.<br />

The receptor you choose must have been already set up with the membrane, as described<br />

in Section 7.3 on page 56.<br />

3. (optional) Select a region within a certain distance of the ligand for which the protein<br />

structure will be relaxed in the calculation.<br />

The atoms in all residues within the specified distance of the first ligand processed are<br />

included in the flexible region. By default, all protein atoms are frozen, and only the<br />

ligand structure is relaxed. The larger the flexible region, the longer the calculation takes.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 81


82<br />

Chapter 10: <strong>Prime</strong> MM-GBSA<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />

Figure 10.1. The <strong>Prime</strong> MM-GBSA panel.<br />

4. Click Start to start the job, or click Write to write the input file and run the job from the<br />

command line.<br />

When you click Start, the Start dialog box opens, in which you can choose to incorporate<br />

the results into the Maestro project, set the job name, select the host, and set the number<br />

of CPUS and number of subjobs, if you are running the job on a multiprocessor host or<br />

submitting it to a queuing system. You can divide the ligand set between a specified number<br />

of subjobs, to achieve load balancing. The number of subjobs must be no smaller than<br />

the number of CPUs, and for optimal load balancing, should be several times the number<br />

of CPUs.<br />

You can also run <strong>Prime</strong> MM-GBSA from the command line on Unix, with the input file written<br />

from the panel. The syntax is as follows:<br />

$SCHRODINGER/prime_mmgbsa [options] input-file<br />

The input file must be a Maestro file with the receptor as the first entry followed by one or<br />

more ligands (i.e. pose viewer file format). The options are described in Table 10.1.


Table 10.1. Options for the prime_mmgbsa command<br />

Option Description<br />

Chapter 10: <strong>Prime</strong> MM-GBSA<br />

-b jobname Use alternative job basename. Default is to use the input file basename.<br />

-cmae Use atomic partial charges from the input Maestro file in simulations<br />

-extdiel value Set the external dielectric to value.<br />

-h Print help message and exit<br />

-intdiel value Set the internal dielectric (including the membrane) to value.<br />

-keep Save all minimized ligand, complex, and protein structures (default is to save<br />

only minimized protein-ligand complexes)<br />

-lstrain Include an estimate of the ligand strain energy in the output<br />

-membrane Use the <strong>Prime</strong> implicit membrane model in protein and complex simulations<br />

-n subjobs Number of subjobs to run.<br />

-rflex file File listing protein residues to be treated flexibly. The residues are specified<br />

in the format chain:residue.<br />

-v Print version information and exit<br />

The output Maestro structure file includes a number of properties, which are described in<br />

Table 10.2.<br />

Table 10.2. Output properties from a <strong>Prime</strong> MM-GBSA calculation.<br />

Property Description<br />

<strong>Prime</strong> Coulomb Coulomb energy of the complex<br />

<strong>Prime</strong> Covalent Covalent binding energy of the complex<br />

<strong>Prime</strong> vdW Van der Waals energy of the complex<br />

<strong>Prime</strong> Solv SA Solvation surface area of the complex<br />

<strong>Prime</strong> Solv GB Solvation energy of the complex<br />

<strong>Prime</strong> Energy <strong>Prime</strong> energy of the complex<br />

<strong>Prime</strong> MMGBSA<br />

Complex Energy<br />

<strong>Prime</strong> MMGBSA Ligand<br />

Energy<br />

<strong>Prime</strong> MMGBSA<br />

Receptor Energy<br />

MMGBSA energy of the complex<br />

MMGBSA energy of the free ligand<br />

MMGBSA energy of the uncomplexed receptor<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 83


84<br />

Chapter 10: <strong>Prime</strong> MM-GBSA<br />

Table 10.2. Output properties from a <strong>Prime</strong> MM-GBSA calculation. (Continued)<br />

Property Description<br />

<strong>Prime</strong> MMGBSA Ligand<br />

in Complex Energy<br />

<strong>Prime</strong> MMGBSA Ligand<br />

Strain Energy<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />

MMGBSA energy of the ligand in the complex<br />

MMGBSA ligand strain energy<br />

<strong>Prime</strong> MMGBSA DG bind MMGBSA free energy of binding<br />

<strong>Prime</strong> MMGBSA DG bind<br />

no Ligand Strain<br />

MMGBSA free energy of binding without including ligand strain


<strong>Prime</strong> <strong>User</strong> <strong>Manual</strong><br />

Chapter 11: Command Syntax<br />

Chapter 11<br />

This chapter summarizes the command syntax for various programs that are part of <strong>Prime</strong>. All<br />

the programs are executed by driver scripts, which run under the Schrödinger Job Control<br />

facility. For more information on this facility, see the Job Control Guide. The conventions used<br />

in describing the command syntax are outlined in the Document Conventions section at the<br />

front of this manual.<br />

Residue specifications are used with a number of the input file keywords for several of the<br />

scripts. These have the format C:N, where C is a one-character chain ID (‘_’ if the chain ID is<br />

blank) and N is the residue number and insertion code (if any). Atom specifications similarly<br />

have the format C:N:A, where A is the 4-character PDB atom name with spaces replaced by<br />

underscores, e.g. A:185:_SG_.<br />

Most of these scripts accept the standard Schrödinger Job Control command line options and<br />

other commonly used options, listed in Table 11.1. Exceptions are noted in the program<br />

descriptions.<br />

Table 11.1. Standard Schrödinger job control options.<br />

Option Description<br />

-HOST host run the job on the specified host. The default is the local host.<br />

-LOCAL Run the job in the local directory rather than the temporary directory<br />

-TMPDIR tmpdir Run the job in the temporary directory tmpdir<br />

-WAIT Do no return control to the shell until the job finishes<br />

-INTERVAL n Update the output every n seconds<br />

-NICE Run the job at reduced priority<br />

-DEBUG Turn on debugging for job control as well as the program<br />

-HELP Show information on command-line usage<br />

11.1 blast—Run Blast Searches<br />

The blast script provides an interface to the Blast and PSIBlast programs from NCBI, while<br />

also supporting standard Schrodinger Job Control options. All options are passed on to the<br />

script via an input file.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 85


86<br />

Chapter 11: Command Syntax<br />

Syntax<br />

$SCHRODINGER/blast jobname [options]<br />

Input is read from the file jobname.inp. Output is written by default to jobname.out.<br />

Options<br />

The blast script accepts the options listed in Table 11.1.<br />

Input File<br />

Keywords control the operation of the blast script, and are listed in Table 11.2. Most<br />

keywords are simply a translation of options available from the blastall or blastpgp<br />

executables distributed by NCBI. For optional keywords, the Blast default values are used.<br />

Since these defaults can differ between different version of Blast they are not listed here.<br />

Table 11.2. Keywords for the blast script<br />

Keyword Description<br />

QUERY_FILE filename Required. Name of the FASTA file containing the input/query<br />

sequence<br />

DATABASE {nr | pdb} Database to be used in the search. In a default installation the two<br />

allowed values are nr and pdb.<br />

ALIGNMENT_FORMAT value Alignment view options (Blast -m option). Allowed values are:<br />

0 pairwise<br />

1 query-anchored showing identities<br />

2 query-anchored no identities<br />

3 flat query-anchored, show identities<br />

4 flat query-anchored, no identities<br />

5 query-anchored no identities and blunt ends<br />

6 flat query-anchored, no identities and blunt ends<br />

7 XML Blast output<br />

8 tabular<br />

9 tabular with comment lines<br />

m2io Maestro format (default)<br />

EXPECT value Expectation value (E) (Blast -e option).<br />

FILTER {true | false} Filter query sequence: DUST with blastn, SEG with others. (Blast<br />

–F option).<br />

GAP_OPEN_COST value Cost to open a gap (Blast -G option)<br />

GAP_EXTEND_COST value Cost to extend a gap (Blast -E option)<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


Table 11.2. Keywords for the blast script (Continued)<br />

Keyword Description<br />

Return Value and Errors<br />

blast returns 0 if successful and a non-zero value in all other cases.<br />

Examples<br />

Chapter 11: Command Syntax<br />

MATRIX type Sequence similarity matrix to use for the alignments. Allowed values<br />

of type are BLOSUM45, BLOSUM62, BLOSUM80, PAM30, PAM70<br />

MAX_ROUNDS n Maximum number of passes to use in multipass version (PSI-Blast<br />

–j option); only applies to runs using PROGRAM blastpgp<br />

E_VALUE_THRESHOLD value e-value threshold for inclusion in multipass model (PSIBlast -h<br />

option).<br />

OUTPUT_FILE filename Name of the output file. Default: jobname.out.<br />

PROGNAME name Type of blast search to run. The allowed values are blastp,<br />

blastpgp, and bl2seq (for pairwise alignments). Default: blastp.<br />

TEMPLATE_FILE filename Name of the FASTA file that contains the template sequence<br />

(bl2seq –j option). Only applies to pairwise alignments using<br />

bl2seq<br />

PSSM_ASCII_OUTFILE<br />

filename<br />

EXPAND_HITS {true |<br />

false}<br />

The following is an example input file; query.aa is the query sequence in FASTA format.<br />

QUERY_FILE query.aa<br />

PROGNAME blastp<br />

DATABASE pdb<br />

Environment Variables<br />

ASCII output file for PSI-BLAST matrix (PSIBlast -Q option).<br />

Expand redundant hits, based on information in the NCBI define for a<br />

particular hit. Only available when using Maestro format.<br />

SHOW {all | pdb} Allows the user to restrict the types of hits that are included in the<br />

output. all includes all hits; pdb only includes hits from the PDB.<br />

Only available when using Maestro format.<br />

The following environment variables are supported by the blast script:<br />

PSP_BLASTDB directory containing the customized Blast databases<br />

PSP_BLAST_DIR directory containing a custom Blast installation<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 87


88<br />

Chapter 11: Command Syntax<br />

11.2 bldstruct—Build a Protein Model from an Alignment<br />

The bldstruct script can be used to build initial (unrefined) protein models from an alignment<br />

to one or more templates via the Protein Local Optimization Program (PLOP).<br />

bldstruct handles all the tasks associated with the building of homology models via PLOP<br />

(i.e. parsing and preparation of input files, interaction with PLOP, and handling of output structures,<br />

etc.). All jobs handled by bldstruct are run under Schrödinger Job Control.<br />

Multithreaded execution (with Open MP) can be enabled for bldstruct jobs by setting the<br />

environment variable OMP_NUM_THREADS. It is recommended to set this environment variable<br />

in the schrodinger.hosts file for hosts on which multithreading is supported—see<br />

Section 6.1 of the Installation Guide.<br />

Syntax<br />

$SCHRODINGER/bldstruct input-file [input-file2 ...] [options]<br />

The input file can be specified with or without the .inp extension. If multiple input files are<br />

specified, the chains are combined to build a multimer. Each input file should correspond to a<br />

single chain of the multimer, and be accompanied by the usual associated PDB template structure<br />

and alignment files. The output is a single structure that contains all of the chains, refined<br />

together; the job and the output file are named after the first input file.<br />

Options<br />

The -DEBUG and -HOST options are recognized by bldstruct. See Section 2.3 of the Job<br />

Control Guide for more information.<br />

Command Input File<br />

The command input file is named jobname.inp. This input file contains keyword-value pairs,<br />

one pair per line separated by one or more spaces. The keywords are given in Table 11.3.<br />

Table 11.3. Keywords for the bldstruct script<br />

Keyword syntax Description<br />

BUILD_DELETIONS {true |<br />

false}<br />

BUILD_TRANSITIONS {true |<br />

false}<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />

Turn on or off closure of the chain breaks near deletions in the<br />

alignment. If not closed, they will remain as chain breaks in the<br />

output structure. Default: true.<br />

Turn on or off closure of the junctions between templates in multitemplate<br />

homology modelling jobs. If not closed, they will remain<br />

as chain breaks in the output structure. Default: true.


Table 11.3. Keywords for the bldstruct script (Continued)<br />

Keyword syntax Description<br />

Chapter 11: Command Syntax<br />

COMPOSITE_ARRAY string A string of integers indicating from which template (or, more specifically,<br />

from which alignment) coordinates for a given residue<br />

should be obtained. Thus, this string of integers must be equal in<br />

length to the query sequence being used in the alignment files.<br />

COMPOSITE_ARRAY is required for multiple template jobs. It is not<br />

required for single template jobs, but if it is present, it is ignored.<br />

KEEP_ROTAMERS {true |<br />

false}<br />

Specifies which side chains to optimize. If true, the rotamers for<br />

conserved side chains are preserved, and only non-conserved side<br />

chains are predicted. If false, all side chains are predicted. This<br />

keyword is ignored if SIDE_OPT is false. Default: true.<br />

MAX_INSERTION_SIZE n Specifies the longest insertion in the alignment that will be built.<br />

Insertions longer than this will be omitted, and not appear in the<br />

output structure. Residue numbering will reflect the full query<br />

sequence however. Default: 1000.<br />

MINIMIZE {true | false} Turn on or off minimization of regions that were not directly copied<br />

from the template structure. These regions primarily include portions<br />

of the structure involved in closing gaps or building insertions,<br />

but also include any side chains optimized due to the<br />

SIDE_OPT keyword. Default: true.<br />

SIDE_OPT {true | false} Turn on or off a side chain optimization stage. Default: true.<br />

template_ALIGN_FILE<br />

filename<br />

template_HETERO_i<br />

ligand-spec<br />

Required. The name of the file containing the alignment between<br />

this template and the query. Details on the format are provided in<br />

the Examples section below.<br />

Required if a ligand is present. Specifies a ligand to include from<br />

this template. The format of the specification is<br />

AAA C:###<br />

where AAA is the ligand’s three letter code, C is its chain ID, ### is<br />

its residue number. Multiple ligands from the template are numbered<br />

using i as an index, starting from zero.<br />

TEMPLATE_NAME template Required. A label used to identify a given template in subsequent<br />

data fields. For example, if TEMPLATE_NAME is set to 1BPJ_A,<br />

other template-specific fields are given as 1BPJ_A_STRUCT_FILE,<br />

1BPJ_A_ALIGN_FILE, and so on.The chain to be used is specified<br />

by an underscore and letter suffix: for example, for 1BPJ_A, chain<br />

A will be used. The chain suffix must be included in the label.<br />

template_NUMBER n Required. Used to identify a given template in the<br />

COMPOSITE_ARRAY (see below). Numbering goes from 1 to 9.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 89


90<br />

Chapter 11: Command Syntax<br />

Table 11.3. Keywords for the bldstruct script (Continued)<br />

Keyword syntax Description<br />

template_STRUCT_FILE<br />

filename<br />

USE_SYMMETRY {true |<br />

false}<br />

Files<br />

In addition to the input files (structure and alignment) and the job files (command input file,<br />

jobname.inp, and log file, jobname.log), bldstruct produces the following output files:<br />

jobname-out.mae Built model in Maestro format.<br />

jobname-out.pdb Built model in PDB format.<br />

Return Value and Errors<br />

bldstruct returns a value of 0 on successful completion, 1 otherwise. All error messages are<br />

printed to the file jobname.log. There is no comprehensive listing of possible errors as these<br />

can arise in a variety of programs and scripts.<br />

Examples<br />

The following is a sample input file for building on two templates:<br />

QUERY_OFFSET 0<br />

1TSV_NUMBER 1<br />

1TSV_STRUCT_FILE pdb1TSV.ent<br />

1TSV_ALIGN_FILE 1TSV.align<br />

TEMPLATE_NAME 1BPJ_A<br />

1BPJ_A_NUMBER 2<br />

1BPJ_A_STRUCT_FILE pdb1BPJ_A.ent<br />

1BPJ_A_ALIGN_FILE 1BPJ_A.align<br />

1BPJ_A_HETERO_0 UMP :317<br />

1BPJ_A_HETERO_1 ZN :35<br />

TAILS 0<br />

SIDE_OPT true<br />

MINIMIZE true<br />

KEEP_ROTAMERS true<br />

MAX_INSERTION_SIZE 1000<br />

BUILD_DELETIONS true<br />

BUILD_TRANSITIONS true<br />

USE_SYMMETRY false<br />

COMPOSITE_ARRAY 111111111111111111111112222222222222222222222222222222222221111111<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />

Required. The PDB file to be used for this particular template.<br />

Turn on or off use of the crystallographic symmetry of the template<br />

file when constructing the homology model. Default: false.


Chapter 11: Command Syntax<br />

The alignment file format is as follows. The first line corresponds to the query sequence (indicated<br />

by ProbeAA:) and the second line corresponds to the template sequence (indicated by<br />

Fold AA:). A period is used as a gap character, and an X is used for non-standard residues.<br />

Below is a sample alignment file.<br />

ProbeAA: .AAEEKTEFDVILKAAGANKVAVIKAVRGATGLGLKEAKDLVESA...PAALKEGVSKDDAEALKKALEEAG<br />

AEVEVK<br />

Fold AA: AAQEEKTEFDVVLKSFGQNKIQVIKVVREITGLGLKEAKDLVEKAGSPDAVIKSGVSKEEAEEIKKKLEEAG<br />

AEVELK<br />

The sequences are split for formatting reasons, and should be on a single line in the actual file.<br />

11.3 fr—Fold Recognition<br />

fr finds template proteins that are similar to the query sequence at both sequence and structure<br />

levels.<br />

The fold recognition program used in <strong>Prime</strong> differs from standard sequence search program<br />

such as BLAST by taking into account secondary structure matching and profile-sequence<br />

matching. As a result, fr can find structural homologs with low sequence identity to the query<br />

(


92<br />

Chapter 11: Command Syntax<br />

Syntax<br />

Options<br />

$SCHRODINGER/fr jobname [options]<br />

The -DEBUG and -HOST options are recognized by fr. See Section 2.3 of the Job Control<br />

Guide for more information.<br />

Command Input File<br />

The command input file is named jobname.inp. This input file contains keyword-value pairs,<br />

one pair per line separated by one or more spaces. The keywords are given in Table 11.4.<br />

Table 11.4. Keywords for the fr script.<br />

Keyword syntax Description<br />

QUERY_NAME name Name of the query sequence<br />

QUERY_FILE filename Query sequence file, in FASTA format<br />

ALIFORMAT value Select format of alignments file. Default is Maestro format. For “normal”<br />

format set value to dev.<br />

BGNRES n The first residue of the query that is used in search for homologs. Default: 1<br />

(the first residue).<br />

ENDRES n The last residue of the query that is used. Default: the last residue of the<br />

complete query<br />

PREDSS[_i] filename Secondary structure prediction file of the query sequence, in CASP format.<br />

One occurrence of this keyword is required for each file, and there must be<br />

at least one SSP. If only one prediction is used, the keyword is PREDSS; if<br />

more predictions are used, the keyword is suffixed with an index, starting<br />

from zero: PREDSS_0, PREDSS_1, and so on.<br />

Files<br />

• Input file—named jobname.inp. Contains keywords for running the program.<br />

• Query sequence file—File containing the complete sequence of the query, in FASTA format.<br />

Specified in the input file.<br />

• Secondary structure prediction files in CASP format for the query sequence. See below<br />

for an example of CASP format.<br />

• Output Ranking File—File named jobname.out containing a list of templates ranked<br />

according to the FR scoring function as described above.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


Chapter 11: Command Syntax<br />

• Output alignment file—File named jobname.ali containing pairwise alignment between<br />

the query and a template. By default, the alignment is in Maestro format (m2io). In order<br />

to view alignments in normal format (i.e. aligned residues from query and template are<br />

placed on top of each other), add the keyword ALIFORMAT to the input file with value set<br />

to dev.<br />

• Log file—File named jobname.log containing progress of the program, including warnings<br />

and error messages.<br />

Return Value and Errors<br />

The script returns 0 for success and 1 for failure.<br />

In case of failure, check the log file for details on what might be have caused the program to<br />

fail. To turn on debugging output use the -DEBUG command line option as follows:<br />

$SCHRODINGER/fr -DEBUG jobname<br />

Examples<br />

The following file demonstrates the use of three SSPs.<br />

QUERY_NAME HypProtein_furiosus_truncated<br />

QUERY_FILE HP.seq<br />

PREDSS_0 prime_foldrecog_run1_HP-ssp1.casp<br />

PREDSS_1 prime_foldrecog_run1_HP-ssp2.casp<br />

PREDSS_2 prime_foldrecog_run1_HP-ssp3.casp<br />

BGNRES 1<br />

ENDRES 197<br />

The following is an example of a CASP format file. The CASP format has a minimum of 2<br />

columns per line, with the first being the one-letter code for the amino acid and the second<br />

being the secondary structure type. An optional third column gives the confidence level of the<br />

prediction.<br />

PFRMAT SS<br />

TARGET 1HL6:A<br />

AUTHOR xxxx-xxxx-xxxx<br />

REMARK written by Schrodinger LLC Software<br />

METHOD not applicable<br />

MODEL 1<br />

M C 1.00<br />

A C 1.00<br />

D H 1.00<br />

V H 1.00<br />

...<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 93


94<br />

Chapter 11: Command Syntax<br />

E C 1.00<br />

K C 1.00<br />

R C 1.00<br />

R C 1.00<br />

R C 1.00<br />

END<br />

11.4 multirefine—Multi-Step Extended Sampling<br />

The multirefine script automates extended sampling protocols, which consist of multiple<br />

executions of refinestruct on a set of structures. Multirefine handles the submission of the<br />

subjobs and the compilation of the results. Currently available protocols include extended loop<br />

sampling, ultra-extended loop sampling, and cooperative loop pair sampling.<br />

Syntax<br />

$SCHRODINGER/multirefine input-file [options]<br />

where input-file is the name of the command input file, with or without a .inp extension, and<br />

serves also as the default job name. The input file must be either the first or last command-line<br />

argument, and everything else on the command line is passed as is to Job Control.<br />

Options<br />

The standard Job Control command line options listed in Table 11.1 are accepted. There are no<br />

options specific to this program.<br />

Command Input File<br />

Most of the keywords are the same as the general keywords and the keywords for protocol 1<br />

(loop and helix refinement) for refinestruct: see Section 11.5 on page 98 for details. The<br />

differences and new keywords are described below.<br />

Table 11.5. Keywords for the multirefine input file<br />

Keyword syntax Description<br />

HOST hostname Job host to which the subjobs will be submitted. The top-level driver script<br />

will run on the host specified by the -HOST command-line argument (or on<br />

the launch host if none is specified.) If hostname is the same as that specified<br />

by -HOST or is localhost, the subjobs will run on the same host as the toplevel<br />

driver (or be submitted to the same queuing system if the top-level host<br />

is a queue). Otherwise, the driver will submit subjobs using hostname as their<br />

command-line argument. Not available with refinestruct.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


Table 11.5. Keywords for the multirefine input file (Continued)<br />

Keyword syntax Description<br />

Chapter 11: Command Syntax<br />

LOOP_i_RES_j Specify beginning and end of loops. Same as for refinestruct. Required if<br />

SEG_LIST is not given. Depending on the protocol, 1, 2, or an arbitrary number<br />

of loops may be specified. For example, to specify the beginning and end<br />

of loop 0, from residue 10 to residue 20 in chain A, use the following:<br />

LOOP_0_RES_0 A:10<br />

LOOP_0_RES_1 A:20<br />

MAX_JOBS n Maximum number of subjobs that can run at one time. This can be set to a<br />

value less than the value of THREADS if sufficient resources are not available.<br />

Not available with refinestruct.<br />

PRIME_TYPE n Type of refinement to be carried out. Can be specified as either a number or<br />

text string. Different from refinestruct. Available choices are:<br />

0 or default: Equivalent to running a single loop refinement using<br />

refinestruct. Not particularly useful except for completeness and testing<br />

purposes. Does not recognize THREADS keyword.<br />

1 or extended: Used for Extended Sampling from the GUI. Does not recognize<br />

MIN_OVERLAP and MAX_CA_MOVEMENT keywords, as these are used<br />

internally by the protocol. Accepts multiple loops as input which will be<br />

run sequentially.<br />

2 or long_loop: No longer used.<br />

3 or loop_pair: Used for Cooperative Loop Sampling from the GUI.<br />

Exactly two loops must be specified.<br />

2 or long_loop_2: Used for Ultra-extended Sampling from the GUI. Does<br />

not recognize MIN_OVERLAP and MAX_CA_MOVEMENT keywords, as these<br />

are used internally by the protocol. Also does not recognize THREADS keyword,<br />

as the sampling is determined by the loop length. Only one loop can<br />

be specified.<br />

STRUCT_FILE Required. The input structure file in Maestro format. Same as for<br />

refinestruct.<br />

SEG_LIST filename Name of a file containing the list of loops to be refined. For the example given<br />

for the keyword LOOP_i_RES_j, the file would consist of the single line:<br />

loop A:10 A:20<br />

Not available with refinestruct.<br />

THREADS n Number of simultaneous jobs to be run during selected refinement stages.<br />

Determines the sampling level for some protocols. Note: the jobs specified by<br />

THREADS do not have to actually run in parallel—see MAX_JOBS. Not available<br />

with refinestruct.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 95


96<br />

Chapter 11: Command Syntax<br />

Files<br />

In addition to the jobname.log and jobname.err files, multirefine produces the following<br />

output files:<br />

jobname-out.mae Final structures, all in a single file.<br />

jobname.pdb PDB format file corresponding to lowest-energy structure.<br />

Other structures will have corresponding files jobname-<br />

2.pdb, and so on.<br />

jobname.restart Contains information that allows a job to be restarted if it<br />

fails.<br />

jobname-run-N.archive.zip Archives of subjob output files.<br />

Temporary files, which are normally deleted when no longer needed unless the debugging<br />

option is used, are named according to the following patterns:<br />

jobname-conf-N.* Files associated with temporary structures. All active structures in the<br />

ensemble have a unique serial number N.<br />

jobname-run-N.* Files associated with a given subjob. All subjobs have a unique serial<br />

number N.<br />

Return Value and Errors<br />

multirefine returns a value of 0 on successful completion, 1 otherwise. All scripts run by<br />

multirefine that return a value of 1 also print a message to the file jobname.err. This file is<br />

empty otherwise, as all non-fatal messages are printed to the file jobname.log. Output from<br />

Job Control goes to standard output, and any errors arising before the driver script has started<br />

likewise go to standard output or standard error.<br />

Restarting an Incomplete Job<br />

If a job fails, you can restart it by submitting it again from the same directory. The<br />

jobname.restart file and the archives of subjobs must be present in the directory, and the<br />

directory must be accessible from the host on which the driver job runs. Also, the job records<br />

for the failed jobs must not be purged from the job database, so that the restart mechanism has<br />

access to information on the failed subjobs. If the job records are missing, the subjobs are<br />

repeated.<br />

Failed jobs that were started from Maestro must be restarted from the command line. You can<br />

use the -PROJ and -D<strong>ISP</strong> options to assign the job to the Maestro project and set the incorporation<br />

mode.<br />

Once the job finishes successfully, the subjob archives and the restart file are removed.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


Chapter 11: Command Syntax<br />

It is not necessary to use -LOCAL or -SAVE to ensure that a job can be restarted, but there must<br />

be sufficient space in the working directory for the subjob archives.<br />

Examples<br />

The following is an example of an input file produced by Maestro for a Serial loop sampling<br />

(Extended Medium) job:<br />

STRUCT_FILE prime_refine.mae<br />

PRIME_TYPE 1<br />

LOOP_0_RES_0 _:35<br />

LOOP_0_RES_1 _:40<br />

THREADS 4<br />

MAX_JOBS 2<br />

NUM_OUTPUT_STRUCT 4<br />

RES_SPHERE 7.50<br />

USE_CRYSTAL_SYMMETRY false<br />

USE_RANDOM_SEED true<br />

SEED 0<br />

HOST localhost<br />

Setting PRIME_TYPE to 1 selects the extended sampling protocol; setting THREADS to 4 corresponds<br />

to the Medium sampling level. Setting MAX_JOBS to 2 requires that no more than 2<br />

subjobs can run at once (e.g. if using a 2-processor machine.)<br />

Environment Variables<br />

In addition to the standard Schrödinger environment variables, multirefine recognizes the<br />

following environment variables:<br />

PSP_RB_DEBUG<br />

PSP_DEBUG<br />

PSP_RB_LOCAL<br />

PSP_LOCAL<br />

TFM_QUIET<br />

TFM_VERBOSE<br />

If this environment variable is set, debugging will be turned on and the<br />

subdirectory containing intermediate files will be kept. Equivalent to using<br />

the command-line option -DEBUG.<br />

If this is set, all calculations will run in the local launch directory. Equivalent<br />

to using the command-line option -LOCAL.<br />

These environment variables can be set to decrease or increase the jobrelated<br />

output in the log file.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 97


98<br />

Chapter 11: Command Syntax<br />

11.5 refinestruct—Refine a Protein Structure<br />

The refinestruct script runs protein structure refinement jobs (side chain prediction and<br />

energy minimization as well as loop and helix prediction) using the Protein Local Optimization<br />

Program (PLOP).<br />

refinestruct handles all the tasks associated with the refinement of protein structures via<br />

PLOP: parsing and preparation of input files, interaction with PLOP, and handling of output<br />

structures, etc. It also fixes the structural issues that are fixed by primefix.py (see<br />

Section 12.3 on page 118). All jobs handled by refinestruct are run under Schrödinger Job<br />

Control.<br />

Multithreaded execution (with Open MP) can be enabled for bldstruct jobs by setting the<br />

environment variable OMP_NUM_THREADS. It is recommended to set this environment variable<br />

in the schrodinger.hosts file for hosts on which multithreading is supported—see<br />

Section 6.1 of the Installation Guide.<br />

Syntax<br />

$SCHRODINGER/refinestruct [options] jobname<br />

The input is read from the file jobname.inp and the structural output is written to the file<br />

jobname-out.mae. If the input structure file has alternate coordinates, the refinement is<br />

performed on the first set of alternate coordinates.<br />

Options<br />

The refinestruct script accepts the options listed in Table 11.1.<br />

Command Input File<br />

The command input file contains a list of keywords that control the operation of the<br />

refinestruct script, one per line. The keywords are listed in the tables below. Generally<br />

valid keywords are listed in Table 11.6 and protocol-specific keywords are listed in Table 11.7,<br />

under the protocol to which they apply.<br />

Table 11.6. General keywords for the refinestruct script<br />

Keyword syntax Description<br />

ECUTOFF value Energy cutoff for return of conformations. Returns all conformations<br />

within value kcal/mol of the lowest-energy structure. It can be used in<br />

conjunction with NUM_OUTPUT_STRUCT, to impose a maximum on the<br />

number of such conformations to return. As with<br />

NUM_OUTPUT_STRUCT, it only applies to single loop refinements.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


Table 11.6. General keywords for the refinestruct script (Continued)<br />

Keyword syntax Description<br />

Chapter 11: Command Syntax<br />

EXT_DIEL value Dielectric constant of the continuum solvation model. Default: 80.0.<br />

Used with SGB_MOD sgbnp.<br />

INT_DIEL value Dielectric constant used within the radius of an atom. Default: 1.0.<br />

NUM_OUTPUT_STRUCT n Number of conformations to return for single loop refinements. This has<br />

no effect on helix refinements or jobs composed of more than one refinement<br />

(e.g. 2 loops, 1 helix + 1 loop, etc.). All structures are returned in a<br />

single structure file in Maestro format. Default: 1.<br />

PAIR_CONSTRAINT_i<br />

atom1,atom2,value<br />

Specify a constraint on a pair of atoms. The format of the atom specifications<br />

is given at the beginning of this chapter. The value is the distance<br />

between the atoms in angstroms.<br />

PRIME_TYPE type Type of refinement to perform. The available options are:<br />

SIDE_PRED Predict Side Chains (prefix SC)<br />

LOOP_BLD Predict Loops and Helices<br />

REAL_MIN Minimize Energy (prefix MINI)<br />

SITE_OPT Active Site Minimization (including ligand)<br />

ENERGY Energy Calculation<br />

RESIDUE_i spec Specify a residue to refine. The numbering i begins at 0, and increases<br />

incrementally. The format of the residue specification spec is given at<br />

the beginning of this chapter.<br />

RETAIN_CORRECTIONS<br />

{yes | no}<br />

Retain corrections made when duplicate residue numbers are found. The<br />

duplications detected are between non-protein residues or between protein<br />

and non-protein residues. The non-protein residue is assigned a new,<br />

unique chain name temporarily. If you set this keyword to yes, the new<br />

chain name is written to the output file. Default: no.<br />

SEED n The integer to use as the seed for the random number generator if<br />

USE_RANDOM_SEED is false. If it is true, SEED is ignored. Default: -1,<br />

meaning that the program will generate a random seed rather than use<br />

the supplied seed.<br />

SELECT string Specify the manner for specifying which residues to refine. Available<br />

options are:<br />

all Refine all residues<br />

pick Refine the specific residues, indicated by included<br />

RESIDUE_i parameters (see above).<br />

file Read list of residues to refine from a file. The file consists of a<br />

series of lines with a residue specification on each line.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 99


100<br />

Chapter 11: Command Syntax<br />

Table 11.6. General keywords for the refinestruct script (Continued)<br />

Keyword syntax Description<br />

SGB_MOD {sgbnp|<br />

vdgbnp}<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />

Specify the implicit solvation model to use. The choices are:<br />

sgbnp—use the standard generalized Born model.<br />

vdgbnp—use the variable-dielectric generalized Born model, which<br />

uses different dielectric constants based on the polar or charged nature<br />

of the protein residue: 4 for Lys, 3 for Glu/Arg, 2 for Asp/His, 1 for all<br />

others. This method improves loop predictions.<br />

Default: vdgbnp.<br />

STRUCT_FILE filename Input structure file in Maestro format.<br />

USE_CRYSTAL_SYMMETRY<br />

{yes | no}<br />

USE_RANDOM_SEED<br />

{yes | no}<br />

USE_MAE_CHARGES<br />

{yes | no}<br />

USE_MEMBRANE<br />

{yes | no}<br />

Set to true if the input structure contains unit cell information and crystal<br />

symmetric atoms are to be included in calculation. Default: no.<br />

Indicates whether to use a random seed for the random number generator.<br />

This keyword has no effect on Minimization tasks. Default: no.<br />

Indicates whether to use atomic partial charges from the Maestro input<br />

file for untemplated residues (ligands, cofactors). Default: no.<br />

Indicates whether to use the implicit membrane model. The model must<br />

be set up from the Setup Membrane panel in Maestro. Default: no.<br />

USE_SGB {yes | no} Use the SGB implicit solvation model. If the model is not used, calculations<br />

are run in vacuum with a dielectric of 6.0. Default: no.<br />

Table 11.7. Protocol-specific keywords for the refinestruct script<br />

Keyword syntax Description<br />

LOOP_BLD<br />

CA_CONSTRAINT_i<br />

spec,[x,y,z]value<br />

Specify a constraint on the position of a specific alpha carbon atom. The<br />

format of the residue specification spec is given at the beginning of this<br />

chapter. The optional coordinates specify the target position; if omitted,<br />

the target position is the initial position. The value is the maximum distance<br />

that the C-alpha atom can move from the target position, in angstroms.<br />

Specifying a target position allows you to move a loop to a<br />

desired location.<br />

LOOP_i_RES_j spec Specify residue j for defining loop i. Loops are numbered sequentially,<br />

beginning at 0. Two occurrences of this keyword are required, with j=0<br />

(the beginning of the loop) and j=1 (the end of the loop). spec is a residue<br />

specification as defined above.


Table 11.7. Protocol-specific keywords for the refinestruct script (Continued)<br />

Keyword syntax Description<br />

HELIX_BUILD<br />

spec1/spec2<br />

Files<br />

Chapter 11: Command Syntax<br />

In addition to the input structure file, defined by the value of STRUCT_FILE, and the job files<br />

(the command input file jobname.inp, and the log file jobname.log), bldstruct produces<br />

the following output file:<br />

Return Value<br />

Build the specified sequence as a helix. The format of the residue specifications<br />

spec1 and spec2 is given at the beginning of this chapter. You<br />

should include enough residues on either side of the helical region in the<br />

loop prediction to ensure that sufficient flexibility is available to build<br />

the loop.<br />

MAX_CA_MOVEMENT value Specify the maximum distance that any alpha carbon atom in the loop<br />

should be allowed to move during refinement. Omitting this parameter<br />

indicates that no restriction should be applied.<br />

RES_SPHERE value All side chains that lie within this distance of the loop will also be optimized<br />

during refinement. This allows these nearby side chains to “react”<br />

to the loop being predicted. Default 0.0.<br />

MIN_OVERLAP value Fraction of van der Waals distance used to define a clash. Smaller values<br />

allow structures with worse potential atomic overlaps to be generated.<br />

Default: 0.70<br />

SITE_OPT<br />

LIGAND string Residue specification for the ligand.<br />

NPASSES n Number of passes through side chain optimization of the protein residues<br />

followed by minimization of the protein residues (including the<br />

backbone). These two steps are repeated the specified number of times<br />

before proceeding to optimization of the protein and the ligand.<br />

Default: 2. Files written from Maestro have NPASSES set to 1.<br />

jobname-out.mae Final structures, all in a single file.<br />

refinestruct returns a value of 0 on successful completion, 17 for license errors, and 1<br />

otherwise. All error messages are printed to the file jobname.log. There is no comprehensive<br />

listing of possible errors as these can arise in a variety of programs and scripts.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 101


102<br />

Chapter 11: Command Syntax<br />

Environment Variables<br />

In addition to those used for all Schrödinger software, refinestruct uses the environment<br />

variable PSP_OPLS_VERSION to define the OPLS force field version. The default value is<br />

2005. It can be set to 2001 to use the earlier set of force field parameters for ligands and nonstandard<br />

residues. The OMP_NUM_THREADS environment variable is also supported for<br />

OpenMP multithreaded execution (see above).<br />

Examples<br />

The following is a sample minimization input file:<br />

STRUCT_FILE input-structure.mae<br />

PRIME_TYPE REAL_MIN<br />

SELECT pick<br />

RESIDUE_0 A:1<br />

RESIDUE_1 A:20<br />

RESIDUE_2 A:23<br />

RESIDUE_3 A:26<br />

RESIDUE_4 A:27<br />

RESIDUE_5 A:30<br />

RESIDUE_6 A:31<br />

RESIDUE_7 A:36<br />

RESIDUE_8 A:40<br />

RESIDUE_9 A:42<br />

RESIDUE_10 A:53<br />

USE_RANDOM_SEED false<br />

SEED 0<br />

The arrangement in columns is for ease of reading only—the spacing is not significant.<br />

11.6 ska—Protein Structure Alignment<br />

The SKA program carries out structural alignment of proteins by aligning protein backbones at<br />

different levels of detail.<br />

The ska script provides an interface to the SKA structural alignment program. As with other<br />

computational programs in <strong>Prime</strong> its operation is controlled through simple input files. The<br />

allowed keywords and their values are discussed in more detail below. The ska script allows<br />

you to run the SKA program in three different modes under Job Control:<br />

• align mode—SKA aligns two or more protein structures to one another.<br />

• dbcreate mode—SKA creates databases for use with scan mode. You should split PDB<br />

files with multiple chains into single-chain files before adding them to the database, as<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


SKA has difficulty aligning to multiple chains.<br />

Chapter 11: Command Syntax<br />

• scan mode—SKA searches a database of protein structures and identifies potential structural<br />

analogues. By default a non-redundant database of protein chains from the PDB is<br />

searched.<br />

Syntax<br />

$SCHRODINGER/ska [options] jobname<br />

By default the input is read from the file jobname.inp and the output is written to<br />

jobname.out.<br />

ska does not generate output other than the explicitly defined database and a log file in<br />

dbcreate mode—see below for details.<br />

Options<br />

The only option accepted by ska is -DEBUG, which turns on debugging mode for both Job<br />

Control and the SKA program.<br />

Command Input File<br />

The input file consists of lines containing keyword-value pairs, one per line. The keywords are<br />

described in Table 11.8. The set of allowed and required keywords depends on the mode,<br />

which is determined by the MODE keyword.<br />

Return Value and Errors<br />

The script returns 0 for success and 1 for failure.<br />

In case of failure, check the log file for details on what might be have caused the program to<br />

fail. To turn on debugging output use the -DEBUG command line option as follows:<br />

$SCHRODINGER/ska -DEBUG jobname<br />

Environment Variables<br />

The ska program handles all required environment variables internally and requires only that<br />

the environment variable SCHRODINGER be set. Should the environment variable TROLLTOP<br />

exist in the current environment it will be overwritten.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 103


104<br />

Chapter 11: Command Syntax<br />

Table 11.8. Keywords syntax for ska input file<br />

Keyword syntax Mode Description<br />

MODE mode all The SKA operating mode; allowed values are:<br />

*align*: structurally align two or more protein structures<br />

to on another. This is the default mode<br />

*scan*: given a query structure scan a database of protein<br />

structures and identify potential structural homologues<br />

*dbcreate*: create a database that can be search in<br />

scan mode from a list of PDB files<br />

QUERY_FILE filename<br />

[name]<br />

TEMPLATE filename<br />

[name]<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong><br />

*align*<br />

*scan*<br />

Required. The input query structure. In *align* mode,<br />

the structure against which other proteins are aligned. In<br />

*scan* mode, the structure used for scanning the database.<br />

The optional name (*align* mode only) sets the<br />

name of this structure to a value other than the filename<br />

(the default). The name is displayed in the output file<br />

(see Examples, below). name cannot contain the file separator<br />

character (‘/’ on UNIX).<br />

*align* Required. PDB structures that are to be aligned to the<br />

query structure, set by QUERY_FILE. The TEMPLATE<br />

keyword can appear multiple times in order to be able to<br />

carry out multiple structure alignments (aligning more<br />

than one template structure to the query structure). The<br />

optional name sets the name of this entry to a value other<br />

than the filename (the default). The name is displayed in<br />

the output file (see Examples, below). name cannot contain<br />

the file separator character (‘/’ on UNIX).<br />

DATABASE filename *scan* The absolute path to the database to scan with the query<br />

structure. The database can be generated in dbcreate<br />

mode. Default: database supplied with <strong>Prime</strong>.<br />

DBNAME filename *dbcreate* The name of the file in which to store the database create<br />

by ska. filename must be the file name without any path<br />

specification.<br />

FILELIST filename *dbcreate* A file containing a list of absolute filenames for structures<br />

(in PDB format) that are to be included in a database<br />

of templates that can be search with ska in scan<br />

mode. Absolute filenames are required to avoid having<br />

to copy large numbers of PDB files to the temporary<br />

directory in which the job is run.<br />

GAP_OPEN value *align*<br />

*scan*<br />

Gap initiation penalty.<br />

GAP_DEL value *align* Deletion penalty.


Table 11.8. Keywords syntax for ska input file (Continued)<br />

Keyword syntax Mode Description<br />

OUTPUT_FILE filename *align*<br />

*scan*<br />

ALISTRUCS_OUTFILE<br />

filename<br />

PSD_THRESHOLD value *align*<br />

*scan*<br />

SSE_MINSIM value *align*<br />

*scan*<br />

SSE_MINLEN value *align*<br />

*scan*<br />

SSE_WINLEN value *align*<br />

*scan*<br />

SWITCH [true | false] *align*<br />

*scan*<br />

Examples<br />

Chapter 11: Command Syntax<br />

The following minimal input file (assumed to be called align.inp) structurally aligns two<br />

protein structures to one another:<br />

QUERY_FILE query.pdb<br />

TEMPLATE template.pdb<br />

File in which to store the structural alignment. Default is<br />

jobname.out.<br />

*align* File in which to store the 3D coordinates of the aligned<br />

structures.<br />

PSD threshold for inclusion in the MSA. Overridden by<br />

RECKLESS.<br />

Minimum Secondary Structural Element (SSE) similarity<br />

for the alignment.<br />

Minimum SSE length for the alignment.<br />

Align only SSE windows of length value.<br />

Carry out global alignment first and then align subwindows<br />

if no similarity was found.<br />

ORDER [seed | psd] *align* psd: order the alignment by PSD to the seed structure.<br />

CLEAN *align* Exclude structures with a PSD < PSD_THRESHOLD from<br />

the alignment.<br />

RECKLESS [true |<br />

false]<br />

*align*<br />

*scan*<br />

Force the alignment of structures even if no similarity is<br />

found.<br />

where query.pdb is the name of the PDB file containing the query structure (the structure to<br />

which we want to align) and template.pdb contains the template structure (the structure we<br />

want to align to the query).<br />

When run using $SCHRODINGER/ska align it should produce the following output:<br />

• align.out—a file containing the structural alignment<br />

• align.log—a log file that contains log information and diagnostics on the run<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 105


106<br />

Chapter 11: Command Syntax<br />

Note that the actual 3D coordinates of the superposed structures are not written out by default<br />

You must use ALISTRUCS_OUTFILE to write out the structures.<br />

The following input file (align.inp):<br />

QUERY_FILE query.pdb<br />

TEMPLATE template.pdb<br />

ALISTRUCS_OUTFILE superposition.pdb<br />

saves the 3D coordinates of the aligned structures in the file superposition.pdb.<br />

The following is a sample input file for database creation in dbcreate mode:<br />

MODE dbcreate<br />

FILELIST filelist.dat<br />

DBNAME mydatabase.dat<br />

The file filelist.dat contains the file names for the templates, including the path, as<br />

follows:<br />

/scr/templates/template-1.pdb<br />

/scr/templates/template-2.pdb<br />

/scr/templates/template-3.pdb<br />

/scr/templates/template-4.pdb<br />

A minimal input file for a scan is as follows:<br />

MODE scan<br />

QUERY_FILE query.pdb<br />

Running ska with this input file would scan the database supplied with <strong>Prime</strong> for structural<br />

analogs. The output from the structural search contains pairwise alignments between the query<br />

structure and every structure in the database (assuming the two structures are sufficiently<br />

similar). The results can be analyzed in a more convenient way with the graphical SKA Results<br />

Viewer utility, $SCHRODINGER/utilities/SkaResultsViewer.<br />

The use of the optional name argument is illustrated next. For the following input file:<br />

QUERY_FILE query.pdb 1ctf<br />

TEMPLATE 1dd3_A.pdb<br />

the output file is as follows:<br />

.........+.........+.........+.........+.........+.........+.....<br />

1ctf 1 ceeeeeecc<br />

1dd3_A 1 cchhhhhhhhhhhchhhhhhhhhhhhhhhcccchhhhhhhhhhhhhhhhhhhhhhccceeeeeecc<br />

1ctf 1 ---------------------------------------------------------EFDVILKAA<br />

1dd3_A 1 MTIDEIIEAIEKLTVSELAELVKKLEDKFGVTAAAPVAVAAAPVAGAAAGAAQEEKTEFDVVLKSF<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


Chapter 11: Command Syntax<br />

..........+.........+.........+.........+.........+.........+.<br />

1ctf 62 ccchhhhhhhhhhhhccchhhhhhhhhhc c eeeeeccchhhhhhhhhhhhhhcceeeee<br />

1dd3_A 67 ccchhhhhhhhhhhhcchhhhhhhhhhhcccccccceeccchhhhhhhhhhhhhhcceeeee<br />

1ctf 62 GANKVAVIKAVRGATGLGLKEAKDLVESA-P--AALKEGVSKDDAEALKKALEEAGAEVEVK<br />

1dd3_A 67 GQNKIQVIKVVREITGLGLKEAKDLVEKAGSPDAVIKSGVSKEEAEEIKKKLEEAGAEVELK<br />

RMSD: 0.763733<br />

PSD: 0.024524<br />

Without the name argument for the QUERY_FILE keyword, the output would be<br />

.........+.........+.........+.........+.........+.........+.....<br />

query 1 ceeeeeecc<br />

1dd3_A 1 cchhhhhhhhhhhchhhhhhhhhhhhhhhcccchhhhhhhhhhhhhhhhhhhhhhccceeeeeecc<br />

query 1 ---------------------------------------------------------EFDVILKAA<br />

1dd3_A 1 MTIDEIIEAIEKLTVSELAELVKKLEDKFGVTAAAPVAVAAAPVAGAAAGAAQEEKTEFDVVLKSF<br />

..........+.........+.........+.........+.........+.........+.<br />

query 62 ccchhhhhhhhhhhhccchhhhhhhhhhc c eeeeeccchhhhhhhhhhhhhhcceeeee<br />

1dd3_A 67 ccchhhhhhhhhhhhcchhhhhhhhhhhcccccccceeccchhhhhhhhhhhhhhcceeeee<br />

query 62 GANKVAVIKAVRGATGLGLKEAKDLVESA-P--AALKEGVSKDDAEALKKALEEAGAEVEVK<br />

1dd3_A 67 GQNKIQVIKVVREITGLGLKEAKDLVEKAGSPDAVIKSGVSKEEAEEIKKKLEEAGAEVELK<br />

RMSD: 0.763733<br />

PSD: 0.024524<br />

11.7 ssp—Secondary Structure Prediction<br />

The ssp script provides an interface to the secondary structure prediction programs SSPro,<br />

which is bundled with <strong>Prime</strong>, as well as PSIPRED, which is available from a third party.<br />

Syntax<br />

$SCHRODINGER/ssp [options] jobname<br />

By default the input is read from the file jobname.inp and the output is written to<br />

jobname.out.<br />

Options<br />

The standard Job Control command line options listed in Table 11.1 are accepted. There are no<br />

options specific to this program.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 107


108<br />

Chapter 11: Command Syntax<br />

Command Input File<br />

The command input file consists of lines containing keyword-value pairs, one per line. The<br />

keywords are described in Table 11.9.<br />

Table 11.9. Keywords for the ssp input file.<br />

Keyword syntax Description<br />

QUERY_FILE filename Required. Name of the FASTA file containing the query sequence<br />

METHOD program Required. Name of the secondary structure prediction that is to be run.<br />

Allowed values are sspro, psipred, profking, and all. all means<br />

run all available programs. If the input file contains multiple METHOD keywords,<br />

only the last one is used.<br />

OUTPUT_FILE filename Optional. Name of the output file. Default is jobname.out.<br />

Return Value and Errors<br />

ssp returns 0 if successful and a non-zero value in all other cases.<br />

Environment Variables<br />

The environment variables that can be used to control the secondary structure prediction<br />

programs are listed in Table 11.10.<br />

Table 11.10. Environment variables for secondary structure prediction programs.<br />

Environment Variable Description<br />

PSP_SSPRO_DB BLAST database to be searched when assembling the multiple sequence<br />

alignment used in the SSPro prediction<br />

PSP_PSIPRED_DB BLAST database to be searched when assembling the sequence profile<br />

used in the PSIPRED prediction<br />

Examples<br />

The following minimal input file, called sspro.inp, generates an SSPro secondary structure<br />

prediction for the sequence stored in file query.fasta:<br />

QUERY_FILE query.fasta<br />

METHOD sspro<br />

The prediction produced by the back-end is written to the file sspro.out.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


Chapter 11: Command Syntax<br />

The following input file, assumed to be called all-methods.inp, instructs the ssp script to<br />

produce secondary structure prediction for all methods available—SSPro and, depending on<br />

the local installation, PSIPRED.<br />

QUERY_FILE query.fasta<br />

METHOD all<br />

11.8 sta—Single Template Alignment<br />

The Single Template Alignment (STA) program in PRIME is designed for protein sequences<br />

with medium to high sequence identity (>25%). STA differs from standard sequence alignment<br />

such as BLAST by taking into account secondary structure matching as well as profilesequence<br />

matching. As a result, STA often generates better alignment than BLAST in the<br />

regions where sequence conservation is relatively weak. STA can also take in constraints, such<br />

as user-selected residue pairs to be aligned together and/or residues in either query or template<br />

that are not to be aligned (gaps).<br />

STA can use templates directly from the PDB, as selected by running the Find Homologs step,<br />

and can also use templates from the <strong>Prime</strong> fold library, as selected in the Fold Recognition<br />

step. For the latter, you can export the template from the Fold Recognition table, and save it as<br />

a PDB file.<br />

Syntax<br />

$SCHRODINGER/sta [options] jobname<br />

The input is read from the file jobname.inp and the output is written to jobname.out.<br />

Options<br />

The sta script accepts the -DEBUG and -HOST options, as described in Table 11.1.<br />

Command Input File<br />

The input file consists of lines containing keyword-value pairs, one per line. The keywords are<br />

described in Table 11.11.<br />

Files<br />

• Input file—named jobname.inp. Contains keywords for running the program.<br />

• Query sequence file—File containing the complete sequence of the query, in FASTA format.<br />

Specified in the input file.<br />

• Secondary structure prediction files in CASP format for the query sequence. See page 93<br />

for an example of CASP format.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 109


110<br />

Chapter 11: Command Syntax<br />

Table 11.11. Keywords for the sta program.<br />

Keyword syntax Description<br />

QUERY_NAME name Name of the query sequence.<br />

QUERY_FILE filename File name of the query sequence in FASTA format.<br />

FORMAT string Alignment file format. Default is Maestro. Set to dev for plain text format—see<br />

page 111 for an example.<br />

BGNRES n The first residue of the query that is used in alignment. Default: 1.<br />

ENDRES n The last residue of the query sequence that is used. Default: the last residue<br />

of the original query sequence.<br />

PREDSS[_i] filename Secondary structure prediction file of the query sequence, in CASP format.<br />

One occurrence of this keyword is required for each file, and there<br />

must be at least one SSP file specified. If only one prediction is used, the<br />

keyword is PREDSS; if more predictions are used, the keyword is suffixed<br />

with an index i, starting from zero: PREDSS_0, PREDSS_1, and so<br />

on. For an example of CASP format, see page 93.<br />

TEMPLATE_PDB filename Filename of the template structure, in PDB format.<br />

MODE string Optional. Needed in the input file only when there are user-specified<br />

constraints. In that case, the value is set to user.<br />

PAIR pair-list Optional. <strong>User</strong>-specified residue pairing, used only when MODE is set to<br />

user. All the residue pairs can be added after the keyword in one or<br />

multiple lines. For example, the following lines pair residue 12, 20, 60<br />

from the query sequence with residue 15, 24, 65 from the template.<br />

MODE user<br />

PAIR 12 15 20 24<br />

PAIR 60 65<br />

GAP gap-list Optional. <strong>User</strong>-specified gaps, used only when MODE is set to user. Can<br />

be used with or without PAIR. Gaps are specified by the residues in<br />

either query (P) or template (T) that are aligned with them. For example,<br />

the following opens gaps against query residue 10, 11, template residue<br />

34 and 35.<br />

MODE user<br />

GAP P10 P11 T34 T35<br />

• Template PDB file—Template structure file, in PDB format. The sequence used for the<br />

template is taken from SEQRES lines in the PDB file. If SEQRES is not found in the PDB<br />

file, the template sequence is then taken from ATOM records.<br />

• Output alignment file—File named jobname.out containing pairwise alignment between<br />

the query and the template. By default, the alignment is in Maestro format. To generate<br />

plain text format, include FORMAT dev in the command input file. In plain text format,<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


Chapter 11: Command Syntax<br />

aligned residues from the query and the template are placed on top of each other—see<br />

below for an example.<br />

• Log file—File named jobname.logcontaining progress of the sta program, including<br />

warnings and error messages.<br />

Return Value and Errors<br />

sta returns 0 for success and non-zero for failure. In case of failure, check the log file for<br />

details, and search for ERROR. To further analyze the failure, use the -DEBUG option when you<br />

run sta.<br />

Examples<br />

Below is an example input file, where query_seq.fasta stores the query sequence in<br />

FASTA format.<br />

QUERY_NAME query<br />

QUERY_FILE query_seq.fasta<br />

PREDSS_0 query-ssp1.casp<br />

PREDSS_1 query-ssp2.casp<br />

PREDSS_3 query-ssp3.casp<br />

BGNRES 1<br />

ENDRES 149<br />

TEMPLATE_PDB pdb1vpe.ent<br />

Below is an example of the plain text format for output alignment files. This format is generated<br />

when you include ALIFORMAT dev in the command input file.<br />

probe length=70======tmplt_T0281_d1dq3a2 length=79 TotalScore= 42 SEQ= -4 SSTR= 102<br />

SEQID= 0.061538 nali= 65<br />

probe bgnRes=1 endRes=70 template bgnRes=3 endRes=75<br />

ProbeAA: ..MWMPPRPEEVARKLRRLGFVERMAKGGHRLYTHPDGRIVVVPFH...SGELP....KG<br />

ProbeSS: LLLLLLHHHHHHHHHHLLLEEEEELLLEEEEELLLLLEEEeLLL LLLLL HH<br />

Fold AA: GNFGLPLNFNAFKEWASEYGVEFKTNGSQTIAIIND...ERISLGQWHTRNRVSKAVLVK<br />

Fold SS: LLLEELLLHHHHHHHHHLLLLEEEEELLEEEEEELL EEEELLLHHHHLLEEHHHHHH<br />

ProbeAA: TFKRILRDAGLTEEEFHNL....<br />

ProbeSS: HHHHHHHHhLLLHHHHHLL<br />

Fold AA: MLRKLYEATK.DEEVKRMLHLIE<br />

Fold SS: HHHHHHHHHL LHHHHHHHHHHL<br />

The file contains header information at the top, then in the alignment section, the query<br />

sequence (ProbeAA), query composite secondary structure (ProbeSS), template sequence<br />

(Fold AA) and template secondary structure (Fold SS) are given. Gaps are denoted by<br />

periods in the sequences and by spaces in the secondary structures.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 111


112<br />

Chapter 11: Command Syntax<br />

11.9 tfm—Tertiary Folding Module<br />

The tfm program generates one or more complete backbone conformations starting from a<br />

primary sequence and secondary structure, along with optional domain boundaries and sheet<br />

topologies. In addition, tfm can accept several different types of constraint including multiple<br />

coordinate templates, multiple types of distance constraints, and dihedral angle constraints. A<br />

large number of run-time parameters allow the code to be used in a variety of ways, from minimizing<br />

a homology model to full ab initio folding.<br />

Syntax<br />

The tfm program has hundreds of user-specified options contained in over a dozen input files.<br />

A detailed description of all the options is beyond the scope of this document. The refine<br />

command can be used as a wrapper to prepare default input files. If suitable input has been<br />

prepared, there is also a top-level wrapper for tfm that can be used to run the program directly<br />

under Schrodinger Job Control. It can be invoked from the command line via:<br />

$SCHRODINGER/tfm input-file [options]<br />

where input-file is typically the job name, with or without a .inp extension. The input file<br />

must be either the first or last command-line argument, and everything else on the command<br />

line is passed as is to Job Control.<br />

Options<br />

All options are specified in the input file, and there are no command-line options, other than<br />

those recognized by Job Control.<br />

Command Input File<br />

There are only three keywords recognized in the command input file, listed in Table 11.12.<br />

Table 11.12. Keywords for tfm program.<br />

Keyword Description<br />

INPUT_FILE filename The input file passed to the executable.<br />

OUTPUT_FILE filename The output produced by the executable.<br />

STRUCT_NAME name Optional. Prefix for files in the local directory that will be automatically<br />

treated as input, such as template or constraint files.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


Files<br />

Chapter 11: Command Syntax<br />

In addition to the descriptive output in the file specified by OUTPUT_FILE and the error<br />

messages in name.err, tfm produces the following files, where the first part of the file name<br />

is tfm by default but can be changed by other input options, and N is a serial number.<br />

tfm.N.pdb Output structures, if applicable.<br />

tfm.N.spair Possible strand pair assignments for unpaired strands, if automatic strandpairing<br />

is being used.<br />

Return Value and Errors<br />

tfm returns a value of 0 on successful completion, 1 otherwise. All error messages are printed<br />

to the file name.err, where name is either the STRUCT_NAME value or the SEQNAME parameter<br />

in the file specified by INPUT_FILE. This file is empty otherwise, and all output from the<br />

program itself is written to the file specified by OUTPUT_FILE. Output from Job Control goes<br />

to stdout, and any errors arising before the driver script has started will likewise go to stdout or<br />

stderr.<br />

Examples<br />

Sample input for tfm can be generated by running refine with the -LOCAL and -DEBUG<br />

options. The command<br />

refine sample<br />

creates a directory run_sample with the input file sample.1.in (among other files determined<br />

by the refinement type). An input file sample_tfm.inp containing the following lines can be<br />

used to run the job from the command line<br />

INPUT_FILE sample.1.in<br />

OUTPUT_FILE sample.1.out<br />

with the command<br />

tfm sample_tfm<br />

Environment Variables<br />

In addition to the standard environment variables tfm recognizes the following environment<br />

variables:<br />

TFM_QUIET<br />

TFM_VERBOSE<br />

These environment variables can be set to decrease or increase the jobrelated<br />

output in the log file. The value to which they are set is irrelevant.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 113


114<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


<strong>Prime</strong> <strong>User</strong> <strong>Manual</strong><br />

Chapter 12: Command-Line Utilities<br />

12.1 Multiple Template Alignment: structalign<br />

Chapter 12<br />

If multiple templates are selected in the Find Homologs step, they must be rotated into the<br />

same coordinate reference frame before they can be used. The Align Structures button is the<br />

usual means of performing structural alignment of templates, but the structalign utility can<br />

be run from the command line.<br />

The input files must be in PDB format and the templates in them should consist of single<br />

chains. To use the output files, each must be imported into Find Homologs and selected as one<br />

of the templates, replacing the original, unaligned version.<br />

Syntax:<br />

$SCHRODINGER/utilities/structalign [options] input-files<br />

The input files must be in PDB format. The output file name is generated by adding the prefix<br />

rot- to the input file name.<br />

Options:<br />

-h Print usage message.<br />

-force Force alignment even if structures are not sufficiently similar.<br />

To use structalign from the command line:<br />

1. Obtain the PDB structure file of each template. This can be done easily by clicking on the<br />

sequence name of each template of interest to display it in the Workspace.<br />

2. For each structure, choose Export from the Project menu of the Maestro main window to<br />

save the coordinates of the structure to a file. Select PDB format.<br />

You should now have the following files:<br />

template1.pdb template2.pdb template3.pdb<br />

3. At the command line, enter:<br />

$SCHRODINGER/utilities/structalign template1.pdb template2.pdb<br />

template3.pdb<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 115


116<br />

Chapter 12: Command-Line Utilities<br />

This command runs the structalign utility and returns one aligned file for each<br />

template:<br />

rot-template1.pdb<br />

rot-template2.pdb<br />

rot-template3.pdb<br />

Note: If you encounter a problem involving templates with multiple chains, use the<br />

getpdb utility (see Section D.2.3 of the Maestro <strong>User</strong> <strong>Manual</strong>) to extract each<br />

chain into a separate .pdb file, then run structalign again.<br />

4. Import each rot-template.pdb file using the Import button.<br />

Once the files are imported and the aligned templates appear in the Homologs table, select<br />

them all using shift-click or control-click. (Do not select the original structures.) Click Next to<br />

continue to the Edit Alignment step.<br />

12.2 Format Conversion: seqconvert<br />

This utility converts sequences between any permitted input file format and output file format.<br />

This may be useful if autoconversion does not proceed as expected.<br />

Syntax:<br />

$SCHRODINGER/utilities/seqconvert [options]<br />

input-format input_filename output-format output_filename<br />

One use of the seqconvert utility is to convert SSPs produced in <strong>Prime</strong> to FASTA format with<br />

the following command:<br />

$SCHRODINGER/utilities/seqconvert -inative infile.seq<br />

-ofasta outfile.fasta<br />

Options are described in Table 12.1.<br />

Table 12.1. Keywords for the seqconvert utility<br />

Option Description<br />

-start Read from this sequence in the input file<br />

-end Read to (and including) this sequence in the input file<br />

-h Display usage message<br />

-total Read at most this number of sequences<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


Table 12.1. Keywords for the seqconvert utility (Continued)<br />

Option Description<br />

-ali Read an alignment in the input file<br />

Chapter 12: Command-Line Utilities<br />

-pdb_use_atoms Use ATOM (not SEQRES) records with PDB files<br />

Input File Formats:<br />

-iany Autodetect input file format<br />

-inative Input file in native (Schrödinger) format<br />

-ifasta Input file in FASTA format<br />

-ipdb Input file in PDB format<br />

-inbrf_pir Input file in NBRF-PIR format<br />

-itext_plain Input file in TEXT-PLAIN format<br />

-igcg Input file in GCG format<br />

-iswissprot Input file in SWISS-PROT format<br />

-igenbank Input file in GENBANK format<br />

-incbi Input file in NCBI format<br />

-iphylip Input file in PHYLIP format<br />

-inexus_paup Input file in NEXUS-PAUP format<br />

-iselex Input file in SELEX format<br />

-istaden Input file in STADEN format<br />

-ipfam_stockholm Input file in PFAM-STOCKHOLM format<br />

-icasp Input file in CASP format<br />

Output File Formats:<br />

-onative Output file in native (Schrödinger) format<br />

-ofasta Output file in FASTA format<br />

-onbrf_pir Output file in NBRF-PIR format<br />

-otext_plain Output file in TEXT-PLAIN format<br />

-ogcg Output file in GCG format<br />

-oswissprot Output file in SWISS-PROT format<br />

-ocasp Output file in CASP format<br />

-ogenbank Output file in GENBANK format<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 117


118<br />

Chapter 12: Command-Line Utilities<br />

12.3 Structure Preparation: primefix.py<br />

Python script for preparing structures for use with <strong>Prime</strong>, which requires certain conventions<br />

for residue naming. You should not normally need to use this script, because its functions are<br />

performed when you run a <strong>Prime</strong> refinement job. The script performs the following actions:<br />

• Assigns a unique, single residue name, residue number, and chain name to all ligand residues,<br />

and ensures that new name doesn’t conflict with amino acid residue names.<br />

• Left-justifies 3-character residue names, and ensures that 1- and 2-character residue<br />

names are treated appropriately.<br />

• Converts PDB names to upper case.<br />

• Deletes all bonds to metals and assigns formal charges to the metal and previously<br />

attached atoms. FeS clusters are fixed with the appropriate bonds and formal charges.<br />

• Sets water molecule residue names to "HOH " and PDB atom names to " O ", "1H "<br />

and "2H ". This action is performed for residues named TIP, TIP3, TIP4, SPC, HOH, and<br />

DOD.<br />

• Renames terminal caps: "NME " to "NMA ", pdbname " C " of NMA to " CA ", and<br />

pdbname " CA " of ACE to " CH3".<br />

• Atoms from the same molecule are placed together in the structure output.<br />

<strong>Prime</strong> can fail if a residue is not in order of connectivity. If this is the case in your structure, it<br />

should be exported from Maestro as a PDB file, then reimported and exported in Maestro<br />

format before running this script.<br />

Syntax:<br />

$SCHRODINGER/run primefix.py infile outfile<br />

infile Input file (Maestro format) containing the structure<br />

outfile Output structure file (Maestro format) containing the prepared structure<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


Chapter 12: Command-Line Utilities<br />

12.4 Structure Alignment: align_binding_sites<br />

This script performs a pairwise superposition of multiple structures using the C-alpha atoms of<br />

selected residues. The reference structure for the alignment is the first structure in the input<br />

file. The C-alpha atoms used for the alignment come from residues within the specified cutoff<br />

distance from the ligand in the reference structure. Alternatively, a list of residues may be<br />

given for the alignment, which must correspond to residues in the reference structure. C-alpha<br />

atoms in the structures that are aligned that are greater than the specified distance from any Calpha<br />

in the reference are not considered in the alignment.<br />

$SCHRODINGER/utilities/align_binding_sites [options] input-file<br />

The options are given in Table 12.2. The Job Control options described in Table 2.1 and<br />

Table 2.2 of the Job Control Guide are supported, with the exception of -INTERVAL.<br />

Table 12.2. Options for the align_binding_sites command.<br />

Option Description<br />

-v[ersion] Show the program version and exit.<br />

-h[elp] Show usage message and exit.<br />

-l[igmol] molnum Molecule number of the ligand. Default is to automatically detect the<br />

ligand.<br />

-c[utoff] value Binding site cutoff distance from the ligand in angstroms. Residues<br />

within this distance of the ligand are used in the alignment. Default: 5.<br />

-d[ist] value Atom pairs greater than this distance are not considered in the alignment<br />

(default=5)<br />

-p[realigned] Structures are prealigned. Do not run global protein structure alignment<br />

(SKA).<br />

-r[es] list Residues for the alignment. This option overrides the use of -cutoff.<br />

Format for a residue is chain:resnum. If there is no chain name, use an<br />

underscore, e.g. _:999 . At least three residues are required. The<br />

comma-separated list should not have spaces.<br />

-j[obname] jobname Jobname to use. The default is based on the input file name.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 119


120<br />

Chapter 12: Command-Line Utilities<br />

12.5 Database Update Utilities<br />

<strong>Prime</strong> uses two databases, the PDB database and the BLAST database. As these databases are<br />

continually being updated, the versions distributed with the software will not have the latest<br />

contributions. Utilities have been provided for updating these databases from the web sites.<br />

12.5.1 update_BLASTDB<br />

This utility updates the BLAST database from a copy on the Schrödinger web site that is<br />

synchronized nightly with the BLAST web site. The preformatted database files are downloaded.<br />

Syntax:<br />

$SCHRODINGER/utilities/update_BLASTDB<br />

The BLAST database is installed in the first location found in the following list:<br />

• $PSP_BLASTDB<br />

• $SCHRODINGER_THIRDPARTY/database/blast<br />

• $SCHRODINGER/thirdparty/database/blast<br />

These environment variables are described in Appendix B. You must have permission to write<br />

to the destination in order to run this utility.<br />

12.5.2 rsync_pdb<br />

This utility creates or updates a local mirror of the PDB.<br />

Syntax:<br />

$SCHRODINGER/utilities/rsync_pdb<br />

The PDB is installed in the first location found in the following list:<br />

• $SCHRODINGER_PDB<br />

• $SCHRODINGER_THIRDPARTY/database/pdb<br />

• $SCHRODINGER/thirdparty/database/pdb<br />

These environment variables are described in Appendix B. You must have permission to write<br />

to the destination in order to run this utility.<br />

This utility uses an rsync server on the Schrödinger web site that enables the files to be downloaded.<br />

The files are written in the new remediated format. The Schrödinger site is updated<br />

nightly from the PDB.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


<strong>Prime</strong> <strong>User</strong> <strong>Manual</strong><br />

Appendix A: Third-Party Programs<br />

A.1 Location of Third-Party Programs<br />

Appendix A<br />

As well as internal and bundled software and data, <strong>Prime</strong> uses certain external third-party<br />

programs and databases for sequence and homolog searches, protein family searches, and<br />

secondary structure prediction. These are described under the headings in this topic.<br />

A.1.1 BLAST/PSI-BLAST, Pfam, and the PDB<br />

See the home pages for these programs and servers for more information:<br />

• BLAST/PSI-BLAST<br />

http://www.ncbi.nlm.nih.gov/blast<br />

The BLAST version distributed with <strong>Prime</strong> is version 2.2.16. To download the BLAST<br />

database, you should use the update_BLASTDB utility. This utility uses rsync to download<br />

or update the database, in the correct format for use with <strong>Prime</strong>.<br />

• HMMER/Pfam<br />

http://pfam.sanger.ac.uk<br />

This site has links to mirror sites that may be more convenient for downloads. You can<br />

download the Pfam database from ftp.schrodinger.com/support/hidden/prime/<br />

Pfam_fs.gz.<br />

• PDB<br />

http://www.rcsb.org/pdb/<br />

Note that the PDB format has changed, as of August 1, 2007. You should use the<br />

rsync_pdb utility to update your local copy of the PDB for use with <strong>Prime</strong>. <strong>Prime</strong> uses<br />

the new format, and automatically converts files in the old format.<br />

A.1.2 Secondary Structure Prediction<br />

Two distinct third-party secondary structure prediction programs can be used to increase the<br />

accuracy of, and get better results from, <strong>Prime</strong>. One of these programs (SSpro) is already<br />

bundled with <strong>Prime</strong>. The other (PSIPRED) is highly recommended, and may be manually<br />

installed by you, at your election, and subject to applicable license terms, if any. At the third-<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 121


122<br />

Appendix A: Third-Party Programs<br />

party web site whose address is given below, you may download the program and then install<br />

it. For instructions on how to install third-party programs, see Section 3.7 of the Installation<br />

Guide (Linux) or Section 4.2 of the Installation Guide (Windows), or the Third Party Programs<br />

page of our web site.<br />

The PSIPRED home page is located at:<br />

http://bioinf.cs.ucl.ac.uk/psipred/<br />

and contains information about the program, referrals to terms of use and/or license terms (in<br />

the README file), and a link to download the program. The link to download PSIPRED is<br />

contained in the section of the page labeled “Overview of prediction methods.”<br />

Note: PSIPRED is not available on Windows.<br />

A.2 Third-Party Program Use Agreement<br />

By using the links above or the third-party programs, you acknowledge and agree to the<br />

following:<br />

PSIPRED is a third party program (“Third Party Program”) and may therefore be subject to<br />

third party license agreements and fees. We (i.e., Schrödinger, LLC and its affiliates) have no<br />

responsibility or liability, directly or indirectly, for Third Party Programs or for any third party<br />

Web sites (“Linked Sites”) or for any damage or loss alleged to be caused by or in connection<br />

with use of or reliance thereon. Any warranties that we make regarding our own products and<br />

services do not apply to the Third Party Programs or Linked Sites, or to the interaction<br />

between, or interoperability of, our products and services and the Third Party Programs. Referrals<br />

and links to Third Party Programs and Linked Sites do not constitute an endorsement of<br />

such Third Party Programs or Linked Sites.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


<strong>Prime</strong> <strong>User</strong> <strong>Manual</strong><br />

Appendix B: Environment Variables<br />

Appendix B<br />

The only environment variable that must be set before you can run <strong>Prime</strong> is SCHRODINGER.<br />

The following list is presented for your convenience in case you prefer a different location for<br />

any <strong>Prime</strong>-related directories.<br />

General<br />

SCHRODINGER_THIRDPARTY: Directory containing third-party software and databases, which<br />

can be installed with the installers supplied by Schrödinger. The databases are installed in the<br />

database subdirectory; the software in the bin subdirectory. Each third-party product has its<br />

own subdirectory under these subdirectories. The default location if this environment variable<br />

is not set is $SCHRODINGER/thirdparty.<br />

PDB<br />

SCHRODINGER_PDB: Full path to a user-supplied copy of the PDB database. The default for<br />

this directory is $SCHRODINGER/thirdparty/database/pdb if only SCHRODINGER is set;<br />

if SCHRODINGER_THIRDPARTY is set, the default is $SCHRODINGER_THIRDPARTY/<br />

database/pdb. The copy of the database must retain the directory hierarchy used by the PDB<br />

and must include the following subdirectories to be usable with <strong>Prime</strong>:<br />

data/structures/all/pdb<br />

data/structures/divided/pdb<br />

data/structures/obsolete/pdb<br />

On Windows the data/structures/all subdirectory is absent (and is not required).<br />

Blast<br />

PSP_BLAST_DIR: Location of Blast executables. Default is $SCHRODINGER_THIRDPARTY/<br />

bin/platform/blast/bin, with the default for SCHRODINGER_THIRDPARTY noted above.<br />

PSP_BLAST_DATA: Location of Blast matrices. Default is $SCHRODINGER_THIRDPARTY/<br />

bin/platform/blast/data, with the default for SCHRODINGER_THIRDPARTY noted above.<br />

PSP_BLASTDB: Location of Blast databases. The BLAST databases distributed with <strong>Prime</strong> are<br />

formatted with version 2.2.16. If your BLAST database does not conform to this format, you<br />

might experience problems with <strong>Prime</strong>. Default is $SCHRODINGER_THIRDPARTY/<br />

database/blast, with the default for SCHRODINGER_THIRDPARTY noted above.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 123


124<br />

Appendix B: Environment Variables<br />

HMMER/Pfam<br />

PSP_HMMER_DIR: Location of HMMER distribution (the executables are assumed to be in the<br />

binaries subdirectory). Default is $SCHRODINGER_THIRDPARTY/bin/platform/hmmer,<br />

with the default for SCHRODINGER_THIRDPARTY noted above.<br />

PSP_HMMERDB: Location of Pfam database. Default is $SCHRODINGER_THIRDPARTY/<br />

database/pfam, with the default for SCHRODINGER_THIRDPARTY noted above.<br />

PSIPRED<br />

PSP_PSIPRED_DIR: Location of PSIPRED installation (top level directory containing the bin<br />

and data directories). Default is $SCHRODINGER_THIRDPARTY/bin/platform/psipred,<br />

with the default for SCHRODINGER_THIRDPARTY noted above.<br />

PSP_PSIPRED_DB: Sequence database to use for assembling the PSI-BLAST profile. Allowed<br />

values are nr and pdb. Default: pdb.<br />

SSPRO<br />

PSP_SSPRO_DB: Sequence database to use for assembling the SSPRO profile. Allowed values<br />

are nr and pdb. Default: pdb.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


<strong>Prime</strong> <strong>User</strong> <strong>Manual</strong><br />

Appendix C: File Formats<br />

C.1 Input File Formats<br />

The input file formats are:<br />

• native (Schrödinger) format<br />

• FASTA format<br />

• PDB format<br />

Appendix C<br />

Sequences are read from the PDB SEQRES records, if they exist. Otherwise they are read<br />

from the ATOM records. If SEQRES records do not exist and residues are missing from<br />

the ATOM records, the query sequence will of course contain gaps.<br />

• NBRF-PIR format<br />

• TEXT-PLAIN format<br />

• GCG format<br />

• SWISS-PROT format<br />

• GENBANK format<br />

• NCBI format<br />

• PHYLIP format<br />

• NEXUS-PAUP format<br />

• SELEX format<br />

• STADEN format<br />

• PFAM-STOCKHOLM format<br />

• CASP format<br />

C.2 Output File Formats<br />

The output file formats are:<br />

• native (Schrödinger) format<br />

• FASTA format<br />

• NBRF-PIR format<br />

• TEXT-PLAIN format<br />

• GCG format<br />

• SWISS-PROT format<br />

• CASP format<br />

• GENBANK format<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 125


126<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


<strong>Prime</strong> <strong>User</strong> <strong>Manual</strong><br />

Appendix D: Error Messages and Warnings<br />

D.1 Input Sequence<br />

Invalid File<br />

This message appears if the system fails to extract a valid sequence from the file.<br />

• Check that the sequence is in one of the supported sequence file formats.<br />

Appendix D<br />

• If FASTA format is used, make sure there are no extra spaces between the “>” and the<br />

sequence name, and that there are no extra spaces at the end of the line.<br />

No Sequences Found<br />

This message appears if the system fails to extract any sequences from the workspace.<br />

• Check the Project Table to ensure the correct entries are included in the Workspace.<br />

• Use Open Project in the Maestro Window to open a project.<br />

Invalid Sequence Characters<br />

This message appears if the sequence contains non-standard one-letter code for amino acids.<br />

• Check that sequence uses the one-letter code and is all uppercase.<br />

• Change/edit any non-standard characters from your sequence using an editor, and then<br />

restart job.<br />

D.2 Find Homologs<br />

D.2.1 Error Messages<br />

Multiple Templates Not Aligned<br />

• If you select multiple templates and proceed to Edit Alignment, you are reminded that the<br />

templates must be structurally aligned to one another.<br />

If one of the following messages appears when you click Next to go to the Edit Alignment step,<br />

you will not be able to proceed along the Comparative Modeling path until the problem is<br />

resolved:<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 127


128<br />

Appendix D: Error Messages and Warnings<br />

No Homologs Found<br />

• You may change your search options and continue searching, or proceed to Fold Recognition.<br />

Invalid File<br />

• Check to ensure the homolog file exists in either of the following locations:<br />

$SCHRODINGER/thirdparty/database/pdb/data/structures/divided/pdb/*<br />

$SCHRODINGER_THIRDPARTY/database/pdb/data/structures/divided/pdb/*<br />

and is in the right format (UNIX compressed, pdb*.ent.gz).<br />

D.2.2 Warnings in Homologs Table<br />

These messages appear in the Warning column of the Homologs table for sequences that are<br />

likely to be unusable for model building:<br />

Many missing residues<br />

• Too many consecutive missing residues to guarantee model building will succeed.<br />

!UNUSABLE: Alternate residue type<br />

• The residue type was undetermined, causing the file to include the residue twice but with<br />

different identity (e.g. ALA20 and VAL20). This is not supported by <strong>Prime</strong>.<br />

!UNUSABLE: Bad SEQRES<br />

• General problems with the SEQRES.<br />

!UNUSABLE: CA only<br />

• File contains only alpha C coordinates.<br />

!UNUSABLE: Molecule molname should be labeled as HETATM<br />

• A non-protein molecule has been listed as ATOM when it should be listed as HETATM.<br />

ATOM records are reserved for proteins only.<br />

!UNUSABLE: SEQRES/ATOM mismatch<br />

• The sequence listed in the SEQRES records does not match the actual sequence in the<br />

ATOM records.<br />

!UNUSABLE: SEQRES is missing residues<br />

• There are residues in the ATOM records that are not listed in the SEQRES records. PDB<br />

files require that the ATOM records be a subset of the SEQRES records.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


!UNUSABLE: SEQRES numbering is wrong<br />

Appendix D: Error Messages and Warnings<br />

• The number of residues in a particular chain does not agree with the number of residues<br />

shown in columns 14 to 17 of SEQRES.<br />

D.3 Edit Alignment<br />

Check Alignment<br />

When you click Next at the end of the Edit Alignment step, the Check Alignment warning may<br />

appear. The warning lists possible alignment problems that you may want to fix before<br />

continuing to the next step. For example, there may be gaps in secondary structure elements of<br />

the template, or gaps in the query that are aligned to secondary structure elements of the<br />

template.<br />

If you do not want to make corrections to the alignment, click Continue to go to the next step.<br />

To make corrections based on the information in the error message, make a note of the gap<br />

locations given, click Cancel to close the message box while remaining in Edit Alignment, and<br />

change the alignment accordingly.<br />

D.4 Build Structure<br />

Error messages and warnings for this generally appear in the log of a running job. The log can<br />

be viewed in the job Monitor panel until the job is completed.<br />

D.5 Refinement<br />

Error messages for Refinement generally appear in the log of a running job. The following is<br />

an example of a warning in the log of a loop refinement.<br />

Invalid Features<br />

When a loop refinement begins, a validation program checks the loop features of the structure.<br />

Apparent errors are reported in a warning dialog box. You may choose to Run All Features,<br />

Run Only Valid Features, orCancel. It is recommended that you make a note of the invalid<br />

features, click Cancel, and correct the structure if appropriate.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 129


130<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


<strong>Prime</strong> <strong>User</strong> <strong>Manual</strong><br />

Getting Help<br />

Schrödinger software is distributed with documentation in PDF format. If the documentation is<br />

not installed in $SCHRODINGER/docs on a computer that you have access to, you should install<br />

it or ask your system administrator to install it.<br />

For help installing and setting up licenses for Schrödinger software and installing documentation,<br />

see the Installation Guide. For information on running jobs, see the Job Control Guide.<br />

Maestro has automatic, context-sensitive help (Auto-Help and Balloon Help, or tooltips), and<br />

an online help system. To get help, follow the steps below.<br />

• Check the Auto-Help text box, which is located at the foot of the main window. If help is<br />

available for the task you are performing, it is automatically displayed there. Auto-Help<br />

contains a single line of information. For more detailed information, use the online help.<br />

• If you want information about a GUI element, such as a button or option, there may be<br />

Balloon Help for the item. Pause the cursor over the element. If the Balloon Help does<br />

not appear, check that Show Balloon Help is selected in the Maestro menu of the main<br />

window. If there is Balloon Help for the element, it appears within a few seconds.<br />

• For information about a panel or the tab that is displayed in a panel, click the Help button<br />

in the panel, or press F1. The help topic is displayed in your browser.<br />

• For other information in the online help, open the default help topic by choosing Online<br />

Help from the Help menu on the main menu bar or by pressing CTRL+H. This topic is displayed<br />

in your browser. You can navigate to topics in the navigation bar.<br />

The Help menu also provides access to the manuals (including a full text search), the FAQ<br />

pages, the New Features pages, and several other topics.<br />

If you do not find the information you need in the Maestro help system, check the following:<br />

• Maestro <strong>User</strong> <strong>Manual</strong>, for detailed information on using Maestro<br />

• Maestro Command Reference <strong>Manual</strong>, for information on Maestro commands<br />

• Maestro Overview, for an overview of the main features of Maestro<br />

• Maestro Tutorial, for a tutorial introduction to basic Maestro features<br />

• <strong>Prime</strong> Quick Start Guide, for a tutorial introduction to <strong>Prime</strong><br />

• <strong>Prime</strong> Frequently Asked Questions pages, at<br />

https://www.schrodinger.com/<strong>Prime</strong>_FAQ.html<br />

• Known Issues pages, available on the Support Center.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 131


132<br />

Getting Help<br />

The manuals are also available in PDF format from the Schrödinger Support Center. Local<br />

copies of the FAQs and Known Issues pages can be viewed by opening the file<br />

Suite_2009_Index.html, which is in the docs directory of the software installation, and<br />

following the links to the relevant index pages.<br />

Information on available scripts can be found on the Script Center. Information on available<br />

software updates can be obtained by choosing Check for Updates from the Maestro menu.<br />

If you have questions that are not answered from any of the above sources, contact Schrödinger<br />

using the information below.<br />

E-mail: help@schrodinger.com<br />

USPS: Schrödinger, 101 SW Main Street, Suite 1300, Portland, OR 97204<br />

Phone: (503) 299-1150<br />

Fax: (503) 299-4532<br />

WWW: http://www.schrodinger.com<br />

FTP: ftp://ftp.schrodinger.com<br />

Generally, e-mail correspondence is best because you can send machine output, if necessary.<br />

When sending e-mail messages, please include the following information:<br />

• All relevant user input and machine output<br />

• <strong>Prime</strong> purchaser (company, research institution, or individual)<br />

• Primary <strong>Prime</strong> user<br />

• Computer platform type<br />

• Operating system with version number<br />

• <strong>Prime</strong> version number<br />

• Maestro version number<br />

• mmshare version number<br />

On UNIX you can obtain the machine and system information listed above by entering the<br />

following command at a shell prompt:<br />

$SCHRODINGER/utilities/postmortem<br />

This command generates a file named username-host-schrodinger.tar.gz, which you<br />

should send to help@schrodinger.com. If you have a job that failed, enter the following<br />

command:<br />

$SCHRODINGER/utilities/postmortem jobid<br />

where jobid is the job ID of the failed job, which you can find in the Monitor panel. This<br />

command archives job information as well as the machine and system information, and<br />

includes input and output files (but not structure files). If you have sensitive data in the job<br />

launch directory, you should move those files to another location first. The archive is named<br />

jobid-archive.tar.gz, and should be sent to help@schrodinger.com instead.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


Getting Help<br />

If Maestro fails, an error report that contains the relevant information is written to the current<br />

working directory. The report is named maestro_error.txt, and should be sent to<br />

help@schrodinger.com. A message giving the location of this file is written to the terminal<br />

window.<br />

More information on the postmortem command can be found in Appendix A of the Job<br />

Control Guide.<br />

On Windows, machine and system information is stored on your desktop in the file<br />

schrodinger_machid.txt. If you have installed software versions for more than one<br />

release, there will be multiple copies of this file, named schrodinger_machid-N.txt,<br />

where N is a number. In this case you should check that you send the correct version of the file<br />

(which will usually be the latest version).<br />

If Maestro fails to start, send email to help@schrodinger.com describing the circumstances,<br />

and attach the file maestro_error.txt. If Maestro fails after startup, attach this file and the<br />

file maestro.EXE.dmp. These files can be found in the following directory:<br />

%USERPROFILE%\Local Settings\Application Data\Schrodinger\appcrash<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 133


134<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


<strong>Prime</strong> <strong>User</strong> <strong>Manual</strong><br />

Glossary<br />

Chapter<br />

alignment—The optimal matching of residue positions between sequences, typically a query<br />

sequence and one or more template sequences.<br />

anchor—A constraint on alignment set at a given residue position. Alignment changes must<br />

preserve the query-template pairing at that residue until the anchor is removed.<br />

ASL—Atom Specification Language.<br />

cofactor—The complement of an enzyme reaction, usually a metal ion or metal complex.<br />

Comparative Modeling—Protein structure modeling based on a query-template match with a<br />

substantial percentage of identical residues (usually 50% or greater sequence identity).<br />

composite template—A type of template used in the Threading path, produced from the core<br />

(invariable) and variable regions of a family of structurally similar proteins.<br />

constraints—Tools to keep regions of a sequence (alignment constraints) or structure (during<br />

minimization) in a particular configuration.<br />

deletions—The residues missing from a query sequence that are present in a template<br />

sequence.<br />

Fold Recognition—The use of secondary structure matching and profiles generated from<br />

multiple sequence/structure alignments to find templates when sequence methods are unsuccessful.<br />

gaps—The spaces in an alignment resulting from insertions and deletions.<br />

HETATOMs—The atoms of residues, including amino acids, that are not one of the standard<br />

20 amino acids. In PDB files, HETATM.<br />

homolog—A sequence/structure related to the query sequence; i.e., a sequence with many of<br />

the same residues in the same patterns as the query sequence. Usually these sequences are<br />

derived from the same family and may have similar function.<br />

insertions—The extra residues found in a query sequence that are not found in a template<br />

sequence.<br />

ligand—A small molecule that binds to a protein; e.g., an antigen, hormone, neurotransmitter,<br />

substrate, inhibitor, and so on.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong> 135


136<br />

Glossary<br />

loop—A region of undefined secondary structure.<br />

project—A collection of related data, such as structures with their associated properties. In<br />

<strong>Prime</strong> a project comprises one or more runs (executions of the <strong>Prime</strong> workflow). The project<br />

may include data that does not appear in the project table.<br />

Project Table—The Maestro panel associated with a project, featuring a table with rows of<br />

entries and columns of properties. In <strong>Prime</strong>, project table entries are usually model structures.<br />

query sequence—A sequence of unknown structure or fold.<br />

Ranking Score—The score used to rank composite templates derived from different seed<br />

templates. Generated by the Global Scoring Function in the Threading Path.<br />

refinement—An improvement of a model structure through energy-based optimization of<br />

selected regions.<br />

run—A single execution of the <strong>Prime</strong> workflow using a particular set of choices (of templates,<br />

of paths, and of settings). Each run belongs to a project. Runs cannot be saved without saving<br />

the project to which they belong.<br />

SSA—Secondary structure assignment.<br />

SSP—Secondary structure prediction.<br />

template sequence—A sequence of known structure and fold used as a basis for building a<br />

model of the query.<br />

Threading—A structure prediction process in which Fold Recognition is used to define<br />

templates, then backbone models are built via alignment to composite templates and refined.<br />

May be used when query-template sequence identity is low.<br />

Workspace—The open area in the center of the Maestro window in which structures are<br />

displayed.<br />

Z-Score—Measures the compatibility of the query sequence with the model structure, relative<br />

to the compatibility of randomly shuffled sequences of the same composition.<br />

<strong>Prime</strong> 2.1 <strong>User</strong> <strong>Manual</strong>


Symbols<br />

+ (plus sign), in Pfam sequence ........................ 21<br />

- (hyphen)<br />

for locked gap ....................................... 14, 33<br />

for loop ........................................... 13, 31, 46<br />

~ (tilde), for unlocked gap........................... 14, 33<br />

A<br />

Add Anchors ............................................... 14, 33<br />

Align program................................................... 47<br />

aligned regions .................................................. 36<br />

alignment......................................................... 135<br />

editing................................................... 13, 33<br />

alignment score ................................................. 34<br />

interpretation of values ............................... 71<br />

alpha carbons<br />

aligning structures by ............................... 119<br />

constraining specific ................................. 100<br />

maximum movement .......................... 65, 101<br />

anchors ............................................................ 135<br />

alignment constraints............................ 14, 33<br />

removing............................................... 14, 33<br />

setting ................................................... 14, 33<br />

ASL................................................................. 135<br />

atom names, correcting ..................................... 54<br />

ATOM records ................................................ 125<br />

B<br />

Balloon Help ....................................................... 8<br />

binding sites, aligning ....................................... 74<br />

bit score............................................................. 23<br />

BLAST<br />

database format......................................... 123<br />

expectation value ........................................ 23<br />

percentage gaps .......................................... 23<br />

updating database ..................................... 120<br />

BLAST/PSI-BLAST....................................... 121<br />

blocks, sliding ............................................. 13, 33<br />

BLOSUM62 similarity matrix .......................... 23<br />

bond orders, assigning ...................................... 53<br />

bonds, to metals ................................................ 52<br />

Build Structure step........................................... 35<br />

C<br />

chain lengths ..................................................... 18<br />

Index<br />

chain name ........................................................ 23<br />

chains, multiple......................................... 18, 102<br />

charged residues................................................ 33<br />

code, sequence .................................................. 18<br />

cofactors.............................................. 36, 37, 135<br />

Color by Sequence ............................................ 10<br />

Color Scheme.............................................. 10, 14<br />

<strong>Prime</strong> Display Menu................................... 10<br />

<strong>Prime</strong> toolbar .............................................. 10<br />

color schemes.............................................. 10–11<br />

for Set Template Regions ........................... 36<br />

columns<br />

sorting............................................... 8, 22, 48<br />

V (Visualization) ........................................ 49<br />

Comparative Modeling ............................... 1, 135<br />

Comparative Modeling Path ................... 9, 17, 25<br />

composite SSP .................................................. 33<br />

composite templates........................................ 135<br />

Compound (PDB COMPND) ........................... 23<br />

conserved residues ...................................... 36, 47<br />

constraints ....................................................... 135<br />

alpha carbon.............................................. 100<br />

atom pair..................................................... 99<br />

continuum solvation model............................... 40<br />

conventions, document...................................... ix<br />

cooperative refinement of loops........................ 60<br />

coordinate copying............................................ 36<br />

Covalent Docking panel.................................... 78<br />

covalently bound ligands .................................. 51<br />

crystal symmetry............................................... 64<br />

C-terminal gaps........................................... 13, 33<br />

D<br />

Delete<br />

<strong>Prime</strong> runs..................................................... 6<br />

SSP ......................................................... 8, 31<br />

deletions .............................................. 36, 51, 135<br />

directory<br />

installation .................................................... 3<br />

Maestro working........................................... 3<br />

Display menu (<strong>Prime</strong>)................................. 10, 13<br />

domains....................................................... 18, 25<br />

E<br />

E (for beta strand) ................................. 13, 31, 46<br />

Edit menu (<strong>Prime</strong>)............................................. 13<br />

<strong>Prime</strong> 2.5 <strong>User</strong> <strong>Manual</strong> 137


138<br />

Index<br />

Edit SSP ............................................................ 13<br />

editing alignment .............................................. 33<br />

editing sequences .............................................. 10<br />

EMBL ............................................................... 18<br />

Energy Analysis task......................................... 58<br />

energy minimization ......................................... 62<br />

entries, Project Table......................................... 16<br />

environment variables ..................................... 123<br />

SCHRODINGER .............................................. 3<br />

SCHRODINGER_PDB...................................... 3<br />

error messages................................... 18, 127, 129<br />

Expect column .................................................. 23<br />

expectation value............................................... 23<br />

Export Sequences............................................ 8, 9<br />

Export SSP.............................................. 8, 30, 43<br />

extracting sequence names................................ 18<br />

F<br />

family ................................................................ 21<br />

FASTA<br />

file format ............................................. 18, 30<br />

file header ................................................... 18<br />

file formats ........................................................ 18<br />

File menu ............................................................ 6<br />

Job Options................................................. 14<br />

Find Homologs step .................................... 17, 18<br />

Fold Recognition....................................... 43, 135<br />

folds................................................................... 47<br />

Font Size, in Sequence Viewer ......................... 12<br />

force field<br />

Build Structure............................................ 40<br />

energy minimization................................... 62<br />

nonstandard residues .................................. 52<br />

single-point energy ..................................... 58<br />

formal charges, correcting ................................ 53<br />

G<br />

gaps ................................................................. 135<br />

C-terminal............................................. 13, 33<br />

locking .................................................. 14, 33<br />

N-terminal............................................. 13, 33<br />

percentage................................................... 23<br />

in query....................................................... 23<br />

in secondary structure......................... 34, 129<br />

in template .................................................. 23<br />

unlocking .............................................. 14, 33<br />

<strong>Prime</strong> 2.5 <strong>User</strong> <strong>Manual</strong><br />

GENBANK ....................................................... 18<br />

getpdb utility.................................................. 22<br />

GPCRs<br />

aligning....................................................... 31<br />

SSPs for ...................................................... 29<br />

Guide......................................................... 8, 9, 13<br />

H<br />

H (for alpha helix)................................. 13, 31, 46<br />

header<br />

FASTA file.................................................. 18<br />

PDB ............................................................ 18<br />

helix, building ................................................. 101<br />

HETATMs ....................................................... 135<br />

HETATOMs .................................................... 135<br />

hidden sequences .............................................. 21<br />

Hide All (SSPs)................................................... 8<br />

HMMER/Pfam.................................... 21, 28, 121<br />

homologs......................................................... 135<br />

importing .................................................... 19<br />

Homologs table ................................................. 23<br />

host machine ..................................................... 14<br />

hydrogens, adding............................................. 52<br />

hydrophobic residues ........................................ 33<br />

hyphen symbol<br />

for locked gap ....................................... 14, 33<br />

for loop ........................................... 13, 31, 46<br />

I<br />

ID, PDB chain................................................... 23<br />

Identities column............................................... 23<br />

Import SSP.................................................. 30, 43<br />

importing structurally aligned templates .......... 19<br />

Include ligands and cofactors window.............. 37<br />

Input Sequence step .......................................... 17<br />

insertions....................................... 33, 36, 51, 135<br />

J<br />

Job Monitor panel ....................................... 15, 38<br />

job options......................................................... 14<br />

Job Options dialog box ................................. 9, 14<br />

job status icon ............................................... 8, 15<br />

jobs<br />

killing.......................................................... 15<br />

monitoring .................................................. 15<br />

stopping ...................................................... 38


K<br />

killing jobs ........................................................ 15<br />

L<br />

leaving group<br />

ligand .......................................................... 78<br />

receptor....................................................... 79<br />

libraries<br />

backbone dihedral....................................... 40<br />

side-chain rotamer ...................................... 40<br />

ligands................................................. 36, 37, 135<br />

covalently bound......................................... 51<br />

ligands, covalently bound ........................... 52, 77<br />

lipophilic term................................................... 69<br />

locking gaps ................................................ 14, 33<br />

log file ............................................................... 38<br />

loop refinement ................................................. 51<br />

cooperative.................................................. 60<br />

invalid features check ................................. 60<br />

options ........................................................ 64<br />

order of ....................................................... 60<br />

settings........................................................ 59<br />

time required............................................... 59<br />

loops................................................................ 136<br />

M<br />

Maestro, starting ................................................. 3<br />

<strong>Manual</strong> Threading............................................. 33<br />

maximum length ............................................... 25<br />

maximum sequence length................................ 18<br />

menu bar (<strong>Prime</strong>) ............................................ 8, 9<br />

menus<br />

Display (<strong>Prime</strong>) .................................... 10, 13<br />

Edit (<strong>Prime</strong>) ................................................ 13<br />

File (<strong>Prime</strong>)............................................. 6, 14<br />

merging projects.................................................. 5<br />

metals, bonds to ................................................ 52<br />

minimization ............................................... 36, 62<br />

missing residues.............................................. 125<br />

Monitor panel................................................ 8, 38<br />

monitoring jobs ................................................. 15<br />

mouse functions ............................................ 8, 22<br />

multimers, building ........................................... 88<br />

multiple chains<br />

building as multimer................................... 88<br />

with same sequence .................................... 18<br />

Index<br />

with SKA.................................................. 102<br />

multiple sequences............................................ 18<br />

multiple templates............................................. 36<br />

exporting................................................... 115<br />

N<br />

NAME_pfam sequence ..................................... 21<br />

names<br />

chains.......................................................... 23<br />

extracting sequence..................................... 18<br />

non-unique.................................................. 18<br />

sequences.............................................. 18, 23<br />

native Schrödinger format................................. 30<br />

New command .................................................... 6<br />

non-conserved residues..................................... 36<br />

None color scheme............................................ 10<br />

N-terminal gaps........................................... 13, 33<br />

O<br />

Open command ................................................... 6<br />

OPLS_2005 force field ................... 40, 52, 58, 62<br />

optimizing side chains....................................... 37<br />

options<br />

job........................................................... 9, 14<br />

in panel layout .............................................. 8<br />

P<br />

panel layout......................................................... 7<br />

panels<br />

Job Options................................................. 14<br />

Monitor......................................................... 8<br />

path<br />

Comparative Modeling ................. 1, 9, 17, 25<br />

Threading.................................................... 17<br />

PDB........................................................... 47, 121<br />

ATOM records .......................................... 125<br />

chain ID ...................................................... 23<br />

header ......................................................... 18<br />

ID code ....................................................... 23<br />

SEQRES ................................................... 125<br />

updating database ..................................... 120<br />

PDB atom names, correcting for receptor-ligand<br />

complex.......................................................... 54<br />

Pfam .................................................... 21, 28, 121<br />

PIR .................................................................... 18<br />

plus sign (+), in Pfam sequence ........................ 21<br />

<strong>Prime</strong> 2.5 <strong>User</strong> <strong>Manual</strong> 139


140<br />

Index<br />

polar residues .................................................... 33<br />

position-specific substitutional matrix (PSSM) 33<br />

Positives column ............................................... 23<br />

Preferences........................................................ 12<br />

command .................................................... 11<br />

<strong>Prime</strong> Guide .................................................. 8, 13<br />

<strong>Prime</strong> menu bar............................................... 8, 9<br />

<strong>Prime</strong> runs ........................................................... 5<br />

<strong>Prime</strong> Sequence Viewer ...................................... 8<br />

<strong>Prime</strong> toolbar................................................. 8, 13<br />

<strong>Prime</strong> workflow................................................... 9<br />

<strong>Prime</strong>-Refinement ......................................... 1, 51<br />

<strong>Prime</strong>–Structure Prediction....................... 1, 7, 17<br />

processors, maximum number .......................... 65<br />

product installation.......................................... 131<br />

programs, third-party ...................................... 121<br />

Project Table ............................................. 16, 136<br />

projects............................................................ 136<br />

proximity........................................................... 11<br />

Proximity color scheme .................................... 11<br />

proximity cutoff .......................................... 11, 12<br />

PSIPRED ........................................................ 121<br />

Q<br />

query sequence................................................ 136<br />

query, maximum length .............................. 18, 25<br />

R<br />

random seed, for side-chain prediction............. 64<br />

Ranking Score................................................. 136<br />

read sequence.................................................... 18<br />

records<br />

ATOM....................................................... 125<br />

SEQRES ................................................... 125<br />

Refine Structure ................................................ 16<br />

Refinement .......................................................... 1<br />

refinement ....................................................... 136<br />

cooperative.................................................. 60<br />

protocols ..................................................... 51<br />

Refinement panel .............................................. 51<br />

regions<br />

aligned ........................................................ 36<br />

of multiple templates .................................. 36<br />

of templates, setting.................................... 36<br />

Rename command............................................... 6<br />

Res1................................................................... 59<br />

<strong>Prime</strong> 2.5 <strong>User</strong> <strong>Manual</strong><br />

Res2................................................................... 59<br />

Residue Matching color scheme ....................... 11<br />

residue names, correcting for receptor-ligand<br />

complex.......................................................... 54<br />

Residue Property color scheme......................... 11<br />

Residue Type color scheme .............................. 11<br />

residues<br />

charged........................................................ 33<br />

conserved.............................................. 36, 47<br />

hydrophobic................................................ 33<br />

incomplete .................................................. 61<br />

missing...................................................... 125<br />

non-conserved............................................. 36<br />

nonstandard................................................. 52<br />

polar............................................................ 33<br />

standard, list of ........................................... 51<br />

variable ....................................................... 47<br />

rotamers............................................................. 37<br />

ruler..................................................................... 8<br />

Run SSP ................................................ 31, 44, 46<br />

runs.......................................................... 5, 6, 136<br />

S<br />

sampling accuracy, loop refinement.................. 64<br />

Save As ......................................................... 6, 25<br />

saving <strong>Prime</strong> runs................................................ 6<br />

Schrödinger contact information..................... 132<br />

schrodinger.hosts file............................ 14<br />

SCOP domains.................................................. 47<br />

score<br />

alignment .................................................... 34<br />

BLAST bit .................................................. 23<br />

Z.................................................................. 46<br />

scratch projects.................................................... 5<br />

Search program<br />

in Find Homologs....................................... 20<br />

in Fold Recognition .................................... 47<br />

secondary structure assignment .................. 14, 33<br />

Secondary Structure color scheme.................... 11<br />

secondary structure prediction programs........ 121<br />

See also SSPs<br />

Select toolbar command.................................... 13<br />

seqconvert utility.......................... 30, 43, 116<br />

SEQRES records............................................. 125<br />

sequence identity............................................... 23<br />

sequence positives............................................. 23<br />

Sequence Viewer (<strong>Prime</strong>).............................. 8, 21


sequences<br />

coloring....................................................... 14<br />

command menu for....................................... 8<br />

cropping...................................................... 18<br />

editing......................................................... 10<br />

exporting....................................................... 9<br />

maximum length................................... 18, 25<br />

multiple....................................................... 18<br />

names.................................................... 18, 23<br />

non-unique.................................................. 18<br />

selecting...................................................... 18<br />

standard code.............................................. 18<br />

unusable...................................................... 23<br />

Set Template Regions........................................ 13<br />

color scheme............................................... 36<br />

Set Template Regions toolbar button ................ 36<br />

side chains......................................................... 36<br />

optimizing................................................... 37<br />

predicting.................................................... 36<br />

rotamers ...................................................... 37<br />

unfreezing in loop refinement..................... 66<br />

similarity, BLOSUM62 matrix ......................... 23<br />

Slide As Block ............................................ 13, 33<br />

Slide Freely ................................................. 13, 33<br />

solvation............................................................ 40<br />

sorting tables, by column heading ................ 8, 22<br />

SSA ................................................................. 136<br />

SSP profile, in Fold Recognition ...................... 43<br />

SSpro................................................... 29, 44, 121<br />

SSPs ................................................................ 136<br />

command menu for....................................... 8<br />

composite.................................................... 33<br />

deleting ................................................... 8, 31<br />

editing......................................................... 13<br />

exporting........................................... 8, 30, 43<br />

in Fold Recognition .................................... 43<br />

hiding all....................................................... 8<br />

importing .............................................. 30, 43<br />

retrieving unmodified ........................... 31, 46<br />

in Sequence Viewer ...................................... 8<br />

standard residues............................................... 51<br />

Step View ............................................................ 8<br />

steps<br />

action ............................................................ 8<br />

available and unavailable.............................. 9<br />

Find Homologs ..................................... 17, 18<br />

Input Sequence ........................................... 17<br />

Index<br />

name ............................................................. 8<br />

panel layout .................................................. 7<br />

structalign utility.................................... 115<br />

structurally aligned templates ........................... 23<br />

structure refinement, loop ................................. 51<br />

structures<br />

coloring....................................................... 14<br />

structures, exporting........................................ 115<br />

Surface Generalized Born (SGB)...................... 40<br />

T<br />

tables ................................................................... 8<br />

cells............................................................. 22<br />

sorting........................................................... 8<br />

in Step View.................................................. 8<br />

of templates ................................................ 47<br />

Template ID color scheme ................................ 11<br />

template sequence ........................................... 136<br />

templates<br />

multiple....................................................... 36<br />

regions ........................................................ 36<br />

structurally aligned ..................................... 23<br />

Templates table ................................................. 47<br />

third-party programs ......................... 44, 121, 122<br />

Threading ........................................................ 136<br />

Threading path ............................................ 17, 25<br />

tilde symbol, for unlocked gap.................... 14, 33<br />

Title, PDB header.............................................. 23<br />

toolbar, <strong>Prime</strong>................................................ 8, 13<br />

tooltips .......................................................... 8, 22<br />

for buttons................................................... 13<br />

for table entries............................................. 8<br />

truncated-Newton.............................................. 62<br />

U<br />

unlocking gaps ............................................ 14, 33<br />

unusable sequence............................................. 23<br />

Update (button) ................................................. 33<br />

utilities<br />

getpdb...................................................... 22<br />

seqconvert .............................. 30, 43, 116<br />

structalign........................................ 115<br />

V<br />

V (Visualization) column.................................. 49<br />

View SSA.......................................................... 14<br />

<strong>Prime</strong> 2.5 <strong>User</strong> <strong>Manual</strong> 141


142<br />

Index<br />

View Structure............................................. 12, 14<br />

W<br />

warnings...................................... 18, 23, 127–129<br />

workflow ....................................................... 9, 17<br />

<strong>Prime</strong> 2.5 <strong>User</strong> <strong>Manual</strong><br />

Workspace........................................... 10, 14, 136<br />

Z<br />

Z-Score...................................................... 46, 136


120 West 45th Street, 29th Floor<br />

New York, NY 10036<br />

Zeppelinstraße 13<br />

81669 München, Germany<br />

101 SW Main Street, Suite 1300<br />

Portland, OR 97204<br />

Dynamostraße 13<br />

68165 Mannheim, Germany<br />

8910 University Center Lane, Suite 270<br />

San Diego, CA 92122<br />

Quatro House, Frimley Road<br />

Camberley GU16 7ER, United Kingdom<br />

SCHRÖDINGER ®

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!