12.07.2015 Views

Caché High Availability Guide - InterSystems Documentation

Caché High Availability Guide - InterSystems Documentation

Caché High Availability Guide - InterSystems Documentation

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Table of ContentsIntroduction ........................................................................................................................ 11 Write Image Journaling and Recovery .......................................................................... 31.1 Write Image Journaling ............................................................................................ 31.1.1 Image Journal ................................................................................................. 41.1.2 Two-Phase Write Protocol .............................................................................. 41.2 Recovery .................................................................................................................. 51.2.1 Recovery Procedure ........................................................................................ 61.3 Error Conditions ....................................................................................................... 71.3.1 If Recovery Cannot Complete (UNIX and OpenVMS) ................................. 81.3.2 Sample Recovery Errors ................................................................................. 91.3.3 Write Daemon Panic Condition .................................................................... 101.3.4 Write Daemon Errors and System Crash ...................................................... 101.3.5 Freeze Writes on Error ................................................................................. 101.3.6 Responding to a Freeze ................................................................................ 121.4 Limitations of Write Image Journaling .................................................................. 132 Backup and Restore ....................................................................................................... 152.1 Backup Integrity and Recoverability ..................................................................... 162.2 Importance of Journals ........................................................................................... 172.3 Backup Methods .................................................................................................... 182.3.1 External Backup ........................................................................................... 182.3.2 Online Backup .............................................................................................. 212.4 Configuring <strong>Caché</strong> Backup Settings ...................................................................... 232.4.1 Define Database Backup List ....................................................................... 242.4.2 Configure Backup Tasks ............................................................................... 242.4.3 Schedule Backup Tasks ................................................................................ 262.5 Managing <strong>Caché</strong> Online Backups .......................................................................... 272.5.1 Run Backup Tasks ........................................................................................ 272.5.2 View Backup Status ...................................................................................... 282.5.3 View Backup History .................................................................................... 282.5.4 Error Handling for Backups ......................................................................... 292.5.5 Backing Up Selected Globals and Routines ................................................. 292.6 Restoring from a Backup ....................................................................................... 302.6.1 Using the Backup History to Recreate the Database .................................... 30<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>iii


2.6.2 Suspending Database Access During a Restore ........................................... 312.6.3 Restoring Database Properties ...................................................................... 312.6.4 Performing a Restore .................................................................................... 322.6.5 Error Handling for Restore ........................................................................... 382.7 <strong>Caché</strong> Backup Utilities .......................................................................................... 382.7.1 Estimating Size of Backups .......................................................................... 382.7.2 <strong>Caché</strong> ^BACKUP Routine ............................................................................ 412.8 UNIX Backup and Restore .................................................................................... 422.8.1 Using UNIX Backup Utilities ...................................................................... 422.8.2 cbackup Utility ............................................................................................. 432.9 OpenVMS Backup and Restore ............................................................................. 452.9.1 Efficiency ..................................................................................................... 452.9.2 Concurrent Operation ................................................................................... 452.9.3 History Log ................................................................................................... 452.9.4 Using the OpenVMS BACKUP Utility ........................................................ 452.9.5 Using CBACKUP.COM ............................................................................... 462.9.6 Restore on OpenVMS .................................................................................. 472.10 Sample Backup Scripts ........................................................................................ 472.10.1 External UNIX Backup Script .................................................................... 472.10.2 Concurrent <strong>Caché</strong> Backup for UNIX Script ............................................... 482.10.3 Non-Concurrent External Backup .............................................................. 483 Journaling ....................................................................................................................... 513.1 Journaling Overview .............................................................................................. 513.1.1 Differences Between Journaling and Write Image Journaling ..................... 533.1.2 Protecting Database Integrity ....................................................................... 533.1.3 Automatic Journaling of Transactions .......................................................... 533.1.4 Rolling Back Incomplete Transactions ......................................................... 543.1.5 Using Temporary Globals and CACHETEMP ............................................. 543.2 Configuring Journaling .......................................................................................... 553.2.1 Configure Journal Settings ........................................................................... 563.2.2 Journaling Best Practices .............................................................................. 573.3 Journaling Operation Tasks .................................................................................... 583.3.1 Start Journaling ............................................................................................. 583.3.2 Stop Journaling ............................................................................................. 583.3.3 Switch Journal Files ..................................................................................... 593.3.4 View Journal Files ........................................................................................ 59iv<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


3.3.5 Purge Journal Files ....................................................................................... 603.3.6 Restore Journal Files .................................................................................... 603.4 Journaling Utilities ................................................................................................. 613.4.1 Perform Journaling Tasks Using ^JOURNAL .............................................. 613.4.2 Start Journaling Using ^JRNSTART ............................................................ 623.4.3 Stop Journaling Using ^JRNSTOP ............................................................... 623.4.4 Restore Globals From Journal Files Using ^JRNRESTO ............................ 633.4.5 Filter Journal Records Using ^ZJRNFILT ................................................... 643.4.6 Switch Journal Files Using ^JRNSWTCH ................................................... 663.4.7 Display Journal Records Using ^JRNDUMP ............................................... 673.4.8 Update Journal Settings Using ^JRNOPTS ................................................. 723.4.9 Recover from Startup Errors Using ^STURECOV ...................................... 733.4.10 Convert Journal Files Using ^JCONVERT and ^JREAD .......................... 773.4.11 Set Journal Markers Using ^JRNMARK ................................................... 793.4.12 Manipulate Journal Files Using ^JRNUTIL ............................................... 793.4.13 Manage Journaling at the Process Level Using %NOJRN ......................... 803.5 Special Considerations for Journaling ................................................................... 803.5.1 Journal Management Global ........................................................................ 813.5.2 Performance .................................................................................................. 813.5.3 Journal I/O Errors ......................................................................................... 814 Shadow Journaling ........................................................................................................ 854.1 Shadowing Overview ............................................................................................. 854.2 Configuring Shadowing ......................................................................................... 874.2.1 Configuring the Source Database Server ..................................................... 874.2.2 Configuring the Destination Shadow ........................................................... 894.2.3 Journaling on the Destination Shadow ......................................................... 944.3 Using Shadowing ................................................................................................... 944.3.1 Shadow Administration Tasks ...................................................................... 964.3.2 Shadow Operations Tasks ............................................................................. 964.4 Using the Shadow Destination for Disaster Recovery ........................................... 975 System Failover Strategies ............................................................................................ 995.1 No Failover ........................................................................................................... 1005.2 Cold Failover ........................................................................................................ 1015.3 Warm Failover ...................................................................................................... 1025.4 Hot Failover ......................................................................................................... 1046 Cluster Management ................................................................................................... 107<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>v


6.1 Overview of Clusters ............................................................................................ 1086.1.1 Cluster Master ............................................................................................ 1096.1.2 Cluster Master as Lock Server ................................................................... 1096.2 Configuring a <strong>Caché</strong> Cluster ................................................................................ 1106.3 Managing Cluster Databases ................................................................................ 1116.3.1 Creating <strong>Caché</strong> Database Files ................................................................... 1116.3.2 Mounting Databases ................................................................................... 1116.3.3 Deleting a Cluster-Mounted Database ........................................................ 1126.4 <strong>Caché</strong> Startup ....................................................................................................... 1126.5 Write Image Journaling and Clusters ................................................................... 1136.6 Cluster Backup ..................................................................................................... 1136.7 System Design Issues for Clusters ....................................................................... 1146.7.1 Determining Database File <strong>Availability</strong> ..................................................... 1156.8 Cluster Application Development Strategies ....................................................... 1156.8.1 Block Level Contention .............................................................................. 1156.9 <strong>Caché</strong> ObjectScript Language Features ............................................................... 1166.9.1 Remote <strong>Caché</strong> ObjectScript Locks ............................................................. 1166.10 DCP and UDP Networking ................................................................................ 1187 Cluster Journaling ....................................................................................................... 1197.1 Journaling on Clusters .......................................................................................... 1197.1.1 Cluster Journal Log .................................................................................... 1207.1.2 Cluster Journal Sequence Numbers ............................................................ 1217.2 Cluster Failover .................................................................................................... 1227.2.1 Cluster Recovery ........................................................................................ 1237.2.2 Cluster Restore ........................................................................................... 1247.2.3 Failover Error Conditions ........................................................................... 1257.3 Cluster Shadowing ............................................................................................... 1267.3.1 Configuring a Cluster Shadow ................................................................... 1287.3.2 Cluster Shadowing Limitations .................................................................. 1317.4 Tools and Utilities ................................................................................................ 1327.5 Cluster Journal Restore ........................................................................................ 1327.5.1 Perform a Cluster Journal Restore .............................................................. 1337.5.2 Generate a Common Journal File ............................................................... 1437.5.3 Perform a Cluster Journal Restore after a Backup Restore ........................ 1437.5.4 Perform a Cluster Journal Restore Based on <strong>Caché</strong> Backups .................... 1447.6 Journal Dump Utility ........................................................................................... 144vi<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


7.7 Startup Recovery Routine .................................................................................... 1457.8 Setting Journal Markers on a Clustered System .................................................. 1477.9 Cluster Journal Information Global ..................................................................... 1477.10 Shadow Information Global and Utilities .......................................................... 1508 <strong>Caché</strong> Clusters on Tru64 UNIX .................................................................................. 1558.1 Tru64 UNIX <strong>Caché</strong> Cluster Overview ................................................................. 1568.2 TruCluster File System Architecture .................................................................... 1578.2.1 <strong>Caché</strong> and CDSLs ...................................................................................... 1588.2.2 Remastering AdvFS Domains .................................................................... 1588.3 Planning a Tru64 <strong>Caché</strong> Cluster Installation ....................................................... 1608.4 Tuning a Tru64 <strong>Caché</strong> Cluster Member ............................................................... 1619 <strong>Caché</strong> and Windows Clusters ..................................................................................... 1639.1 Single Failover Cluster ......................................................................................... 1649.1.1 Setting Up a Failover Cluster ..................................................................... 1659.2 Example Procedures ............................................................................................. 1679.2.1 Create a Cluster Group ............................................................................... 1679.2.2 Create an IP Address Resource .................................................................. 1689.2.3 Create a Physical Disk Resource ................................................................ 1709.2.4 Install <strong>Caché</strong> ............................................................................................... 1709.2.5 Create a <strong>Caché</strong> Cluster Resource ............................................................... 1719.3 Multiple Failover Cluster ..................................................................................... 1749.3.1 Setting Up a Multiple Failover Cluster ....................................................... 17510 ECP Failover .............................................................................................................. 17910.1 ECP Recovery .................................................................................................... 17910.2 ECP and Clusters ............................................................................................... 18110.2.1 Application Server Fails ........................................................................... 18210.2.2 Data Server Fails ...................................................................................... 18210.2.3 Network Is Interrupted ............................................................................. 18310.2.4 Cluster as an ECP Database Server .......................................................... 183<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>vii


List of FiguresShadowing Overview .......................................................................................................... 86Relationships of Shadow States and Permissible Actions .................................................. 95Cold Failover Configuration ............................................................................................. 102Warm Failover Configuration ........................................................................................... 103Hot Failover Configuration ............................................................................................... 104Cluster Shadowing Overview ........................................................................................... 127Example of Tru64 Cluster Configuration ......................................................................... 156Single Failover Cluster ..................................................................................................... 164Failover Cluster with Node Failure .................................................................................. 165IP Address Advanced Properties ....................................................................................... 169IP Address Parameter Properties ...................................................................................... 169Physical Disk Dependency Properties .............................................................................. 170Cluster Resource General Properties ................................................................................ 172Cluster Resource Dependencies Properties ...................................................................... 172Cluster Resource Advanced Properties ............................................................................. 173Cluster Resource Parameter Properties ............................................................................ 173Multiple Failover Cluster .................................................................................................. 174Multiple Failover Cluster with Node Failure .................................................................... 175viii<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


List of TablesConditions Affecting Write Daemon Errors ........................................................................ 7Write Daemon Error Conditions ........................................................................................ 11Backup Task Descriptions .................................................................................................. 25Restore Actions ................................................................................................................... 32Values of backup_type ........................................................................................................ 41UNIX Backup Utilities and Commands ............................................................................. 43Journal Data Record Fields Displayed by ^JRNDUMP ..................................................... 69Journal File Command Type Codes .................................................................................... 70Functions Available in ^JRNUTIL ..................................................................................... 79<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>ix


IntroductionAs organizations rely more and more on computer applications, it is vital to safeguard thecontents of databases. This guide explains the many mechanisms that <strong>Caché</strong> provides tomaintain a highly available and reliable system. It describes strategies for recovering quicklyfrom system failures while maintaining the integrity of your data.<strong>Caché</strong> write image journaling technology protects against internal integrity failures due tosystem crashes. <strong>Caché</strong> backup and journaling systems provide rapid recovery from physicalintegrity failures. Logical database integrity is ensured through transaction processing, locking,and automatic rollback.In addition, there are other mechanisms available to maintain high availability includingshadow journaling and various recommended failover strategies involving <strong>Caché</strong> ECP(Enterprise Cache Protocol) and clustering. The networking capabilities of <strong>Caché</strong> can becustomized to allow cluster failover.The following topics are addressed:• Write Image Journaling and Recovery• Backup and Restore• Journaling• Shadow Journaling• System Failover Strategies• Cluster Management• Cluster Journaling• <strong>Caché</strong> Clusters on Tru64 UNIX• <strong>Caché</strong> and Windows Clusters• ECP Failover<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 1


1Write Image Journaling andRecovery<strong>Caché</strong> uses write image journaling to maintain the internal integrity of your <strong>Caché</strong> database.It is the foundation of the database recovery process.This chapter discusses the following topics:• Write Image Journaling• Recovery• Error Conditions• Limitations1.1 Write Image Journaling<strong>Caché</strong> safeguards database updates by using a two-phase technique, write image journaling,in which updates are first written from memory to a transitional journal, CACHE.WIJ, andthen to the database. If the system crashes during the second phase, the updates can be reappliedupon recovery. The following topics are covered in greater detail:• Image Journal• Two-Phase Write Protocol<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 3


Write Image Journaling and Recovery1.1.1 Image JournalThe Write daemon is activated at <strong>Caché</strong> startup and creates an image journal. The Writedaemon records database updates here before writing them to the <strong>Caché</strong> database.By default, the write image journal (WIJ) is named CACHE.WIJ and resides in the systemmanager directory, usually /Mgr, where CacheSys is the installation directory.To specify a different location for this file, use the System Management Portal:1. Navigate to the [Home] > [Configuration] > [Journal Settings] page.2. Enter the new location of the image journal file in the Write image journal directory boxand click Save. The name must identify an existing directory on the system and may beup to 63 characters long. If you edit this setting, restart <strong>Caché</strong> to apply the change.Important:<strong>InterSystems</strong> recommends locating the write image journal (WIJ) file on aseparate disk from the database disks (those that contain the CACHE.DAT files)to reduce risk and increase performance.On some Linux and UNIX platforms, using a raw partition may improve performance. A rawpartition is a UNIX character mode special file type that allows raw access to a contiguousportion of a physical disk. To place the image journal in a raw partition:1. Calculate the size of the partition by adding the amount of database cache, the amountof routine buffer space, plus 10 megabytes. The result is the number of bytes you needto assign to the raw partition.2. Create a raw partition of that size. See your UNIX system documentation for details.3. Follow the procedure above from the [Home] > [Configuration] > [Journal Settings] pageof the System Management Portal to specify the raw partition name for the Write imagejournal directory setting.CAUTION:The WIJ file should never be put on a networked disk.1.1.2 Two-Phase Write Protocol<strong>Caché</strong> maintains application data in databases whose structure enables fast, efficient searchesand updates. A database update occurs when a Set, Kill, ZSave, or ZRemove command isissued. Generally, when an application updates data, <strong>Caché</strong> must modify a number of blocksin the database structure to reflect the change.4 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


RecoveryDue to the sequential nature of disk access, any sudden, unexpected interruption of disk orcomputer operation can halt the update of multiple database blocks after the first block hasbeen written but before the last block has been updated. This incomplete update leads to aninconsistent database structure. The consequences can be as severe as a database that is totallyunusable, all data irretrievable by normal means.The <strong>Caché</strong> write image journaling technology uses a two-phase process of writing to thedatabase to protect against such events as follows:• In the first phase, <strong>Caché</strong> records the changes needed to complete the update in the writeimage journal. Once it enters all updates to the write image journal, it sets a flag in thefile and the second phase begins.• In the second phase, the Write daemon writes the changes recorded in the write imagejournal to the database on disk. When this second phase completes, the Write daemonsets a flag in the write image journal to indicate it is empty.When <strong>Caché</strong> starts, it automatically checks the write image journal and runs a recovery procedureif it detects that an abnormal shutdown occurred. When the procedure completessuccessfully, the internal integrity of the database is restored.<strong>Caché</strong> write image journaling guarantees the order of updates. The Write daemon records alldatabase modifications in the image journal. For example, assume that modifications A, B,and C normally occur in that order, but that only B is split over multiple blocks. All threemodifications are in the image journal, and are written to the database, so all three are in thedatabase following a failure, or none of them are.1.2 RecoveryWhen <strong>Caché</strong> starts, it automatically checks the write image journal and runs a recovery procedureif it detects that an abnormal shutdown occurred. Recovery is necessary if a systemcrash or other major system malfunction occurs at either of the following points in the twophasewrite protocol process:• Before the Write daemon has completed writing the update to the write image journal.In this case, recovery discards the incomplete entry and updates are lost. However, thedatabases are in a consistent and usable state and the transaction journal file can be applied,if it is being used, to restore any updates which may have been lost because they had notyet been written to the database. See the Journaling chapter for more information.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 5


Write Image Journaling and Recovery• After the update to the write image journal is complete but before the database is updated.In this case, the recovery procedure applies the updates from the write image journal fileto the database to restore internal database integrity.1.2.1 Recovery ProcedureIf the write image journal is marked as “complete,” the Write daemon completed writingmodified disk blocks to the image journal but had not completed writing the blocks back totheir respective databases. This indicates that restoration is needed. The recovery program,cwdimj, does the following:• Informs the system manager in the recovery log file.• Performs dataset recovery.• Continues and completes restoration.1.2.1.1 Recovery Log FileThe recovery procedure records its progress in the cconsole.log file in the <strong>Caché</strong> systemmanager directory. This file contains a record of output from all recoveries run in the %SYSnamespace. To view the file, open it with a text viewer or editor. You can also view its contentsfrom the [Home] > [System Logs] > [View Console Log] page of the System ManagementPortal.1.2.1.2 Dataset RecoveryThe recovery procedure allows you to confirm the recovery on a dataset-by-dataset basis.Normally, you specify all datasets. After each dataset prompt, type either:• Y — to restore that dataset• N — to reject restoration of that datasetYou can also specify a new location for the dataset if the path to it has been lost, but you canstill access the dataset. Once a dataset has been recovered, it is removed from the list ofdatasets requiring recovery and is not recovered during subsequent runs of the cwdimj program,should any be necessary. Typically, all recovery is performed in a single run of the cwdimjprogram.6 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Error Conditions1.2.1.3 Completes RestorationIf no operator is present during the recovery procedure, <strong>Caché</strong> takes default actions in responseto prompts: it restores all directories and automatically marks the write image journal asdeleted. However, if a problem occurs during recovery, the cwdimj program aborts and thesystem is not started. Any datasets which were not successfully recovered are still markedas requiring recovery in the write image journal. See the Error Conditions section for moreinformation.When the recovery procedure is complete, the recovery program asks whether it should markthe contents of the write image journal as “deleted” . If recovery has successfully written allblocks, answer “Yes.” However, if an error occurred during writing, or if you chose not towrite the blocks, answer “No;” otherwise, you most likely will cause database degradation.<strong>Caché</strong> cannot run until either the contents of this file have been deleted or the file has beenremoved or renamed.When recovery completes normally, the write image journal is marked as deleted, and startupcontinues. If the Write daemon cannot create the write image journal, it halts all databasemodifications. The halt continues until the Write daemon can create the image journal, oruntil you shut down the system.Once the Write daemon is able to create the image journal, it sends the following messageto the console log:Database updates have resumed1.3 Error ConditionsIf an error occurs that causes database degradation, the Write daemon’s action depends onthe condition under which the error occurs.Conditions Affecting Write Daemon ErrorsConditionDatabase freezes on error.Write Daemon ActionWrite daemon freezes the system and logs to the operator’sconsole an error message of the type shown in the FreezeWrites on Error section.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 7


Write Image Journaling and RecoveryConditionError trapping is enabledwith the command SET$ZT="^%ET".Error occurred as a resultof a <strong>Caché</strong> ObjectScriptcommand entered inprogrammer mode.Serious Disk Write erroroccurred in a <strong>Caché</strong>database file.Serious Disk Read orWrite error occurred in thewrite image file.Write Daemon ActionError trapping halts the process where the error occurred.One of the error conditions listed in the Write Daemon ErrorConditions table is stored in the ^ERTRAP global in the<strong>Caché</strong> database, unless there is a file-full condition in thatdatabase. In that case, the halt occurs with no indication asto why.One of the errors listed in the Write Daemon ErrorConditions table appears on your screen.Write daemon freezes the system and displays the followingmessage: “SERIOUS DISK WRITE ERROR - WILL RETRY”. If it cannot recover, it displays a message of the type shownin the Freeze Writes on Error section. If it is able to recover,database updates resume.Write daemon freezes the system while it attempts torecover, and displays one of the following messages:“SERIOUS DISK ERROR WRITING IMAGE FILE - WILLRETRY” or “SERIOUS DISK ERROR READING IMAGEFILE - WILL RETRY” . If it cannot recover, it displays amessage of the type shown in the Freeze Writes on Errorsection. If it is able to recover, database updates resume.1.3.1 If Recovery Cannot Complete (UNIX and OpenVMS)If recovery cannot complete, <strong>Caché</strong> prompts you to choose between the following two options:• Abort startup, fix the problem that prevented recovery, and try again. This option ispreferable if you have time for it.• Delete or rename the write image journal file and continue startup. <strong>Caché</strong> will run withone or more databases suffering degradation caused when an update in progress did notcomplete when the system crashed or while recovery took place. If you delete the writeimage journal, you must restore those databases from backups or use repair utilities tofix them.8 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Error Conditions1.3.2 Sample Recovery Errors1.3.2.1 Error Opening CACHE.DATIf you cannot open a cache.dat or cache.ext file that needs to be restored, you see this messageduring the write phase:Can't open file: /usr/cache/cache.datIts blocks weren't writtenRecovery continues trying to write blocks to all other directories to be restored. If this happens,do the following:1. Do not delete the write image journal.2. Try to correct the problem with the <strong>Caché</strong> database on which the error occurred.3. Restart and let recovery try again.Directories that were restored the first time are not listed as having blocks to be writtenduring this second recovery attempt.1.3.2.2 Error Writing to <strong>Caché</strong> BlockIf recovery starts to write to a <strong>Caché</strong> database file, but cannot write a particular block number,you see this message:Error writing block number xxxxIf this error occurs four times in a single restoration, the restoration aborts, and you see thismessage:Error writing block number xxxxDo you want to delete the Write Image File (Y/N)? Y =>Enter N to retain the write image journal. Recovery attempts to continue. If it still does notsucceed and you receive this message again, call <strong>InterSystems</strong> Technical Support. If youmust continue immediately, you can delete or rename the write image journal. If you deleteit, you lose all changes recorded in it.1.3.2.3 Error Reading Write Image JournalIf an error occurs when recovery attempts to read the write image journal file, you see thismessage:<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 9


Write Image Journaling and RecoveryDo you want to write them now (Y/N)? Y =>Yes*** WRITING ABORTED***Can't read Cache Write Image FileDo you want to delete the Write Image File (Y/N)? Y =>1.3.3 Write Daemon Panic ConditionIf the global buffer pool is full of blocks that need to be written to databases, the Write daemonmay enter a state where it cannot write to its write image journal. Before this happens, itnotifies you on the operator’s console and the cconsole.log file. It then prints a message foreach block written to the database that was not written first to the write image journal. Thistechnique allows you to track the cause of any subsequent database degradation.If the condition clears because global buffers have been freed up, the Write daemon informsyou on the operator’s console that the panic condition has ended. If the panic condition doesnot end, the system may hang. If so, running cstop automatically calls cforce, in which caseyou most likely have database degradation.To avoid this situation, allocate more database cache from the System Management Portal.If a panic condition message appears on the operator’s console, try adding 1 MB to the cache.1.3.4 Write Daemon Errors and System Crash<strong>Caché</strong> does not allow database modifications in the event of a Write daemon error. Then, ifa Write daemon error occurs while accessing any of the databases, you avoid databasedegradation because all updates to any database on the system are suspended.If the system freezes, you must stop <strong>Caché</strong> and restart the system.Under rare circumstances, database degradation can occur that cannot be rectified by writeimage journaling. Run Integrity on the global identified in the error message that the Writedaemon logged when the freeze occurred.1.3.5 Freeze Writes on ErrorWhen the Write daemon encounters an error while writing a block, it freezes all processesperforming database updates, and logs an error message to the operator’s console log,cconsole.log, as long as the freeze continues. It sends the error messages first at thirty second,one-, two-, and four-minute intervals, and then at regular eight-minute intervals.If the cause of the freeze is an offline or write-protected disk, an operator can fix the problemand processing can continue. Otherwise, to recover from a freeze, you need to run:ccontrol force10 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Error Conditionsand then:ccontrol startWhen the system freezes due to an error, the Write daemon generates an operator consoleerror message that reports the exact error that caused the system to freeze as well as the nameof the cache.dat file and the global or routine that was involved in causing the error. Thefollowing is an example of an error message that would occur when accessing a global:*** CACHE: AN ERROR OCCURRED WHILE UPDATING A CACHE.DAT FILE THAT COULD CAUSEDATABASE DEGRADATION. TO PREVENT DEGRADATION ALL USER PROCESSES PERFORMINGDATABASE UPDATES HAVE BEEN SUSPENDED AND THE WRITE DAEMON WILL NOT RUN.ERROR: FILE: DUA0:[SYSM]GLOBAL: ^UTILITYIf the error occurs while accessing a routine, the last part of the error message reads:ROUTINE: TESTINGThe following table describes the errors that can occur during a database update, and providessome possible solutions. Not every occurrence of these errors freezes the system; the systemfreezes only when the error occurs during a critical part of a database update.Write Daemon Error ConditionsErrorMeaningA block could not be allocatedwhen information was added to adatabase because no blocks wereavailable.During an attempt to access ablock in a file, the request to theoperating system failed. Thisfailure may have occurred becausethe disk is offline or because theactual size of the file is less thanthe expected size.A database integrity problem hasbeen encountered.SolutionDetermine whether there isexpansion room in the <strong>Caché</strong>database. If not, increase themaximum size. Otherwise,determine whether there isenough physical space on thedisk.Check that the disk is online.If it is, run Integrity on theglobal where the erroroccurred.Run Integrity on the globalwhere the error occurred.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 11


Write Image Journaling and RecoveryErrorMeaningWhen <strong>Caché</strong> must extend thecurrent file to allocate a block forthe pointer level of the physicaldatabase structure, it uses theuser’s process string stack toformulate the file name. If the stringstack is full, this error occurs.<strong>Caché</strong> tried to find available userprocess memory for creatingstructures it needed to expand afile to allocate a pointer block, butnone was available.Extreme database degradationocurred while updating pointerblocks.SolutionStop then restart <strong>Caché</strong>. If theproblem still exists, contact the<strong>InterSystems</strong> WorldwideResponse Center (WRC).Once the problem is corrected, database updates are re-enabled.1.3.6 Responding to a FreezeIf a freeze occurs, follow the procedure below.1. Check the operator console to see the directory, the global or routine, and the process inwhich the error occurred.2. Fix any causes of the error that you can correct easily. For example, put a disk online.3. If updates do not resume, stop <strong>Caché</strong>.4. Restart <strong>Caché</strong>.5. Fix any causes of the error you could not correct earlier. For example, if the error was, you would need to provide more physical disk space, add a volume set,or increase the maximum size of the affected <strong>Caché</strong> database.6. Run Integrity on the global or routine directory in the database where the error occurredto verify that no degradation occurred.Some error conditions ( and ) indicate that database degradationmay exist. If degradation exists, try the ^REPAIR utility or contact the WRC.12 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Limitations of Write Image JournalingCertain error conditions can cause degradation that write image journaling cannot repair; seethe Limitations section.1.4 Limitations of Write Image JournalingWhile the two-phase write protocol safeguards structural database integrity, it does not preventdata loss. If the system failure occurs prior to a complete write of an update to the write imagejournal, <strong>Caché</strong> does not have all the information it needs to perform a complete update todisk. Hence, that data is lost.In addition, write image journaling cannot eliminate internal database degradation in thefollowing cases:• A hardware malfunction on the drive that contains the temporary write image journalprevents <strong>Caché</strong> from reading this file.Note that the Write daemon freezes if the malfunction occurs during an attempt to reador write this temporary file while <strong>Caché</strong> is operating. In most cases this means that amalfunction of this disk results only in data loss, not database degradation.• A drive malfunctions and its contents are irretrievably lost or permanently unalterable.You must restore the backup of this database for the directories using the malfunctioningdrive. However, write image journaling can still restore directories on other disks.• A single process (for example, due to a segmentation fault) disappears while within theglobal module. Such a situation could occur if:- On Windows NT, the Task Manager is used to halt a single process.- On OpenVMS or UNIX, the terminal for that process is disconnected.- On OpenVMS, a STOP/ID is issued. See the $ZUTIL(69,24) entry in the <strong>Caché</strong>ObjectScript Reference for further details.• If an obscure situation occurs in which drive A contains pointer blocks to drive B, a Killcommand deletes those pointers, and after the Garbage Collector begins its work, driveA becomes inoperable before the pointer block is rewritten. In this situation, write imagejournaling could fail. This condition usually follows another failure that would preventthis situation from being a problem. Furthermore, this situation is also likely to be onein which drive A has malfunctioned to such an extent that you would need to restore thedatabase for that drive anyway.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 13


Write Image Journaling and RecoveryIf you believe that one of these situations has occurred, please contact the WRC.14 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


2Backup and RestoreThis chapter outlines the factors in developing a solid plan for backing up your <strong>Caché</strong> system.It discusses techniques for ensuring the integrity and recoverability of your backups, as wellas suggested backup methodologies. Later sections of the chapter contain details about theprocedures used to perform these tasks, either through the System Management Portal or byusing <strong>Caché</strong> and third-party utilities. It discusses the following topics:• Backup Integrity and Recoverability• Importance of Journals• Backup Methods• Configuring <strong>Caché</strong> Backup Settings• Managing <strong>Caché</strong> Online Backups• Restoring from a Backup• <strong>Caché</strong> Backup Utilities• UNIX Backup and Restore• OpenVMS Backup and Restore• Sample Backup ScriptsBackup strategies can differ depending upon your operating system, preferred backup utilities,disk configurations, and backup devices. If you require further information to help you todevelop a backup strategy tailored for your environment, or to review your current backuppractices, please contact the <strong>InterSystems</strong> Worldwide Response Center.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 15


Backup and Restore2.1 Backup Integrity and RecoverabilityRegardless of the backup methods you use, it is critical to restore backups on a regular basisas a way to ensure that your backup strategy is a workable means of disaster recovery. Thebest practice is to restore every backup of the production environment to an alternate server,and then check the physical structure of the restored databases. This provides the followingbackup validation functions:• Validates the recoverability of the backup media.• Validates the global-level integrity of the databases in the backup.• Provides a warm copy of the backup, substantially reducing the time required to restorethe backup in the event of a disaster. In the event of a disaster, only journal files need berestored.• Establishes a last known good backup.See the Check Database Integrity section of the “Managing <strong>Caché</strong>” chapter of the <strong>Caché</strong>System Administration <strong>Guide</strong> for details.The backup methods described in this document preserve the physical structure of the database;therefore, a clean integrity check of the restored copy implies that the integrity of the productiondatabase was sound at the time of the backup. The converse, however, is not true; anintegrity error detected on the restored copy of a database does not necessarily imply thatthere are integrity problems on the production database. There could, for example, be errorsin the backup media. If you discover an integrity error in the restored database, immediatelyrun an integrity check on the production database to verify the integrity of the productionsystem.To further validate that the application is working correctly on the restored database, you canalso perform application-level checks. To perform these checks, you may need to restorejournal files to restore transactional integrity. See the Importance of Journals section for moreinformation.Once you restore the backup and establish that it is a viable source of recovery, it is best topreserve that restored copy until you establish the next good backup. Therefore, the serveron which you are validating the backup should ideally have twice the storage space requiredby production—space to store the last known good backup as well as the backup currentlybeing validated.16 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Importance of JournalsDepending on your needs, performance requirements of the storage device used for restoringbackups may be less stringent, perhaps allowing for a less expensive storage solution. In thisway, the last known good backup is always available for use in a disaster even if validationof the current backup fails. Retain all journal files corresponding to the last known backupuntil you identify a new backup as the last known good backup. To protect the enterprisefrom a disaster that could destroy the physical plant, regularly ship backup media to a secureoff-site location.Backups can be performed during transaction processing. In this case, the resulting backupfile contains partial transactions. When restoring from a backup, you first restore the backupfile, then restore from the journal file to complete the partial transactions in the backup file.2.2 Importance of JournalsThe backup of a <strong>Caché</strong> database alone is not enough to provide a viable restore of productiondata. In the event of a disaster that requires restoring from backup, you will always applyjournal files to the restored copy of the database. Applying journal files restores all journaledupdates from the time of the backup, up to the time of the disaster. Also applying journals isnecessary to restore the transactional integrity of your database by rolling back uncommittedtransactions (the databases may have contained partial transactions at the time of the backup).It is critical to ensure that journal files are available to be restored in the event of a disaster.Take the following steps to prevent compromising the journal files in the event of a disasterthat requires you to restore databases.• Verify that all databases for which durability and recoverability is required are set to bejournaled.• A journal file must not be purged unless it was closed prior to the last known good backup,as determined by the backup validation procedure discussed previously. Set the numberof days to keep journal files appropriately.• Define an alternate journal directory.• The primary and alternate journal directories should reside on disk devices that are separatefrom the storage of the databases, separate from the storage of the write-image journal(WIJ), and separate from each other (primary and alternate journal directories shouldreside on different devices). For practical reasons, these different devices may just bedifferent logical unit numbers (LUNs) on the same storage area network (SAN), but thegeneral rule is the more separation the better. As best as possible, the system should be<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 17


Backup and Restoreconfigured so that journals are isolated from any failure that may compromise the databasesor WIJ, because if the database or WIJ is compromised, then restoring frombackup/journals may be required.• Hardware redundancy such as mirroring can be used to help protect the journals. Longdistancereplication can also provide a real-time off-site copy of the journal files. Theoff-site copy of journals allows recovery from a disaster where the physical plant isdestroyed (in conjunction with the off-site copy of the backup media).• Enable the “Freeze System on Journal Error” option. If a failure of journaling occurswhere journaling can no longer write to the primary nor the alternate journal device, thesystem can be configured to freeze. The alternative is to allow the system to continueand journaling will become disabled, which, among other things, compromises the abilityto reliably restore from backup/journals.Important:It is critical to test the entire disaster recovery procedure from start to finishperiodically. This includes backup restore, journal restore, and running simulateduser activity on the restored environment.2.3 Backup MethodsThe two main methods of backing up <strong>Caché</strong> data are the external backup and the <strong>Caché</strong> onlinebackup. Each of these methods have variations on how to implement them; your backupstrategy can contain multiple types of backups performed at different times and with differentfrequency. This section describes the details and variations of the following types of backups:• External Backup• Online Backup2.3.1 External BackupThis strategy is used in conjunction with technology that provides the ability to quickly createa functional “snapshot” of a logical disk volume. Such technologies exist at various levels,such as simple disk mirrors, volume shadowing at the operating system level, or more modernsnapshot technologies provided at the SAN level.The external backup approach is especially attractive for enterprises that have a very largeamount of data, where the output of a <strong>Caché</strong> online backup would be so large as to be18 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


unwieldy. The approach is to freeze writes to all database files for the duration required tocreate a snapshot, then create a snapshot of the disk using the technology of choice. Whenthe snapshot has been created, the system is thawed to allow writes to continue to the databasewhile the snapshot image is copied to the backup mediaWhile the ability to do this has always existed in <strong>Caché</strong>, this release provides theBackup.General class with class methods to simplify and enhance this technique. Using thenew class methods on a nonclustered instance of <strong>Caché</strong>, only physical writes to the databaseare paused during the creation of the snapshot, while user processes are allowed to continueperforming updates in memory. This allows for a zero-downtime external backup on nonclusteredsystems. You should be careful to ensure that this mechanism is being used with a disktechnology that can create the snapshot within several minutes. If writes are paused for anextended period of time, user processes could hang due to a shortage of free global buffers.On a clustered configuration of <strong>Caché</strong>, this method pauses user processes for the duration ofthe freeze.In addition to pausing writes as described above, the freeze method also handles switchingjournal files, and writing a backup marker to the journal. The class methods that perform thedatabase freeze and thaw operations are Backup.General.ExternalFreeze() andBackup.General.ExternalThaw() respectively. See the Backup.General class documentationin the <strong>Caché</strong> Class Reference for details on the use of these methods.<strong>Caché</strong> maintains a bitmap list of database blocks modified since the last backup; you can usethis list to keep track of database updates during backup stages. The DBSIZE utility, whichinspects these bitmaps, helps you calculate the size of a backup. <strong>Caché</strong> BACKUP uses amultipass scan of these lists to back up only the blocks modified since the last pass. Subsequentpasses usually operate on a reduced list of modified blocks; generally three passes are sufficientto complete a backup. During the final pass, <strong>Caché</strong> suspends all processes to prevent updatesand allow the backup to completely copy the last short list of modified blocks. This methodis used in most of the backup strategies that follow.The following sections discuss the types of external backups and the advantages and disadvantagesof each:• Concurrent External Backup• Non-concurrent External Backup• Cold BackupBackup Methods<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 19


Backup and Restore2.3.1.1 Concurrent External BackupA concurrent external backup, or “dirty backup,” is the most common strategy used by largescaleproduction facilities that have large databases, have limited time to complete a backup,and require uninterrupted processing 24 hours a day. The utility used to perform the backupdepends on site preference and the operating system. You may choose a native operatingsystem utility, such as the tar utility on UNIX, or a third-party utility such as Veritas orARCserve.• Advantages — Production is never paused (except for possibly briefly during the incrementalbackup).• Disadvantages — Multiple files need to be restored (Cache.dat database files and incrementalbackup files), which causes the restore process to take longer.Procedure Outline:Concurrent external backups are performed by doing the following:1. Clear the list of data blocks modified since the last backup.2. Copy the Cache.dat database files.3. Perform a <strong>Caché</strong> incremental backup, which copies any blocks that changed while theCache.dat files were being copied; this results in a very brief suspension of user processesin some configurations.2.3.1.2 Non-concurrent External BackupA non-concurrent external backup, or “paused backup,” is the second most common strategyused by large-scale production facilities that have large databases and limited time to completea backup, but can tolerate a brief operational pause. This strategy is often used in conjunctionwith advanced disk technologies, such as disk mirroring. The approach is to safely pause<strong>Caché</strong> long enough to separate a mirror copy of data and then quickly allow <strong>Caché</strong> to continueprocessing. The mirror is backed up and then later rejoined to the production disk(s).• Advantages — An incremental pass is not necessary in the restore process.• Disadvantages — Unless mirroring or a similar technology is used, you must pause thesystem for a considerable amount of time.Procedure Outline:<strong>Caché</strong> non-concurrent external backups are performed through the following steps:20 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


1. Quiesce (pause) the database (this prevents any writes from occurring or locks from beingtaken out).2. Separate the disk mirror from production (if advanced disk technologies are used), ormake a copy of the Cache.dat files.3. Resume <strong>Caché</strong>.Backup Methods4. If a mirror was split off, backup the mirror copy of the database, and rejoin the mirror toproduction.2.3.1.3 Cold BackupThe cold backup strategy is generally used in the rare case when the operation toleratesdowntime. Often, smaller installations that do not have strict 24/7 access requirements usethis strategy. Sometimes, this is done only when performing a complete system backup aspart of a maintenance effort such as repairing faulty hardware. In this situation, stop <strong>Caché</strong>during the backup period and restart it when the backup completes.• Advantages — Very simple procedure (stop <strong>Caché</strong> and copy the cache.dat files).• Disadvantages — <strong>Caché</strong> must be stopped; so, of all the backup options, this methodinvolves the longest downtime.Procedure Outline:1. Stop <strong>Caché</strong> using the ccontrol command or through the <strong>Caché</strong> Cube.2. Perform the backup.3. Restart <strong>Caché</strong> using the ccontrol command or through the <strong>Caché</strong> Cube.2.3.2 Online Backup<strong>Caché</strong> implements a proprietary backup mechanism designed to cause very minimal or, inmost cases, no downtime to users of the production system. The online backup captures onlyblocks that are in-use by the database. The output goes to a sequential file (or directly to tape,though not recommended). The backup file is then copied to the backup media along withany other external files such as the .cpf file, the CSP files, and external files used by theapplication.The <strong>Caché</strong> backup uses a multipass scan to backup database blocks. It is expected that eachpass has a reduced list of modified blocks and that generally three passes are sufficient tocomplete a backup. During the entire final pass and for a brief moment during each prior<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 21


Backup and Restorepass, the system pauses writes to the database. If the backup list contains only new-formatdatabases (8-KB block size), only physical writes to the database are paused while user processesare allowed to continue performing updates in memory. If the backup list contains anyold-format (2-KB block size) databases, or if it is a clustered <strong>Caché</strong> environment, then alluser activity is paused for these multiple brief periods.The concurrent <strong>Caché</strong> online backup strategy is used when the backup must have the leastimpact on <strong>Caché</strong> processes. This is a strategy used across all sizes of production facilities.In the case where 8-KB databases are used in a nonclustered environment, it is possible toback up the database without pausing user processes. The backup procedure incorporatesmultiple passes to copy the data, where each consecutive pass copies any data blocks thatchanged during the previous pass. During the last pass, writes to the disk are paused, whilewrites to the buffers are still allowed, thus users are not impacted (provided there are sufficientglobal buffers). In a clustered environment (or when some 2-KB databases are backed up),user processes are paused briefly during the final pass of the backup.There are three different types of concurrent online backups, which can be combined tomanage a trade-off between the size of the backup output, and the time needed to recoverfrom the backup:Full BackupWrites an image of all in-use blocks to the backup media.• Advantages — Provides the basis of your database restoration; a requirement for cumulativeand incremental backups.• Disadvantages — Time-consuming operation.Incremental BackupWrites all blocks that have been modified since the last backup of any type. Must be used inconjunction with a previous full backup and (optionally) subsequent cumulative or incrementalbackups.• Advantages — Quickest backup; creates smallest backup files.• Disadvantages — May end up having to restore multiple incremental backups, slowingdown the restore process.Cumulative BackupWrites all blocks that have been modified since the last full backup. Must be used in conjunctionwith a previous full backup.22 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


• Advantages — Quicker than a full backup; quicker to restore than multiple incrementalbackups.• Disadvantages — More time-consuming than incremental backups.Configuring <strong>Caché</strong> Backup Settings<strong>Caché</strong> online backup writes all database blocks to a single file (or set of tapes) in an interleavedfashion. When an extremely large amount of data is backed up using online backup, restorescan become somewhat cumbersome. This should be considered when planning your backupstrategy. The restore validation process discussed above helps resolve limitations in this areaby providing an online, restored copy of the databases.When using incremental or cumulative backup, the same backup validation method explainedearlier in this document should of course be used. After each incremental or cumulativebackup is performed, it can be immediately restored to the alternate server. As an example,a strategy of weekly full backups and daily incremental backups can work well because eachdaily backup only contains blocks modified that day. Each day restore that incremental tothe alternate server, and check integrity.As discussed previously, overwriting the warm copy of the last known good backup whenrestoring the backup currently being validated should be avoided. The same concept applieswhen restoring an incremental to the existing restored database. After the backup is establishedas being the last known good backup and before applying the next day’s incremental orcumulative backup to it, a copy should be saved so that the last known good backup is alwaysonline and ready for use in case the subsequent incremental restore fails. If a restored backupfails an integrity check, it must be discarded and cannot be used as a target of a subsequentincremental restore.When restoring a system from a <strong>Caché</strong> backup, first restore the most recent full backup, followedby the most recent cumulative backup, and then all incremental backups taken sincethe cumulative backup.2.4 Configuring <strong>Caché</strong> Backup SettingsYou can configure the <strong>Caché</strong> database backup settings from the [Home] > [Configuration] >[Database Backup Settings] and the [Home] > [Configuration] > [Task Manager Settings] > [TaskSchedule] pages of the System Management Portal.From the System Management portal you can perform the following configuration tasks:• Define Database Backup List<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 23


Backup and Restore• Configure Backup Tasks• Schedule Backup Tasks2.4.1 Define Database Backup List<strong>Caché</strong> maintains a database list that specifies the databases to be backed up. You can displaythis list by opening the [Home] > [Configuration] > [Database Backup Settings] > [BackupDatabase List] page of the System Management Portal.Use the arrow buttons to move the databases you do not want to back up to the Available listand the databases you do want to back up to the Selected list. Click Save.This database list is ignored by the FullAllDatabases backup task, which performs a backupof all <strong>Caché</strong> databases.When you add a new database to your system, <strong>Caché</strong> automatically adds it to the databaselist. If you do not need to include the new database in your backup plan, be sure to removeit from the Backup Database List.You can also maintain the backup database list using theBackup.General.AddDatabaseToList() and Backup.General.RemoveDatabaseFromList()methods.In addition to pausing writes as described above, the freeze method also handles switchingjournal files, and writing a backup marker to the journal. The class methods that perform thedatabase freeze and thaw operations are Backup.General.ExternalFreeze() andBackup.General.ExternalThaw() respectively. See for the Backup.General class descriptionin the <strong>Caché</strong> Class Reference for details on using these methods.2.4.2 Configure Backup Tasks<strong>Caché</strong> provides four different types of backup tasks; each is listed as an item on the DatabaseBackup Settings menu. The four backup tasks are:• Configure Full Backup of All Databases• Configure Full Backup of the Database List• Configure Incremental Backup of the Database List• Configure Cumulative Backup of the Database List24 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


These are predefined backup tasks that an operator can run on-demand from the [Home] >[Backup] page of the portal. You can also schedule combinations of these backup tasks usingthe Task Manager. See the Schedule Backup Tasks section later in this chapter for details.The process for configuring each of these tasks is the same. The Name, Description, and Typefields are read-only and reflect the menu choice as described in the following table.Backup Task DescriptionsConfiguring <strong>Caché</strong> Backup SettingsNameFullAllDatabasesFullDBListIncrementalDBListCumuIncrDBListDescriptionFull backup of all databases, whether or not theyare in the Backup Database List.Full backup of the <strong>Caché</strong> databases listed in theBackup Database List.Incremental backup of changes made to the datasince the last backup, whether full or cumulative.Backup is performed on the databases currentlylisted in the Backup Database List.Cumulative and Incremental backup of all changesmade to the data since the last full backup. Backupis performed on the databases currently listed inthe Backup Database List.TypeFullFullIncrementalCumulativeYou can send backup output to a directory on disk or to magnetic tape. Select one of the twooptions:1. To back up to a directory on disk, specify the file pathname in the Device field. ClickBrowse to select a directory.2. To back up to magnetic tape, select the Save to Tape check box, and specify a TapeNumber from the list of available tape device numbers.See the Identifying Devices section of the <strong>Caché</strong> I/O Device <strong>Guide</strong> for detailed informationregarding tape numbers.The Define Database Backup List section describes how to maintain the Backup DatabaseList.2.4.2.1 Backup File NamesBy default, backup files are stored in CacheSys\Mgr\Backup. The backup log files are storedin the same directory. Backup files have the suffix .cbk. Backup log files have the suffix .log.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 25


Backup and RestoreBackup files and backup log files use the same naming conventions:• The name of the backup task, followed by an underscore character (_)• The date of the backup, in yyyymmdd format, followed by an underscore character (_)• An incremental number, nnn, for that task, for that day• The .log or .cbk suffixWhere nnn is a sequence number incremented for that backup task on that date. <strong>Caché</strong> createsa log file for every backup attempt; successful, failed, or aborted. <strong>Caché</strong> creates a backup fileonly upon successful backup, but its increment number matches the corresponding log fileincrement number.For example: You perform three FullDBList backup operations on June 4, 2006, the firstsuccessful, the second aborted, the third successful. This generates three .log files, numbered001, 002, and 003, but only two .cbk files, numbered 001 and 003.The backup files:FullDBList_20060604_001.cbkFullDBList_20060604_003.cbkThe matching log files:FullDBList_20060604_001.logFullDBList_20060604_002.logFullDBList_20060604_003.log2.4.3 Schedule Backup TasksYou should ideally set up a schedule for running backups. Backups are best run at a timewhen there are the least amount of active users on the system.In addition to the four backup tasks supplied with <strong>Caché</strong>, you can create additional definitionsof these four backup tasks. For example, you could create two full backup tasks, one to savethe backup to a disk file, the other to save the backup to a tape. Or, to alternate backupsbetween two disk drives, you could create a backup task for each drive.Use the <strong>Caché</strong> Task Manager to schedule these backup tasks:1. Navigate to the [Home] > [Configuration] > [Task Manager Settings] > [Task Schedule]page of the System Management Portal.2. Click Schedule New Task.3. Specify the Name, Description, Backup Type, and output location.26 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Managing <strong>Caché</strong> Online BackupsYou can delete any task you add by clicking Delete on its row on the Task Schedule page.2.5 Managing <strong>Caché</strong> Online BackupsYou can run <strong>Caché</strong> database backup tasks and view backup history from the [Home] > [Backup]page of the System Management Portal. If you schedule additional backup tasks using theTask Manager, you can manage those from the [Home] > [Task Manager] page of the SystemManagement Portal.From the System Management portal you can perform the following backup tasks:Run Backup TasksView Backup StatusView Backup HistoryWhen you add a new database to your system, you must perform a full backup. You cannotperform an incremental backup, or restore a database, until a full backup exists.After installing <strong>Caché</strong> it is recommended that you perform a FullAllDatabases backup toestablish a complete backup for subsequent use by the other backup tasks.2.5.1 Run Backup TasksThere are four types of backup tasks you can run from the System Management Portal, eachhaving its own menu item:• Run Full Backup of All Databases• Run Full Backup of the Backup Database List• Run Incremental Backup of the Backup Database List• Run Cumulative Backup of the Backup Database ListYou must have performed a full backup on a database before performing an incremental orcumulative backup on that database.Read the Run Backup Task box to verify that the settings are correct. If the backup optionsare correct, click OK to start the backup.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 27


Backup and RestoreWhile running a backup from the [Home] > [Backup] > [Run Backup] page, you can view thestatus of the running backup by clicking the text next to Backup started. See Monitor BackupStatus for details.Performing Multivolume BackupsA backup, particularly a full backup, may require multiple tape volumes, or multiple diskfiles. Currently, there is no way to perform a multivolume backup using the System ManagementPortal. If you require a multivolume backup use the ^BACKUP utility. If a disk fullcondition occurs, <strong>Caché</strong> prompts you for the name of another disk file on another disk.In the event of an error during backup, you cannot restart the backup on a second or subsequentvolume. You must restart the backup from the beginning.2.5.2 View Backup StatusClick View on the running backup process to monitor the progress of the backup operation.The same information is recorded in the log file for that backup operation, which you canlater view from the View Backup History page.When <strong>Caché</strong> begins a backup, it updates the Time and Status columns of the listing. The Timecolumn records the date and time that the backup was initiated, not when it completed. TheStatus column is updated to Running.Upon completion of a backup, <strong>Caché</strong> again updates the Status column to indicate the finalstatus of the backup. Completed indicates the backup successfully completed. Failed indicatesthe backup could not be performed or was aborted by the operator.One cause of backup failure is trying to perform a backup on a dismounted database.2.5.3 View Backup HistoryEvery backup operation creates a separate backup log file. The logs follow the naming conventiondescribed in Backup File Names.From the portal you can view a list of system backup logs from completed backup tasks:1. Navigate to the [Home] > [Backup] page of the System Management Portal.2. Click View Backup History in the right-hand column to display the [Home] > [Backup] >[View Backup History] page.3. To view the contents of a particular file, click View in the right-hand column of theappropriate row. You can view the contents of a backup log file and search for a stringwithin that file.28 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Managing <strong>Caché</strong> Online Backups2.5.4 Error Handling for BackupsIn the event of an error during backup, the backup utility allows you to retry the device onwhich the error occurred. Alternatively, the backup can be aborted.On stand-alone systems (those not clustered), if you abort a backup regardless of the type,the next backup must be a full backup. This full backup on a stand-alone system does notblock access to the database.If a backup encounters any I/O errors, the backup aborts and logs a system error in thecconsole.log file, viewable by the Backup utility, the SYSLOG character-based utility or anytext file viewer. The log file allows you to see quickly where the problem occurred so thatyou can fix it.2.5.5 Backing Up Selected Globals and RoutinesSometimes you may want to back up only some of the globals or routines in a database. Inthese cases, use the <strong>Caché</strong> global and routine backup utilities. You benefit from using the<strong>Caché</strong> global and routine backup utilities in these cases:• Restoring selectively — If you have backed up your globals using the Export option fromthe [Home] > [Globals] page of the System Management Portal, you can use the Importoption to restore only the globals you require.• Restoring a database after extensive repairs — When your <strong>Caché</strong> database suffersdegradation, it does not use the space as efficiently as it could; some unused blocks arenot marked as being available, and pointers become overly indirect. If you used the Exportoption to back up your globals before you had the problem, you can recreate the database,and then load in the globals using the Import option.• You can use the Export and Import options from the [Home] > [Routines] page of theSystem Management Portal to back up and restore individual routines.Use the <strong>Caché</strong> Export utility when you want to back up source code (.MAC, .INC, and/or.INT extensions), or both source and object code (.OBJ extension). Use the <strong>Caché</strong> Importutility to restore these files.Since you can select from several formats for the routine Export file, you can use the<strong>Caché</strong> Export utility to export routines to other <strong>InterSystems</strong> database systems, includingDSM, DTM, ISM, and MSM.Routines and globals are backed up into standard format files. These files are referred to asRSA (routine save) and GSA (global save) files.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 29


Backup and Restore2.6 Restoring from a BackupIf any problem arises that renders your data inaccessible or unusable, you can recreate thatdata by restoring the affected database(s) from backup and applying the changes recorded inthe journal.CAUTION:If you backed up with a UNIX or OpenVMS backup utility, use the sameutility to restore.To perform a restore, follow the following strategy:1. Identify which <strong>Caché</strong> databases require restoration.2. Restore the last full backup of those <strong>Caché</strong> databases.3. If you have done cumulative incremental backups since the full backup, restore the lastone.4. Restore all incremental backups done since the last full backup in the order in which thebackups were performed, or restore the last cumulative incremental backup, whicheverwas more recent.5. Apply the changes in the journal file for the directories restored or selected directoriesand globals you specify.6. Perform a full backup of the restored system.2.6.1 Using the Backup History to Recreate the DatabaseThe Backup utility maintains a backup history. The Restore utility prompts you for thebackup(s) to restore according to their order in the backup history.Note:On <strong>Caché</strong> platforms that support access to the same database from multiple computers,you should always back up a given directory from the same computer, so that itscomplete backup history is available if you need to restore the directory.When you select one of the three restore options on the BACKUP main menu, the utility asksyou to enter the name of the device holding the first backup to be restored. The default thefirst time you enter a restore option is the device the last full backup was sent to, if there wasone.30 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Restoring from a Backup<strong>Caché</strong> helps you restore backups in logical order. After restoring the last full backup, theutility uses the backups in the Backup History to suggest the next logical backup for you torestore. It cycles through all of the backups in this way.Having already prompted you with the last full backup, it prompts you to restore subsequentbackups in the following order:1. It prompts you for the most recent cumulative incremental backup after the last fullbackup, if one exists.2. After restoring the most recent cumulative incremental backup, if there was one, it promptsyou to restore all incremental backups since the last cumulative incremental backup (orif none exists, since the last full backup). It does so in order from the first to the mostrecent.You can override the suggested backups in the restore process. Remember, however, thatan incremental or cumulative incremental backup does not represent a complete copy ofyour disk. You can restore an incremental backup only after restoring a full backup.2.6.2 Suspending Database Access During a RestoreIn most cases, the database you are restoring is not fully independent of the other databaseson the system. For this reason, it is recommended that all user activity be suspended duringrestore. Even if you are the only user on your system, you still want to restrict login accessif any users can log in remotely to your system.You can, however, restore a database with users active on other databases. All databasesbeing restored are dismounted during the restore. Therefore, if you did not suspend databaseaccess, users who try to access the databases being restored receive errors.2.6.3 Restoring Database PropertiesIf the characteristics of a directory have changed by the time you do a restore, the restoreutility handles these situations. It creates <strong>Caché</strong> databases as necessary and modifies theircharacteristics as appropriate, to return them to the state they were in at the time the backupwas completed.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 31


Backup and Restore2.6.4 Performing a RestoreYou must choose whether to restore all directories or selected directories. See To RestoreAll Directories if you want to restore all directories. See Restoring Selected or RenamedDirectories if you want to restore selected or renamed directories.You can also choose whether to suspend processes and whether to restore the journal.The following table summarizes each action that the <strong>Caché</strong> BACKUP utility RESTOREfunction performs, and shows with which Restore option it is associated. If both Restoreoptions perform an action, the Restore Option column contains the word Both.Restore ActionsRestore ActionVerifies that you want to restore ALL directories.Asks if you want to stop all other <strong>Caché</strong> processes from runningduring the restore. Normally you would say Yes.Asks for the name of the file that holds the backup you wish torestore. If your backup is on more than one volume, that informationis in the volume header. After restoring the first volume, the utilityprompts you for the next.Displays the header of the volume.The header contains the followinginformation determined during the backup: - Date of the backup onthis volume - Date of previous backup - Date of last FULL backupAsks you to verify that this is the backup you wish to restore.Lets you select directories to restore, and optionally rename them.Lists directories on volume that will be restored.Allows you to specify another input device that contains the nextbackup to restore.Lets you suspend all other <strong>Caché</strong> processes during the restore.Allows you to pick one of three ways to apply the global changeslogged in the journal file. You can also choose not to apply thesechanges.Restore OptionRestore AllBothBothBothBothRestore SelectedBothBothBothBoth2.6.4.1 To Restore All DirectoriesThe following procedure restores all directories:32 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Restoring from a Backup1.USER>DO ^%CDNamespace: %SYSYou're in namespace %SYSDefault directory is c:\cachesys\mgr\%SYS>DO ^BACKUP2. Select Restore ALL from the Backup utility options. This option restores all directoriesthat are on the backup medium.%SYS>Do ^BACKUP1) Backup2) Restore ALL3) Restore Selected or Renamed Directories4) Edit/Display List of Directories for BackupsOption?3. Confirm that you want to restore all directories:Proceed with restoring ALL directories Yes=>• If you press Enter, the restore process proceeds for all databases.• If you answer No, you can choose which type of restore to perform.Restore: 1. All directories2. Selected and/or renamed directories3. Exit the restore program1 =>4. Indicate whether you want to suspend <strong>Caché</strong> processes while restoring takes place.<strong>InterSystems</strong> recommends suspending processes.Do you want to set switch 10 so that other processes will beprevented from running during the restore? Yes =>5. Specify the first file from which to restore. You can press Enter to accept the default file,which is the last full backup.Specify input file for volume 1 of backup 1(Type STOP to exit)Device: c:\cachesys\mgr\backup\FullAllDatabases_20060323_001.cbk =>6. Check that the description of the backup is correct and verify that this is the file you wantto restore.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 33


Backup and RestoreThis backup volume was created by:Cache for Windows (Intel) 5.1The volume label contains:Volume number 1Volume backup MAR 23 2006 09:52AM FullPrevious backup MAR 22 2006 11:00AM IncrementalLast FULL backup MAR 16 2006 11:00AMDescriptionFull backup of ALL databases, whether or not they are inthe backup database list.Buffer Count 0Is this the backup you want to start restoring? Yes =>7. The utility tells you which directories it will restore, and the restore proceeds.The following directories will be restored:c:\cachesys\mgr\c:\cachesys\mgr\cacheaudit\c:\cachesys\mgr\samples\c:\cachesys\mgr\test\c:\cachesys\mgr\user\***Restoring c:\cachesys\mgr\ at 10:46:01146045 blocks restored in 241.3 seconds for this pass, 146045 total restored.***Restoring c:\cachesys\mgr\cacheaudit\ at 10:50:0153 blocks restored in 0.0 seconds for this pass, 53 total restored.***Restoring c:\cachesys\mgr\samples\ at 10:50:01914 blocks restored in 0.6 seconds for this pass, 914 total restored.***Restoring c:\cachesys\mgr\test\ at 10:50:0253 blocks restored in 0.0 seconds for this pass, 53 total restored.***Restoring c:\cachesys\mgr\user\ at 10:50:02124 blocks restored in 0.1 seconds for this pass, 124 total restored.***Restoring c:\cachesys\mgr\ at 10:50:025 blocks restored in 0.0 seconds for this pass, 146050 total restored.***Restoring c:\cachesys\mgr\cacheaudit\ at 10:50:021 blocks restored in 0.0 seconds for this pass, 54 total restored.***Restoring c:\cachesys\mgr\samples\ at 10:50:021 blocks restored in 0.0 seconds for this pass, 915 total restored.***Restoring c:\cachesys\mgr\test\ at 10:50:021 blocks restored in 0.0 seconds for this pass, 54 total restored.***Restoring c:\cachesys\mgr\user\ at 10:50:021 blocks restored in 0.0 seconds for this pass, 125 total restored.***Restoring c:\cachesys\mgr\ at 10:50:023 blocks restored in 0.0 seconds for this pass, 146053 total restored.***Restoring c:\cachesys\mgr\cacheaudit\ at 10:50:021 blocks restored in 0.0 seconds for this pass, 55 total restored.***Restoring c:\cachesys\mgr\samples\ at 10:50:021 blocks restored in 0.0 seconds for this pass, 916 total restored.***Restoring c:\cachesys\mgr\test\ at 10:50:021 blocks restored in 0.0 seconds for this pass, 55 total restored.***Restoring c:\cachesys\mgr\user\ at 10:50:021 blocks restored in 0.0 seconds for this pass, 126 total restored.34 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


8. Specify the input file for the next incremental backup to restore, or enter STOP if thereare no more input files to restore.Specify input file for volume 1 of backup following MAR 23 2006 09:52AM(Type STOP to exit)Device: stop9. Indicate whether you want to restore other backups. When you answer Yes, the procedurerepeats from step 3. When you respond No, <strong>Caché</strong> mounts the databases you have restored.Do you have any more backups to restore? Yes => NoMounting c:\cachesys\mgr\c:\cachesys\mgr\ ... (Mounted)Mounting c:\cachesys\mgr\cacheaudit\c:\cachesys\mgr\cacheaudit\ ... (Mounted)Mounting c:\cachesys\mgr\samples\c:\cachesys\mgr\samples\ ... (Mounted)Mounting c:\cachesys\mgr\test\c:\cachesys\mgr\test\ ... (Mounted)Mounting c:\cachesys\mgr\user\c:\cachesys\mgr\user\ ... (Mounted)Restoring from a Backup10. Specify which journal entries you want to apply to the restored databases, and the nameof the journal file you are restoring. Normally, you select Option 1 and apply only thosechanges that affect the directories you have just restored. While the journal is beingrestored, replication is disabled until all journal files have been restored.Restoring a directory restores the globals in it only up to thedate of the backup. If you have been journaling, you can applyjournal entries to restore any changes that have been made in theglobals since the backup was made.What journal entries do you wish to apply?1. All entries for the directories that you restored2. All entries for all directories3. Selected directories and globals4. No entriesApply: 1 =>11. Restore from the journal files begins.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 35


Backup and RestoreWe know something about where journaling was at the time of the backup:0: offset 172940 in c:\cachesys\mgr\journal\20060323.002Use current journal filter (ZJRNFILT)? NoUse journal marker filter (MARKER^ZJRNFILT)? NoUpdates will not be replicatedThe earliest journal entry since the backup was made is atoffset 172940 in c:\cachesys\mgr\journal\20060323.002Do you want to start from that location? Yes => YesFinal file to process (name in YYYYMMDD.NNN format): [?]=>Prompt for name of the next file to process? No => NoProvide or confirm the following configuration settings:Journal File Prefix: =>Files to dejournal will be looked for in:c:\cachesys\mgr\journal\c:\journal\altdir\in addition to any directories you are going to specify below, UNLESSyou enter a minus sign ('-' without quotes) at the prompt below,in which case ONLY directories given subsequently will be searchedDirectory to search: Here is a list of directories in the order they will be searched for files:c:\cachesys\mgr\journal\c:\journal\altdir\The journal restore includes the current journal file.You cannot do that unless you stop journaling or switchjournaling to another file.Do you want to switch journaling? Yes => YesJournaling switched to c:\cachesys\mgr\journal\20060323.004You may disable journaling the updates for faster restore; on the other hand,you may not want to do so if a database to restore is being shadowed.Do you want to disable journaling the updates? Yes => yesUpdates will NOT be journaledc:\cachesys\mgr\journal\20060323.00261.32% 65.03% 68.44% 72.21% 75.86% 79.26% 82.73% 86.08% 89.56%92.99% 96.07% 98.87%100.00%***Journal file finished at 11:03:31c:\cachesys\mgr\journal\20060323.00316.17% 17.10% 17.90% 18.90% 20.05% 21.33% 22.58% 23.81% 25.15%26.32% 27.65% 28.85% 30.08% 31.37% 32.59% 33.98% 35.16% 36.25%37.32% 38.41% 39.55% 40.72% 41.81% 42.83% 43.85% 44.89% 46.00%47.15% 48.24% 49.28% 50.32% 51.41% 52.54% 53.71% 54.76% 55.80%56.85% 57.97% 59.10% 60.16% 61.17% 62.19% 63.24% 64.32% 65.18%66.02% 66.87% 67.71% 68.52% 69.34% 70.14% 70.96% 71.76% 72.60%73.58% 74.51% 75.43% 76.35% 77.26% 78.17% 79.07% 79.69% 80.31%80.93% 81.56% 82.20% 82.83% 83.47% 84.27% 87.00% 88.57% 91.65%93.03% 96.09% 97.44% 99.04%100.00%***Journal file finished at 11:03:32Journal reads completed. Applying changes to databases...14.29% 28.57% 42.86% 57.14% 71.43% 85.71% 100.00%[journal operation completed]Replication Enabled1) Backup2) Restore ALL36 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Restoring from a Backup3) Restore Selected or Renamed Directories4) Edit/Display List of Directories for BackupsOption?2.6.4.2 Restoring Selected or Renamed DirectoriesThe Restore Selected or Renamed Directories option lets you select which directoriesto restore from the backup medium. It also allows you to restore a database to a differentdirectory name.The following example shows how to restore selected or renamed directories. It uses UNIXstyledirectory names.1.%SYS>DO ^BACKUP2. Select Restore Selected or Renamed Directories from the Backup Menu.3. Indicate whether you want to suspend <strong>Caché</strong> processes while restoring takes place.<strong>InterSystems</strong> recommends suspending processes.Do you want to set switch 10 so that other Cache processeswill be prevented from running during the restore? Yes =>4. Specify the first file from which to restore. You can press to accept the defaultfile, which is the last full backup.Specify input file for volume 1 of backup 1(Type STOP to exit)Device: c:\cachesys\mgr\backup\IncrementalDBList_20060323_001.cbk =>5. Check that the description of the backup is correct and verify that this is the file you wantto restore.This backup volume was created by:Cache for Windows (Intel) 5.1The volume label contains:Volume number 1Volume backup MAR 23 2006 11:03AM FullPrevious backup MAR 23 2006 09:52AM FullLast FULL backup MAR 23 2006 09:52AMDescriptionIncremental backup of all databases that are in the backupdatabase list.Buffer Count 0Is this the backup you want to start restoring? Yes =>6. As the utility prompts you with directory names, specify which databases you want torestore, and in which directories you want to restore them:<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 37


Backup and RestoreFor each database included in the backup file, you can:-- press RETURN to restore it to its original directory;-- type X, then press RETURN to skip it and not restore it at all.-- type a different directory name. It will be restored to the directoryyou specify. (If you specify a directory that already contains adatabase, the data it contains will be lost).c:\cachesys\mgr\ =>c:\cachesys\mgr\cacheaudit\ =>c:\cachesys\mgr\test\ =>c:\cachesys\mgr\user\ =>Do you want to change this list of directories? No =>7. After responding to each directory prompt, you see the prompt: “Do you want to changethis list of directories? No=>” . Answer Yes if you want to edit your choices, or pressEnter to confirm them.8. Continue the Restore from Step 8 in the procedure for restoring all directories, as specifiedearlier in this chapter.2.6.5 Error Handling for RestoreIf an error occurs while you are restoring, you are given these options:• Retry the device• Skip that block or set of blocks and continue with the restore• Abort the restore of that directory but otherwise continue with the restore• Abort the restore2.7 <strong>Caché</strong> Backup Utilities• Estimating Size of Backups — ^DBSIZE2.7.1 Estimating Size of BackupsImmediately before performing any backup, estimate its size using the ^DBSIZE utility,which estimates disk space needed for the backup.The utility ^DBSIZE provides an estimate of the size of the output created by a <strong>Caché</strong> backup.It is only an estimate, since there is no way of knowing how many blocks will be modified38 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


once the backup has been started. You can obtain a more accurate estimate by preventingglobal updates while running ^DBSIZE, and then doing your backup before allowing globalupdates to resume.You can estimate the size of backups in two ways:• Run ^DBSIZE interactively• Call ^DBSIZE from a routine<strong>Caché</strong> Backup UtilitiesNote:A database must be in the list of selected databases to be backed up before you canevaluate it with DBSIZE.2.7.1.1 Run DBSIZE InteractivelyThe following procedure describes the steps necessary to run ^DBSIZE interactively:1.Do ^DBSIZE2. <strong>Caché</strong> displays the DBSIZE main menu:Incremental Backup Size EstimatorWhat kind of backup:1. Full backup of all in-use blocks2. Incremental since last backup3. Cumulative incremental since last full backup4. Exit the backup program1=>3. Select the type of backup for which you want an estimate: full, incremental, or cumulativeincremental.4. At the “Suspend Updates? Yes=>” prompt, either:Press Enter to suspend updates so that you get a more accurate estimate;orEnter No to continue updates5. Examine the results that are displayed.First, DBSIZE shows you how many <strong>Caché</strong> blocks you need to do the type of backupyou selected, for:• Each directory in the backup list• All directories in the backup list.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 39


Backup and RestoreSuspend Updates? Yes=> nDirectoryIn-UseBlocksc:\cachesys\mgr\ 983c:\cachesys\mgr\cachelib\ 5320c:\cachesys\mgr\docbook\ 6137c:\cachesys\mgr\samples\ 687c:\cachesys\mgr\user\ 45--------Total Number of Database Blocks: 13172For a disk file:Total size including overhead:52960 512-byte blocks = 27115520 bytesFor Magnetic Media:Total Number of 16KB Blocks including overhead of backupvolume and pass labels: 1655Next, DBSIZE provides information about backup to disk file (for Windows 95/98, WindowsNT and UNIX) or RMS file (for OpenVMS). If the directories to be backed up include anylong strings, you see separate lines for standard and long block sizes.For a disk file:Total size including overhead:52960 512-byte blocks = 27115520 bytesFor an RMS file:Total Number of 512 Blocks including overhead of backupvolume and pass labels: 24064Pre Allocation quantity is: 32016Finally, DBSIZE provides information about the amount of space used if the backup is madeto magnetic tape.For Magnetic Media:Total Number of 16KB Blocks including overhead of backupvolume and pass labels: 16552.7.1.2 Use the DBSIZE FunctionYou can also call ^DBSIZE from a routine. To do so, you use the following function:$$INT^DBSIZE(backup_type)Important:The values of backup_type differ from the option values running ^DBSIZEinteractively.40 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


<strong>Caché</strong> Backup UtilitiesValues of backup_typebackup_type123DescriptionIncremental backupFull backupCumulative incremental backupFor example to run a full backup:%SYS>w $$INT^DBSIZE(2)13178^5The returned value is two numbers separated by a caret (^). In this example, the returnedvalue 13178 is the total estimated size of the backup, in blocks; the returned value 5 indicatesthe number of databases to be backed up.%SYS>w $$INT^DBSIZE(1)996^5%SYS>w $$INT^DBSIZE(3)996^5%SYS>%SYS>w $$INT^DBSIZE(1)95^3^950272^16^1013760^1980^1980%SYS>w $$INT^DBSIZE(2)2620^3^22390784^377^22302720^43560^43560%SYS>w $$INT^DBSIZE(3)95^3^950272^16^1013760^1980^1980%SYS>2.7.2 <strong>Caché</strong> ^BACKUP RoutineThe <strong>Caché</strong> ^BACKUP utility allows you to backup <strong>Caché</strong> databases or to restore an alreadycreated backup. If a list of databases has not been created then all databases will be includedin the backup. If a list is created that list will apply to all aspects of the backup systemincluding calls to LISTDIRS^DBACK and CLRINC^DBACK for scripted backups.Note:When editing the database list use the database name, not the directory name. Thisis consistent with the way the backup configuration works in the System ManagementPortal.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 41


Backup and Restore%SYS>Do ^BACKUP1) Backup2) Restore ALL3) Restore Selected or Renamed Directories4) Edit/Display List of Directories for Backups2.8 UNIX Backup and RestoreYou should perform a UNIX level backup of your system periodically. You can perform aUNIX level backup either from the UNIX prompt or using the UNIX/<strong>Caché</strong> conterminousbackup facility cbackup discussed below.2.8.1 Using UNIX Backup UtilitiesUse a UNIX backup utility in place of <strong>Caché</strong> Backup in the following situations:• To restore sequential files or other software that is not part of <strong>Caché</strong>.• To move <strong>Caché</strong> data between systems when you cannot transfer databases directlyThe following table lists UNIX utilities you may find useful for these types of backups.42 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


UNIX Backup and RestoreUNIX Backup Utilities and CommandsUtility/Commandcp and mv (copy andmove)tar (tape archiver)cpio ( not available on allBSD systems)dumpbruFunctionCopies or moves the given file or files to a different file,directory, or file system. Example: move the manager'sdirectory (and the binary files in /usr/bin) to another directory,using the following command: #mv /usr/cache/*/usr/cache.old/*Standard UNIX command to copy files and directories,extract files from tape, and list the files on a tape. Producesmore portable output than cpio or dump.In conjunction with the find command, performs backupssimilar to the tar command.Performs complete system backups.A third-party backup and restore utility. Before using it onyour system, verify that it works properly and meets yourrequirements.2.8.2 cbackup UtilityA UNIX utility called cbackup allows you to back up your <strong>Caché</strong> databases using a systemlevelbackup while automatically updating your <strong>Caché</strong> internal incremental backup. Usingcbackup allows you to use the backup tools available on your operating system to back upyour <strong>Caché</strong> databases and synchronize the bitmap information with the <strong>Caché</strong> internalincremental backup facility. This updates your <strong>Caché</strong> backup history so that you can usecbackup for full backups and the <strong>Caché</strong> Backup utility for incremental backups.On <strong>Caché</strong> for UNIX, you must initiate all restores after backup interactively through theBACKUP utility.Note:The cbackup utility is called from the shell environment. The backup utility halts<strong>Caché</strong> while it performs the backup and then restarts it.In order to set up conterminous journaling you must first set up the call_os_backup file tochoose your UNIX backup utility. The call_os_backup file is automatically installed in yoursystem manager directory. The example below uses tar as the backup utility.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 43


Backup and Restore:# <strong>InterSystems</strong> Corporation## File: call_os_backup## This is a template for the user-specific backup procedure.# It should backup CACHE.DAT files from a directory list contained in the# cback_dir_list file, which is generated by the cbackup script that calls# this one.## Do not forget to include extension files if any!## Variable backstatus should be set to 1 for success# 0 for failure## For example:## sed -e "s/\\/$//" cback_dir_list > cback_tar_list# if tar cvfF /dev/rmt0 cback_tar_list# then backstatus=1# else backstatus=0# fiTo use tar as your UNIX backup utility remove the comment marks from the last five linesabove. If you want to use a different utility, use the form presented above to set up the backuputility.echo "This message is from the call_os_backup script, which should contain a"echo "user-defined backup procedure to backup CACHE.DAT files according to a"echo "list of directories stored in cback_dir_list."echo ""echo " !! Do not forget to include extension files if any !!"echo ""echo " Variable backstatus should be set to 1 for success"echo "0 for failure"#backstatus=1#exit $backstatus2.8.2.1 Performing Both <strong>Caché</strong> and System-level Backups Using cbackup1. Be sure <strong>Caché</strong> is running and that you are in the manager's directory.2. Enter the Backup Menu to ensure that you have selected all the directories you want toback up.3. Halt out of <strong>Caché</strong> to return to the UNIX shell.4. Type cbackup at the UNIX shell prompt. You will be prompted for confirmation of thedirectories to be backed up. The cbackup script automatically shuts down <strong>Caché</strong> andactivates the call_os_backup script which performs the backup.44 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


2.9 OpenVMS Backup and RestoreThe full backup option of the <strong>Caché</strong> Backup utility is the recommended way to do fullbackups on OpenVMS, but you also can use the OpenVMS BACKUP utility.2.9.1 Efficiency<strong>Caché</strong> Full Backup backs up only in a directory that contains a <strong>Caché</strong> database. If yourdatabase is only partly full (60% or less), a <strong>Caché</strong> Full Backup will be faster than the Open-VMS Backup. If it is almost full, however, the OpenVMS Backup will be faster by aboutone-third. For more information, see Estimating Size of Backups.2.9.2 Concurrent OperationYou can perform <strong>Caché</strong> Full Backup while your database is active. To perform OpenVMSBACKUP you must shut down <strong>Caché</strong>. The OpenVMS backup copies entire CACHE.DATand CACHE.EXT files. For more information, see Using the OpenVMS BACKUP Utility.2.9.3 History LogIf you are doing a OpenVMS Backup, you must update the history log manually. Make anentry in the BACKUP Management Backup History table to record the date, time, and typeof your backup.If you are using the <strong>Caché</strong> Backup utility, <strong>Caché</strong> makes this history log entry automatically.<strong>Caché</strong> uses the Backup History table to prompt you for which databases to restore duringbackup restoration.2.9.4 Using the OpenVMS BACKUP UtilityOpenVMS Backup and RestoreCACHE.DAT and CACHE.EXT files created with either the <strong>Caché</strong> Database utility or thecharacter-based MSU utility are RMS files. Thus, they can be backed up and copied usingOpenVMS utilities, such as BACKUP.You can defragment your <strong>Caché</strong> databases by using the OpenVMS BACKUP utility monthlyto backup and restore your CACHE.DAT and CACHE.EXT files.You can use the OpenVMS BACKUP utility to perform your weekly full backup as part ofyour strategy to ensure the physical integrity of your database. OpenVMS BACKUP provides<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 45


Backup and Restoregreater redundancy and error checking than <strong>Caché</strong> backup. As a result, it is more proficientat recovering from tape errors.The disadvantage of OpenVMS BACKUP is that, normally, <strong>Caché</strong> must be shut down to runthe OpenVMS BACKUP. However, you can overcome this disadvantage by doing a OpenVMSBACKUP while <strong>Caché</strong> is running and following it with an incremental backup. See UsingCBACKUP.COM.2.9.4.1 Running OpenVMS BACKUP for a <strong>Caché</strong> DatabaseIf you are using the OpenVMS BACKUP utility by itself, rather than in conjunction with<strong>Caché</strong> incremental backup as in the CBACKUP.COM example file, use this procedure:1. Have users log off the system and set OpenVMS interactive logins to zero.2. Stop <strong>Caché</strong> using the ccontrol stop command procedure.3. Use OpenVMS BACKUP to back up the system.4. If you are doing the backup to defragment your files, use the Restore option of the utility.5. Start <strong>Caché</strong> using the ccontrol start command procedure and resume operation.6. Enable OpenVMS logins.7. Record this full backup in the Management Information option of the <strong>Caché</strong> BACKUPUtility.2.9.5 Using CBACKUP.COM<strong>InterSystems</strong> supplies a command procedure, CBACKUP.COM, which provides a model ofhow to use entry points in <strong>Caché</strong> backup routines to perform a backup while <strong>Caché</strong> is running.This command procedure gives you examples of various combinations of OpenVMS and<strong>Caché</strong> backups.CBACKUP.COM is loaded into the CACHESYS directory during <strong>Caché</strong> installation fromthe file CBACKUP_PROTO.COM on the distribution tape.CBACKUP.COM checks that the process which is executing it meets one of the followingthree criteria:• Has the system manager's UIC• Is authorized to hold SYSPRV• Is authorized to hold CMKRNL46 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


This privilege is required because CBACKUP.COM uses the /IGNORE=INTERLOCKqualifier in the OpenVMS BACKUP command. If the process does not meet one of the criteria,an error message is printed and CBACKUP.COM terminates.CBACKUP.COM carries out these actions:1. Performs the OpenVMS backup.2. Records the date, time, and a brief description of the OpenVMS full backup in the <strong>Caché</strong>Backup History. This information is used later when you request a restore.3. Runs a <strong>Caché</strong> incremental backup.Sample Backup Scripts<strong>InterSystems</strong> recommends that you examine this procedure in detail, modify it as necessary,and use it if you wish to use OpenVMS BACKUP or any entry points to <strong>Caché</strong>Backup.2.9.6 Restore on OpenVMSIf you backed up with the OpenVMS backup utility, use the same utility to restore.On <strong>Caché</strong> for OpenVMS you can initiate restores either interactively or from a batch programthat you write to enable unattended restores.Your batch program can implement unattended restores by calling two entry points into theutility. Both entry points are functions, which return the status of the call.• EXTALL^DBREST restores all directories present on the backup device.• EXTSELCT^DBREST restores selected files from the backup device.2.10 Sample Backup Scripts2.10.1 External UNIX Backup Script<strong>Caché</strong> makes it easy to integrate with such utilities. The following is an example of a UNIXprocedure:1. Clear the list of database blocks modified since the last backup. This synchronizationpoint later allows you to identify all database blocks modified during the backup. Call<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 47


Backup and Restorethe application program interface (API), CLRINC^DBACK("QUIET"), in the backupscript; this completes instantly.2. Using your preferred backup utility, copy the CACHE.DAT files which may be in use.3. Perform an incremental backup of the blocks modified by users during the backup. Thisshould be an output sequential file. Since it is likely to be a small set of blocks, this stepshould complete very quickly. Call the API, BACKUP^DBACK(), in the backup script.Important:The journal file should be switched at this time.4. Copy the incremental file to the backup save-set, using your preferred UNIX command.The following is an abbreviated example of the cbackup script:../bin/cuxs -s . -U "" -B


processing. You back up the mirror and then later rejoin it to the production disk(s) and catchup. <strong>Caché</strong> makes it easy to integrate in this configuration:1. Switch the journal file using the System Management Portal, JRNSWTCH, or the APIINT^JRNSWTCH.2. Quiesce <strong>Caché</strong> processing in a safe way, to ensure the structural integrity of globals.Make sure that application processes pause and system processes are allowed to completetheir work by setting a software switch that inhibits activity and waiting for all processesto quiesce. The API, ENQ13^DBACKA, completes both operations within a few seconds.Set the variables CLUBACKUP, ALRDY13, and NOFORCE to 0 before calling ENQ13.3. Separate the disk mirror from production.4. Resume <strong>Caché</strong> processing by calling the API DEQ13^DBACKA, which clears thesoftware switch that inhibits processing.5. Back up the mirror copy.6. Rejoin the mirror disk(s) with production.Sample Backup Scripts<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 49


3JournalingGlobal journaling preserves changes in the database since the last backup. While a backupis the cornerstone of physical recovery, it is not the complete answer. Restoring the databasefrom a backup does not recover changes made since that backup. Typically, this is a numberof hours before the point at which physical integrity was lost. What happens to all the databasechanges that occurred since then? The answer lies with journaling.This chapter discusses the following topics:• Journaling Overview• Configuring Journaling• Journaling Operation Tasks• Journaling Utilities• Special Considerations for Journaling3.1 Journaling OverviewEach instance of <strong>Caché</strong> keeps a journal. The journal is a set of files that keeps a time-sequencedlog of changes that have been made to the databases since the last backup. The process isredundant and logical and does not use the <strong>Caché</strong> Write daemon. <strong>Caché</strong> transaction processingworks with journaling to maintain the logical integrity of data.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 51


JournalingWhen <strong>Caché</strong> starts, it reapplies all journal entries since the last Write daemon pass. Sinceuser processes update the journal concurrently, rather than through the Write daemon, thisapproach provides added assurance that updates prior to a crash are preserved.This release of <strong>Caché</strong> enhances the configuration and management of journaling to providea safer and more consistent approach to supporting highly available systems. The journalingstate is now a property of the database, not individual globals. A database can have only oneof two global journaling states: Yes or No.The journal contains global update operations (Set and Kill operations, for example) forglobals in transactions regardless of the setting of the databases in which the affected globalsreside, as well as all update operations for globals in databases whose Global Journal Stateis Yes. This greatly improves reliability of the system; it avoids inconsistencies (after crashrecovery) due to updates to globals that may or may not be journaled, and that may or maynot be involved in transactions.Journaling global operations in databases mounted on a cluster depends on the database setting.The local <strong>Caché</strong> instance does not journal transaction operations to globals on remote nodes.In a network configuration, journaling is the responsibility of the node on which the globalactually resides, not the one that requests the Set or Kill. Thus, if node B performs a Set atthe request of node A, the journal entry appears in the journal on node B, not node A.Backups and journaling are daily operations that allow you to recreate your database. If anyproblem arises that renders your database inaccessible or unusable, you can restore thebackups and apply the changes in the journal to recreate your database. This method ofrecovering from a loss of physical integrity is known as “roll forward” recovery. The journalis also used for rolling back incomplete transactions.The default Global Journal State for a new database is Yes. New <strong>Caché</strong> instances have thejournaling property set to Yes for the CACHEAUDIT, CACHESYS, and USER databases. TheCACHELIB, CACHETEMP, DOCBOOK, and SAMPLES databases have the property set to No.Operations to globals in CACHETEMP are never journaled. Map temporary globals to the<strong>Caché</strong> temporary database, CACHETEMP.The following topics provide greater detail of how journaling works:• Differences Between Journaling and Write Image Journaling• Protecting Database Integrity• Automatic Journaling of Transactions• Rolling Back Incomplete Transactions• Using Temporary Globals and CACHETEMP52 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


3.1.1 Differences Between Journaling and Write Image JournalingIn this chapter, “the journal” refers to the journal file; “journaling” refers to the writing ofglobal update operations to the journal file.Do not confuse the <strong>Caché</strong> journal described in this chapter with write image journaling, whichis described in the “Write Image Journaling and Recovery” chapter of this guide. Journalingprovides a complete record of all database changes, as long as you have journaling enabledfor the database. In the event of database loss or degradation, you restore the contents of thejournal file to the database.Write image journaling provides a copy of any database modifications that are not actuallywritten to the database when a system crash occurs. In such a case, <strong>Caché</strong> automatically writesthe contents of the write image journal to the database when it restarts.3.1.2 Protecting Database IntegrityThe <strong>Caché</strong> recovery process is designed to provide maximal protection:Journaling Overview• It uses the “roll forward” approach. If a system crash occurs, the recovery mechanismcompletes the updates that were in progress. By contrast, other systems employ a “rollback” approach, undoing updates to recover. While both approaches protect internalintegrity, the roll forward approach used by <strong>Caché</strong> does so with reduced data loss.• It protects the sequence of updates: if an update is present in the database followingrecovery, all preceding updates are also present. Other systems which do not correctlypreserve update sequence may yield a database that is internally consistent but logicallyinvalid.• It protects the incremental backup file structures, as well as the database. You can run avalid incremental backup following recovery from a crash.3.1.3 Automatic Journaling of TransactionsIn a <strong>Caché</strong> application, you can define a unit of work, called a transaction. <strong>Caché</strong> transactionprocessing uses the journal to store transactions. <strong>Caché</strong> journals any global update that is partof a transaction regardless of the global journal state setting for the database in which theaffected global resides.You use commands to:• Indicate the beginning of a transaction.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 53


Journaling• Commit the transaction, if the transaction completes normally.• Roll back the transaction, if an error is encountered during the transaction.<strong>Caché</strong> supports some SQL transaction processing commands. See the “Transaction Processing”chapter of Using <strong>Caché</strong> ObjectScript for details on these commands.3.1.4 Rolling Back Incomplete TransactionsIf a transaction does not complete, it is rolled back using the journal entries. When incompletetransactions are rolled back, <strong>Caché</strong> returns the globals involved to their pre-transaction values.As part of updating the database, <strong>Caché</strong> rolls back incomplete transactions by applying thechanges in the journal, that is, by performing a journal restore. This happens:• During recovery, which occurs as part of <strong>Caché</strong> startup after a system crash.• When you halt your process while transactions are in progress.• When you use the Terminate option to terminate a process from the [Home] > [ProcessDetails] page of the System Management Portal. If you terminate a process initiated bythe Job command, the system automatically rolls back any incomplete transactions in it.If you terminate a user process, the system sends a message to the user asking whetherit should commit or roll back incomplete transactions.You can write roll back code into your applications. The application itself may detect aproblem and request a rollback. Often this is done from an error-handling routine followingan application-level error.See the Managing Transactions Within Applications section of the “Transaction Processing”chapter of Using <strong>Caché</strong> ObjectScript for more information.3.1.5 Using Temporary Globals and CACHETEMPNothing mapped to the CACHETEMP database is ever journaled.Since the globals in a namespace may be mapped to different databases, some may be journaledand some may not be. It is the journal property for the database to which the global is mappedthat determines if <strong>Caché</strong> journals the global operation. The difference between CACHETEMPand a database with the journal property set to No, is that nothing in CACHETEMP, not eventransactional updates, are journaled.54 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


If you need to exclude new z/Z* globals from journaling, map the globals to a database withthe journal property set to No. To always exclude z/Z* globals from journaling, you mustmap them in every namespace to the CACHETEMP database.<strong>Caché</strong> does not journal temporary globals. Some of the globals designated by <strong>Caché</strong> as temporaryand mapped to CACHETEMP are:• ^CacheTemp*• ^ROUTINE• ^mtemp• ^mtemp0• ^%cspSessionYou can view a list of globals mapped to CACHETEMP from the System Management Portal:1. From the [Home] page, click Globals under the Data Management menu.2. On the left side of the page, click Databases and then click CACHETEMP.Configuring Journaling3. In the search bar, select the System check box and then click Go to display the globalsmapped to the CACHETEMP database.3.2 Configuring Journaling<strong>Caché</strong> starts with journaling enabled for the following databases: CACHESYS, CACHELIB,and USER. You can enable or disable journaling on each database from the [Home] > [Configuration]> [Local Databases] page of the System Management Portal. Click Edit on the rowcorresponding to the database and click Yes or No in the Global Journal State box.The default setting of the journal state for new databases is Yes. When you first mount adatabase from an earlier release of <strong>Caché</strong>, the value is set to Yes, regardless of the previoussetting for new globals and regardless of the previous settings of individual globals withinthat database.You can change the global journal setting for a database on a running system. If you do this,<strong>Caché</strong> warns you of the potential consequences and audits the change if auditing is enabled.The journal file name is in current date format (yyyymmdd.nnn). The suffix nnn starts at 001and increases incrementally. When the journal file fills, <strong>Caché</strong> automatically switches to anew one. The new file has the same directory name, but a different numeric suffix. If you<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 55


Journalingupgrade from a previous release of <strong>Caché</strong> and used a journal file prefix, <strong>Caché</strong> continues touse that prefix.For example, if the journal file 20060827.001 fills, <strong>Caché</strong> starts a new one called 20060827.002.If the date changes while the journal file is filling, the new journal file is named 20060828.001.The following sections describe configuration in greater detail:• Configure Journal Settings• Journaling Best Practices3.2.1 Configure Journal SettingsTo configure <strong>Caché</strong> journaling, navigate to the [Home] > [Configuration] > [Journal Settings]page of the System Management Portal.You can edit the following settings:• Journal directory — The name of a directory in which to store the journal file. The namemay be up to 63 characters long. <strong>InterSystems</strong> recommends that the journal directory belocated in a different partition from your databases.• Alternate journal directory — The name of an alternate directory journaling switches toif the current directory disk is full or becomes unavailable. The same characteristics applyas for the journal directory.• Start new journal file every—Enter the number of megabytes for the maximum size ofthe journal file after which the journal file switches. The default size is 1024 MB.Important:<strong>InterSystems</strong> recommends isolating journal files from the databases byupdating both the current and alternate journal file locations to separate diskpartitions before any activity takes place on the system.You can also update these first three settings using the ^JRNOPTS routine or by selectingoption 6, Edit Journal Properties, from the ^JOURNAL routine menu. See the UpdateJournal Settings using ^JRNOPTS section for details.• When to purge journal files—You can choose either of two options:- After this many days—Enter the number of days after which to purge (valid values:1-100).- After this many successive successful backups—Enter the number of consecutivesuccessful backups after which to purge (valid values: 1-10).56 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Configuring JournalingThis includes any type of backup, whether a <strong>Caché</strong> backup or an external backupthat calls the $$BACKUP^DBACK("","E") function after successful completion.Note:No journal file containing currently open transactions is purged, even if it meetsthe criteria of this setting.• Freeze on error — Controls the behavior when an error occurs in writing to the journal.The default is No. See Journal I/O Errors for a detailed explanation of this setting.You are not required to restart <strong>Caché</strong> after changing any of these settings, but any changecauses a new journal file to begin.There is an additional advanced configuration setting affecting journaling which you canmaintain from the [Home] > [Configuration] > [Advanced Settings] page of the System ManagementPortal. Choose Transactions in the Category list.• KillRollbackLimit—The number of Kill operations that can be rolled back from a journal.Set to a value from 1000 (the default) to 65,535 Kill operations.3.2.2 Journaling Best PracticesThe following are some important points to consider when configuring journaling:• Journal all globals to ensure close to zero data loss in the event of a crash; <strong>Caché</strong> updatesthe journals much more frequently than the physical database.• Always know exactly what you are journaling and always journal what you cannot lose.• Understand all of your globals; make a distinction between what is truly temporary (andtherefore should be mapped to CACHETEMP if possible), and what just goes away aftera while (which should be journaled, as it would be needed as part of a restore).• Place journal files on a separate disk from the database (CACHE.DAT) files. <strong>InterSystems</strong>recommends isolating journal files from the database to lessen the risk of their beingcorrupted if there is a crash and the database is corrupted.• Journal files never contain database degradation; they can, therefore, function as a usefulform of secondary backup.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 57


Journaling3.3 Journaling Operation TasksOnce journaling is configured there are several tasks you can perform:• Start journaling• Stop journaling• Switch journal files• View journal files• Purge journal files• Restore journal files3.3.1 Start JournalingIf journaling is stopped, you can start it using the ^JRNSTART routine or by selecting option1, Begin Journaling, from the ^JOURNAL routine menu. See the Start Journaling Using^JRNSTART section for details.Note:You cannot start journaling from the System Management Portal.3.3.2 Stop JournalingWhen you stop journaling, transaction processing ceases. If a transaction is in progress whenyou stop journaling, the complete transaction may not be entered in the journal. To avoid thisproblem, it is best to make sure all users are off the system before stopping journaling.Transactions are not affected in any adverse way by switching journal files. Rollback correctlyhandles transactions spanning multiple journal files created by journal switching; so, if possible,it is better to switch journal files than to stop journaling.You can stop journaling using the ^JRNSTOP routine or by selecting option 2, Stop Journaling,from the ^JOURNAL routine menu. See the Stop Journaling Using ^JRNSTOPsection for details.Note:You cannot stop journaling from the System Management Portal.58 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Journaling Operation Tasks3.3.3 Switch Journal Files<strong>Caché</strong> automatically switches the journal file in the following situations:• After a successful backup of a <strong>Caché</strong> database• When the current journal file grows to the maximum file size allowed (configurable onthe Journal Settings page)• When the journal directory becomes unavailable and you specified an alternate directory• After updating settings in the [Home] > [Configuration] > [Journal Settings] page of theSystem Management PortalSwitching the journal file is preferable to stopping and starting journaling. The advantage ofswitching journal files over stopping and restarting journaling, is that you do not miss journalingany global activity that occurs after journaling is stopped but before it is restarted.To manually switch journal files:1. Navigate to the [Home] > [Journals] page of the System Management Portal.2. Click Switch Journal above the list of database journal files.3. Confirm the journal switch by clicking OK.You can also switch journal files using the ^JRNSWTCH routine or by selecting option 4,Switch Journal File from the ^JOURNAL routine menu. See the Switch Journal Files Using^JRNSWTCH section for details.3.3.4 View Journal FilesYou can view the journal files from the [Home] > [Journals] page of the System ManagementPortal.1. Click Journals under the Operations column of the [Home] page. Use the Filter box toshorten the list if necessary.2. To view the journal file information, click View in the row of the appropriate journal file.Use the Match box with the Search button to help find a particular entry. (Text in theSearch box is case-sensitive.)3. To view the journal file entry, click the Offset of the appropriate node in the list to viewa dialog box containing journal record details.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 59


JournalingYou can also use the ^JRNDUMP utility to display the entire journal and theSELECT^JRNDUMP entry point to display selected entries. See the Display JournalRecords Using ^JRNDUMP section for details.3.3.5 Purge Journal FilesYou can schedule a task to run regularly that purges obsolete journal files. A new <strong>Caché</strong>instance contains a pre-scheduled Purge Journal task that is scheduled to run after the dailySwitch Journal task that runs at midnight. The purge process deletes journal files based onthe When to purge journal files setting on the [Home] > [Configuration] > [Journal Settings]page.Note:No journal file containing currently open transactions is purged, even if it meets thecriteria of the purge setting.3.3.6 Restore Journal FilesAfter a system crash or disk hardware failure, recreate your database by restoring your backupcopies. If you have been journaling and your journal file is still accessible, you can furtherrestore your databases by applying changes since the last backup, which have been trackedin your journal.To restore the journal files:1. First confirm that all users exit <strong>Caché</strong>.2. Stop journaling if it is enabled.3. Restore the latest backup of your database. See the “Backup and Restore” chapter ofthis guide for more information.4. Run the journal restore utility. See the Restore Globals From Journal Files Using^JRNRESTO section for details.Note:You cannot run the journal restore process from the System Management Portal.60 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Journaling Utilities3.4 Journaling Utilities<strong>Caché</strong> provides several utilities to perform journaling tasks. The ^JOURNAL utility providesmenu choices to run some common journaling utilities, which you can also run independently.There are also several other journaling utilities. Run these utilities from the system manager’sdirectory.The following sections describe the journaling utilities in detail:• Perform Journaling Tasks Using ^JOURNAL• Start Journaling Using ^JRNSTART• Stop Journaling Using ^JRNSTOP• Restore Globals From Journal Files Using ^JRNRESTO• Filter Journal Records Using ^ZJRNFILT• Switch Journal Files Using ^JRNSWTCH• Display Journal Records Using ^JRNDUMP• Update Journal Settings Using ^JRNOPTS• Recover from Startup Errors Using ^STURECOV• Convert Journal Files Using ^JCONVERT• Set Journal Markers Using ^JRNMARK• Manipulate Journal Files Using ^JRNUTIL• Manage Journaling at the Process Level Using %NOJRN3.4.1 Perform Journaling Tasks Using ^JOURNALThis example shows the menu available by invoking the ^JOURNAL routine:%SYS>Do ^JOURNAL1. Begin Journaling (^JRNSTART)2. Stop Journaling (^JRNSTOP)3. Restore Globals From Journal (^JRNRESTO)4. Switch Journal File (^JRNSWTCH)5. Display Journal File (^JRNDUMP)6. Edit Journal Properties (^JRNOPTS)7. Exit This UtilityOption?<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 61


JournalingEnter the appropriate menu number option to start that particular routine. Enter 7 to exit theutility.3.4.2 Start Journaling Using ^JRNSTARTTo start journaling, run ^JRNSTART or enter 1 at the Option prompt of the ^JOURNALmenu, as shown in the following examples.Example of running ^JRNSTART directly:%SYS>Do ^JRNSTARTExample of starting journaling from the ^JOURNAL menu:1. Begin Journaling (^JRNSTART)2. Stop Journaling (^JRNSTOP)3. Restore Globals From Journal (^JRNRESTO)4. Switch Journal File (^JRNSWTCH)5. Display Journal File (^JRNDUMP)6. Edit Journal Properties (^JRNOPTS)7. Exit This UtilityOption? 1If journaling is running when you select this option, you see a message similar to the following:Option? 1Already journaling to c:\cachesys\mgr\journal\20060126.0013.4.3 Stop Journaling Using ^JRNSTOPTo stop journaling, run ^JRNSTOP or enter 2 at the Option prompt of the ^JOURNALmenu, as shown in the following examples.Example of running ^JRNSTOP directly:%SYS>Do ^JRNSTOPStop journaling now? Yes => YesExample of stopping journaling from the ^JOURNAL menu:1. Begin Journaling (^JRNSTART)2. Stop Journaling (^JRNSTOP)3. Restore Globals From Journal (^JRNRESTO)4. Switch Journal File (^JRNSWTCH)5. Display Journal File (^JRNDUMP)6. Exit This UtilityOption? 2Stop journaling now? Yes => Yes62 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


If journaling is not running when you select this option, you see a message similar to thefollowing:Option? 2Not journaling now.Journaling Utilities3.4.4 Restore Globals From Journal Files Using ^JRNRESTOJournal restore respects the current settings of the database. <strong>Caché</strong> stores nothing in thejournal about the journal state of the database when it writes the journal record. The state ofthe database at the time of restore determines what action is taken. This means that changesto databases whose journal state is Yes are durable, but changes to other databases may notbe. <strong>Caché</strong> assures physical consistency, but not necessarily application consistency, if transactionsinvolve databases whose journal state is No.The <strong>Caché</strong> ^JRNRESTO routine only restores databases whose journal state is Yes at thetime of the journal restore. It checks the database journal state the first time it encounterseach database and records the journal state. The restore process skips journal records fordatabases whose journal state is No.If no databases are marked as being journaled, the routine asks if you wish to terminate therestore. You can change the database journal state to Yes on specific databases and restart^JRNRESTO.To restore the journal files:1. Run the routine from the system manager’s namespace:%SYS>Do ^JRNRESTOThis utility uses the contents of journal filesto bring globals up to date from a backup.Replication is not enabled.Restore the Journal? Yes =>2. Press to select the default, Yes, to confirm that you want to restore the journal.3. If you have existing journal filters, specify whether you want to use them:Use current journal filter (ZJRNFILT)?Use journal marker filter (MARKER^ZJRNFILT)?See the Filter Journal Records Using ^ZJRNFILT section for details.4. Specify whether you want to restore all journaled globals.Process all journaled globals in all directories? Yes<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 63


Journaling• Enter Yes at the prompt if you want to apply all global changes to the database.• Enter No if you want to restore only selected globals. Then at the “Global^” prompts,enter the specific globals you want to restore.5. Specify whether or not to clear the journal file.• If you do not use transaction processing, enter Yes.• If you use transaction processing, you can clear the journal file only if there are noactive <strong>Caché</strong> processes that may be in the middle of a transaction.Restoring from Multiple Journal FilesIf <strong>Caché</strong> switched to multiple journal files since the restored backup, you must restore thejournal files in order from the oldest to the most recent.For example, if you have three journal files to restore, 20060325.001, 20060325.002, and20060326.001, you must restore them in the following order:1. 20060325.0012. 20060325.0023. 20060326.001Rolling Back Incomplete TransactionsRestoring the journal also rolls back incomplete transactions. Ensure that users have completedall transactions so that the restore does not attempt to roll back active processes.To ensure that transactions are all complete before you restore your backup and clear thejournal file, <strong>InterSystems</strong> strongly recommends the following:• If you need to roll back transactions for your own process, the process must halt or usethe TROLLBACK command.• If you need to roll back transactions system-wide, shut down <strong>Caché</strong> and restart it to ensurethat no users are on the system.3.4.5 Filter Journal Records Using ^ZJRNFILT<strong>InterSystems</strong> provides a journal filter mechanism to manipulate the journal file. The journalfilter program is a user-written routine called ^ZJRNFILT whose format is shown below.This is called by the <strong>Caché</strong> journal restore program, ^JRNRESTO, and assures that onlyselected records are restored. Create the ^ZJRNFILT routine using the following format:64 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Journaling UtilitiesZJRNFILT(pid,dir,glo,type,restmode,addr,time)ArgumentpiddirglotyperestmodeaddrtimeTypeinputinputinputinputoutputoutputoutputDescriptionProcess id of process in journal record (in hex)Directory in journal recordGlobal in journal recordCommand type in journal record (S for Set, K for Kill)0 - do not restore record1 - restore recordAddress of the journal recordTime stamp of the record. This is the time the journalbuffer is created, not when the Set or Kill operationoccurs, so it represents the earliest this particularoperation could have happened.^ZJRNFILT ConsiderationsConsider the following when using ^ZJRNFILT:1. If it is the startup routine (^STU) calling ^JRNRESTO, it does not call the filter routineunder any circumstances.2. Journal restore only calls the journal filter (^ZJRNFILT) if it exists. If it does exist, therestore procedure prompts you to confirm user is prompted to confirm that the filtershould be used in the restore process.3. If the user answers in the affirmative about using the journal filter, for every record inthe journal file that the restore would normally take place, a call is made to the journalfilter ^ZJRNFILT with the above parameters for confirmation to restore the currentrecord.4. The user can use any kind of logic in his program ^ZJRNFILT to determine whetherthe record has to be restored or not. The user returns confirmation through the outputargument restmode (0 don’t restore, 1 restore).5. Upon completion of the journal restore process, the user is prompted to confirm whetheror not the ^ZJRNFILT routine should be renamed or deleted. If the user chooses torename the filter it will be renamed to XJRNFILT and the original version ^ZJRNFILTwill be deleted.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 65


Journaling6. The canonical form of the directory name is passed to the routine ^ZJRNFILT in theargument zdir (above). If ^ZJRNFILT wants to perform checks against the directoryname, users are recommended to canonize the directory name that they might be using.In other words, if "dirnam" is the variable in ^ZJRNFILT that contains the directoryname that one wishes to check against, it is recommended that it is canonized by doing: s dirnam=$zu(12,dirnam)7. The process id of the process in the journal record will be passed to ^ZJRNFILT in hex.8. The entire global reference will be passed to ^ZJRNFILT and one can use appropriatelogic to perform checks.9. The restore process aborts with an appropriate error message if any errors occur in the^ZJRNFILT routine.^ZJRNFILT ExamplesTwo globals, ^ABC and ^XYZ, are journaled. While journaling is turned on, the followingcode is executed, and the journal file records the Set and Kill operations for these globalsFor I=1:1:500 Set ^ABC(I)=""For I=1:1:500 Set ^XYZ(I)=""For I=1:1:100 Kill ^ABC(I)1. To restore all records for ^ABC only, the ^ZJRNFILT routine looks like this:ZJRNFILT(pid,dir,glo,type,restmode,addr,time) ; Filter;Set restmode=1; Return 1 for restoreIf glo["XYZ" Set restmode=0 ; except when it is ^XYZQuit;2. To restore all records except the kill on ^ABC, the ^ZJRNFILT routine looks like this:ZJRNFILT(pid,dir,glo,type,restmode,addr,time) ; Filter;Set restmode=1; Return 1 for restoreIf glo["^ABC",type="K" Set restmode=0; except when it is a kill on ^ABCQuit;3.4.6 Switch Journal Files Using ^JRNSWTCHTo switch the journal file, run ^JRNSWTCH or enter 4 at the Option prompt of the^JOURNAL menu, as shown in the following example:66 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Journaling Utilities1. Begin Journaling (^JRNSTART)2. Stop Journaling (^JRNSTOP)3. Restore Globals From Journal (^JRNRESTO)4. Switch Journal File (^JRNSWTCH)5. Display Journal File (^JRNDUMP)6. Exit This UtilityOption? 4Switching from: c:\cachesys\mgr\journal\20060413.008To:c:\cachesys\mgr\journal\20060413.009The utility displays the name of the old and new journal files.3.4.7 Display Journal Records Using ^JRNDUMPTo display the records in the journal file, enter 5 at the Option prompt of the ^JOURNALmenu or run ^JRNDUMP as shown in the following example:1.%SYS>DO ^JRNDUMPJournalDirectory & prefix20060324.002 c:\cachesys\mgr\journal\20060326.001 c:\cachesys\mgr\journal\20060327.001 c:\cachesys\mgr\journal\20060327.002 c:\cachesys\mgr\journal\20060327.003 c:\cachesys\mgr\journal\20060328.001 c:\cachesys\mgr\journal\20060328.002 c:\cachesys\mgr\journal\20060329.001 c:\cachesys\mgr\journal\20060329.002 c:\cachesys\mgr\journal\20060329.003 c:\cachesys\mgr\journal\20060330.001 c:\cachesys\mgr\journal\> 20060330.002 c:\cachesys\mgr\journal\2. The routine displays a list of journal files. A greater-than sign (>) appears to the left ofthe chosen file followed by a prompt:Pg(D)n,Pg(U)p,(N)ext,(P)rev,(G)oto,(E)xamine,(Q)uit =>Use these options to navigate to the journal file you wish to locate:• Enter D or U to page through the list of journal files.• Enter N or P to move the > to the desired journal file.• Enter G to enter an alternate file name of which to display the contents.• Enter E to display the contents of the chosen journal file.• Enter Q or to quit the routine.3. After entering G or E, the utility displays the journal file name and begins listing thecontents of the file by offset address. For example:<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 67


JournalingJournal: c:\cachesys\mgr\journal\20060330.002Address Proc ID Op Directory Global & Value===============================================================================131088 2980 S c:\cachesys\mgr\ SYS("shdwcli","doctest","remend") = 1+131156 2980 S c:\cachesys\mgr\ SYS("shdwcli","doctest","end") = 1013+131220 2980 S c:\cachesys\mgr\ SYS("shdwcli","doctest","jrnend") = 1+...4. At the bottom of the current listing page is information about the journal file and anotherprompt:Last record: 573004; Max size: 1073741824(N)ext,(P)rev,(G)oto,(F)ind,(E)xamine,(Q)uit =>Use these options to navigate to the journal record you wish to display:• Enter N or P to display the next or previous page of addresses.• Enter G to bring the list to a particular address.• Enter F to search for a particular string within the journal file.• Enter E to enter the address and display the contents of a chosen journal record.• Enter Q or to return to the list of journal files.5. After entering E or G, enter an address at the prompt. The E option displays the contentsof the journal record at or near the address you entered; the G option displays the page ofjournal records starting at that location.For either option, the utility locates the record that is the closest to the offset address youspecify; it does not need to be a valid address of a journal record. Also, you may enter 0(zero) or press to go to the beginning of the journal file, or enter -1 to go to theend of the journal file.6. You may browse through a display of the journal records using N or P to display the nextor previous journal record contents, respectively. When you are finished displayingrecords, enter Q at the prompt to return to the list of journal records.There are different types of journal records:• The journal header is 8192 bytes long. It appears once at the start of every journal file.The ^JRNDUMP utility does not display the journal header record.• Journal data records.68 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


The following is a sample journal file data record as displayed by ^JRNDUMP. The exampleshows how a Set command is recorded. The new value is recorded, but not the old value,because the Set occurred outside a transaction:Journal: c:\cachesys\mgr\journal\20060119.004Address: 233028Type:SetIn transaction:NoProcess ID: 4836Remote system ID: 0Time stamp: 60284,53240Collation sequence: 5Prev address: 232984Next address: 0Global: ^["^^c:\cachesys\mgr\"]ABCNew Value: 2Journaling Utilities(N)ext,(P)rev,(Q)uit =>In a transaction, the old value is also recorded, to allow transaction rollback, as seen in thissecond example:Journal: c:\cachesys\mgr\journal\20060119.004Address: 233088Type:SetIn transaction:YesProcess ID: 5444Remote system ID: 0Time stamp: 60284,53240Collation sequence: 5Prev address: 233072Next address: 233136Global: ^["^^c:\cachesys\mgr\"]ABCNew Value: 5Old Value: 2The following table describes each field in the journal data record.Journal Data Record Fields Displayed by ^JRNDUMPFieldAddressTypeIn transactionProcess IDDescriptionLocation of this record in number of bytes from beginning of file.This is the only field where you enter a value to select a record.The type of command recorded in this journal record entry. Seethe Journal File Command Type Codes table for possible types.Whether or not the update occurred in a transaction.Process ID number for the process issuing the command.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 69


JournalingFieldRemote system IDTime stampCollation sequencePrev addressNext addressCluster sequence #GlobalNew ValueOld ValueDescriptionRemote system ID number (0 if a local process).Time the process began, in $HOROLOG format.Collation sequence of the global being updated.Location of previous record (0 indicates this is the first record).Location of next record (0 indicates this is the last record).Sequencing for globals in cluster-mounted databases. Duringcluster failover, journal entries from different nodes are updatedin order of this cluster time sequencing.Extended reference of global being updated.For a Set operation, the value assigned to the global.For a Set or Kill operation in a transaction, the value that was inthe global before the operation.The following table shows both the number and the letter code for various transaction types.Journal File Command Type CodesTypeBeginTransCommitTransSetKillNodeKillDescZKillNSetNKillNZKillJrnMarkBitSetNetReqJOURNAL-ENDNumber456789101112131415-1LetterBTCTSKkkSKkMbNN70 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Journaling Utilities3.4.7.1 Select Journal Records to DumpThe function SELECT^JRNDUMP lets you display any or all of the records in the journalfile. <strong>Caché</strong> dumps selected records from the journal file, starting from the beginning of thefile, based on the arguments passed to the function.The syntax to use the SELECT entry point of the ^JRNDUMP utility is as follows:SELECT^JRNDUMP(%jfile,%pid,%dir,%glo,%gloall,%command,%remsysid)Argument%jfile%pid%dir%glo%gloall%command%remsysidDescriptionJournal file name. Default is the current journal file.Process ID in the journal record. Default is any process.Directory in the journal record. Default is any directory.Global reference in the journal record. Default is any global.Global indicator whether to list entries related to all global nodescontaining the name represented by %glo: 0 — Exact match of globalreference with the name specified in %glo.1 — Partial match; all recordswith a global reference that contains the name specified in %glo.Defaultis 0.Type of command. Default is any command. Use either the letter or thenumeric codes described in the Journal File Command Type Codestable in the previous section.Remote system ID of journal record. Default is any system. 11 If %pid is specified, then %remsysid defaults to local system (0); otherwise, it defaults toany system, same as if it is specified as 0. That is, one cannot select journal entries only fromthe local system.You may pass the null string for any of the other arguments, in which case the routine usesthe defaults.SELECT^JRNDUMP ExamplesThe following examples show different ways to select specific journal records.You can use this entry point to send the output of the ^JRNDUMP routine to a device otherthan the terminal. For example, this sends the output to a file called JRNDUMP.OUT:<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 71


Journaling%SYS>Do SELECT^JRNDUMP("JOURNAL:POGH120020507.009","543437716","","","","","")Device: SYS$LOGIN:JRNDUMP.OUTParameters: "RW"=>To select all records in the journal file that contain the global reference ^ABC:DO SELECT^JRNDUMP("20050327.001","","","^ABC",1,"")To select only records that have an exact match to the global reference ^ABC:DO SELECT^JRNDUMP("20050327.001","","","^ABC",0,"")Records that are not an exact match, such as ^ABC(1) or ^ABC(100), are not selected.To select only records that exist for the process with pid number, 1203:DO SELECT^JRNDUMP("20050327.001","1203","","","","")Note:On OpenVMS, you must specify the pid in uppercase.To select only records for Kill operations of global ^ABC:DO SELECT^JRNDUMP("20050327.001","","","^ABC","","K")The following is an example of a journal marker record created by an incremental backup:Journal: c:\cachesys\mgr\journal\20060330.002Address: 182240Type:JrnMarkMarker ID: -1Marker text:MAR 30 2006;11:00AM;IncrementalMarker seq number: 1Prev marker address: 0Time stamp: 60354,37684Prev address: 182136Next address: 182332(N)ext,(P)rev,(Q)uit =>3.4.8 Update Journal Settings Using ^JRNOPTSAs an alternative to using the Journal Settings page of the System Management Portal, youcan update the basic journal configuration settings using the ^JRNOPTS routine. To changethe setting, type the new value at the prompt and press Enter. For example:72 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Journaling UtilitiesSYS>Do ^JRNOPTSCurrent journal directory: =>Alternate journal directory: =>Max journal file size in MB (range: [1, 4087]): => 1024If you change any of the settings, the journal file switches. If you do not change any settings,you see the following message:*** No change to settings, journal files will not be switched3.4.9 Recover from Startup Errors Using ^STURECOVDuring the <strong>Caché</strong> startup procedure if the journal or transaction restore process encounterserrors, such as or , the procedure logs the errors in the console log(cconsole.log) and starts the system in single-user mode.<strong>Caché</strong> provides a utility, ^STURECOV, to help you recover from the errors and start <strong>Caché</strong>in multiuser mode. The routine has several options which you can use to retry the failedoperation and bring the system up, or ignore the errors and bring the system up. The journalrestore phase tries to do as much work as possible before it aborts. If a database triggers morethan three errors, it aborts the recovery of that database and leaves the database dismounted.During transaction rollback, the first error in a database causes the rollback process to skipthat database in the future. The process does not fully replay transactions that reference thatdatabase; it stores them for rollback during the recovery process.When <strong>Caché</strong> encounters a problem during the dejournaling phase of startup it generates aseries of console log messages similar to the following:08/10-11:19:47:024 ( 2240) System Initialized.08/10-11:19:47:054 ( 2256) Write daemon started.08/10-11:19:48:316 ( 1836) Performing Journal Recovery08/10-11:19:49:417 ( 1836) Error in JRNRESTB: restore+49^JRNRESTBc:\cachesys\mgr\journal\20060810.004 addr=977220^["^^c:\cachesys\mgr\jo1666\"]test(4,3,28)08/10-11:19:49:427 ( 1836) Error in JRNRESTB: restore+49^JRNRESTBc:\cachesys\mgr\journal\20060810.004 addr=977268^["^^c:\cachesys\mgr\test\"]test(4,3,27)08/10-11:19:49:437 ( 1836) Error in JRNRESTB: restore+49^JRNRESTBc:\cachesys\mgr\journal\20060810.004 addr=977316^["^^c:\cachesys\mgr\test\"]test(4,3,26)08/10-11:19:49:447 ( 1836) Error in JRNRESTB: restore+42^JRNRESTBc:\cachesys\mgr\journal\20060810.004 addr=977748^["^^c:\cachesys\mgr\test\"]test(4,2,70)08/10-11:19:50:459 ( 1836) Too many errors restoring to c:\cachesys\mgr\test\.Dismounting and skipping subsequent records08/10-11:19:50:539 ( 1836) 4 errors during journal restore,see console.log file for details.Startup aborted, entering single user mode.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 73


JournalingIf the errors are from transaction rollback, then the output looks similar to this:08/11-08:55:08:732 ( 428) System Initialized.08/11-08:55:08:752 ( 1512) Write daemon started.08/11-08:55:10:444 ( 2224) Performing Journal Recovery08/11-08:55:11:165 ( 2224) Performing Transaction Rollback08/11-08:55:11:736 ( 2224) Max Journal Size: 107374182408/11-08:55:11:746 ( 2224) START: c:\cachesys\mgr\journal\20060811.01108/11-08:55:12:487 ( 2224) Journaling selected globals toc:\cachesys\mgr\journal\20060811.011 started.08/11-08:55:12:487 ( 2224) Rolling back transactions ...08/11-08:55:12:798 ( 2224) Error in %ROLLBACK: set+2^%ROLLBACKc:\cachesys\mgr\journal\20060811.010 addr=984744^["^^c:\cachesys\mgr\test\"]test(4,1,80)08/11-08:55:12:798 ( 2224) Rollback of transaction for process id #2148aborted at offset 984744 in c:\cachesys\mgr\journal\20060811.010.08/11-08:55:13:809 ( 2224) c:\cachesys\mgr\test\ dismounted -Subsequent records will not be restored08/11-08:55:13:809 ( 2224) Rollback of transaction for process id #924aborted at offset 983464 in c:\cachesys\mgr\journal\20060811.010.08/11-08:55:14:089 ( 2224) STOP: c:\cachesys\mgr\journal\20060811.01108/11-08:55:14:180 ( 2224) 1 errors during journal rollback,see console.log file for details.Startup aborted, entering single user mode.Both output listings end with the same instructions:Enter Cache' withc:\cachesys\bin\cache -sc:\cachesys\mgr -Band D ^STURECOV for help recovering from the errors.When <strong>Caché</strong> cannot start properly, it starts in single-user mode. While in this mode, executethe special commands indicated in these instructions to enter <strong>Caché</strong>. For example, for aWindows installation, enter the following:c:\cachesys\bin\>cache -sc:\cachesys\mgr -BUNIX-based and OpenVMS systems have a slightly different syntax.This runs the <strong>Caché</strong> executable from the <strong>Caché</strong> installation bin directory (cachesys\bin, bydefault) indicating the pathname (by using the -s argument) of the system manager’s directory(c:\cachesys\mgr) and inhibits all logins except one emergency login (by using the -B argument).You are now in the manager’s namespace and can run the startup recovery routine,^STURECOV:Do ^STURECOVThe ^STURECOV journal recovery menu appears as follows:74 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Journaling UtilitiesJournal recovery options--------------------------------------------------------------1) Display the list of errors from startup2) Run the journal restore again3) Bring up the system in multi-user mode (includes journal restore)4) Dismount a database5) Mount a database6) Database Repair Utility7) Check Database Integrity8) Reset system so journal is not restored at startup9) Display instructions on how to shut down the system10) Display Journaling Menu (^JOURNAL)--------------------------------------------------------------H) Display HelpE) Exit this utility--------------------------------------------------------------Enter choice (1-10) or [Q]uit/[H]elp?Only UNIX-based or OpenVMS systems contain option 9 on the menu.Before starting the system in multiuser mode, correct the errors that prevented the journalrestore or transaction rollback from completing. You have several options regarding what todo:• Option 1 — The journal restore and transaction rollback procedure tries to save the listof errors in the ^%SYS() global. This is not always possible depending on what is wrongwith the system. If this information is available, this option displays the errors.• Option 2 — This option performs the same journal restore and transaction rollback whichwas performed when the system was started. The amount of data is small so it should notbe necessary to try and restart from where the error occurred.• Option 3 — When you are satisfied that the system is ready for use, use this option tocomplete the startup procedure and bring the system up as if startup had completed normally.• Option 4 — This option lets you dismount a database. Generally, use this option if youwant to let users back on a system but you want to prevent them from accessing a databasewhich still has problems (^DISMOUNT utility).• Option 5 — This option lets you mount a database (^MOUNT utility).• Option 6 — This option lets you edit the database structure (^REPAIR utility).• Option 7 — This option lets you validate the database structure (^INTEGRIT utility).• Option 8 — This updates the system so that it does not attempt journal restore or transactionrollback at startup. This applies only to the next time the startup process is run. Usethis in situations where you cannot get journal recovery to complete and you need toallow users back on the system. Consider dismounting the databases which have not been<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 75


Journalingrecovered. This operation is not reversible. You can perform journal restore manuallyusing the ^JRNRESTO utility.• Option 9 — It is not possible to shut down the system from this utility, but this optiondisplays instructions on how to shut the system down from the UNIX or OpenVMScommand line.• Option 10 — This option brings up the journaling menu which allows you to browse andrestore journal files. There are options which start and stop journaling but these are notgenerally of interest when resolving problems with journaling at startup.Take whatever corrective action is necessary to resolve the problem. This may involve usingthe ^DATABASE routine to extend the maximum size of the database, or it may requirefreeing space on the file system or using the ^INTEGRIT and ^REPAIR utilities to findand correct database degradation. As you do this work, you can use Option 2 of the^STURECOV utility to retry the journal replay/transaction rollback as many times as necessary.You can display any errors you encounter, including those from when the system started,using Option 1. When you correct all the problems, and run Option 2 without any errors, useOption 3 to bring the system up in multiuser mode.If you find that you cannot resolve the problem, but you still want to bring the system up,use Option 8 to clear the information in the <strong>Caché</strong> image journal (.wij file) that triggers journalrestore and transaction rollback at startup. The option also logs the current information in theconsole log. Once this completes, use Option 3 to start the system. Use this facility with care,as it is not reversible.If <strong>Caché</strong> was unable to store the errors during startup in the ^%SYS() global for ^STURECOVto display, you may get an initial message before the menu that looks like this:There is no record of any errors during the prior startupThis could be because there was a problem writing the dataDo you want to continue ? No => yesEnter error type (? for list) [^] => ?Supported error types are:JRN - Journal and transaction rollbackEnter error type (? for list) [^] => JRNJournaling errors are one type of error that this utility tries to handle and that is the scope ofthis chapter. Other error types are discussed in the appropriate sections of the documentation.76 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Journaling UtilitiesCAUTION:Only use the ^STURECOV utility when the system is in single-user modefollowing an error during startup. Using it while the system is in any otherstate (for example, up running normally) can cause serious damage to yourdata as it restores journal information if you ask it to and this information maynot be the most current data. The ^STURECOV utility warns you, but it letsyou force it to run.3.4.10 Convert Journal Files Using ^JCONVERT and ^JREADThe ^JCONVERT routine is a utility that reads journal files and converts them to a commonformat so that the ^JREAD utility can read them.The directory reference determines where ^JREAD sets the global on the target system. Ifthe directory reference is not included, all sets are made to the current directory. If the directoryreference is included, sets are made to the same directory as on the source system unlesstranslated by a user-supplied ^%ZREAD program. If the target system is on a differentoperating system or the databases reside in different directories the ^%ZJREAD programmust be used to translate the directory reference.The following is an explanation of how ^JREAD operates:;During import of the records, the routine ^%ZJREAD will be called for;each journal transaction if it exists. This is a user written routine;which will allow the user to examine and filter the journal;transactions. The following variables can be looked at and manipulated;in ^%ZJREAD:;; type - Transaction type; gref - Global reference; value - Global value; %ZJREAD - 1:Apply transaction, 0:Don't apply transaction;;If the user decides to not apply the transaction, they can set the;variable %ZJREAD to 0 to skip over it. Likewise they can modify the;other variables (such as the directory specification in gref) to;change how the transaction is applied.The following is a sample run of ^JCONVERT:<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 77


Journaling%SYS>d ^JCONVERTJournal Conversion Utility [ Cache Format --> Common Format ]You must choose the export format of the converted file.There are two common export formats to choose from. If you are moving yourdata to another Cache or ISM system, or have binary data you should chooseVariable mode. Otherwise Stream mode will be used.Use Variable mode? Is the target platform VMS? Globals in the journal file are stored with a specific directory referenceappended to the global reference. You can choose either to includethe directory reference in the converted file, or exclude it. Note thatif you include it, you can always filter it out or change it later duringthe JREAD procedure. The directory reference determines where ^JREAD setsthe global on the target system. If the directory reference is not included,all sets are made to the current directory. If the directory reference isincluded, sets will be made to the same directory as on the source systemunless translated by a ^%ZJREAD program you supply. If the target systemis on a different operating system or the databases reside in differentdirectories on the target system, the ^%ZJREAD program must be used totranslate the directory reference.Include the directory reference? Enter common journal file name:Enter common journal file name: COMMON.JRNCommon journal file: COMMON.JRNVariable mode: YesDirectory reference: YesProcess all journaled globals in all directories? NoDirectory to restore [? for help]: C:\CONFIGS\CACHE51U\MGR\USER\c:\configs\cache51u\mgr\user\Process all globals in c:\configs\cache51u\mgr\user\? No => YesDirectory to restore [? for help]:Processing globals from the following datasets:1. c:\configs\cache51u\mgr\user\ All GlobalsSpecifications correct? Yes => YesSpecify range of files to process (names in YYYYMMDD.NNN format)from: [?] => 20050526.001through: [?] =>Prompt for name of the next file to process? No => NoProvide or confirm the following configuration settings:Journal File Prefix: =>Files to dejournal will be looked for in:c:\configs\cache51u\mgr\journal\in addition to any directories you are going to specify below, UNLESSyou enter a minus sign ('-' without quotes) at the prompt below,in which case ONLY directories given subsequently will be searchedDirectory to search: Here is a list of directories in the order they will be searched for files:78 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Journaling Utilitiesc:\configs\cache51u\mgr\journal\c:\configs\cache51u\mgr\journal\20050526.001100.00%***Journal file finished at 12:26:233.4.11 Set Journal Markers Using ^JRNMARKTo set a journal marker in a journal file, use the following routine:SET rc=$$ADD^JRNMARK(id,text)ArgumentidtextrcDescriptionMarker ID (for example, -1 for backup)Marker text of any string up to 256 characters (for example, “timestamp”for backup)Journal location of the marker (journal offset and journal file name,delimited by a comma) or, if the operation failed, a negative error codefollowed by a comma and a message describing the error. Note that ajournal offset must be a positive number.3.4.12 Manipulate Journal Files Using ^JRNUTIL<strong>InterSystems</strong> provides several functions in the ^JRNUTIL routine. You can use these functionsfor writing site-specific routines to manipulate journal records and files.The following table lists the functions available in the routine.Functions Available in ^JRNUTILJournaling TaskSwitch to a different journal fileOpen a journal fileClose a journal fileUse a journal fileRead a record from a journal fileinto a local arrayFunction Syntax$$JRNSWCH^JRNUTIL(jrnfile,mode)$$OPENJRN^JRNUTIL(jrnfile)$$CLOSEJRN^JRNUTIL(jrnfile)$$USEJRN^JRNUTIL(jrnfile)$$GETREC^JRNUTIL(rec addr,JRNODE)<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 79


JournalingJournaling TaskDelete a record from a journal fileDelete a journal fileFunction Syntax$$DELREC^JRNUTIL(addr)$$DELFILE^JRNUTIL(jrnfile)The following table describes the arguments used in the utility.ArgumentaddrjrnfilenewdirjrnodeDescriptionAddress of the journal record.Name of journal file.New journal file directory.Local variable passed by reference to return journal record information.3.4.13 Manage Journaling at the Process Level Using %NOJRNIf journaling is enabled system-wide, you can stop journaling for Set and Kill operations onglobals within a particular process by issuing a call to the ^%NOJRN utility from within anapplication or from programmer mode as follows:%SYS>DO DISABLE^%NOJRNJournaling remains disabled until one of the following events occurs:• The process halts.• The process issues the following call to reactivate journaling:%SYS>DO ENABLE^%NOJRN3.5 Special Considerations for JournalingKeep in mind the following characteristics of <strong>Caché</strong> journaling when developing yourapplications:• Journal Management Global• Performance• Journal I/O Errors80 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Special Considerations for Journaling3.5.1 Journal Management GlobalThe global ^%SYS is used to store information about the journal file. For example:• ^%SYS("JOURNAL","ALTDIR") stores the name of the alternate journal directory.• ^%SYS("JOURNAL","CURDIR") stores the name of the current journal directory.• ^%SYS("JOURNAL","CURRENT") stores journal status and the journal file name.You can view this information from the [Home] > [Globals] page of the System ManagementPortal.3.5.2 PerformanceWhile journaling is crucial to ensuring the integrity of your database, it can consume diskspace and slow performance, depending on the number of global updates being journaled.Journaling affects performance because updates result in double processing, as the change isrecorded in both the database and the journal file. <strong>Caché</strong> uses a flat file journaling scheme tominimize the adverse effect on performance.3.5.3 Journal I/O ErrorsThis section contains a brief guide to recovering from journal I/O errors, including when<strong>Caché</strong> cannot open the latest journal file.FreezeOnError is false, as soon as the error occurs, journaling is first retried, and then disabled,while <strong>Caché</strong> continues running. As soon as the error occurs, <strong>Caché</strong> sends a console messageto alert the system administrator, who can fix the problem and then run ^JRNSWTCH atthe console to restart journaling when all is ready. If journaling becomes disabled, the systemadministrator must also refresh the shadow and take a backup as soon as possible. Runningwithout journaling is a calculated risk, as it means the activity that occurs during this periodcannot be restored.Upon an I/O error, the Journal daemon retries the failed operation periodically until it succeeds.Typically there is a one-second interval between two consecutive retries. What happens nextdepends on whether the option to FreezeOnError is set to true of false.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 81


JournalingFreeze System on Journal I/O Error Setting is FalseIf the option is off (the default), <strong>Caché</strong> disables journaling, to prevent the system fromhanging, after it has retried the failed I/O operation at least 16 times without success and oneof the following conditions is met:• Journal buffers are all filled up.• It has retried for 10 minutes.Once journaling is disabled, global Set and Kill operations are not journaled. Therefore, backup the databases as soon as the cause of the journal I/O error is addressed. As part of the<strong>Caché</strong> online backup procedure, the JRNSWTCH routine switches journaling to a new fileand re-enables journaling.Note:<strong>Caché</strong> does not re-enable journaling automatically even if it succeeds with the failedI/O and switches journaling to a new file.Freeze System on Journal I/O Error Setting is TrueIf the option to freeze system on journal I/O errors is Yes, the journal daemon freezes journalingimmediately upon an I/O error. Thus, global Set and Kill operations that are meant tobe journaled hang. This prevents the loss of journal data at the expense of system availability.The Journal daemon unfreezes journaling after it succeeds with the failed I/O operation.If FreezeOnError is true, as soon as the error occurs all global activities that are normallyjournaled are blocked, which causes other jobs to block. The typical outcome whenFreezeOnError is true is that <strong>Caché</strong> goes into a hanging state until the journaling problem isresolved, and then resumes running. While <strong>Caché</strong> is hanging, the administrator can takecorrective measures, such as freeing up space on a disk that is full, switching the journal toa new disk that is working or has free space, etc. This option has the advantage that once theproblem is fixed and <strong>Caché</strong> resumes running, no journal information has been lost. It has thedisadvantage that the system is less available while the problem is being solved. IWarning and error messages are posted to cconsole.log periodically when the Journal daemonis retrying the failed I/O operation.What happens if journaling is disabled due to I/O errors?When journaling is disabled, database updates are no longer journaled. As a result, the journalis no longer a reliable source from which to recover databases if a crash occurs. Since shadowingrelies on journaling of the main databases, it also becomes unreliable once updates tomain databases are not journaled. For the same reason, transaction rollback and ECP networkingalso fail.82 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


What should you do after journaling is disabled?Perform the following steps:1. Resolve the problem that disabled journaling.2. Switch the journal file to re-enable journaling.3. Back up the databases on the main server (the backup automatically re-enables journalingif you have not done so).4. Restore the backup to the shadow(s), if any.5. Restart the shadow from the new journal file since the backup.Special Considerations for Journaling<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 83


4Shadow JournalingShadow journaling, or shadowing, enables a secondary computer to maintain a “shadow”version of selected databases as they are updated on a primary machine. By continuallytransferring journal information from the primary machine to the secondary machine, shadowingenables recovery to a database which is within only a few transactions of the sourcedatabase. This process is sometimes referred to as replication.By itself, shadowing is not sufficient to ensure successful failover, but it is a very simple andinexpensive approach to maintaining a disaster recovery system. Often, this approach is alsoused to update a report server, where ad hoc reporting tasks can operate on current datawithout affecting production.This chapter discusses the following topics:• Shadowing Overview• Configuring Shadowing• Using Shadowing• Using the Shadow Destination for Disaster Recovery4.1 Shadowing OverviewShadow journaling monitors database activity on a primary system, the source, and causesthe same activity to occur on a secondary system, the destination. It does this through a shadowclient service running on the destination that continually requests journal file details from ashadow service running on the source. The shadow service responds by sending the details<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 85


Shadow Journalingof the actual Set, Kill, and $Bit journal record entries to the destination shadow over a TCPconnection. The source and destination servers can be of different hardware, operating system,or endian byte order.Shadowing OverviewStarting with <strong>Caché</strong> release 5.1, there is no longer an option to choose the method of journaltransmission. All shadowing uses the fast mode, apply changes method to transmit information.Fast transmission mode allows more efficient performance because it sends the compactedjournal file block by block. Fast mode requires the data to be written to the journal file, whichmay introduce a delay of a few seconds. The shadow establishes a TCP connection to theserver and receives the journal file. As the journal file downloads, another shadow processapplies the journal entries to the local destination copy of the database.Upon connecting to the data source server, the destination shadow sends the server the nameof the journal file and the starting block number. When the server reaches the ending blocknumber, even if it has changed since the download began, the transmission ends. Subsequently,the shadow checks the source to see if it has the latest records. If it does, the shadow waits,checking on new records periodically. If it does not have the latest records, the shadowdownloads them and updates the databases as described.86 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Configuring ShadowingThe fast transmission mode applies all transactions to the local (shadow) database allowingyou to maintain a shadow copy of the databases. <strong>Caché</strong> purges the destination shadow copyof a source journal file once it is dejournaled as long as it does not contain any transactionsopen on the shadow.4.2 Configuring ShadowingThis section explains how to configure and set up shadowing in <strong>Caché</strong>. It describes the followingprocedures:• Configuring the Source Database Server• Configuring the Destination Shadow• Journaling on the Destination ShadowImportant:A shadow service cannot run on a system with a single-server license.4.2.1 Configuring the Source Database ServerTo enable shadowing on a source database server, first assure that the source system canmake a TCP connection to the destination system. Next, use the System Management Portalfrom the <strong>Caché</strong> instance running on the source system to enable the shadow service, restrictconnections, and to enable global journaling for the databases you are shadowing. You alsoneed to synchronize the source and destination databases before shadowing begins. Theseprocedures are described in the following topics:• Enable the Shadowing Service• Enable Journaling• Synchronize DatabasesFor information on methods and queries available for interfacing with the data source of ashadow apart from the System Management Portal, see the SYS.Shadowing.DataSource classdocumentation in the <strong>Caché</strong> Class Reference.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 87


Shadow JournalingEnable the Shadowing ServiceTo use shadowing, you must enable the shadowing service using the Security Managementportion of the System Management Portal. You may also restrict shadowing access byentering the IP addresses of allowed connections:1. Navigate to the [Home] > [Security Management] > [Services] page of the System ManagementPortal.2. Click %Service_Shadow in the list of service names to edit the shadow service properties.3. Select the Service enabled check box. Before clicking Save, you may want to first restrictwhat IP addresses can connect to this database source. If so, perform the next step, andthen click Save.4. In the Allowed Incoming Connections box, any previously entered server addresses aredisplayed in the IP Address list. Click Add to add an IP Address. Repeat this step untilyou have entered all permissible addresses.You may delete any of these addresses individually by clicking Delete in the appropriaterow, or click Delete All to remove all addresses, therefore allowing connections from anyaddress.Enable JournalingVerify that you are journaling each database that you wish to shadow.1. Navigate to the [Home] > [Configuration] > [Local Databases] page of the System ManagementPortal and view the Journal column for each database you wish to shadow.2. To change the journal state from No to Yes, click Edit in the row of the appropriatedatabase to edit the database properties.3. In the Global Journal State list, click Yes and then click Save.Note:By default, the CACHELIB, DOCBOOK, and SAMPLES databases are not journaledand, as a result, are not shadowed — CACHETEMP is never journaled. You can allowshadowing of a database by changing its Global Journal State value to Yes as describedpreviously.Synchronize DatabasesBefore you start shadowing, synchronize the databases on the shadow destination with thesource databases. Use an external backup on the source data server and restore the databaseson the destination shadow. See the “Backup and Restore” chapter of the <strong>Caché</strong> <strong>High</strong><strong>Availability</strong> <strong>Guide</strong> for more information.88 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Configuring Shadowing4.2.2 Configuring the Destination ShadowTo configure shadowing on a destination shadow server, first assure that the destination systemcan make a TCP connection to the source system. Next, use the System Management Portalfrom the <strong>Caché</strong> instance running on the destination system to configure the destination shadowproperties and start shadowing. These procedures are described in the following topics:• Define the Shadow• Map Databases• Start ShadowingFor information on methods and queries available for interfacing with the shadow destinationapart from the System Management Portal, see the SYS.Shadowing.Shadow class documentationin the <strong>Caché</strong> Class Reference.Define the ShadowNavigate to the [Home] > [Configuration] page of the System Management Portal and clickShadow Server Settings under the Connectivity column to display the [Home] > [Configuration]> [Shadow Server Settings] page. Perform the following steps to define the shadow properties:1. Click Add New Shadow to define a shadow on this destination server.If you have previously defined a shadow source and wish to update its information, clickEdit in the row of the source settings you wish to update.2. Enter an identifying name for this shadow in the Name of the shadow box. The systemuses this name to distinguish between shadow instances that may be running on the samesystem.Note:Do not use the tilde (~) character in the shadow name; it is used in internalshadow processing.3. Enter the TCP/IP address or host name (DNS) of the source database server you areshadowing in the DNS name or IP address of the source box.4. Enter the port number of the <strong>Caché</strong> instance of the source database server you are shadowingin the Port number of the source box.5. After entering the location information for the source instance, click Select Source Eventto choose where to begin shadowing. A page displays the available source events fromthe source journal file directory. You must select a source event before you can add any<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 89


Shadow Journalingdatabase mappings or start shadowing. See the “Select a Source Event” section fordetails.6. Click Advanced to enter the following optional fields:• Journal file directory — Enter the full name, including the path, of the journal filedirectory on the destination shadow system. Click Browse for help in finding theproper directory. The pre-filled default of a new shadow is a subdirectory namedshadow in the manager’s directory. For example: C:\ProgramFiles\CacheSys\Mgr\shadow.• Filter routine — Enter the name (omit the leading ^) of an optional filter routine theshadow uses to filter journal records before dejournaling them on the shadow. Theroutine should be in the %SYS namespace. See the “Create a Filter Routine” sectionfor details.• Maximum error messages — Enter the number of shadowing errors from 0 to 200which <strong>Caché</strong> should retain.7. Click Save.Map DatabasesAfter you successfully save the configuration settings, add database mapping from the sourceto the shadow:1. Next to Database mapping for this shadow click Add to associate the database on thesource system with the directory on the destination system using the Add Shadow Mappingdialog box.2. In the Source database directory box, enter the physical pathname of the source databasefile—the CACHE.DAT file. Enter the pathname of its corresponding destination shadowdatabase file in the Shadow database directory box, and then click Save.3. Verify any pre-filled mappings and click Delete next to any invalid or unwanted mappings.Shadowing requires at least one database mapping to start.4. Click Close to return to the [Home] > [Configuration] > [Shadow Server Settings] page.If the source database server is part of a cluster, the configuration settings for the destinationshadow differ slightly. For information on shadowing a clustered system, see the ClusterShadowing section of the “Cluster Journaling” chapter of this guide.90 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Start ShadowingThe shadow definition you added now appears in the list of shadows on the [Home] > [Configuration]> [Shadow Server Settings] page.1. Before starting the shadowing process, verify you have synchronized the databases youare shadowing on the source and destination, selected the appropriate point to beginshadowing, and mapped the source databases to the corresponding destination databases.2. Click Start in the row for the shadow name you want to start.Configuring ShadowingImportant:The journal reader and database update processes on the shadow destinationcommunicate via shared memory allocated from the generic memory heap.Set the generic memory heap as large as possible for the optimal performanceof shadow dejournaling. The minimum requirement is five pages for shadowing.The shadow destination fails to start if there is insufficient generic memoryheap allocated.Change the setting of GenericHeapSize from the [Home] > [Configuration] >[Advanced Settings] page of the System Management Portal.4.2.2.1 Select a Source EventWhile configuring a destination shadow, you must select a source event from the journal fileson the data source server where shadowing of the journaled databases should begin.Click Select Source Event to display a list of journal events on the source database that arevalid starting points for shadowing. From this list, click the time to specify at which sourceevent shadowing starts. Choose the starting point after which you synchronized the databaseson the source and destination.For example, <strong>Caché</strong> automatically switches the journal file after a successful backup. Beforestarting the shadowing process, synchronize the databases by restoring the successful backupfile from the source on the destination shadow databases. On the shadow, click Select SourceEvent from the configuration page to see events listed similar to those in the following display.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 91


Shadow JournalingFor this example, to start shadowing at the point when the source backup ended successfully(the point of database synchronization), click the Time (2006-03-12 10:15:24), of the Eventdisplaying end of backup ().4.2.2.2 Create a Filter RoutineIf you indicate a filter routine, the shadow dejournaling process runs it from the %SYSnamespace. Your filter routine should take the following format:MyShadowFilter(pid,dir,glo,type,addr,time)The input values of the filter routine are described in the following table.92 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Configuring ShadowingArgumentpiddirglotypeaddrtimeDescriptionprocess ID of the record(If the record has a nontrivial remote system ID, the pid contains two fieldsdelimited by a comma (,): the first field is the process ID and the secondis the remote system ID.source (not shadow) database directoryglobal reference in the form of global(subscripts) (without the leading ^)type of the record; valid values are: “S” (SET), “s” (BITSET), “K” (KILL),“k” (ZKILL)offset of the record in the journal filetimestamp of the recordIn the filter routine logic, return 0 for the dejournaling process to skip the record; otherwisethe shadow dejournals the record.CAUTION:Perform the New command on any local variable in the filter routine to avoidaccidently overwriting the variables used in the shadow routine.The following sample filter routine skips all journal records for the global ^X during thedejournaling process and logs each record that is dejournaled:MyShadowFilter(pid,dir,glo,type,addr,time)If $Extract($QSubscript(glo,0))="X" Quit 0;shadow filter routine;skip X* globalsDo MSG^%UTIL(pid_","_dir_","_glo_","_type_","_addr_","_time,1,0) ;logQuit 1You can specify to use a filter routine in two ways:• From the [Home] > [Configuration] > [Shadow Server Settings] page when you choose toAdd a New Server or Edit an existing shadow, enter the name in the Filter routine box inthe Advanced settings.• Set the global node ^SYS("shdwcli",,"filter") to the name of the filter routine(without the leading ^), where is the name of the shadow, MyShadowFilterin this case. For example:Set ^SYS("shdwcli","MyShadow","filter")="MyShadowFilter"<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 93


Shadow Journaling4.2.3 Journaling on the Destination Shadow<strong>InterSystems</strong> recommends you journal all databases that are the destination of shadowing.This results in the destination shadow maintaining a journal of applied updates, which providesan additional level of redundancy. Be careful not to place these journals in the same directoryas the journal files coming over from the source database server.To mitigate the increased demand on the storage capacity of the shadow, <strong>Caché</strong> purges thedestination shadow copy of a source journal file once it is dejournaled as long as it does notcontain any transactions open on the shadow.If you decide not to journal the destination shadow databases, you must also disable journalingon the CACHESYS database. <strong>Caché</strong> stores the journal address and journal file name of thejournal record last processed by shadowing in the ^SYS global in the CACHESYS database.This serves as a checkpoint from which shadowing resumes if shadowing fails.CAUTION:On the shadow destination, if you journal the CACHESYS database, but notthe destination shadow databases, there is the possibility that if the shadowcrashes and is restarted, the checkpoint in CACHESYS could be recovered toa point in time which is later in the journal stream than the last record committedto the shadow databases.4.3 Using ShadowingThis release of <strong>Caché</strong> provides an enhanced interface to shadow processing. A shadow canbe in one of three states and depending on the state, you can perform different actions on theshadow. The following sections describe each state and action including the interrelationshipsamong them.Shadow StatesA shadow can be in one of these states at any given time:• Stopped — When a shadow is stopped, you can modify its properties. This is the initialstate of a newly created shadow.• Processing — When a shadow is running, it applies database updates and you cannotmodify its properties.• Suspended — When a shadow is suspended, it does not apply database updates and youcannot modify its properties.94 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Shadow ActionsThere are four allowable actions you can perform on a shadow, depending on its current stateand your user privileges:• Start — Starts a stopped shadow from the starting point specified using Select SourceEvent.• Stop — Stops a processing or suspended shadow. When a shadow is stopped, <strong>Caché</strong> rollsback its open transactions and purges the shadow copies of the journal files.If you stop a shadow within an hour of starting it, <strong>Caché</strong> preserves the starting point and,unless it is changed, uses it when you next start the shadow. That is, the shadow startsfrom the same location as the previous start. If you stop a shadow an hour or more afterstarting it, <strong>Caché</strong> deletes the starting point, therefore requiring you to re-specify thestarting point before restarting the shadow.• Suspend — Suspends a processing shadow. Contrary to stopping a shadow, when yoususpend a shadow, <strong>Caché</strong> maintains its open transactions and journal files.• Resume — Resumes a suspended shadow from where it left off. When a fatal error occurs,a shadow aborts, entering the suspended state. On <strong>Caché</strong> startup, a shadow that was notin stopped state in the previous <strong>Caché</strong> session resumes automatically.The following diagram shows the permissible actions on a shadow in each state. It indicatesthe shadow states with circles and shows the actions you perform on these states with arrows.Relationships of Shadow States and Permissible ActionsUsing ShadowingThere are two places in the System Management Portal that you can perform tasks on adefined shadow:<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 95


Shadow Journaling• From the Configuration menu under System Administration tasks, choose Shadow ServerSettings in the Connectivity column. See Shadow Administration Tasks for a descriptionof the procedures.• From the Operations menu on the [Home] page, choose Shadow Servers and then clickThis System as Shadow Server. See Shadow Operation Tasks for a description of theprocedures.4.3.1 Shadow Administration TasksThe [Home] > [Configuration] > [Shadow Server Settings] page lists each defined shadow withthe name, status, source name and port, start point, filter, and choices for performing actionson the shadow configuration. Click the following options to perform the indicated task:• Edit — Allows updates to fields entered when you added a new shadow. See Configuringthe Destination Shadow for descriptions of these settings.• Start — Starts shadow processing; option available if the shadow is stopped.• Stop — Stops shadow processing; option available if the shadow is processing.• Delete — Deletes the entire shadow definition; you must stop the shadow before deletingthe shadow definition.4.3.2 Shadow Operations TasksYou can monitor the shadowing operation status from both the source and destination serversof the shadow.From the System Management Portal, navigate to the [Home] > [Shadow Servers] page. Onthis page you choose whether you are monitoring the shadowing process from the shadowside or the data-source side.Manage the Destination ShadowTo monitor and manage the shadow process from the destination shadow:1. Click This System as Shadow Server to display a list of source servers for this shadowmachine.The [Home] > [Shadow Servers] > [Shadows] page lists each defined shadow with theName, Status, Check point, Errors, Open Transactions, Latency, and choices for performingthe operations on the shadow described in the following steps.2. Click Details to view the specifics about this shadowing configuration.96 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


3. Click Resume to resume shadow processing; option available if you have previouslysuspended shadow processing.4. Click Suspend to suspend shadow processing; option available if the shadow is processing.5. Click Errors to view a list of errors occurring on the destination shadow.Monitor the Data SourceTo monitor the shadow process from the data source:1. Under the Data Source column of the [Home] > [Shadow Servers] page. you have twochoices of information to display.2. Click This System as Data Source to display a list of shadows defined for this data source.The [Home] > [Shadow Servers] > [Data Source] page lists each defined shadow with thePort, Shadow IP, Journal file, PID, Latency, and Shadowing Rate information.3. Click Error Log to display the [Home] > [Shadow Servers] > [Data Source Errors] pagewhich lists errors reported on this data source.Purging Shadow Journal FilesUsing the Shadow Destination for Disaster Recovery<strong>Caché</strong> purges the destination shadow copy of a source journal file once it is dejournaled aslong as it does not contain any transactions open on the shadow.4.4 Using the Shadow Destination for DisasterRecoveryThe makers of operating systems and computer hardware provide several appealing failoverstrategies for high availability. <strong>Caché</strong> is compatible with these strategies (see the “SystemFailover Strategies” chapter in this guide for more information). In addition, <strong>Caché</strong> providescapabilities to support Disaster Recovery strategies. <strong>Caché</strong> shadowing provides low-costlogical data replication over heterogeneous network configurations. A <strong>Caché</strong> shadow servercan apply journals from several dissimilar platforms on a small-scale server over any TCPnetwork. Since only logical updates are conveyed to the destination shadow, the risk of proliferatingany structural problem is eliminated.You should consider the following limitations before deciding if <strong>Caché</strong> shadowing best suitsyour Disaster Recovery strategy.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 97


Shadow Journaling• The shadow server applies production journals asynchronously so as not to affect performanceon the production server. This results in possible latency in data applied to theshadow server, although it is generally seconds behind at most. Consequently, if youwant to use the shadow server databases, they might be slightly out of date. This latencycould increase if the shadow server connection with the production server is lost for anysustained period. <strong>Caché</strong> provides mechanisms to monitor the state and progress of theshadow server to help you determine the risk of using the shadow server databases duringdisaster recovery.• Open transactions may remain. Stopping shadowing implicitly rolls back incompletetransactions. However, once transactions are rolled back, shadowing cannot resume fromits last position.Enabling the shadow server to replace the production server is not automatic. The followingprocedure highlights how you might recover to the shadow server. If your database systemfunctions as an application server, install identical applications on your shadow system tospeed recovery.To use your shadow system as a master database:1. Follow the procedure for stopping shadowing on the shadow server.2. Stop <strong>Caché</strong> and change the IP address and fully qualified domain name (FQDN) of theshadow system so that it exactly matches the original database system.3. Restart <strong>Caché</strong>.98 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


5System Failover Strategies<strong>Caché</strong> fits into all common high-availability configurations supplied by operating systemproviders including Microsoft, IBM, HP, and EMC. <strong>Caché</strong> provides easy-to-use, often automatic,mechanisms that integrate easily with the operating system to provide high availability.There are four general approaches to system failover. In order of increasing availability theyare:• No Failover• Cold Failover• Warm Failover• Hot FailoverEach strategy has varying recovery time, expense, and user impact, as outlined in the followingtable.ApproachRecovery TimeExpenseUser ImpactNo FailoverUnpredictableNo cost to low cost<strong>High</strong>Cold FailoverMinutesModerateModerateWarm FailoverSecondsModerate to highLowHot FailoverImmediateModerate to highNoneThere are variations on these strategies; for example, many large enterprise clients haveimplemented hot failover and also use cold failover for disaster recovery.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 99


System Failover StrategiesIt is important to differentiate between failover and disaster recovery. Failover is a methodologyto resume system availability in an acceptable period of time, while disaster recoveryis a methodology to resume system availability when all failover strategies have failed.If you require further information to help you develop a failover and backup strategy tailoredfor your environment, or to review your current practices, please contact the <strong>InterSystems</strong>Worldwide Response Center (WRC).5.1 No FailoverWith no failover in place your <strong>Caché</strong> database integrity is still protected from productionsystem failure. Structural database integrity is maintained by <strong>Caché</strong> write image journal (WIJ)technology. Logical integrity is maintained through global journaling and transaction processing.While WIJ, global journaling, and transaction processing are optional, <strong>InterSystems</strong>highly recommends using them.If a production system failure occurs, such as a hardware failure, the database and applicationare generally unaffected. Disk degradation, of course, is an exception. Disk redundancy andgood backup procedures are vital to mitigate problems arising from disk failure.With no failover strategy in place, system failures can result in significant downtime,depending on the cause and your ability to isolate and resolve it. If a CPU has failed, youreplace it and restart, while application users wait for the system to become available. Formany applications that are not business-critical this risk may be acceptable. Customers thatadopt this approach share the following common traits:• Clear and detailed operational recovery procedures• Well-trained, responsive staff• Ability to replace hardware quickly• Disk redundancy (RAID and/or disk mirroring)• Enabled global journaling and WIJ• 24x7 maintenance contracts with all vendors• Expectations from application users who tolerate moderate downtime• Management acceptance of risk of an extended outage100 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Cold FailoverSome clients cannot afford to purchase adequate redundancy to achieve higher availability.With these clients in mind, <strong>InterSystems</strong> strives to make <strong>Caché</strong> 100% reliable.5.2 Cold FailoverA common and often inexpensive approach to recovery after failure is to maintain a standbysystem to assume the production workload in the event of a production system failure. Atypical configuration has two identical computers with shared access to a disk subsystem.After a failure, the standby system takes over the applications formerly running on the failedsystem. Microsoft Windows Clusters, HP MC/ServiceGuard, Tru64 UNIX TruClusters,OpenVMS Clusters, and IBM HACMP provide a common approach for implementing coldfailover. In these technologies, the standby system senses a heartbeat from the productionsystem on a frequent and regular basis. If the heartbeat consistently stops for a period of time,the standby system automatically assumes the IP address and the disk formerly associatedwith the failed system. The standby can then run any applications (<strong>Caché</strong>, for example) thatwere on the failed system. In this scenario, when the standby system takes over the application,it executes a pre-configured start script to bring the databases online. Users can then reconnectto the databases that are now running on the standby server. Again, WIJ, global journaling,and transaction processing are used to maintain structural and data integrity.Customers generally configure the failover server to mirror the main server with an identicalCPU and memory capacity to sustain production workloads for an extended period of time.The following diagram depicts a common configuration:<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 101


System Failover StrategiesCold Failover ConfigurationState of PRODFUNCTIONALOUT OF SERVICEIP address of PROD191.10.25.1N/AIP address of STDBY191.10.25.50191.10.25.1Note:Shadow journaling, where the production journal file is continuously applied to astandby database, includes inherent latency and is therefore not recommended as anapproach to high availability. Any use of a shadow system for availability or disasterrecovery needs should take these latency issues into consideration.5.3 Warm FailoverThe warm failover approach exploits a standby system that is immediately available to acceptuser connections after a production system failure. This type of failover requires the concurrentaccess to disk files provided, for example, by OpenVMS clusters and Tru64 UNIX TruClusters.102 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


In this type of failover two or more servers, each running an instance of <strong>Caché</strong> and each withaccess to all disks, concurrently provide access to all data. If one machine fails, users canimmediately reconnect to the cluster of servers.A simple example is a group of OpenVMS servers with cluster-mounted disks. Each serverhas an instance of <strong>Caché</strong> running. If one server fails, the users can reconnect to another serverand begin working again.Warm Failover ConfigurationWarm FailoverStateABCNormal300 users300 users300 usersB fails300 users0 users300 usersB users log on again450 users0 users450 usersThe 600 users on A and C are unaware of B's failure, but the 300 users that were on the failedserver are affected.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 103


System Failover Strategies5.4 Hot FailoverThe hot failover approach can be complicated and expensive, but comes closest to ensuring100% uptime. It requires the same degree of failover as for a cold or warm failover, but alsorequires that the state of a running user process be preserved to allow the process to resumeon a failover server. One approach, for example, uses a three-tier configuration of clients andservers.Hot Failover ConfigurationThousands of users on terminal browsers connect through TCP sockets to a bank of applicationservers. Each application server has a backup server ready to automatically start in case of aserver failure. In turn, the application servers are each connected to a bank of data servers,each with its own backup server.104 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Hot FailoverIf a data server fails, any application server waiting for a response automatically resubmitsits request to a different data server while the backup server is started. Similarly, any userterminal that sends a request to an application server that fails automatically reissues itsrequest to an alternate application server.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 105


6Cluster ManagementThis chapter contains information about cluster management in <strong>Caché</strong>. It discusses the followingtopics:• Overview of Clusters• Configuring a <strong>Caché</strong> Cluster• Managing Cluster Databases• <strong>Caché</strong> Startup• Write Image Journaling and Clusters• Cluster Backup• System Design Issues for Clusters• Cluster Application Development Strategies• <strong>Caché</strong> ObjectScript Language Features• DCP and UDP Networking<strong>Caché</strong> Clusters can be configured on both the OpenVMS and Tru64 UNIX platforms. Thischapter contains information about cluster management in general for both platforms andsome specifics for OpenVMS. For more detailed information on other cluster-related topics,please see:• Cluster Journaling• <strong>Caché</strong> Clusters on Tru64 UNIX• <strong>Caché</strong> and Windows Clusters<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 107


Cluster Management• ECP Failover6.1 Overview of Clusters<strong>Caché</strong> systems may be configured as a cluster. Cluster configurations provide special benefitsto their users:• Users can invisibly share disk storage and printers or maintain private access to theseresources.• Users can share queues.• Cluster software can be configured to search for the least used resource, maximizingusage of resources while simultaneously increasing throughput.• In a cluster environment, each computer executes its own copy of software.A <strong>Caché</strong> cluster is identified by its PIJ directory. Nodes that specify the same PIJ directoryare all part of a cluster. A cluster session begins when the first cluster node starts and endswhen the last cluster node shuts down.The networking capabilities of <strong>Caché</strong> can be customized to allow cluster failover: if onecomputer in the cluster goes down, the remaining members continue to function withoutdatabase degradation. Databases can be shared between cluster members. The computers ina cluster can be connected in the following ways:• Special purpose hardware, such as Memory Channels and Gigabit Ethernet, for highspeed communication• SCSI bus based clusters• Ethernet cables, for lower cost• A combination of the aboveThe functionality provided is the same, regardless of which connection mechanisms are used.System specifications for a cluster configuration:• Maximum number of cluster nodes in a cluster: 14• Maximum number of cluster-mounted databases: approximately 512108 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Overview of ClustersImportant:<strong>Caché</strong> has specific builds for clustered systems. Verify you obtain and installthe proper build type.6.1.1 Cluster MasterThe first node running <strong>Caché</strong> that joins the cluster by attempting to mount a database incluster mode becomes the cluster master. The cluster master performs the following functions:• Acts as a lock server to all cluster-mounted databases.• Coordinates write image journaling cluster-wide.• Manages cluster failover.If the cluster master fails or shuts down, the next node that joined the cluster becomes thecluster master and assumes the roles described.A node joins a cluster when it starts its ENQ daemon system process. This process is activatedthe first time a node attempts to cluster-mount a database. The RECOVERY daemon is createdon a node along with the ENQ daemon the first time a database is mounted in cluster mode.It manages cluster failover. The ENQDMN and RECOVERY system processes are createdonly on systems that join a cluster.The ENQ daemon uses a cluster-wide .PIJ file, which you must specify from the [Home] >[Configuration] > [Advanced Settings] page of the System Management Portal. Each node inthe cluster must specify the same location for this file. The file is used to support clusterfailover and recovery procedures and by the write image journaling technology.6.1.2 Cluster Master as Lock ServerThe cluster master acts a lock server by managing access to the cluster-mounted CACHE.DATfiles. Applications that run in a cluster must have mechanisms to coordinate access frommultiple cluster nodes to cluster-mounted databases. <strong>Caché</strong> accomplishes this at two levels:• Block-level Locks—<strong>Caché</strong> manages block-level access to shared databases on disk for<strong>Caché</strong> applications running in a cluster environment. It prevents one node from readingor modifying a block from a disk which is simultaneously being changed in the memoryof another node. Multiple nodes can read the same block, but only one can update it at atime.<strong>Caché</strong> manages these simultaneous access requests at the block level with the DistributedLock Manager (DLM), using the ENQ daemon process called ENQDMN.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 109


Cluster Management• <strong>Caché</strong> ObjectScript Level Locks—While each cluster member can directly access clustereddatabases, no member can independently process <strong>Caché</strong> ObjectScript Lock commandsfor clustered databases. The cluster master acts as a lock server by coordinating all <strong>Caché</strong>ObjectScript Lock requests so that the logical integrity of the cluster-mounted databaseis maintained.These requests are communicated to the cluster master via a network connection usinga <strong>Caché</strong> server. Thus, ECP, or the older DCP, must be running on each computer that isparticipating in the cluster. Even if an application issues a Lock command to a globalusing the extended bracket syntax of [dir_name,dirset_name], or via a namespacemapped to a cluster mounted database, the command is processed by the cluster master.If multiple global updates must be coordinated, you must use the Lock command whenupdating globals in cluster-mounted databases. <strong>Caché</strong> journaling technology uses Lockinformation to coordinate updates to these databases so that journal restores work correctlyin the event of cluster failover or recovery after a cluster crash.6.2 Configuring a <strong>Caché</strong> ClusterIf you are running in a <strong>Caché</strong> cluster, you must set up a network configuration. It is easiestto set up your network configuration prior to creating your system configuration. Set up a<strong>Caché</strong> server configuration that links all the computers in your <strong>Caché</strong> cluster using ECP.By default, the <strong>Caché</strong> instance acts as a single system until you set specific cluster-relatedsetting values. On each cluster node, perform the following tasks:1. Navigate to the [Home] > [Configuration] > [Advanced Settings] page of the SystemManagement Portal.2. Select Clusters from the Category list.3. Enter a value for CommAddr, the IP Address to advertise in the PIJ to the other clustermembers.4. Set JoinCluster to true.5. Enter the PIJDirectory location; this must be the same on each cluster node and is requiredwhen the previous field is set to true. The directory must exist.If your network configuration contains multiple network devices you must make sure thateach cluster node is identified in every other cluster node.110 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Managing Cluster Databases6.3 Managing Cluster DatabasesThe following sections provide information on database management within a cluster-networkedsystem. Most examples are for OpenVMS <strong>Caché</strong> shared-all clusters.6.3.1 Creating <strong>Caché</strong> Database FilesIn a cluster, all CACHE.DAT database files must be created on disks that are cluster-accessibleat the system level. Type in the name of the OpenVMS directory where you wish to create anew CACHE.DAT file.If you are using a directory on the current disk, just type the directory name. If you are usinga directory on another disk, or a disk connected to another cluster node, type the device nameand the directory name as follows:device_name:[DIRNAME]On an OpenVMS cluster, the device portion contains a controller name, regardless of whetherthe CACHE.DAT directories are cluster mounted.For disks that are physically connected to one node in the cluster, that name is the same asthe (system communications services) SCS node name of the computer serving the disk andthe parts are separated by a dollar sign.For example: DKA100, if physically served by node TEST, is known as TEST$DKA100:. <strong>Caché</strong>expands DKA100: to TEST$DKA100:.If the disk is served by an independent controller array, it has a number and is both precededand separated by dollar signs.For example: DKA100: on cluster controller 1 is $1$DKA100:.6.3.2 Mounting DatabasesWhen you create a database, you assign one of two intrinsic modes: automatic or explicit.Automatic is the default and it is the same as that for nonclustered systems; databases aremounted as they are referenced. If a database is marked for explicit mounting, a reference toa database that is not mounted results in a error. Databases must be mountedexplicitly via the mount command or at system startup via the database mount list, which ispart of the network dataset configuration.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 111


Cluster ManagementIn a <strong>Caché</strong> cluster, all <strong>Caché</strong> databases must be on a disk that is mounted as cluster-accessibleto all members of the <strong>Caché</strong> cluster, even if they are privately mounted at the <strong>Caché</strong> level.Include the following in your database mount list:• All databases to be cluster-mounted by any cluster node.• Any database namespace that contains implicitly mapped globals.<strong>Caché</strong> mounts newly created databases privately. You can remount private databases ascluster databases using the mount utility. From the [Home] > [Databases] page of the SystemManagement Portal click Dismount and then Mount.Cluster databases should be mounted at startup as clustered. From the [Home] > [Configuration]> [Local Databases] page of the System Management Portal click Edit1. Navigate to the [Home] > [Configuration] > [Local Databases] page of the System ManagementPortal.2. Click Edit in the appropriate database row.3. Select the Mount Required at Startup check box.6.3.3 Deleting a Cluster-Mounted DatabaseA cluster-mounted database, a CACHE.DAT file, cannot be deleted. If you attempt to deleteit, the following message is displayed:## ERROR while Deleting. Cannot delete a cluster-mounted databaseThe database must be dismounted or privately mounted before you can delete it.6.4 <strong>Caché</strong> StartupOnce the network configuration is determined, the <strong>Caché</strong> startup procedure does the following:1. Performs network initialization operations, including activation of the network daemons.2. Mounts databases configured with the Mount Required at Startup check box selected.<strong>Caché</strong> displays information about each database it mounts. For example:Directory Mode VMS1$DKA0:[SYSM.V6D1-9206A] PvtVMS$DKA0:[DIR2]Clu112 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Write Image Journaling and ClustersIf mount error conditions occur, they are reported to the terminal and the cconsole.log. If theENQ daemon fails to start, see the cconsole.log.The first node to activate its ENQ daemon by cluster-mounting a <strong>Caché</strong> database becomesthe cluster master for each cluster member. Normally, you include all cluster-mounteddatabases in the Database Mount List and they are mounted at startup.Startup pauses with a message if you attempt to join a cluster during cluster failover.6.5 Write Image Journaling and Clusters<strong>Caché</strong> write image journaling allows remaining cluster members to continue to functionwithout database degradation or data loss if one cluster member goes down.In the cluster environment, the Write daemon on the first node to cluster-mount a databasebecomes the master Write daemon for the cluster; it creates the cluster-wide journal file,named CACHE.PIJ. In addition, each node, including the master, has its own image journalfile called CACHE.PIJxxx.In a cluster environment, writes throughout the entire cluster freeze until the cause of thefreeze is fixed.For more information, see the “Write Image Journaling and Recovery” chapter of the <strong>Caché</strong><strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>.6.6 Cluster BackupFor privately mounted databases in a cluster, backups and journaling are the daily operationsthat allow you to recreate your database. In the event of a system failure that renders yourdatabase inaccessible, you can restore the backups and apply the changes in the journal torecreate it.CAUTION:Always run backup for the cluster mounted databases from the same machinein a cluster so the backup history is complete. This is stored in a global in themanager’s database.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 113


Cluster ManagementIf you are doing a full backup on a database that is mounted in cluster mode from multiplecomputers, always perform the back up from the same computer. This maintains an accuratebackup history for the database.The BACKUP utility permits you to back up and restore databases that are shared by multipleCPUs in a cluster environment.Note:For cluster-mounted databases, <strong>InterSystems</strong> recommends another backup strategy,such as volume shadowing. Concurrent backup also works with clusters.All databases must be mounted before you can back them up. The backup utility mounts anydatabases needed for the backup. It first tries to mount them privately; if that action fails, itmounts them for clustered access. If a private mount fails and the system is not part of thecluster, or if the cluster mount fails, then you cannot back up the database. You receive anerror message, and can choose whether to continue or stop.When backing up cluster-mounted databases, BACKUP must wait for all activity in thecluster to cease before it continues. For this reason, clustered systems may be suspendedslightly longer during the various passes than when you back up a single node.The DBSIZE utility gives you the option of suspending the system while it makes its calculation.It also lets you suspend the cluster if any of the databases in the backup list is clustermountedwhen the calculation takes place.The incremental backup software uses a Lock to prevent multiple backups from occurringat the same time. This method does not work across a cluster. You must ensure that only onebackup at a time runs throughout an entire cluster whose members share the same database.The DBSIZE utility uses the same internal structures as the BACKUP utility. DBSIZE teststhe lock used by BACKUP. However, the same restriction applies: do not run DBSIZE onone cluster member while another cluster member is running a backup. Otherwise, the backupwill not be intact, and database degradation may result when you restore from that backup.6.7 System Design Issues for ClustersPlease be aware of the following design issues when configuring your <strong>Caché</strong> cluster system.You may need to adjust your system parameters for your clustered system. Please referencethe appropriate platform appendix of the <strong>Caché</strong> Installation <strong>Guide</strong> for recommended calculationsfor system parameters.114 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


6.7.1 Determining Database File <strong>Availability</strong>Cluster Application Development StrategiesIn order to properly mount the database files to function most efficiently in the cluster,determine which CACHE.DAT files need to be available to all users in the cluster. Mountthese in cluster mode from within <strong>Caché</strong>. All wij, pij, and journal files must be on clustermounteddisks.Also determine which CACHE.DAT files only needed by users on only one cluster node.Mount these privately or specify that they are automatic—the system mounts them on reference.6.8 Cluster Application Development StrategiesThe key to performance in a cluster environment is to minimize disk contention among nodesfor blocks in cluster-mounted directories.6.8.1 Block Level ContentionIf a directory is cluster-mounted, all computers can access data in it with a simple reference.More than one computer can access a given block in the database at a time to read its data.However, if a computer wants to update a block, all other computers must first relinquish theblock. If another computer wants to access that block prior to the completion of the Writedaemon cycle, the computer that did the update must first write the changed block to disk (insuch a way as the action can be reversed if that computer goes down). The other computerscan again read that block until one of them wants to modify it.If there is a great deal of modification done to a database from all cluster members, a significantamount of time-consuming I/O processing occurs to make sure each member sees the mostrecent copy of a block.You can use various strategies to minimize the amount of disk I/O when a particular databaseis modified by multiple computers.Mount Directories PrivatelyIf a database is not used frequently by other nodes, mount the database privately from thenode which uses it most frequently. When other nodes need to access it, they can use a remotenetwork reference.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 115


Cluster ManagementUse Local Storage of CountersContention for disk blocks is most common in the case of updating counters. To minimizethis problem, code your applications so that groups of counters (for example, 10) are allocatedper remote request to a local private directory. Thereafter, whenever a local process needs anew counter index number, it first checks the private directory to see if one of the ten isavailable. If not, it then goes to allocate a new set of ten counters from the clustered directory.You can use $INCREMENT to update counters and retrieve the value in a single operation.Note:This is also a good strategy for nonclustered networked systems.In addition to reducing contention when accessing counters, this technique also enhancesaccess of records that use those counters. Since a system obtains contiguous counters, blocksplitting combined with the <strong>Caché</strong> collating sequence work causes records created by differentnodes to be located in different areas of the database. Therefore, processes on different nodesdo their SETs and KILLs into different blocks and are no longer in contention, thus reducingdisk I/O.6.9 <strong>Caché</strong> ObjectScript Language FeaturesThe following sections provide information about <strong>Caché</strong> ObjectScript language features withimplications for cluster mounted database systems.6.9.1 Remote <strong>Caché</strong> ObjectScript LocksThe following information details remote locks under <strong>Caché</strong> ObjectScript, with respect to acluster environment.6.9.1.1 Remote Lock HandlingInformation about remote locks is stored in two places:• In the Lock Table on the system requesting the lock (the client)• In the Lock Table on the system to which the lock request is directed (the server)The server for items in a cluster-mounted database is always the Lock Server (Cluster Master).When a process on the client system needs to acquire a remote lock, it first checks to see ifan entry is already present in the client lock table, indicating that another process on thatsame computer already has a lock on the remote item.116 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


<strong>Caché</strong> ObjectScript Language FeaturesIf a lock already exists for the desired global, the process queues up for that lock, just as itwould for a local lock. No network transmissions are required.If the needed remote lock is not present in the client’s lock table, the client process createsan entry in the local lock table and sends a network request for it.If the reference resolves to a careted lock in a cluster-mounted database, the lock request isautomatically sent to the Cluster Master.Once the client process receives the lock acknowledgment from the remote computer, anentry identifying the process making the lock will be present both in its own (client) locktable and an entry identifying the remote computer (but not the process) which made the lockwill exist in the server’s lock table.If any of the network requests fail, the client process must remove all the locks from the locallock table. It must also send network unlock requests for any network locks it actually acquiredwhen locking multiple items at one time.When a process has completed its update it issues an UNLOCK command. If it is an incrementalunlock, it is handled in the local lock table. If it is the last incremental unlock, or if itis not an incremental unlock, then an unlock request is sent to the server.Note:If another process on the local machine has queued for the lock, rather than releasingthe lock on the server, <strong>Caché</strong> may grant it to the waiting process. This is called lockconversion.6.9.1.2 Remote Lock Commands by Extended ReferenceAll extended references used in remote lock commands should use the same directory specification.This includes consistency between upper and lower case. For example, “VMS2$SYS”is not equivalent to “vms2$sys” .If you use logicals, all processes and applications must use the same logical name, not justresolve to the same physical directory name. In addition, logicals must be defined the sameway on all cluster members as well as by all processes running on each member. Systemmanagers and applications developers need to work together to maintain consistency.This limitation is consistent with the ANSI standard regarding <strong>Caché</strong> ObjectScript locks andremote reference syntax.Note:In a cluster, references to remote globals on cluster-mounted databases can be madeas a simple reference. However, certain techniques you may want to use to minimizedisk contention require the use of extended reference.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 117


Cluster Management6.10 DCP and UDP NetworkingIf you are configuring a cluster with the legacy DCP technology, <strong>InterSystems</strong> recommendsthat you always specify the local address in the network definition as the IP address whichthe other cluster members use to talk to the local machine.You can determine the default system IP address using the following commands:s hostname=$ZU(54,0);get our host namesipaddr=$P($ZU(54,13,hostname),",");lookup 1st IP addressIf UDP networking is being used for the cluster network traffic, a node cannot use the IPaddress 0.0.0.0 when it joins the cluster. If the network definition for a system specifies0.0.0.0 as the local IP address, <strong>Caché</strong> attempts to determine the real IP address for thissystem when it tries to join a cluster.If this system has only one IP address, this is not an issue. If this system has multiple IPaddresses, <strong>Caché</strong> picks the first one returned by gethostbyname() to use in the cluster. If thisis the incorrect IP address, a cluster crash is declared. You can avoid this problem by alwaysspecifying the local address in the network definition.118 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


7Cluster JournalingThis chapter contains information about journaling on ECP-based shared-disk clustered systemsin <strong>Caché</strong>. It discusses the following topics:• Journaling on Clusters• Cluster Failover• Cluster Shadowing• Tools and UtilitiesFor related information, see the following chapters in this guide:• Journaling• Shadow Journaling• Cluster Management• Backup and Restore7.1 Journaling on ClustersJournaling is necessary for cluster failover to bring the databases up to date and to use transactionprocessing. Each node in a cluster maintains its own journal files, which must beaccessible to all other nodes in the cluster to ensure recoverability. A cluster session ID (CSI)is the time when the session begins, that is, the cluster start time, which is stored in the headerof every journal file on a clustered system.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 119


Cluster JournalingIn addition to the information journaled on a nonclustered system, the following specificsapply to clustered systems:• Updates to clustered databases are always journaled, (usually on the master node only)except for scratch globals. On a cluster database, even globals whose database journalingattribute is No are journaled regardless of whether they are updated outside or within atransaction.• Database updates via the $Increment function are journaled on the master node as wellas on the local node if it is not the master.• Other updates are journaled locally if so configured.The journal files on clustered systems are organized using the following:• Cluster Journal Log• Cluster Journal Sequence NumbersThe default location of the journal files is in the manager’s directory of the <strong>Caché</strong> instance.<strong>InterSystems</strong> recommends isolating journal files from the database files (CACHE.DAT files)by changing the journal file location to a separate disk before any activity takes place on thesystem.CAUTION:Do not stop journaling in a cluster environment, although it is possible to doso with the ^JOURNAL routine. If you do, the recovery procedure is vulnerableuntil the next backup.7.1.1 Cluster Journal LogJournal files used by members of a cluster are logged in a file, CACHEJRN.LOG, located inthe cluster pre-image journal, or PIJ directory. It contains a list of journal files maintainedby nodes while they are part of the cluster. Journal files maintained by a node when it is notpart of the cluster may not appear in the cluster journal log.Here is an example of part of a log:0,_$1$DRA1:[TEST.50.MGR.JOURNAL]20030913.0041,_$1$DKA0:[TEST.50.MGR.JOURNAL]20030916.0020,_$1$DRA1:[TEST.50.MGR.JOURNAL]20030916.0010,_$1$DRA1:[TEST.50.MGR.JOURNAL]20030916.0021,_$1$DRA1:[TEST.5A.MGR.JOURNAL]20030916.0021,_$1$DRA1:[TEST.5A.MGR.JOURNAL]20030916.003The first value in each comma-delimited row is the cluster system number (CSN) of the nodeto which the journal file, the second field, belongs. The log is useful for locating journal files120 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


of all the members of a cluster, especially the members that have left the cluster. The CSNof a node may change when it restarts.When a node joins the cluster its current journal file is added to the journal log. Processesthat start journaling or switch journal files also add entries. The log is used in a cluster journalrestore, by shadowing, and by the journal dump utility.7.1.2 Cluster Journal Sequence NumbersJournaling on Clusters<strong>InterSystems</strong> recommends locking globals in cluster-mounted databases if you require arecord of the sequence of updates. This is the tool <strong>Caché</strong> uses to record the time-sequencingof updates in the journal files of the cluster nodes. If cluster failover occurs, the journals ofall the nodes can be applied in the proper order. Updates may not be dejournaled in the sameorder as they originally occurred, but they are valid with respect to the synchronizationguaranteed by the Lock command and $Increment function.To restore clustered databases properly from the journal files, the updates must be applied inthe order they occurred. The cluster journal sequence number, which is part of every journalentry of a database update, and the cluster session ID, which is part of the journal header,provide a way to sequence transactions from the journal files of all the cluster members. Foreach cluster session, the sequence number starts at 1 and can go as high as18446744073709551619 (that is, 2**64-1).The master node of a cluster maintains a master copy of the sequence number, which isincremented during database mounting, and with each use of Lock and $Increment. Themaster value of the sequence propagates to one or all cluster nodes, depending on the typeof operation.The sequence number is used by cluster journal restore and shadowing, which is a specialform of journal restore. Both utilities operate on the assumption that the sequence numberincreases monotonically during a cluster session.At the end of a backup, the cluster journal sequence number on all cluster nodes is incrementedto be higher than the previous master cluster journal sequence number. Then a journal markerbearing the new cluster journal sequence number is placed in the current local journal file.In a sense, the journal marker serves as a barrier of cluster journal sequence numbers, separatingthe journaled database updates that are covered in the backup from those that are not.Following the restore of the backup, cluster journal restore can start from the cluster journalsequence number of the journal marker and move forward.You can also set your own journal markers using the ^JRNMARK utility. See SettingJournal Markers on a Clustered System for details.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 121


Cluster Journaling7.2 Cluster FailoverThe <strong>Caché</strong> cluster failover process protects the integrity of data on other cluster nodes whenone cluster member fails. It allows the remaining cluster members to continue to function.The following conditions must be met for cluster failover to work successfully:• All directories containing CACHE.DAT files must be accessible to all surviving nodes.• Journaling must be enabled at all times while <strong>Caché</strong> is running.• Networking must be properly configured.If a cluster member fails, the cluster master executes cluster failover. If the master is thefailing node, the cluster member that least recently joined the cluster becomes the new masterand executes failover. Cluster failover consists of two phases.In the first phase, the cluster master does the following:• Checks the cluster PIJ and the write image journal files (CACHE.WIJ) on each node todetermine what recovery is needed from these files.• Executes recovery from the WIJ files to all databases that had been mounted in clustermode.If an error occurs during this phase, the cluster crashes and further Cluster Recovery musttake place.In the second phase, the cluster master does the following:• Mounts databases in private mode, as required, to restore the journals of all clustermembers.• Attempts to mount the databases in cluster mode if it cannot mount them in private mode.• Restores any <strong>Caché</strong> journal entries after the current index kept in the CACHE.WIJ file foreach cluster member’s journal. For details, see Cluster Restore.• Rolls back incomplete transactions in the failed node’s journal file.• Reforms the lock table if it is the new cluster master; otherwise, it discards the locks ofthe failing node.During failover, the journals from all cluster members are applied to the database and anyincomplete transactions are rolled back. If cluster failover completes successfully, there isno database degradation or data loss from the surviving nodes. There is only minimal data122 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


loss (typically less than the last second) not visible to other cluster members from the failingnode.If failover is unsuccessful, the cluster crashes and you must shut down all cluster nodes beforerestarting the cluster. See the Failover Error Conditions section for more details.7.2.1 Cluster RecoveryRecovery occurs when <strong>Caché</strong> stops on any cluster member. The procedure changes dependingon how <strong>Caché</strong> stops. During successful failover, the recovery procedure is fairly straightforwardand automatic. If, however, a clustered system crashes, the recovery is more complex.Following a cluster crash, a clustered <strong>Caché</strong> node cannot be restarted until all of the nodesin the cluster have been stopped. If a cluster member attempts <strong>Caché</strong> startup or tries to jointhe cluster by cluster-mounting a disk before all other cluster members have been stopped,the following message is displayed:ENQ daemon failed to start because cluster is crashed.Once all members are stopped, start each node. The first node that starts runs the <strong>Caché</strong>recovery procedure if it detects there was an abnormal system shutdown. This node becomesthe new cluster master. Cluster members that are not the cluster master are frequently referredto as slave nodes.The Recovery daemon (RCVRYDMN) performs recovery on the surviving or new masternode based on whether the crashed node was the master or a slave. To enable recovery, thenode’s databases and WIJ files must be accessible cluster-wide. The master is responsible formanaging the recovery cluster-wide based on the WIJ file on each node.When a slave crashes, the Recovery daemon on the master node does the following:1. It uses the journal information provided by the WIJ files to apply journal files on allcluster nodes (including the one that crashed) from a common starting point, as with allcluster journal restores.2. It rolls back all incomplete transactions on the crashed system. Again, the rollbacks arejournaled, this time on the host system of the Recovery daemon. For this reason, if yourestore journal files, it is safer to do a cluster journal restore than a stand-alone journalrestore, as the rollback of an incomplete transaction in one node's journal may be journaledon another node.When the (former) master crashes, the Recovery daemon does the following:1. As in step 1 above, it applies the journal files on all the cluster nodes.Cluster Failover<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 123


Cluster Journaling2. In between the two steps above, it adjusts the cluster journal sequence number on its hostsystem, which is the new master, so that it is higher than that of the last journaled entryon the crashed system, which was the old master. This guarantees the monotonicincreasing property of cluster journal sequence in cluster-wide journal files.3. As above, all incomplete transactions are rolled back on the crashed system.If the last remaining node of a cluster crashes, restarting the first node of the clusterinvolves cluster journal recovery, which includes rolling back any transactions that wereopen (uncommitted) at the time of crash.7.2.2 Cluster RestoreTypically, journal files are applied after backups are restored to bring databases up to dateor up to the point of a crash. If nodes have not left or joined a cluster since the last backup,you can restore the journal files starting from the marker corresponding to the backup. If oneor more nodes have joined the cluster since the last backup, the restore is more complex.A node joins a cluster either when it restarts with a proper configuration or cluster mounts adatabase after startup (as long as you properly set up other parameters, such as the PIJ directory,at startup). To make journal restore easier in the latter case, switch the journal file on thenode as soon as it joins the cluster.For each node that has joined the cluster since the last backup of the cluster:1. Restore the latest backup of the node.2. If the backup occurred before the node joined the cluster, restore the private journal filesfrom where the backup ends, up to the point when it joined the cluster. (You can makethis easier by switching the journal file when the node joins the cluster.)3. Restore the latest cluster backup.4. Restore the cluster journal files starting from where the backup ends.See Cluster Journal Restore for detailed information about running the utility.This procedure works well for restoring databases that were privately mounted on nodesbefore they joined the cluster and then are cluster-mounted after that node joined the cluster.It is based on the following assumptions:• A cluster backup covers all cluster-mounted databases and a system-only backup coversprivate databases that, by definition, are not accessible to other systems of the cluster.• The nodes did not leave and rejoin the cluster since the last cluster backup.124 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


In more complicated scenarios, these assumptions may not be true. The first assumptionbecomes false if, say, rather than centralizing backups of all cluster-mounted databases onone node, you configure each node to back up selected cluster-mounted databases along withits private databases.In this case, you may have to take a decentralized approach by restoring one database at atime. For each database, the restore procedure is essentially the same:1. Restore the latest backup that covers the database.2. Restore the private journal files up to the point when the node joined the cluster, if itpostdates the backup.3. Restore the cluster journal files from that point forward.Cluster FailoverCAUTION:Even if the database has always been privately mounted on the same node, itis safer to restore the cluster journal files than to apply only the journal filesof that node. If the node crashed or was shut down when it was part of thecluster, open transactions on the node would have been rolled back by andjournaled on a surviving node of the cluster. Restoring the cluster journal filesensures that you do not miss such rollbacks in journal files of other nodes.<strong>InterSystems</strong> does not recommend or support the scenario where a node joins and leaves thecluster multiple times.7.2.3 Failover Error ConditionsWhen a cluster member fails, the other cluster members notice a short pause while failoveroccurs. In rare situations, some processes on surviving cluster nodes may receive errors. You can trap these errors with the $ZTRAP error-trappingmechanism.Cluster failover does not work if one of the following is true:• One or more cluster members go down during the failover process.• There is disk drive failure and the first failover phase encounters an error.• One of the surviving cluster members does not have a Recovery daemon.If failover is unsuccessful, the cluster crashes and the following message appears at theoperator's console:****** <strong>Caché</strong> : CLUSTER CRASH - ALL <strong>Caché</strong> SYSTEMS ARE SUSPENDED ******<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 125


Cluster JournalingThe other cluster members freeze when <strong>Caché</strong> processes reach a Set or Kill command.Examine the failover log, which is contained in the console log (normally cconsole.log in themanager’s directory) of the cluster master, to see the error messages generated during thefailover process.If a cluster member that failed attempts startup, or a node tries to join the cluster by clustermountinga database while the cluster is in failover, the following message is displayed:The cluster appears to be attempting to recover from the failure of one or moremembers at this time. Waiting 45 seconds for failover to complete...A period (.) appears every five seconds until the active recovery phase completes. The mountor startup then proceeds.If the cluster crashes during this time, the following message is displayed:ENQ daemon failed to start because cluster is crashed.See the Cluster Recovery section for an explanation of what happens when a cluster crashes.7.3 Cluster ShadowingThe use of journaling in a clustered system also makes it possible to shadow a <strong>Caché</strong> cluster.In a cluster, each node manages its own journal files, which contain data involving privateor cluster-mounted databases. The shadow mirrors the changes to the databases (assumingall changes are journaled) on a <strong>Caché</strong> system that is connected to the cluster via TCP. Thefollowing diagram gives an overview of the cluster shadowing process:126 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Cluster ShadowingCluster Shadowing OverviewThe destination shadow connects to the specified <strong>Caché</strong> SuperServer on the cluster, requestinga list of journal files at or after the specified start location (the combination of cluster starttime and cluster journal sequence number), one starting file for each cluster member.For each node (with a unique CSN) returned from the source cluster, the shadow starts acopier process that copies journal files, starting with the file returned, from the server to theshadow. Each copier acts as a semi-independent shadow itself, similar to a nonclusteredblock-mode shadow.Once all copiers are up and running, the cluster shadow starts a dejournaling process thatapplies journal entries from the copied journal files to the databases on the shadow siderespecting the cluster journal sequence numbers of each journal record. The cluster shadowmaintains a list of current live members (including port numbers and IP addresses) of thecluster which it receives from the source cluster.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 127


Cluster JournalingThe following sections describe what information is necessary and the procedures involvedin setting up a cluster shadow as well as the limitations to the completeness and timelinessof the shadow databases:• Configuring a Cluster Shadow• Cluster Shadowing LimitationsNote:The shadow does not have to be a clustered system. The word “cluster” in clustershadowing refers to the source database server, not the shadow.7.3.1 Configuring a Cluster ShadowYou must provide several types of information to properly configure a cluster shadow. Anoverview of the required data items is divided into the following categories:Establishing a ConnectionAlthough a cluster is identified by the PIJ directory that all its member nodes share, theuniqueness of the identifier does not go beyond the physical cluster that hosts the <strong>Caché</strong>cluster. The shadow needs a way to make a TCP connection to the source cluster; therefore,on the shadow you must specify the IP address or host name of one member of the <strong>Caché</strong>cluster and the port number of the SuperServer running on that member. Also provide theshadow with a unique identity to distinguish it from other shadows, if any, in the same <strong>Caché</strong>instance.Identifying the Starting LocationConfigure the shadow to identify the journal starting location— a cluster start time (CSI)and, optionally, a cluster journal sequence number— for dejournaling. If you do not specifya cluster journal sequence number, dejournaling starts at the beginning of the cluster session.Copying Journal FilesSimilar to noncluster shadowing, specify a directory to put the journal files copied over fromthe cluster. However, a single directory is not adequate; journal files from different membersof the cluster must be kept separate. The directory you specify serves as the parent of thedirectories for the shadow copies of the journal files. In fact, the shadow creates directorieson the fly to keep up with the dynamic nature of the cluster components.At run time, for each journal directory on the server cluster, the shadow sets up a distinctsubdirectory under the user-specified parent directory and copies journal files from a journaldirectory on the server to its corresponding directory on the shadow—this is called redirectionof journal files. The subdirectories are named by sequential numbers, starting with 1. You128 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


cannot override a redirection by specifying a different directory on the shadow for a journaldirectory on the server.Redirecting Dejournaled TransactionsAs with nonclustered shadowing, specify database mapping, or redirections of dejournaledSet and Kill transactions.There are two ways to provide the information to set up a cluster destination shadow:• Using the System Management Portal• Using <strong>Caché</strong> RoutinesCluster Shadowing7.3.1.1 Using the System Management PortalYou can configure a shadow server using the System Management Portal. Perform the followingsteps:1. From the [Home] > [Configuration] > [Shadow Server Settings] page of the System ManagementPortal, follow the procedure described in the Configuring the Destination Shadowsection of the “Shadow Journaling” chapter of this guide. Use the following specificsparticular to cluster shadowing:a. Database Server — Enter the IP address or host name (DNS) of one member of thesource cluster to which the shadow will connect.b. Database Server Port # — Enter the port number of the source specified in the previousstep.2. After entering the location information for the source instance, click Select Source Eventto choose where to begin shadowing. A page displays the available cluster events fromthe cluster journal file directory.3. Click Advanced to enter the following optional field:• Journal file directory — Enter the full name, including the path, of the journal filedirectory on the destination shadow system, which serves as the parent directory ofshadow journal file subdirectories, created automatically by the shadow for eachjournal directory on the source cluster. Click Browse for help in finding the properdirectory.4. After you successfully save the configuration settings, add database mapping from thecluster to the shadow.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 129


Cluster Journaling5. Next to Database mapping for this shadow click Add to associate the database on thesource system with the directory on the destination system using the Add Shadow Mappingdialog box.6. In the Source database directory box, enter the physical pathname of the source databasefile—the CACHE.DAT file. Enter the pathname of its corresponding destination shadowdatabase file in the Shadow database directory box, and then click Save.7. Verify any pre-filled mappings and click Delete next to any invalid or unwanted mappings.Shadowing requires at least one database mapping to start.8. Start shadowing.7.3.1.2 Using <strong>Caché</strong> RoutinesYou can also use the shadowing routines provided by <strong>Caché</strong> to configure the cluster shadow.Each of the examples in this section uses the technique of setting a return code as the resultof executing the routine. After which you check the return code for an error or 1 for success.To initially configure the cluster shadow:Set rc=$$ConfigCluShdw^SHDWX(shadow_id,server,jrndir,begloc)Where ...shadow_idserverjrndirbeglocis ...A string that uniquely identifies the shadowConsists of the SuperServer port number and IP address or host nameof one cluster node, delimited by a commaParent directory of shadow journal file subdirectories, where journal filesfetched from the source cluster are stored on the destination shadow, onesubdirectory for each journal directory on the source clusterBeginning location consisting of a cluster session ID (cluster startup time,in YYYYMMDD HH:MM:SS format) and a cluster sequence number, delimitedby a commaYou can run the routine again to change the values of server, jrndir, or begloc. If you specifya new value for jrndir, only subsequent journal files fetched from the source are stored in thenew location; journal files already in the old location remain there.You can also direct the shadow to store journal files fetched from the journal directory, remdir,on the source in a local repository, locdir, instead of the default subdirectory of jrndir. Again,130 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


this change affects only journal files to be fetched, not the journal files that have been or arebeing fetched.Set rc=$$ConfigJrndir^SHDWX(shadow_id,locdir,remdir)The last mandatory piece of information about a shadow, Set and Kill transaction redirection,can be given as follows:Set rc=$$ConfigDbmap^SHDWX(shadow_id,locdir,remdir)This specifies that Set and Kill transactions from the source directory, remdir, be redirectedto the shadow directory, locdir. Unlike journal files, there is no default redirection for a sourcedatabase—if it is not explicitly redirected, Set and Kill transactions from that database areignored by the dejournaling process of the shadow.Finally, to start and stop shadowing:Set rc=$$START1^SHDWCLI("test")Set rc=$$STOP1^SHDWCLI("test")Cluster ShadowingSee Shadow Information Global and Utilities for more information.7.3.2 Cluster Shadowing LimitationsThere are a few limitations in the cluster shadowing process.Database updates that are journaled outside a cluster are not shadowed. Here are two examples:• After a cluster shuts down, if a former member of the cluster starts up as a stand-alonesystem and issues updates to some (formerly clustered) databases, the updates do notappear on the shadow.• After a formerly stand-alone system joins a cluster, the new updates made to its privatedatabases appear on the shadow (if they are defined in the database mapping), but noneof the updates made before the system joined the cluster appear. For this reason, joininga cluster on the fly (by cluster-mounting a database) should be planned carefully incoordination with any shadow of the cluster.In cluster shadowing, there is latency that affects the dejournaler. Journal files on the destinationshadow side are not necessarily as up to date as what has been journaled on the sourcecluster. The shadow applies production journals asynchronously so as not to affect performanceon the production server. This results in possible latency in data applied to the shadow.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 131


Cluster JournalingOnly one <strong>Caché</strong> cluster can be the target of a cluster shadow at any time, although there canbe multiple shadows on one machine. There is no guarantee regarding the interactions betweenthe multiple shadows, thus it is the user’s responsibility to ensure that they are mutuallyexclusive.Note:Exclude <strong>Caché</strong> databases from RTVScan when using Symantec Antivirus softwareto avoid the condition of the cluster shadow hanging on Windows XP. See the ReleaseNotes for Symantec AntiVirus Corporate Edition 8.1.1 for detailed information.7.4 Tools and UtilitiesThe following tools and utilities are helpful in cluster journaling processes:• Cluster Journal Restore — ^JRNRESTO• Journal Dump Utility — ^JRNDUMP• Startup Recovery Routine — ^STURECOV• Setting Journal Markers on a Clustered System — ^JRNMARK• Cluster Journal Information Global — ^%SYS(“JRNINFO”)• Shadow Information Global and Utilities — ^SYS(“shdwcli”)7.5 Cluster Journal RestoreThe cluster journal restore procedure allows you to start or end a restore using the journalmarkers placed in the journal files by a <strong>Caché</strong> backup. You can run a cluster journal restoreeither as part of a backup restore or as a stand-alone procedure.<strong>Caché</strong> includes an entry point to the journal restore interface for performing specific clusterjournal restore operations. From the %SYS namespace, run the following:Do CLUMENU^JRNRESTOThis invokes a menu that includes the following options:1. Perform a cluster journal restore.2. Generate a common journal file from specific journal files.132 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Cluster Journal Restore3. Perform a cluster journal restore after a backup restore.4. Perform a cluster journal restore based on <strong>Caché</strong> backups.7.5.1 Perform a Cluster Journal RestoreThe first option of the cluster journal restore menu allows you to run the general journalrestore on a clustered or nonclustered system. It is the equivalent of running ^JRNRESTOand answering Yes to the Cluster journal restore? prompt.%SYS>Do ^JRNRESTOThis utility uses the contents of journal filesto bring globals up to date from a backup.Replication is not enabled.Restore the Journal? Yes => YesCluster journal restore? YesYou are asked to describe the databases to be restored from the journal and the starting andending points of the restore. The starting and ending points can be based on a backup, on aset of journal markers, at a cluster start, or any arbitrary point in the journal.The interface prompts for directory information, including any redirection specifics, andwhether all databases and globals are to be processed. For example:Directory: _$1$DKB300:[TEST.CLU.5X]Redirect to Directory: _$1$DKB300:[TEST.CLU.5X] => _$1$DKB300:[TEST.CLU.5X]--> _$1$DKB300:[TEST.CLU.5X]Restore all globals in _$1$DKB300:[TEST.CLU.5X]? Yes => YesDirectory:For each directory you enter you are asked if you want to redirect. Enter the name of thedirectory to which to restore the dejournaled globals. If this is the same directory, enter theperiod (.) character or press Enter.Also specify for each directory whether you want to restore all journaled globals. Enter Yesor press Enter to apply all global changes to the database and continue with the next directory.Otherwise, enter No to restore only selected globals.At the Global^ prompt, enter the name of the specific globals you want to restore from thejournal. You may select patterns of globals by using the asterisk (*) to match any number ofcharacters and the question mark (?) to match any single character. Enter ?L to list the currentlyselected list of globals.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 133


Cluster JournalingWhen you have entered all your selected globals, press Enter at the Global^ prompt andenter the next directory. When you have entered all directories, press Enter at the Directoryprompt. Your restore specifications are displayed as shown in this example:Restoring globals from the following clustered datasets:1. _$1$DKB300:[TEST.CLU.5X] All GlobalsSpecifications for Journal Restore Correct? Yes => YesUpdates will not be replicatedVerify the information you entered before continuing with the cluster journal restore. AnswerYes or press Enter if the settings are correct; answer No to repeat the process of enteringdirectories and globals.Once you verify the directory and global specifications, the Main Settings menu of thecluster journal restore setup process is displayed with the current default settings, as shownin the following example:Cluster Journal Restore - Setup - Main Settings1. To LOCATE journal files using cluster journal log_$1$DKB400:[TEST.5X]CACHEJRN.LOGwith NO redirections of journal files2. To START restore at the beginning of cluster session 3. To STOP restore at sequence #319 of cluster session 134388,_$1$DRA2:[TEST.5Y.JOURNAL]20030320.0054. To SWITCH journal file before journal restore5. To DISABLE journaling the dejournaled transactionsSelect an item to modify ('Q' to quit or ENTER to accept and continue):From this menu you may choose to modify any of the default values of the five settings byentering its menu item number:1. Change the source of the restore.2. Change the starting point of the restore.3. Change the ending point of the restore.4. Toggle the switching journal file setting.5. Toggle the disable journaling setting.After each modification, the Main Settings menu is displayed again, and you are asked toverify the information you entered before the restore begins. The following is an example ofhow the menu may look after several changes:134 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Cluster Journal Restore - Setup - Main Settings1. To LOCATE journal files using cluster journal log_$1$DKB400:[TEST.5Y.MGR]CACHEJRN.TXTwith redirections of journal files_$1$DKB400:[TEST.5X.JOURNAL] -> _$1$DRA2:[TEST.5X.JOURNAL]_$1$DKB400:[TEST.5Y.JOURNAL] -> _$1$DRA2:[TEST.5Y.JOURNAL]_$1$DKB400:[TEST.5Z.JOURNAL] -> _$1$DRA2:[TEST.5Z.JOURNAL]2. To START restore at the journal marker located at138316,_$1$DKB400:[TEST.5X.JOURNAL]20030401.001-> _$1$DRA2:[TEST.5X.JOURNAL]20030401.0013. To STOP restore at the journal marker located at133232,_$1$DKB400:[TEST.5X.JOURNAL]20030401.003-> _$1$DRA2:[TEST.5X.JOURNAL]20030401.0034. NOT to SWITCH journal file before journal restore5. To DISABLE journaling the dejournaled transactionsSelect an item to modify ('Q' to quit or ENTER to accept and continue):Start journal restore?Cluster Journal RestorePress Enter to accept the settings and continue. If you are using the journal log of the currentcluster, you are informed that the restore will stop at the currently marked journal locationand asked if you want to start the restore.Select an item to modify ('Q' to quit or ENTER to accept and continue):To stop restore at currently marked journal locationoffset 134168 of _$1$DRA1:[TEST.50.MGR.JOURNAL]20031002.008Start journal restore?Enter Yes to begin the cluster journal restore. Once the restore finishes your system is readyfor activity.Enter No to go back to the main menu where you can continue to make changes to the clusterjournal restore setup or enter Q to abort the cluster journal restore. After aborting the clusterjournal restore, you can run a private journal restore or abort the restore process entirely.Select an item to modify ('Q' to quit or ENTER to accept and continue): QRun private journal restore instead? No[Journal restore aborted]Replication Enabled7.5.1.1 Change the Source of the RestoreThe first item on the Main Settings menu contains the information required to find thejournal files for all the cluster members. The information has two elements:• Cluster journal log — a list of journal files and their original full paths.• Redirection of journal files — necessary only if the system where you are running therestore is not part of the cluster associated with the journal log.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 135


Cluster JournalingBy default, the restore uses the cluster journal log associated with the current clustered system.If you are running the restore on a nonclustered system, you are prompted for a cluster journallog before the main menu is displayed.Choose this option to restore the journal files on a different <strong>Caché</strong> cluster from the one thatowns the journal files. You can either:• Identify the cluster journal log used by the original cluster.• Create a cluster journal log that specifies where to locate the journal files.Note:The option to redirect the journal files is available only if the specified cluster journallog is not that of the current cluster.Identify the Cluster Journal LogThe Journal File Information menu displays the cluster journal log file to be used in thecluster journal restore. If the journal files on the original cluster are not accessible to thecurrent cluster, copy them to a location accessible to the current cluster and specify how tolocate them by entering redirect information.Enter I to identify the journal log used by the original cluster.Select an item to modify ('Q' to quit or ENTER to accept and continue): 1Cluster Journal Restore - Setup - Journal File Info[I]dentify an existing cluster journal log to use for the restoreCurrent: _$1$DRA1:[TEST.50]CACHEJRN.LOG- OR -[C]reate a cluster journal log by specifying where journal files areSelection ( if no change): I*** WARNING ***If you specify a cluster journal log different from current one, youmay need to reenter info on journal redirection, restore range, etc.Enter the name of the cluster journal log ( if no change)=> cachejrn.txtCluster Journal Restore - Setup - Journal File Info[I]dentify an existing cluster journal log to use for the restoreCurrent: _$1$DRA1:[TEST.50.MGR]CACHEJRN.TXT[R]edirect journal files in _$1$DRA1:[TEST.50.MGR]CACHEJRN.TXT- OR -[C]reate a cluster journal log by specifying where journal files areYou must redirect journal files if the journal files being restored are not in their originallocations, as specified in the cluster journal log. To redirect the journal files listed in thecluster journal log, provide the original and current locations when prompted. You may givea full or partial directory name as an original location. All original locations with leading136 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


characters that match the partial name are replaced with the new location. An example ofredirecting files follows:Selection ( if no change): RJournal directories in _$1$DRA1:[TEST.50.MGR]CACHEJRN.TXT_$1$DRA1:[TEST.50.MGR.JOURNAL]_$1$DRA1:[TEST.5A.MGR.JOURNAL]_$1$DRA1:[TEST.5B.MGR.JOURNAL]Enter the original and current locations of journal files (? for help)Journal files originally from: _$1$DRA1:are currently located in: _$1$DRA2:_$1$DRA1:[TEST.50.MGR.JOURNAL] -> _$1$DRA2:[TEST.50.MGR.JOURNAL]_$1$DRA1:[TEST.5A.MGR.JOURNAL] -> _$1$DRA2:[TEST.5A.MGR.JOURNAL]_$1$DRA1:[TEST.5B.MGR.JOURNAL] -> _$1$DRA2:[TEST.5B.MGR.JOURNAL]Journal files originally from:Cluster Journal Restore - Setup - Journal File Info[I]dentify an existing cluster journal log to use for the restoreCurrent: _$1$DRA1:[TEST.50.MGR]CACHEJRN.TXT[R]edirect journal files in _$1$DRA1:[TEST.50.MGR]CACHEJRN.TXT- OR -[C]reate a cluster journal log by specifying where journal files areSelection ( if no change):Cluster Journal RestoreThis example shows the choice of an alternative cluster journal log, CACHEJRN.TXT, whichcontains a list of journal files originally located on _$1$DRA1:. These files are redirected tobe retrieved from their new location, _$1$DRA2:, during the restore.When you have finished entering the redirection information, press Enter to return to theMain Settings menu.Journal redirection assumes a one-to-one or many-to-one relationship between source andtarget directory locations. That is, journal files from one or multiple original directories maybe located in one new location, but not in multiple new locations. To restore from journalfiles that are in multiple new locations, create a cluster journal log that specifies where tolocate the journal files.Create a Cluster Journal LogIf the journal files on the original cluster are not accessible to the current cluster, create acluster journal log that specifies the locations of the journal files. The files in the specifiedlocations must all be part of the cluster. Copy them to a location accessible to the currentcluster and specify how to locate them by entering redirect information.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 137


Cluster JournalingSelection ( if no change): C*** WARNING ***If you specify a cluster journal log different from current one, youmay need to reenter info on journal redirection, restore range, etc.Enter the name of the cluster journal log to create (ENTER if none) =>cachejrn.txtHow many cluster members were involved? (Q to quit) => 3For each cluster member, enter the location(s) and name prefix (if any) of thejournal files to restore --Cluster member #0Journal File Name Prefix:Directory: _$1$DRA1:[TEST.50.MGR.JOURNAL]Directory:Cluster member #1Journal File Name Prefix:Directory: _$1$DRA1:[TEST.5A.MGR.JOURNAL]Directory:Cluster member #2Journal File Name Prefix:Directory: _$1$DRA1:[TEST.5B.MGR.JOURNAL]Directory:This example shows the creation of a cluster journal log, CACHEJRN.TXT, for a cluster withthree members whose journal files were originally located on _$1$DRA1:.The next menu contains the additional option to redirect the journal files in the cluster journallog you created:Cluster Journal Restore - Setup - Journal File Info[I]dentify an existing cluster journal log to use for the restoreCurrent: _$1$DRA1:[TEST.50.MGR]CACHEJRN.TXT[R]edirect journal files in _$1$DRA1:[TEST.50.MGR]CACHEJRN.TXT- OR -[C]reate a cluster journal log by specifying where journal files areSelection ( if no change):Enter R to redirect the files as described in Identify the Cluster Journal Log. When finishedentering redirect information, press Enter to return to the Main Settings menu.7.5.1.2 Change the Starting Point of the RestoreThe second and third items on the Main Settings menu specify the range of restore—wherein the journal files to begin restoring and where to stop. The starting point information containsthe starting journal file and sequence number for each cluster member. The default for whereto begin is determined in the following order:• If a cluster journal restore was performed after any backup restore, restore the journalfrom the end of last journal restore.• If a backup restore was performed on the current system, restore the journal from the endof the last restored backup.138 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


• If the current system is associated with the cluster journal log being used, restore thejournal from the beginning of the current cluster session• Otherwise, restore the journal from the beginning of the cluster journal log.Cluster Journal Restore<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 139


Cluster JournalingCluster Journal Restore - Setup - Main Settings1. To LOCATE journal files using cluster journal log_$1$DRA1:[TEST.50.MGR]CACHEJRN.TXTwith redirections of journal files_$1$DRA1:[TEST.50.MGR.JOURNAL] -> _$1$DRA2:[TEST.50.MGR.JOURNAL]_$1$DRA1:[TEST.5A.MGR.JOURNAL] -> _$1$DRA2:[TEST.5A.MGR.JOURNAL]_$1$DRA1:[TEST.5B.MGR.JOURNAL] -> _$1$DRA2:[TEST.5B.MGR.JOURNAL]2. To START restore at the end of last restored backup_$1$DRA1:[TEST.50.MGR]CLUFULL.BCK134120,_$1$DRA1:[TEST.50.MGR.JOURNAL]20031002.008-> _$1$DRA1:[TEST.50.MGR.JOURNAL]20031002.0083. To STOP restore at the end of the cluster journal log4. To SWITCH journal file before journal restore5. To DISABLE journaling the dejournaled transactionsSelect an item to modify ('Q' to quit or ENTER to accept and continue): 2Cluster Journal Restore - Setup - Where to Start Restore1. At the beginning of a cluster session2. At a specific journal marker3. Following the restore of backup _$1$DRA1:[TEST.50.MGR]CLUFULL.BCK (*)i.e., at the journal marker located at134120,_$1$DRA1:[TEST.50.MGR.JOURNAL]20031002.008Selection ( if no change): 1To start journal restore at the beginning of cluster session ...1. 20030904 09:47:012. 20031002 13:19:123. 20031002 13:26:404. 20031002 13:29:105. 20031002 13:51:316. 20031002 13:58:577. 20031002 14:29:428. 20031002 14:33:559. 20031002 14:35:48=> 5Cluster Journal Restore - Setup - Main Settings1. To LOCATE journal files using cluster journal log_$1$DRA1:[TEST.50.MGR]CACHEJRN.TXTwith redirections of journal files_$1$DRA1:[TEST.50.MGR.JOURNAL] -> _$1$DRA2:[TEST.50.MGR.JOURNAL]_$1$DRA1:[TEST.5A.MGR.JOURNAL] -> _$1$DRA2:[TEST.5A.MGR.JOURNAL]_$1$DRA1:[TEST.5B.MGR.JOURNAL] -> _$1$DRA2:[TEST.5B.MGR.JOURNAL]2. To START restore at the beginning of cluster session 3. To STOP restore at the end of the cluster journal log4. To SWITCH journal file before journal restore5. To DISABLE journaling the dejournaled transactionsSelect an item to modify ('Q' to quit or ENTER to accept and continue): 2Cluster Journal Restore - Setup - Where to Start Restore1. At the beginning of a cluster session (*): 2. At a specific journal marker3. Following the restore of backup _$1$DRA1:[TEST.50.MGR]CLUFULL.BCKSelection ( if no change): 2To start restore at a journal marker location (in original form)journal file: _$1$DRA1:[TEST.50.MGR.JOURNAL]20031002.008offset: 134120You have chosen to start journal restore at140 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Cluster Journal Restore134120,_$1$DRA1:[TEST.50.MGR.JOURNAL]20031002.008the journal location by the end of backup _$1$DRA1:[TEST.50.MGR]CLUFULL.BCKThe submenu varies slightly based on the current settings. For example, if no backup restorewas performed, the submenu for specifying the beginning of the restore does not list option3 to restore from the end of last backup. In a submenu, the option that is currently chosen ismarked with an asterisk (*).7.5.1.3 Change the Ending Point of the RestoreBy default, the restore ends at either the current journal location, if the current system isassociated with the selected cluster journal log, or the end of the journal log. The submenufor option 3 is similar to that for option 2:Cluster Journal Restore - Setup - Main Settings1. To LOCATE journal files using cluster journal log_$1$DRA1:[TEST.50.MGR]CACHEJRN.TXTwith redirections of journal files_$1$DRA1:[TEST.50.MGR.JOURNAL] -> _$1$DRA2:[TEST.50.MGR.JOURNAL]_$1$DRA1:[TEST.5A.MGR.JOURNAL] -> _$1$DRA2:[TEST.5A.MGR.JOURNAL]_$1$DRA1:[TEST.5B.MGR.JOURNAL] -> _$1$DRA2:[TEST.5B.MGR.JOURNAL]2. To START restore at the end of last restored backup_$1$DRA1:[TEST.50.MGR]CLUFULL.BCK134120,_$1$DRA1:[TEST.50.MGR.JOURNAL]20031002.008-> _$1$DRA1:[TEST.50.MGR.JOURNAL]20031002.0083. To STOP restore at the end of the cluster journal log4. To SWITCH journal file before journal restore5. To DISABLE journaling the dejournaled transactionsSelect an item to modify ('Q' to quit or ENTER to accept and continue): 3Cluster Journal Restore - Setup - Where to Stop Restore1. At the end of a cluster session2. At the end of _$1$DRA1:[TEST.50.MGR]CACHEJRN.TXT3. At a specific journal markerThis is the menu you would see if the journal log is the one for the current cluster:Select an item to modify ('Q' to quit or ENTER to accept and continue): 3Cluster Journal Restore - Setup - Where to Stop Restore1. At the end of a cluster session2. At current journal location (*)3. At a specific journal markerThe submenu varies slightly based on the current settings. For example, depending whetheror not the journal log is the one for current cluster, option 2 in the menu for specifying theend of the restore would be either the current journal location or the end of the journal log.In a submenu, the option that is currently chosen is marked with an asterisk (*).<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 141


Cluster Journaling7.5.1.4 Toggle the Switching Journal File SettingThe fourth menu item specifies whether to switch the journal file before the restore. If youselect this item number, the value is toggled between the values To SWITCH and NOT toSWITCH the journal file; the menu is displayed again with the new setting:Select an item to modify ('Q' to quit or ENTER to accept and continue): 4Cluster Journal Restore - Setup - Main Settings1. To LOCATE journal files using cluster journal log_$1$DRA1:[TEST.50.MGR]CACHEJRN.TXTwith redirections of journal files_$1$DRA1:[TEST.50.MGR.JOURNAL] -> _$1$DRA2:[TEST.50.MGR.JOURNAL]_$1$DRA1:[TEST.5A.MGR.JOURNAL] -> _$1$DRA2:[TEST.5A.MGR.JOURNAL]_$1$DRA1:[TEST.5B.MGR.JOURNAL] -> _$1$DRA2:[TEST.5B.MGR.JOURNAL]2. To START restore at the end of last restored backup_$1$DRA1:[TEST.50.MGR]CLUFULL.BCK134120,_$1$DRA1:[TEST.50.MGR.JOURNAL]20031002.008-> _$1$DRA1:[TEST.50.MGR.JOURNAL]20031002.0083. To STOP restore at the end of the cluster journal log4. NOT to SWITCH journal file before journal restore5. To DISABLE journaling the dejournaled transactionsThe default is to switch the journal file before the restore. This provides a clean start so thatupdates that occur after the restore are in new journal files.7.5.1.5 Toggle the Disable Journaling SettingThe fifth menu item specifies whether to disable journaling of the dejournaled transactionsduring the restore. If you select this item, the value is toggled between the values DISABLEand NOT to DISABLE journaling the dejournaled transactions; the menu is redisplayed withthe new setting.Select an item to modify ('Q' to quit or ENTER to accept and continue): 5Cluster Journal Restore - Setup - Main Settings1. To LOCATE journal files using cluster journal log_$1$DRA1:[TEST.50.MGR]CACHEJRN.TXTwith redirections of journal files_$1$DRA1:[TEST.50.MGR.JOURNAL] -> _$1$DRA2:[TEST.50.MGR.JOURNAL]_$1$DRA1:[TEST.5A.MGR.JOURNAL] -> _$1$DRA2:[TEST.5A.MGR.JOURNAL]_$1$DRA1:[TEST.5B.MGR.JOURNAL] -> _$1$DRA2:[TEST.5B.MGR.JOURNAL]2. To START restore at the end of last restored backup_$1$DRA1:[TEST.50.MGR]CLUFULL.BCK134120,_$1$DRA1:[TEST.50.MGR.JOURNAL]20031002.008-> _$1$DRA1:[TEST.50.MGR.JOURNAL]20031002.0083. To STOP restore at the end of the cluster journal log4. NOT to SWITCH journal file before journal restore5. NOT to DISABLE journaling the dejournaled transactionsFor better performance, the default setting is to disable journaling the dejournaled transactions.However, if you are running a cluster shadow, you may want to choose not to disable journaling.142 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Cluster Journal RestoreNote:If you choose not to disable journaling, the dejournaled transactions are journaledonly if they otherwise meet the normal criteria for being journaled.7.5.2 Generate a Common Journal FileThe user interface for this option is similar to the first with additional questions about thecontents and format of the output file. However, instead of restoring the journal files, thisoption produces a common-format journal file that can be read by the ^%JREAD utility ona <strong>Caché</strong> system that does not support cluster journal restores or on another platform such asDSM.^JCONVERT provides the same functionality if you answer Yes to the Cluster JournalConvert? question.The second option produces a single common-format output file from the cluster journal files.It calls the ^JCONVERT utility, which takes a journal file from a single system and writesit out in a common format to be read by the %JREAD routine. This is useful for restoringjournal files across versions of <strong>Caché</strong> where the journal files are not compatible (for example,as part of an “almost rolling” upgrade) or as part of failing back to an earlier release. Youcan also use this option to write the journal file in a format that can be loaded into anotherplatform such as DSM.Cluster Journal Restore Menu--------------------------------------------------------------1) Cluster journal restore2) Generate common journal file from specific journal files3) Cluster journal restore after backup restore4) Cluster journal restore corresponding to <strong>Caché</strong> backups--------------------------------------------------------------H) Display HelpE) Exit this utility--------------------------------------------------------------Enter choice (1-4) or [E]xit/[H]elp? 27.5.3 Perform a Cluster Journal Restore after a Backup RestoreOption three restores the journal files after a <strong>Caché</strong> backup has been restored. This is similarto the restore performed by the incremental backup restore routine, ^DBREST, after a clusterbackup restore when there is no way to run it independently of restoring a backup. (To restartthe journal restore, for example.) One difference between this option and restoring using^DBREST is that this option does not start with the list of databases contained in the backup;you must enter the database list.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 143


Cluster JournalingThe routine offers to include all currently cluster-mounted databases in the restore, but if itis being run after restoring a backup, the databases restored by the backup are then privatelymounted unless you change the mount state. (The restore mounts them privately and leavesthem privately mounted when it is finished.) It starts with the markers recorded in the journalfiles by the backup and ends with the end of the journal data.7.5.4 Perform a Cluster Journal Restore Based on <strong>Caché</strong> BackupsThe fourth menu option restores the journal files using journal markers that were added bya <strong>Caché</strong> backup to specify the starting point and, optionally, the end point. It is similar tooption three except that it uses backups which have been performed to designate where tostart rather than backups which have been restored. Functionally they are the same; bothoptions use a marker which has been placed into the journal file by a <strong>Caché</strong> backup as thestarting point. The difference is in the list of choices of where to start.7.6 Journal Dump UtilityOn a <strong>Caché</strong> clustered system, the ^JRNDUMP routine displays the cluster session ID(cluster startup time) of a journal file instead of the word JRNSTART. The ^JRNDUMP routinedisplays a list of records in a journal file, showing the cluster session ID along with thejournal file sizes.The utility lists journal files maintained by the local system as well as journal files maintainedby other systems of the <strong>Caché</strong> cluster, in the order of cluster startup time (cluster session ID)and the first and last cluster journal sequence numbers of the journal files. Journal files createdby ^JRNSTART are marked with an asterisk (*). Journal files that are no longer available(purged, for example) are marked with D (for deleted). Journal file names are displayed withindentions that correspond to their CSN, that is: no indention for journal files from system0, one space for system 1, two spaces for system 2, etc.Sample output from the cluster version of ^JRNDUMP follows. By default, The level-1display on a clustered system is quite different from the nonclustered one:FirstSeq LastSeq Journal FilesSession 20030820 11:02:430 0 D /bench/test/cache/50a/mgr/journal/20030820.0030 0 /bench/test/cache/50b/mgr/journal/20030820.004Session 20030822 10:55:463 3 /bench/test/cache/50b/mgr/journal/20030822.001(N)ext,(P)rev,(G)oto,(E)xamine,(Q)uit =>144 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Startup Recovery RoutineBesides a list of journal files from every cluster node, (even the dead ones), there are clustersession IDs and the first and the last cluster journal sequence numbers of each journal file.A cluster session ID (the date-time string following Session) is the time the first node ofthe cluster starts. A cluster session ends when the last node of the cluster shuts down. Filesfrom different nodes are shown with different indention: no indentation for the node withCSN 0, one space for the node with CSN 1, and so on. The CSN of a node uniquely identifiesthe node within the cluster at a given time. The files labeled D have most likely been deletedfrom their host systems.The previous version of ^JRNDUMP for clusters is available as OLD^JRNDUMP, if youprefer that output.7.7 Startup Recovery RoutineThe following is the help display of the startup recovery routine, ^STURECOV:<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 145


Cluster Journaling%SYS>Do ^STURECOVLogins are not disabled. This routine is designed to runwhen Cache' is in single user mode due to a problem runningthe STU startup routine.Do you want to continue ? No => yesWarning: Misuse of this utility can harm your systemThere is no record of any errors during the prior startupThis could be because there was a problem writing the datato disk or because the system failed to start for some otherreason.Do you want to continue ? No => yesEnter error type (? for list) [^] => ?Supported error types are:JRN - Journal restore and transaction rollbackCLUJRN - Cluster journal restore and transaction rollbackEnter error type (? for list) [^] => CLUJRNCluster journal recovery options--------------------------------------------------------------1) Display the list of errors from startup2) Run the journal restore again4) Dismount a database5) Mount a database6) Database Repair Utility7) Check Database Integrity--------------------------------------------------------------H) Display HelpE) Exit this utility--------------------------------------------------------------Enter choice (1-8) or [E]xit/[H]elp? H--------------------------------------------------------------Before running ^STURECOV you should have corrected theerrors that prevented the journal restore or transaction rollbackfrom completing. Here you have several options regarding whatto do next.This entry point exists for compatability with prior versions. TheOption 1: The journal restore and transaction rollback proceduretries to save the list of errors in ^%SYS(). This is not alwayspossible depending on what is wrong with the system. If thisinformation is available, this option displays the errors.Option 2: This option performs the same journal restore andtransaction rollback which was performed when the system wasstarted. The amount of data is small so it should not benecessary to try and restart from where the error occurred.Option 3 is not enabled for cluster recoveryOption 4: This lets you dismount a database. Generally thiswould be used if you want to let users back on a system butyou want to prevent them from accessing a database which stillhas problems (^DISMOUNT utility).Option 5: This lets you mount a database (^MOUNT utility).Option 6: This lets you edit the database structure (^REPAIR utility).Option 7: This lets you validate the database structure (^INTEGRIT utility).Option 8 is not enabled for cluster recovery. Shut the systemdown using the bypass option with ccontrol stop and then start itwith ccontrol start. During startup answer YES when asked if youwant to continue after it displays the message related to errorsduring recovery.Press continueCluster journal recovery options--------------------------------------------------------------146 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


1) Display the list of errors from startup2) Run the journal restore again4) Dismount a database5) Mount a database6) Database Repair Utility7) Check Database Integrity--------------------------------------------------------------H) Display HelpE) Exit this utility--------------------------------------------------------------Enter choice (1-8) or [E]xit/[H]elp?Setting Journal Markers on a Clustered System7.8 Setting Journal Markers on a ClusteredSystemTo set a journal marker effective cluster-wide, use the following routine$$CLUSET^JRNMARK(id,text,swset)Where....idtextswsetis...Marker ID (for example, -1 for backup)Marker text (for example, “timestamp” for backup)1 — if the switch that inhibits database reads and writes (switch 10) hasbeen set cluster-wide (and locally) by the caller.The caller is responsiblefor clearing it afterwards.0 — if the switch has not been set. The routine takes care of setting andclearing the switch properlyNote that switch 10 must be set locally and cluster-wide to ensure the integrity of the journalmarker. If successful, the routine returns the location of the marker — the offset of the markerin the journal file and the journal file name — delimited by a comma. Otherwise, it returnsan error code (


Cluster Journalingallows you to modify or delete the cluster journal log (presumably after deleting the journalfiles) between two cluster sessions, as the update algorithm assumes that you do not alter thecluster journal log during a cluster session.The ^%SYS("JRNINFO") global has three subcomponents:• The jrninfo table is indexed by journal file names, with the value of the top node beingthe number of entries in the cluster journal log and the value of each subnode being acomma-delimited list of the attributes of that journal file: CSN, line number of the journalfile in the cluster journal log, CSI, first and last sequence numbers.• The jrninfor (r for reverse) table is a list of journal files, with CSN as the primary keyand the line number of the journal file in the cluster journal log as the secondary key.• The seqinfo table contains the following subscripts: CSI, first and last sequence numbers,CSN, and line number of the journal file in the cluster journal log.Here is a sample of ^%SYS("JRNINFO") contents:148 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


^%SYS("JRNINFO",1032803946,"jrninfo")=16^%SYS("JRNINFO",1032803946,"jrninfo","_$1$DKA0:[TEST.50.MGR.JOURNAL]20030916.002")=1,2,1031949277,160,160"_$1$DRA1:[TEST.50.MGR.JOURNAL]20030913.004")=0,1,1031949277,3,3"_$1$DRA1:[TEST.50.MGR.JOURNAL]20030916.001")=0,3,1031949277,292,292"_$1$DRA1:[TEST.50.MGR.JOURNAL]20030916.002")=0,4,1032188507,3,417"_$1$DRA1:[TEST.50.MGR.JOURNAL]20030916.003")=0,7,1032188507,3,422"_$1$DRA1:[TEST.50.MGR.JOURNAL]20030916.004")=0,8,1032197355,3,4"_$1$DRA1:[TEST.50.MGR.JOURNAL]20030916.005")=0,9,1032197355,3,7"_$1$DRA1:[TEST.50.MGR.JOURNAL]20030916.006")=0,10,1032197355,3,10"_$1$DRA1:[TEST.50.MGR.JOURNAL]20030916.007")=0,11,1032197355,3,17"_$1$DRA1:[TEST.50.MGR.JOURNAL]20030916.008")=0,12,1032197355,3,17"_$1$DRA1:[TEST.50.MGR.JOURNAL]20030918.001")=0,13,1032197355,3,27"_$1$DRA1:[TEST.50.MGR.JOURNAL]20030923.001")=0,15,1032803946,3,133"_$1$DRA1:[TEST.5A.MGR.JOURNAL]20030916.002")=1,5,1032188507,3,3"_$1$DRA1:[TEST.5A.MGR.JOURNAL]20030916.003")=1,6,1032188507,131,131"_$1$DRA1:[TEST.5A.MGR.JOURNAL]20030923.001")=1,14,1032197355,39,39"_$1$DRA1:[TEST.5A.MGR.JOURNAL]20030923.002")=1,16,1032803946,3,3^%SYS("JRNINFO",1032803946,"jrninfor",0,1)=_$1$DRA1:[TEST.50.MGR.JOURNAL]20030913.0043)=_$1$DRA1:[TEST.50.MGR.JOURNAL]20030916.0014)=_$1$DRA1:[TEST.50.MGR.JOURNAL]20030916.0027)=_$1$DRA1:[TEST.50.MGR.JOURNAL]20030916.0038)=_$1$DRA1:[TEST.50.MGR.JOURNAL]20030916.0049)=_$1$DRA1:[TEST.50.MGR.JOURNAL]20030916.00510)=_$1$DRA1:[TEST.50.MGR.JOURNAL]20030916.00611)=_$1$DRA1:[TEST.50.MGR.JOURNAL]20030916.00712)=_$1$DRA1:[TEST.50.MGR.JOURNAL]20030916.00813)=_$1$DRA1:[TEST.50.MGR.JOURNAL]20030918.00115)=_$1$DRA1:[TEST.50.MGR.JOURNAL]20030923.001^%SYS("JRNINFO",1032803946,"jrninfor",1,2)=_$1$DKA0:[TEST.50.MGR.JOURNAL]20030916.0025)=_$1$DRA1:[TEST.5A.MGR.JOURNAL]20030916.0026)=_$1$DRA1:[TEST.5A.MGR.JOURNAL]20030916.00314)=_$1$DRA1:[TEST.5A.MGR.JOURNAL]20030923.00116)=_$1$DRA1:[TEST.5A.MGR.JOURNAL]20030923.002^%SYS("JRNINFO",1032803946,"seqinfo",1031949277,3,3,0,1)=^%SYS("JRNINFO",1032803946,"seqinfo",1031949277,160,160,1,2)=^%SYS("JRNINFO",1032803946,"seqinfo",1031949277,292,292,0,3)=^%SYS("JRNINFO",1032803946,"seqinfo",1032188507,3,3,1,5)=^%SYS("JRNINFO",1032803946,"seqinfo",1032188507,3,417,0,4)=^%SYS("JRNINFO",1032803946,"seqinfo",1032188507,3,422,0,7)=^%SYS("JRNINFO",1032803946,"seqinfo",1032188507,131,131,1,6)=^%SYS("JRNINFO",1032803946,"seqinfo",1032197355,3,4,0,8)=^%SYS("JRNINFO",1032803946,"seqinfo",1032197355,3,7,0,9)=^%SYS("JRNINFO",1032803946,"seqinfo",1032197355,3,10,0,10)=^%SYS("JRNINFO",1032803946,"seqinfo",1032197355,3,17,0,11)=12)=^%SYS("JRNINFO",1032803946,"seqinfo",1032197355,3,27,0,13)=^%SYS("JRNINFO",1032803946,"seqinfo",1032197355,39,39,1,14)=^%SYS("JRNINFO",1032803946,"seqinfo",1032803946,3,3,1,16)=^%SYS("JRNINFO",1032803946,"seqinfo",1032803946,3,133,0,15)=Cluster Journal Information Global<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 149


Cluster Journaling7.10 Shadow Information Global and UtilitiesThe global node ^SYS("shdwcli”) is where shadow client information is maintained. Most ofthe values are available through the utilities ShowState^SHDWX, ShowError^SHDWX,and ShowWhere^SHDWX.Running ShowState^SHDWX displays most of the data contained in the global:%SYS>d ShowState^SHDWX("clutest",1)Shadow ID PrimaryServerIP Port R S Err------------------------------------------------------------------------clutest rodan 42009 0 1 1\_ clutest~0 192.9.202.5 42009 0 1\_ clutest~1 192.9.202.5 42009 0 1\_ clutest~2 rodan 42009 0 1Redirection of Global Sets and Kills:^^_$1$DKB300:[TEST.CLU.5X] -> ^^_$1$DKA0:[TEST.CLU.5X]Redirection of Master Journal Files:Base directory for auto-redirection: _$1$DKA0:[TEST.5X.SHADOW]_$1$DRA2:[TEST.5X.JOURNAL] -> _$1$DKA0:[TEST.5X.SHADOW.1]_$1$DRA2:[TEST.5Y.JOURNAL] -> _$1$DKA0:[TEST.5X.SHADOW.2]_$1$DRA2:[TEST.5Z.JOURNAL] -> _$1$DKA0:[TEST.5X.SHADOW.3]Primary Server Cluster ID: _$1$DKB400:[TEST.5X]CACHE.PIJPrimary Server Candidates (for failover):192.9.202.5 42009rodan 42009192.9.202.5 42019192.9.202.5 42029When to purge a shadow journal file: after it's dejournaledThe output displayed from the ^SYS("shdwcli”) global has the following components:• Shadow ID — the ID of a copier shadow is partially inherited from the parent shadow.The clu subnode of a copier contains the ID of the parent, and the sys subnode of theparent contains a list of the IDs of the copiers.• PrimaryServerIP and Port — for a copier, these specify the system from which it getsjournal files; for the dejournaler, the system from which it gets journal information(JRNINFO server). The values are stored in the ip and port0 subnodes.• R — has the value 1 if the shadow is running; from the stat subnode.• S — has the value 1 if the shadow is requested to stop (due to latency, it is possible thatboth R and S have the value 1 if the shadow has yet to check for the stop request); fromthe stop subnode.• Err — number of errors encountered. See details through ShowError^SHDWX, whichdisplays the information from the err subnode.150 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Shadow Information Global and Utilities• Redirection of Global Sets and Kills — referred to as database mapping in the SystemManagement Portal; from the dbmap subnode of the cluster shadow.• Redirection of Master Journal Files — discussed in the Using <strong>Caché</strong> Routines section;stored in the jrndir subnode of the cluster shadow. The value of the jrndir subnode is thenumber of journal directories that have been automatically redirected (in the precedingexample output, the next new journal directory is redirected to a subdirectory [.4]).(jrndir,0) is the base shadow directory and everything else indicates a redirection ofjournal directories with the server journal directory being the key and the shadow journaldirectory being the value.• Primary Server Cluster ID — used to prevent the shadow from following a node to adifferent cluster; from the DBServerClusterID subnode.• Primary Server Candidates (for failover) — the list of current live members of the cluster.If one member dies, a shadow (either the dejournaler or a copier) that gets informationfrom the member tries other members on the list until it succeeds. A new member isadded to the list as soon as the shadow knows its presence; from the servers subnode.• When to purge a shadow journal file — works in the same way as purging of local journalfiles. The age threshold is set by the lifespan subnode of the cluster shadow. Unlikepurging of local journal files, however, if the value of lifespan is 0, the shadow journalfiles are purged as soon as they have been dejournaled completely. The purged journalfiles are listed in the jrndel subnode of the copiers.The chkpnt subnode stores a list of checkpoints. A checkpoint is a snapshot of the work queueof the dejournaler—the current progress of dejournaling. The value of the chkpnt subnodeindicates which checkpoint to use when the dejournaler resumes. This is the checkpoint displayedby ShowWhere^SHDWX. Updating the value of the chkpnt subnode after havingcompletely updated the corresponding checkpoint, avoids having a partial checkpoint in thecase of system failover in the middle of an update (in that case, dejournaler would use previouscheckpoint).The copiers keep the names of the copied (or being copied) journal files in the jrnfil subnode.This makes it possible to change the redirection of journal files by allowing the dejournalerto find the shadow journal files in the old directory while the copiers copy new journal filesto the new location. Once a shadow journal file is purged, it is moved from the jrnfil list tothe jrndel list.Here is a sample of the ^SYS("shdwcli") contents for the nodes for the cluster shadow, clutest,and two of its copier shadows:<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 151


Cluster Journaling^SYS("shdwcli","clutest")=0^SYS("shdwcli","clutest","DBServerClusterID")=_$1$DKB400:[TEST.5X]CACHE.PIJ"at")=0"chkpnt")=212^SYS("shdwcli","clutest","chkpnt",1)=1,1012488866,-128^SYS("shdwcli","clutest","chkpnt",1,1012488866,0,1)=0,,,,^SYS("shdwcli","clutest","chkpnt",2)=2,1012488866,-128^SYS("shdwcli","clutest","chkpnt",2,1012488866,-128,2)=-128,_$1$DKA0:[TEST.5X.SHADOW.1]20020131.001,0,,0^SYS("shdwcli","clutest","chkpnt",3)=6,1012488866,5^SYS("shdwcli","clutest","chkpnt",3,1012488866,11,6)=5,_$1$DKA0:[TEST.5X.SHADOW.1]20020131.001,-132252,,0^SYS("shdwcli","clutest","chkpnt",4)=35,1012488866,85^SYS("shdwcli","clutest","chkpnt",4,1012488866,95,35)=85,_$1$DKA0:[TEST.5X.SHADOW.1]20020131.001,-136984,,0^SYS("shdwcli","clutest","chkpnt",5)=594,1012488866,807^SYS("shdwcli","clutest","chkpnt",5,1012488866,808,594)=808,_$1$DKA0:[TEST.5X.SHADOW.1]20020131.001,262480,1,0...^SYS("shdwcli","clutest","chkpnt",212)=24559,1021493730,5^SYS("shdwcli","clutest","chkpnt",212,1021493730,37,24559)=5,_$1$DKA0:[TEST.5X.SHADOW.1]20020515.001,-132260,,0^SYS("shdwcli","clutest","cmd")=^SYS("shdwcli","clutest","dbmap","^^_$1$DKB300:[TEST.CLU.5X]")=^^_$1$DKA0:[TEST.CLU.5X]^SYS("shdwcli","clutest","end")=0"err")=1^SYS("shdwcli","clutest","err",1)=20020519 14:16:34 568328925 Query+8^SHDWX;-12;reading ans from |TCP|42009timed out,Remote server is not responding^SYS("shdwcli","clutest","err",1,"begin")=20020519 14:09:09"count")=5^SYS("shdwcli","clutest","errmax")=10"intv")=10"ip")=rodan"jrndir")=3^SYS("shdwcli","clutest","jrndir",0)=_$1$DKA0:[TEST.5X.SHADOW]"_$1$DRA2:[TEST.5X.JOURNAL]")=_$1$DKA0:[TEST.5X.SHADOW.1]"_$1$DRA2:[TEST.5Y.JOURNAL]")=_$1$DKA0:[TEST.5X.SHADOW.2]"_$1$DRA2:[TEST.5Z.JOURNAL]")=_$1$DKA0:[TEST.5X.SHADOW.3]^SYS("shdwcli","clutest","jrntran")=0"lifespan")=0"locdir")="locshd")="pid")=568328919"port")="port0")=42009"remjrn")=^SYS("shdwcli","clutest","servers","42009,192.9.202.5")="42009,rodan")="42019,192.9.202.5")="42029,192.9.202.5")=^SYS("shdwcli","clutest","stat")=0"stop")=1^SYS("shdwcli","clutest","sys",0)=1)=2)=^SYS("shdwcli","clutest","tcp")=|TCP|42009"tpskip")=1"type")=21^SYS("shdwcli","clutest~0")=0^SYS("shdwcli","clutest~0","at")=0"clu")=clutest"cmd")="end")=132260"err")=0152 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


"intv")=10"ip")=192.9.202.5^SYS("shdwcli","clutest~0","jrndel","_$1$DKA0:[TEST.5X.SHADOW.1]20020131.001")="_$1$DKA0:[TEST.5X.SHADOW.1]20020131.002")="_$1$DKA0:[TEST.5X.SHADOW.2]20020510.010")=^SYS("shdwcli","clutest~0","jrnfil")=36^SYS("shdwcli","clutest~0","jrnfil",35)=_$1$DRA2:[TEST.5X.JOURNAL]20020513.006^SYS("shdwcli","clutest~0","jrnfil",35,"shdw")=_$1$DKA0:[TEST.5X.SHADOW.1]20020513.006^SYS("shdwcli","clutest~0","jrnfil",36)=_$1$DRA2:[TEST.5X.JOURNAL]20020515.001^SYS("shdwcli","clutest~0","jrnfil",36,"shdw")=_$1$DKA0:[TEST.5X.SHADOW.1]20020515.001^SYS("shdwcli","clutest~0","jrntran")=0"locdir")="locshd")=_$1$DKA0:[TEST.5X.SHADOW.1]20020515.001"pause")=0"pid")=568328925"port")=42009"port0")=42009"remend")=132260"remjrn")="stat")=0"stop")=1"tcp")=|TCP|42009"tpskip")=1"type")=12^SYS("shdwcli","clutest~1")=0^SYS("shdwcli","clutest~1","at")=0"clu")=clutest"cmd")="end")=132248"err")=0"intv")=10"ip")=192.9.202.5^SYS("shdwcli","clutest~1","jrndel","_$1$DKA0:[TEST.5X.SHADOW.1]20020510.003")="_$1$DKA0:[TEST.5X.SHADOW.2]20020131.001")="_$1$DKA0:[TEST.5X.SHADOW.2]20020510.008")=^SYS("shdwcli","clutest~1","jrnfil")=18^SYS("shdwcli","clutest~1","jrnfil",17)=_$1$DRA2:[TEST.5X.JOURNAL]20020510.011^SYS("shdwcli","clutest~1","jrnfil",17,"shdw")=_$1$DKA0:[TEST.5X.SHADOW.1]20020510.011^SYS("shdwcli","clutest~1","jrnfil",18)=_$1$DRA2:[TEST.5Y.JOURNAL]20020510.011^SYS("shdwcli","clutest~1","jrnfil",18,"shdw")=_$1$DKA0:[TEST.5X.SHADOW.2]20020510.011^SYS("shdwcli","clutest~1","jrntran")=0"locdir")="locshd")=_$1$DKA0:[TEST.5X.SHADOW.2]20020510.011"pid")=568328925"port")=42009"port0")=42009"remend")=132248"remjrn")="stat")=0"stop")=1"tcp")=|TCP|42009"tpskip")=1"type")=12Shadow Information Global and Utilities<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 153


8<strong>Caché</strong> Clusters on Tru64 UNIXAlpha Tru64 UNIX version 5 introduced cluster-mounted file systems and a UNIX distributedlock manager. Thus, two or more servers with their own memory, CPU, and I/O channelscan simultaneously access data stored on a common set of file systems.There are three methods available for running <strong>Caché</strong> on one of these clusters. The first methodis to simply run your <strong>Caché</strong> instances on single servers as stand-alone installations. Whilethis is the simplest case, it fails to utilize the performance and high availability advantagesof the cluster. The second method is to utilize Tru64 UNIX Cluster Available Application(CAA) functionality. This is described in the System Failover Strategies chapter as a coldfailover.The third method, the topic for this chapter, is to utilize two or more of the servers in a Tru64UNIX cluster as a unit in a warm failover or a hot failover configuration. With this versionof <strong>Caché</strong>, you may run <strong>Caché</strong> across a Tru64 UNIX cluster where processes running on anymember of the cluster have controlled simultaneous access to the same <strong>Caché</strong> database files.With this configuration you gain performance and high availability benefits. This functionalityis the same as for <strong>Caché</strong> on OpenVMS clusters.The following topics are discussed:• Tru64 UNIX <strong>Caché</strong> Cluster Overview• TruCluster File System Architecture• Planning a Tru64 <strong>Caché</strong> Cluster Installation• Tuning a Tru64 <strong>Caché</strong> Cluster Member<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 155


<strong>Caché</strong> Clusters on Tru64 UNIX8.1 Tru64 UNIX <strong>Caché</strong> Cluster OverviewTruCluster technology allows multiple machines running Tru64 Version 5.0 or later to worktogether in a scalable, high availability, clustered configuration. All cluster members mustbe connected to each other via a cluster interconnect.The cluster interconnect carries all communication between members of the cluster. Any timea cluster member needs access to a resource that is served by another cluster member, thedata flows over the cluster interconnect. Physically, the interconnect can be Ethernet, GigabitEthernet, or Compaq's proprietary Memory Channel interconnect.The cluster interconnect runs an IP stack, so every cluster member has an IP address for itscluster interconnect in addition to its regular address(es) on the outside LAN. The IP addressof the cluster interconnect is generally chosen from a non-routed network, such as the 10.0.0.0network, as it is only used by members of the cluster and is not visible to the outside world.General information about a cluster's current configuration can be obtained with the commandclu_get_info.Example of Tru64 Cluster Configuration156 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


8.2 TruCluster File System ArchitectureOne of the defining features of the TruCluster file system architecture is the concept of ashared root. All cluster members share the same root, /usr, and /var file systems. These correspondto the cluster_root, cluster_usr, and cluster_var partitions, respectively, and are storedon the cluster root disk or disks. Since the root of the directory structure is shared, all clustermembers see an identical view of the directory structure and all members of the cluster canaccess anything mounted locally on any cluster member. Of course, if a disk is local to onemember and not on the shared SCSI bus, that member becomes a single point of failure forthat disk.The most obvious implication is that all cluster members share the same operating systemfiles. Most administrative tasks, such as user management, take effect cluster-wide. Whilesimplifying system administration, this makes the cluster root drive a very critical resource,and you should consider making it accessible from more than one shared bus to avoid itbecoming a point of failure.This shared-everything approach raises some questions about how system-specific configurationfiles and the boot process are handled. The files are mapped from their standard locationin the file system to the member-specific root partition via context-dependent symbolic links(CDSLs). CDSLs allow each member to have its own version of a particular file.A CDSL is a special type of link that is created with the mkcdsl command. An example viewof a CDSL:clunode1.kinich.com> ls -l sysconfigtablrwxrwxrwx 1 root system 57 Sep 6 2006 sysconfigtab@ ->../cluster/members/{memb}/boot_partition/etc/sysconfigtab*The unique part of this link is the {memb} section. This section corresponds to the membernumber of the node. Thus a process running on member1 accesses its own copy of/etc/sysconfigtab and a process running on member2 accesses its own separate copy of/etc/sysconfigtab. CDSLs are available on all versions of Tru64 5.1B regardless of whetherthe Tru64 cluster software is installed. Each cluster member is assigned a permanent memberid. A system that is not a cluster member is assigned the number 0.A view of the beginning portion of the CDSL:TruCluster File System Architectureclunode1: /etc> ls -l ../cluster/memberstotal 24lrwxr-xr-x 1 root system 6 Sep 6 10:45 member@ -> {memb}/drwxr-xr-x 9 root system 8192 Sep 6 10:45 member0/drwxr-xr-x 9 root system 8192 Mar 16 2006 member1/drwxr-xr-x 9 root system 8192 Sep 6 10:45 member2/<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 157


<strong>Caché</strong> Clusters on Tru64 UNIXWhen a CDSL is referenced, Tru64 replaces the {memb} part with “membern” where n isthe member number of the local machine. Thus for first member, sysconfigtab refers to thefile:/etc/cluster/members/member1/boot_partition/etc/sysconfigtabThe clu_get_info command displays the list of current cluster members and their memberids.8.2.1 <strong>Caché</strong> and CDSLsThe <strong>Caché</strong> registry, which stores all information about <strong>Caché</strong> instances on that machine, isstored in /usr/local/etc/cachesys. When <strong>Caché</strong> is installed in a TruCluster environment, itmakes this directory a CDSL to a member-specific area. Therefore, when <strong>Caché</strong> is installedon one cluster member, the ccontrol list command on the other cluster members do not displaythat installation.This provides maximum flexibility where the current version of <strong>Caché</strong> can be given the sameconfiguration name on each cluster member. It also prevents one cluster member from accidentallystarting or stopping an instance of <strong>Caché</strong> that is running on another cluster member.If desired, the other installations from other cluster members can be added to the local registryfile using the create function of the ccontrol command. The syntax is:clunode1: /> ccontrol create $cfgname directory=$tgtdir versionid=$verwhere $cfgname is the configuration or instance name, $tgtdir is the directory where thatinstance is installed, and $ver is the version string. It is possible to simply delete the CDSL(using the rmcdsl command) and use a common registry file for the <strong>Caché</strong> cluster. However,the upgrade procedure for <strong>Caché</strong> may convert this back to a CDSL, placing the existing registryfile in the member-specific area of the upgraded system.CDSLs can be fragile—it is easy to delete them or break them. Broken CDSLs can causetrouble with remastering AdvFS domains. The clu_check_config command tests the validityof registered CDSLs and displays any problems. Generally, removing and recreating a brokenCDSL may be all that is necessary to resolve problems with lost files.8.2.2 Remastering AdvFS DomainsIn a Tru64 cluster there is a single view of the file system, but this is layered on top of AdvFSfile systems. Even though each cluster member may have direct access to the disk device thatmake up the domain, one cluster member is elected to be the server for that domain and allother cluster members are clients. <strong>Caché</strong> opens database files using direct I/O on Tru64158 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


cluster members. If there is a direct path from a machine to the disk drive, the database,journal, and WIJ(write image journal) I/O all use that path.Tru64 also supports direct I/O when there is not a direct connection to the disk. In this case,Tru64 redirects the I/O requests over the cluster interconnect to the server for that drive. Thisis obviously not the optimal configuration from a performance perspective. Even thoughdirect I/O allows for shared I/O to a file from multiple cluster members, Tru64 restricts fileexpansion (and all metadata operations) to the server for the fileset. This is generally not aconsideration for databases that expand rarely (compared to the I/O rate), but it is a concernfor journal files, which are constantly expanding.<strong>Caché</strong> requires a separate AdvFS domain for each set of journal files that could potentiallybe part of a <strong>Caché</strong> instance on separate cluster members. AdvFS domains can be remasteredon a running system, although this tends to fail if the domain is under heavy load at the time.The status of an AdvFS domain can be determined with the cfsmgr command. The commandwith no arguments displays summary information for each system. The command cfsmgr displays information for a single filesystem. For example:clunode1: /# cfsmgrDomain or filesystem name = cluster_root#rootMounted On = /Server Name = clunode2Server Status : OKDomain or filesystem name = root1_domain#rootMounted On = /cluster/members/member1/boot_partitionServer Name = clunode1Server Status : OKDomain or filesystem name = root2_domain#rootMounted On = /cluster/members/member2/boot_partitionServer Name = clunode2Server Status : OKDomain or filesystem name = cluster_var#varMounted On = /varServer Name = clunode2Server Status : OKDomain or filesystem name = test_domain#testMounted On = /testServer Name = clunode2Server Status : OKDomain or filesystem name = cluster_usr#usrMounted On = /usrServer Name = clunode1Server Status : OKclunode1: /#TruCluster File System Architecture<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 159


<strong>Caché</strong> Clusters on Tru64 UNIXTo change the machine which is serving a domain use the -r -a SERVER= option. You canspecify either the filesystem or the domain to be remastered; however, keep in mind thateither way the entire domain is remastered—this affects all filesets/file systems in that domain.In the following example the server for /test (and the domain it is part of) is transferred fromclunode2 to clunode1:clunode1: /# cfsmgr /testDomain or filesystem name = /testServer Name = clunode2Server Status : OKclunode1: /# cfsmgr –r -a SERVER=clunode1 /testclunode1: /# cfsmgr /testDomain or filesystem name = /testServer Name = clunode1Server Status : OKclunode1: /#When a cluster member shuts down (or fails) any domains/filesets it was serving get takenover by one of the surviving members, which has a direct connection to the disk drivesinvolved. If no member has such a connection, the domain/fileset becomes unavailable. <strong>Caché</strong>cluster failover requires that the surviving cluster members have access to all of the databasesfrom that failed system (not just the cluster mounted ones), the WIJ file, and the journal files.This is why <strong>Caché</strong>, and its components, should not be installed on a disk that is only connectedto a single node. When a failed cluster member starts back up it does not automatically takeover the filesets it used to be serving. This must be done manually.After planning your configuration you should add the necessary cfsmgr commands to thestartup scripts for each cluster member so that when it boots, it becomes the server for thedomains that contain the <strong>Caché</strong> journal files for that node.8.3 Planning a Tru64 <strong>Caché</strong> Cluster InstallationPlease keep the following points in mind when planning your Tru64 UNIX cluster:• <strong>Caché</strong> clusters requires Tru64 UNIX Version 5.1B with the latest 5.1B aggregate patchkit installed.• HP recommends using Memory Channel hardware for the cluster interconnect, althoughGigabit Ethernet is also supported. Slower Ethernet is not recommended for clusterinterconnect hardware.160 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Tuning a Tru64 <strong>Caché</strong> Cluster Member• Each cluster member should have a direct path to the disk subsystem. A disk subsystemmay be directly connected to only some or one of the cluster members, but this is neithera highly available nor a high-performance configuration.• UNIX file systems (UFS) are read-only across Tru64 clusters. <strong>Caché</strong> must be installedon Advanced File Systems (AdvFS) in a Tru64 cluster.• You must create at least as many AdvFS domains as Tru64 cluster members being usedon <strong>Caché</strong>. To store <strong>Caché</strong> journal files, each Tru64 cluster member running <strong>Caché</strong> needsa file system that it can serve locally.Note:It is not acceptable to have one domain and multiple filesets for <strong>Caché</strong> journalfiles.A fileset contains all files and control scripts that make up a product. Filesetsare the smallest software object manageable by SysMan Software Managercommands. Filesets must be grouped into one or more products and can beincluded in one or more different bundles.• The best practice is to build the Tru64 UNIX cluster before installing <strong>Caché</strong>.• Do not use context-dependent symbolic links (CDSLs) for the <strong>Caché</strong> installation directory.Install <strong>Caché</strong> into a separate directory on each cluster member. Using the same namesfor these directories makes administration easier.• Separate AdvFS domains or filesets are required for the <strong>Caché</strong> journal files of eachmember, and the default location of journal files is a subdirectory of the <strong>Caché</strong> installationdirectory. Therefore, if you want to use the default location for journal files, install <strong>Caché</strong>for each cluster member in separate AdvFS domains. <strong>Caché</strong> installations that always runon the same cluster member may be installed in the same domain.8.4 Tuning a Tru64 <strong>Caché</strong> Cluster MemberThe tuning parameters for a Tru64 <strong>Caché</strong> cluster system are the same as those for an OpenVMScluster system. <strong>InterSystems</strong> recommends setting values for two attributes of the dlm subsystem,rhash_size and dlm_locks_cfg. These correspond to the RESHASHTBL and LOCKIDTBLparameters under OpenVMS; their calculations are similar.They are configured as other Tru64 kernel parameters and are put in the /etc/sysconfigtab fileunder the dlm: stanza. For example:<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 161


<strong>Caché</strong> Clusters on Tru64 UNIXdlm:rhash_size = 400000dlm_locks_cfg = 100000The system configuration file contains separate stanzas for kernel subsystems. Each stanzacontains a list of attributes and the values those attributes hold. Only those subsystems/attributesto be changed from the default need to be listed.This file, sysconfigtab, is not shared between cluster members, therefore, each cluster membermust be restarted for the changes to take effect. The settings should be the same for eachcluster member.162 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


9<strong>Caché</strong> and Windows ClustersMicrosoft Windows operating systems do not support shared-all clusters:• They do not offer a shared resource cluster model.• They do not allow simultaneous access to shared drives: you cannot lock, read, or writeto a cluster.• If a drive fails, the operating system does not swap in a backup drive.• Windows NT and Windows 2000 Advanced Server allow only two nodes to be clustered,and only provide failover of your drives from one node to another. Some versions ofWindows Server 2003, though, allow up to eight nodes.Two of the <strong>Caché</strong>-supported Microsoft platforms, Windows 2000 Advanced Server andWindows Server 2003, however, do allow you to cluster computers that share the same storage.You must have a RAID or SCSI disk drive system to do so. See the Microsoft KnowledgeBase Article (278007), “Available Features in Windows Server 2003 Clusters” , for moreinformation on changes for Windows Server 2003.You can run two basic cluster setups under Windows operating systems in <strong>Caché</strong>:• Single Failover Cluster• Multiple Failover ClusterNote:You must have a multiserver <strong>Caché</strong> license for Windows clustering. For a multiplefailover cluster, you must also have a separate license for each <strong>Caché</strong> instance, oran Enterprise license.The Example Procedures sections show the procedure details used in both setups.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 163


<strong>Caché</strong> and Windows ClustersThese cluster setups are described in the sections that follow. For suggestions on other waysto run a large enterprise system, contact the <strong>InterSystems</strong> Worldwide Response Center (WRC).9.1 Single Failover ClusterThe following shows a single failover cluster:Single Failover ClusterCLUNODE-1 and CLUNODE-2 are clustered together, and <strong>Caché</strong> is running on one node.During normal operations, the following conditions are true:• Disk S is online on CLUNODE-1, and CLUNODE-2 has no database disks online.• The instance CacheA runs on CLUNODE-1; CLUNODE-2 is idle.In this setup, if CLUNODE-1 fails, your system looks like this:164 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Single Failover ClusterFailover Cluster with Node Failure• Disk S is online on CLUNODE-2; CLUNODE-1 has no database disks online.• The instance CacheA runs on CLUNODE-2; CLUNODE-1 is down.See the Setting Up a Failover Cluster section for a detailed example.9.1.1 Setting Up a Failover ClusterThis section gives an example of the steps necessary to set up a failover cluster on WindowsServer 2003. The steps are performed on the following:• Tasks on CLUNODE-1• Tasks on CLUNODE-2• Tasks on Both NodesTasks on CLUNODE-1Perform the following steps on the first cluster node, CLUNODE-1:1. Open the Cluster Administrator from the Windows Administrative Tools submenu.Verify that all drives that contain <strong>Caché</strong> files and databases are shared drives and thatthey are all online on CLUNODE-1.2. Create a Cluster Group called CacheA Group.3. Create an IP Address Resource for CacheA Group called CacheA_IP.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 165


<strong>Caché</strong> and Windows ClustersYou can also create a Network Name resource type if you want applications to connectto <strong>Caché</strong> by a DNS name as well as an IP address.4. Create a Physical Disk Resource for the shared disk containing CacheA called Disk S:.5. Install <strong>Caché</strong> on CLUNODE-1, naming the instance CacheA and installing it in a newfolder, CacheA, on Disk S.6. Define the instance CacheA and map the database files for the instance on drives that areonline on the same cluster as the <strong>Caché</strong> instance during normal operations. Do the samefor all journal files.7. Create a <strong>Caché</strong> Cluster Resource for CacheA Group called CacheA_controller.8. Bring CacheA_controller online on CLUNODE-1 using the Cluster Administrator.9. Move CacheA Group from CLUNODE-1 to CLUNODE-2 by right-clicking CacheA Groupunder the Groups branch and then clicking Move Group. You do not need to stop <strong>Caché</strong>;this is the way you fail over.DependenciesAfter you create the resources on CLUNODE-1, the following dependencies exist:1. IP address, CacheA_IP, has no dependencies.2. Physical disk resource, Disk S, depends on CacheA_IP.3. <strong>Caché</strong> cluster resource, CacheA_controller, depends on Disk S.Tasks on CLUNODE-2Perform the following steps on the second cluster node, CLUNODE-2:1. Install <strong>Caché</strong> on CLUNODE-2, naming the instance CacheA and installing it in the CacheAfolder on Disk S. You are installing <strong>Caché</strong> on top of itself only to get the <strong>Caché</strong> entryinto the registry.2. Bring CacheA_controller online on CLUNODE-2 using the Cluster Administrator.Tasks on Both NodesVerify the failover setup:1. Move the CacheA Group cluster group from CLUNODE-2 back to CLUNODE-1.2. When you again have CacheA Group running on CLUNODE-1, run ipconfig on bothCLUNODE-1 and CLUNODE-2 to check that each is properly advertising the alias IPaddresses.166 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Example ProceduresYou now have a failover cluster running <strong>Caché</strong>.Important:Do not start and stop <strong>Caché</strong> from the <strong>Caché</strong> Cube. Instead, using the ClusterAdministrator, take the CacheA_controller offline to shut down <strong>Caché</strong>, andbring the CacheA_controller online to start <strong>Caché</strong>.9.2 Example ProceduresThe following are instructions for performing common procedures in the cluster buildingprocess. They apply to the single failover example, but you can adapt them to the additionalsteps in the multiple failover setup by replacing the CacheA names with the appropriateCacheB names.9.2.1 Create a Cluster GroupTo create a cluster group, do the following:1. From the Cluster Administrator, right-click Groups, point to New, and click Group.2. In the New Group dialog box, enter the group name (in this example, CacheA Group) inthe Name box and click Next.3. List and arrange the preferred owners, for example, CLUNODE-1 and CLUNODE-2, asshown in the following graphic for CacheA Group, and click Finish.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 167


<strong>Caché</strong> and Windows Clusters9.2.2 Create an IP Address ResourceTo create an IP address resource, do the following:1. From the Cluster Administrator, right-click the group name (CacheA Group), point toNew, and click Resource.2. Enter CacheA_IP as the name, select IP Address as the Resource type, and click Next:3. Assign the alias IP address used to connect to the instance (CacheA) by your users. Putthis resource in the corresponding cluster group (CacheA Group). This is not the clusteror node IP, but a new and unique IP specific for the instance (CacheA).For this example, the value of CacheA_IP is 192.9.205.68.168 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Example ProceduresOnce finished, the CacheA_IP resource has the following properties:IP Address Advanced PropertiesIP Address Parameter PropertiesCacheA_IP has no dependencies.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 169


<strong>Caché</strong> and Windows Clusters9.2.3 Create a Physical Disk ResourceTo create a physical disk resource for the shared disk containing CacheA, do the following:1. From the Cluster Administrator, right-click the group name (CacheA Group), point toNew, and click Resource.2. Enter Disk S: as the name, select Physical Disk as the Resource type, and click Next:3. Verify, and update as necessary, the following settings for the Dependencies properties:• Click Modify to enter a dependency.• Enter CacheA_IP in the Name, and IP Address in the Resource Type as shown in thefollowing figure:Physical Disk Dependency Properties9.2.4 Install <strong>Caché</strong>Follow the procedure described in the “Installing <strong>Caché</strong> on Microsoft Windows” chapterof the <strong>Caché</strong> Installation <strong>Guide</strong> to install <strong>Caché</strong> on the Windows cluster node.Each time you install an instance on a new node that is part of a Windows cluster, you mustchange the default automatic startup setting. Navigate to the [Home] > [Configuration] >170 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


[Memory and Startup] page of the System Management Portal and clear the Start <strong>Caché</strong> onSystem Boot check box to prevent automatic startup; this allows the cluster manager to start<strong>Caché</strong>.Following the installation you can remove the shortcuts from the Windows Startup folder(C:\Documents and Settings\All Users\Start Menu\Programs\Startup) that start the <strong>Caché</strong> Cubeon Windows login. The shortcut has the name you give the instance when you install (CACHE,for example).The recommended best practice is to manage the cluster remotely from the cube on a workstationconnecting to the cluster IP address. If you choose to use the cube locally from thedesktop of one of the cluster nodes, be aware that certain configuration changes require a<strong>Caché</strong> restart and if you restart <strong>Caché</strong> outside the context of the cluster administrator, thecluster will declare the group failed and attempt failover.9.2.5 Create a <strong>Caché</strong> Cluster ResourceOn Windows Server 2003 and later, <strong>Caché</strong> provides dynamic link library (DLL) files thatcontain the information necessary to add a new resource type, ISCCres2003, to the ClusterAdministrator. You must add this resource type before continuing with this section. See theAdd the ISCCres2003 Resource Type section for details.To add a <strong>Caché</strong> cluster resource, perform the following steps:1. From the Cluster Administrator, right click the group name, CacheA Group, point to New,and click Resource.2. Select ISCCres2003 as the Resource type and click Next.3. Enter the resource name, CacheA_controller in this example, and the <strong>Caché</strong> instancename you entered at installation, CacheA in this example.4. Update as necessary, the following settings for the Dependencies properties:• Click Modify to enter a dependency.• Enter Disk S: in the Name, and Physical Disk in the Resource Type and click OK.5. Verify, and update as necessary, the following settings for the controller Advancedproperties:• Clear the Affect the group check box.• Leave the default, 3, in the Threshold box.• Click Use value from resource type for both poll intervals.Example Procedures<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 171


<strong>Caché</strong> and Windows ClustersOnce finished, the CacheA_controller cluster resource has the following properties:Cluster Resource General PropertiesCluster Resource Dependencies Properties172 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


Example ProceduresCluster Resource Advanced PropertiesCluster Resource Parameter Properties<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 173


<strong>Caché</strong> and Windows Clusters9.2.5.1 Add the ISCCres2003 Resource TypeThe <strong>Caché</strong> installation procedure places two DLL files, ISCCres2003.dll and ISCCres2003Ex.dll,in the \bin directory. To make the ISCCres2003 resource type available in theCluster Administrator, copy the two DLL files to the Windows cluster directory(\CLUSTER) on each cluster node and run the following commands from this samedirectory:cluster resourcetype "ISCCres2003" /create /dllname:"ISCCres2003.dll"/type:"ISCCres2003"regsvr32 /s ISCCres2003Ex.dllcluster /RegAdminExt:"ISCCres2003Ex.dll"9.3 Multiple Failover ClusterYou may also set up a cluster with multiple failover nodes using the same procedures describedin the previous sections for the single failover cluster. The following shows a failover clusteron multiple nodes:Multiple Failover ClusterCLUNODE-1 and CLUNODE-2 are clustered together, and <strong>Caché</strong> is running on both nodes.During normal operations, the following conditions are true:• Disk S is online on CLUNODE-1, and Disk T is online on CLUNODE-2.• The CacheA instance runs on CLUNODE-1; the CacheB instance runs on CLUNODE-2.174 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


• Instances CacheA and CacheB cannot directly access each other’s cache.dat files; theycan directly access only their own mounted cache.dat files.With this type of setup, if CLUNODE-2 fails, your system looks like this:Multiple Failover Cluster with Node FailureMultiple Failover ClusterBoth CacheA and CacheB run on CLUNODE-1. Once you repair or replace CLUNODE-2, youcan move your CacheB instance back to CLUNODE-2. If CLUNODE-1 were to fail, bothCacheA and CacheB would run on CLUNODE-2.See the Setting Up a Multiple Failover Cluster section for a detailed example.9.3.1 Setting Up a Multiple Failover ClusterThis section gives a simple example of the steps necessary to set up a cluster with more thanone failover node. The steps are performed on the following:• Tasks on CLUNODE-1• Tasks on CLUNODE-2• Tasks on Both NodesTasks on CLUNODE-1Perform the following steps on the first cluster node, CLUNODE-1:1. Open the Cluster Administrator from the Windows Administrative Tools submenu.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 175


<strong>Caché</strong> and Windows ClustersVerify that all drives that contain <strong>Caché</strong> files and databases are shared drives and thatthey are all online on CLUNODE-1.2. Create a Cluster Group called CacheA Group.3. Create an IP Address Resource for CacheA Group called CacheA_IP.4. Create a Physical Disk Resource for the shared disk containing CacheA called Disk S:.5. Create a Cluster Group called CacheB Group.6. Create an IP Address Resource for CacheB Group called CacheB_IP.7. Create a Physical Disk Resource for the shared disk containing CacheB called Disk T:.Add a dependency for Disk T: for the IP Address Resource, CacheB_IP. This drive isnormally online on CLUNODE-2.8. Install <strong>Caché</strong> on CLUNODE-1, naming the instance CacheA and installing it in a newfolder, CacheA, on Disk S.9. Install <strong>Caché</strong> on CLUNODE-1 again, this time naming the instance CacheB and installingit in a new folder, CacheB, on Disk T.10. Update the instances CacheA and CacheB:• Assign unique port numbers for the <strong>Caché</strong> SuperServer, the Web server, and Telnet.• Map database files for each instance on drives that are online on the same cluster asthe <strong>Caché</strong> instance during normal operations. Do the same for all journal files.11. Create a <strong>Caché</strong> Cluster Resource for CacheA Group called CacheA_controller.12. Create a <strong>Caché</strong> Cluster Resource for CacheB Group called CacheB_controller.13. Move CacheA Group from CLUNODE-1 to CLUNODE-2 by right-clicking CacheA Groupunder the Groups branch and then clicking Move Group. You do not need to stop <strong>Caché</strong>;this is the way you fail over.14. Move CacheB Group from CLUNODE-1 to CLUNODE-2.Tasks on CLUNODE-2Perform the following steps on the second cluster node, CLUNODE-2:1. Install <strong>Caché</strong> on CLUNODE-2, naming the instance CacheA and installing it in the CacheAfolder on Disk S. You are installing <strong>Caché</strong> on top of itself only to get the <strong>Caché</strong> entryinto the registry.176 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


2. Install <strong>Caché</strong> on CLUNODE-2 again, this time naming the instance CacheB and installingit in the CacheB folder on Disk T. You are installing <strong>Caché</strong> on top of itself only to get the<strong>Caché</strong> entry into the registry.Tasks on Both NodesAfter checking to see if both instances of <strong>Caché</strong> are running properly, do the following:1. Move the CacheA Group cluster group from CLUNODE-2 back to CLUNODE-1.2. When you again have CacheA Group running on CLUNODE-1 and CacheB Group onCLUNODE-2, run ipconfig on both CLUNODE-1 and CLUNODE-2 to check that each isproperly advertising the alias IP addresses.Do not use actual IP addresses during this setup. Use the cluster IPs; clustering advertises tothe network and routes all connections properly.When failover occurs, a cluster can advertise multiple cluster IPs—you never have to manuallyredirect connections. CLUNODE-1, if running CacheA and CacheB, advertises both clusterIPs. When you fail CacheB back over to its fixed online machine (CLUNODE-2), the cluster,once again, automatically directs connections to CLUNODE-2.You now have a multiple failover cluster running <strong>Caché</strong>.Multiple Failover ClusterImportant:Do not start and stop <strong>Caché</strong> from the <strong>Caché</strong> Cube. Instead, using the ClusterAdministrator, take the CacheA_controller or CacheB_controller offline toshut down <strong>Caché</strong>, and bring the CacheA_controller or CacheB_controlleronline to start <strong>Caché</strong>.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 177


10ECP FailoverOne of the most powerful and unique features of <strong>Caché</strong> is the ability to efficiently distributedata and application logic among a number of server systems. The underlying technologybehind this feature is the Enterprise Cache Protocol (ECP), a distributed data caching architecturethat manages the distribution of data and locks among a heterogeneous network ofserver systems. ECP is an important part of an application failover strategy for high-availabilitysystems.This chapter describes how the architecture works to maintain high availability:• ECP Recovery• ECP and ClustersFor more detailed information about ECP, see the <strong>Caché</strong> Distributed Data Management <strong>Guide</strong>.10.1 ECP RecoveryThe simplest case of ECP recovery is a temporary network interruption that is long enoughto be noticed, but short enough that the underlying TCP connection stays active during theoutage. During the outage, the application server (or client) notices that the connection isnonresponsive and blocks new network requests for that connection. Once the connectionresumes, processes that were blocked are able to send their pending requests. If the underlyingTCP connection is reset, the data server waits for a reconnection for a timeout (configurable,default is set to one minute). If the client does not succeed in reconnecting during that interval,all the work done by the previous connection is rolled back and the connection request isconverted into a request for a brand new connection.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 179


ECP FailoverA more complex case is where the network outage is severe enough to reset the underlyingTCP connection, but both client and data server stay up throughout the outage, and the clientreconnects within the data server’s reconnection window.On reconnection, the main action that must be performed is to flush (or, eventually, re-validate)the client’s cache of downloaded blocks and the client’s cache of downloaded routines. Inaddition, the client keeps a queue of locks to remove and transactions to roll back once theconnection is reestablished. By keeping this queue, there is never a problem with allowing aprocess to halt right away whenever it wants to, whether or not the servers it has pendingtransactions and locks on are currently available. Connection recovery is careful to completeany pending Set and Kill operations that had been queued for the data server before the networkoutage was detected, before it completes the delayed release of locks.Finally, there is the case where the data server shut down, either gracefully or as a result ofa crash. In this case, recovery involves several more steps on the data server, some of whichinvolve the data server journal file in very important ways.The result of the several different steps is that:• The data server’s view of the current active transactions from each application server hasbeen restored from the data server’s journal file.• The data server’s view of the current active Lock operations from each application serverhas been restored, by having the application server upload those locks to the data server.• The application server and the data server both agree on exactly which requests from theapplication server can be ignored (because it is certain they completed before the crash)and which ones should be replayed. Hence, the last recovery step is to simply let thepending network requests complete, but only those network requests that are safe toreplay.• Finally, the application server delivers to the data server any pending unlock or rollbackindications that it saved from jobs that halted while the data server was restarting. Allguarantees are maintained, even in the face of sudden and unanticipated data servercrashes, as long as the integrity of the databases (including the WIJ file and the journalfiles) are maintained.During the recovery of an ECP-configured system, <strong>Caché</strong> guarantees a number of recoverablesemantics which are described in detail in the ECP Recovery Guarantees section of the “ECPRecovery Guarantees and Limitations” appendix of the <strong>Caché</strong> Distributed Data Management<strong>Guide</strong>. There are limitations to these guarantees which are described in detail in the ECPRecovery Limitations section of the “ECP Recovery Guarantees and Limitations” appendixof the <strong>Caché</strong> Distributed Data Management <strong>Guide</strong>.180 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


ECP and Clusters10.2 ECP and ClustersAdding cluster nodes which utilize ECP is simple and straightforward. On each cluster node,configure the system as an ECP data server, by enabling the ECP service from the [Home] >[Security Management] > [Services] page. Click %Service_ECP and select the Service enabledcheck box. This is the only configuration setting required to use this node as an ECP dataserver.If you add a new member to the cluster, <strong>Caché</strong> does not need to change network configurationon every running member. Only the lock and increment requests are delivered in the internalcluster ECP connection. There is no data block to be sent from the master to the other clustermembers as would be sent in ECP without clusters. When cluster failover happens, the clustermember asks the new master to do ECP recovery.A cluster member creates ECP connections to the other cluster members so it can access theprivately-mounted databases in the other members. If you configure a node to be an ECPapplication server, it is not used as the cluster connection; it works as a regular ECP connection.Though this connection can also be used for accessing clustered mounted databases, thecluster failover does not recover the connection.If this is the first member to join the cluster, it is the master. Each node that joins the clusterdoes the following:• Retrieves connection information (IP address and port) for each cluster member from thePIJ file.• Validates the connection to each existing member to ensure cluster failover success.• Allocates a system # (index to netnode array) and sets up a null system name for thesystem #.• Initializes the netnode structure with ECP connection using the IP address and port fromthe master entry in the PIJ file and puts the connections in Not Connected state.<strong>Caché</strong> declares any ECP connection from a failing cluster member Disabled and releases allresources for that connection, including the locks in the lock table owned by the failed system.The following sections outline what happens under the described conditions on a clusteredsystem using the ECP architecture:• Application Server Fails• Data Server Fails<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 181


ECP Failover• Network Is Interrupted• Cluster as an ECP Database Server10.2.1 Application Server FailsIf the data server becomes aware that the application server has halted, crashed, disconnected,or otherwise declared the connection dead, the data server declares the connection dead.If a data server declares a connection is dead, it rolls back any open transactions for thatapplication server and releases any locks held for that application server. It is then availablefor a new connection from that application server.During application server recovery the application server retransmits any requests it hadpreviously sent for which it had not yet received a response, and, in the case of a data servercrash, it also transmits any locks it owned on the data server. Following this phase the userson the application server resume operations without any noticeable effect other than thepause—no data is lost or rolled back, and no application server user processes get errors.However, a few processes that were waiting for a $Increment or $Bit function that sets orclears a bit and returns the former value, may receive errors and have their open transactionsrolled back.10.2.2 Data Server FailsIf a data server crashes while application server connections are open, the data server doesthe following at restart, prior to allowing general system usage:• Attempts to reestablish a connection with the application servers that were active.• Allows them to reestablish locks they had on the data server.• Reprocesses earlier requests for which the answers were never received by the applicationservers.• Declares the connection dead and rolls back open transactions for any application servernot heard from during the startup phase.Following a data server crash, an application server waits for 20 minutes (a configurable timelimit) before declaring the connection dead. If during that time the data server restarts, theapplication server goes through a recovery phase and resumes operation.182 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>


ECP and Clusters10.2.3 Network Is InterruptedIf either the application or the data server detects a network outage, the data server waits upto one minute (a configurable time limit) for the network to start working before declaringthe connection dead. When the application server is not receiving responses to requests, andcannot determine if there is a network outage or a problem with the data server, it waits forup to 20 minutes (configurable) which gives the system manager time to discover that thedata server has crashed and restart it.Once either the application or the data server has declared a connection dead, there is nolonger any ability to recover from a failure. In that case the data server rolls back any opentransactions for that application server and releases its application locks. The applicationserver is expected to issue errors to any application processes that are still waitingfor data. New attempts by the application server to use the network result in an attempt tocreate a new connection, so processing can resume when the problem is ultimately resolved.If the connection cannot be made, the application server process that made the request receivesan error.10.2.4 Cluster as an ECP Database ServerIn the <strong>Caché</strong>-supported genuine cluster environments, namely OpenVMS and Tru64 UNIX,using your cluster as an ECP database server has the following limitations:• You must run ECP to particular members of a cluster, not to the cluster as a whole. Donot use the IP address of the cluster. If a cluster master fails, you must reconnect to thenew master IP address.• In this type of cluster, ECP is used as a lock transport mechanism, not to transfer data.10.2.4.1 Master Fails<strong>Caché</strong> does the following on a cluster member after its cluster master fails:• Sets the state of any ECP connection from the failed master to Disabled and releases allresources for that connection including all the locks in the lock table owned by the failedsystem.• Sets the state of the cluster ECP connection to Trouble.• Locates the IP address and port of the new master through the PIJ file.• Creates a connection to the new master and performs the recovery procedure.<strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong> 183


ECP Failover10.2.4.2 New Cluster Master<strong>Caché</strong> does the following on a new cluster master after the previous cluster master fails:• Sets the state of any ECP connections from the failed master to Disabled and releases allresources for that connection, including all locks in the lock table owned by the failedsystem.• Stops all ECP daemons for the cluster ECP connection.• Converts the granted remote locks to the old master into local locks.• Removes all the pending cancel lock entries to the old master in the lock table, <strong>Caché</strong>does not remove the pending unlock entries because the process that issues the unlockrequest removes it eventually.• Waits for all cluster members to upload their locks.• Reissues the pending lock entries in the lock table by waking up the requesting processes.• Sets the state of the cluster ECP connection to Disabled.184 <strong>Caché</strong> <strong>High</strong> <strong>Availability</strong> <strong>Guide</strong>

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!