Apress.Expert.Oracle.Database.Architecture.9i.and.10g.Programming.Techniques.and.Solutions.Sep.2005
CHAPTER 5 ■ ORACLE PROCESSES 173 PMON: The Process Monitor This process is responsible for cleaning up after abnormally terminated connections. For example, if your dedicated server “fails” or is killed for some reason, PMON is the process responsible for fixing (recovering or undoing work) and releasing your resources. PMON will initiate the rollback of uncommitted work, release locks, and free SGA resources allocated to the failed process. In addition to cleaning up after aborted connections, PMON is responsible for monitoring the other Oracle background processes and restarting them if necessary (and if possible). If a shared server or a dispatcher fails (crashes), PMON will step in and restart another one (after cleaning up for the failed process). PMON will watch all of the Oracle processes and either restart them or terminate the instance as appropriate. For example, it is appropriate to fail the instance in the event the database log writer process, LGWR, fails. This is a serious error, and the safest path of action is to terminate the instance immediately and let normal recovery fix the data. (Note that this is a rare occurrence and should be reported to Oracle Support immediately.) The other thing PMON does for the instance is to register it with the Oracle TNS listener. When an instance starts up, the PMON process polls the well-known port address, unless directed otherwise, to see whether or not a listener is up and running. The well-known/default port used by Oracle is 1521. Now, what happens if the listener is started on some different port? In this case, the mechanism is the same, except that the listener address needs to be explicitly specified by the LOCAL_LISTENER parameter setting. If the listener is running when the database instance is started, PMON communicates with the listener and passes to it relevant parameters, such as the service name and load metrics of the instance. If the listener was not started, PMON will periodically attempt to contact it to register itself. SMON: The System Monitor SMON is the process that gets to do all of the “system-level” jobs. Whereas PMON was interested in individual processes, SMON takes a system-level perspective of things and is a sort of “garbage collector” for the database. Some of the jobs it does include the following: • Cleans up temporary space: With the advent of “true” temporary tablespaces, the chore of cleaning up temporary space has lessened, but it has not gone away. For example, when building an index, the extents allocated for the index during the creation are marked as TEMPORARY. If the CREATE INDEX session is aborted for some reason, SMON is responsible for cleaning them up. Other operations create temporary extents that SMON would be responsible for as well. • Coalesces free space: If you are using dictionary-managed tablespaces, SMON is responsible for taking extents that are free in a tablespace and contiguous with respect to each other and coalescing them into one larger free extent. This occurs only on dictionarymanaged tablespaces with a default storage clause that has pctincrease set to a nonzero value. • Recovers transactions active against unavailable files: This is similar to its role during database startup. Here, SMON recovers failed transactions that were skipped during instance/crash recovery due to a file(s) not being available to recover. For example, the file may have been on a disk that was unavailable or not mounted. When the file does become available, SMON will recover it.
174 CHAPTER 5 ■ ORACLE PROCESSES • Performs instance recovery of a failed node in RAC: In an Oracle RAC configuration, when a database instance in the cluster fails (e.g., the machine the instance was executing on fails), some other node in the cluster will open that failed instance’s redo log files and perform a recovery of all data for that failed instance. • Cleans up OBJ$: OBJ$ is a low-level data dictionary table that contains an entry for almost every object (table, index, trigger, view, and so on) in the database. Many times, there are entries in here that represent deleted objects, or objects that represent “not there” objects, used in Oracle’s dependency mechanism. SMON is the process that removes these rows that are no longer needed. • Shrinks rollback segments: SMON will perform the automatic shrinking of a rollback segment to its optimal size, if it is set. • “Offlines” rollback segments: It is possible for the DBA to offline, or make unavailable, a rollback segment that has active transactions. It may be possible that active transactions are using this offlined rollback segment. In this case, the rollback is not really offlined; it is marked as “pending offline.” In the background, SMON will periodically try to truly take it offline, until it succeeds. That should give you a flavor of what SMON does. It does many other things, such as flush the monitoring statistics that show up in the DBA_TAB_MONITORING view, the flush of the SCN to timestamp mapping information found in the SMON_SCN_TIME table, and so on. The SMON process can accumulate quite a lot of CPU over time, and this should be considered normal. SMON periodically wakes up (or is woken up by the other background processes) to perform these housekeeping chores. RECO: Distributed Database Recovery RECO has a very focused job: it recovers transactions that are left in a prepared state because of a crash or loss of connection during a two-phase commit (2PC). A 2PC is a distributed protocol that allows for a modification that affects many disparate databases to be committed atomically. It attempts to close the window for distributed failure as much as possible before committing. In a 2PC between N databases, one of the databases—typically (but not always) the one the client logged into initially—will be the coordinator. This one site will ask the other N-1 sites if they are ready to commit. In effect, this one site will go to the N-1 sites and ask them to be prepared to commit. Each of the N-1 sites reports back its “prepared state” as YES or NO. If any one of the sites votes NO, the entire transaction is rolled back. If all sites vote YES, then the site coordinator broadcasts a message to make the commit permanent on each of the N-1 sites. If after some site votes YES it is prepared to commit, but before it gets the directive from the coordinator to actually commit the network fails or some other error occurs, the transaction becomes an in-doubt distributed transaction. The 2PC tries to limit the window of time in which this can occur, but cannot remove it. If we have a failure right then and there, the transaction will become the responsibility of RECO. RECO will try to contact the coordinator of the transaction to discover its outcome. Until it does that, the transaction will remain in its uncommitted state. When the transaction coordinator can be reached again, RECO will either commit the transaction or roll it back.
- Page 168 and 169: CHAPTER 4 ■ MEMORY STRUCTURES 123
- Page 170 and 171: CHAPTER 4 ■ MEMORY STRUCTURES 125
- Page 172 and 173: CHAPTER 4 ■ MEMORY STRUCTURES 127
- Page 174 and 175: CHAPTER 4 ■ MEMORY STRUCTURES 129
- Page 176 and 177: CHAPTER 4 ■ MEMORY STRUCTURES 131
- Page 178 and 179: CHAPTER 4 ■ MEMORY STRUCTURES 133
- Page 180 and 181: CHAPTER 4 ■ MEMORY STRUCTURES 135
- Page 182 and 183: CHAPTER 4 ■ MEMORY STRUCTURES 137
- Page 184 and 185: CHAPTER 4 ■ MEMORY STRUCTURES 139
- Page 186 and 187: CHAPTER 4 ■ MEMORY STRUCTURES 141
- Page 188 and 189: CHAPTER 4 ■ MEMORY STRUCTURES 143
- Page 190 and 191: CHAPTER 4 ■ MEMORY STRUCTURES 145
- Page 192 and 193: CHAPTER 4 ■ MEMORY STRUCTURES 147
- Page 194 and 195: CHAPTER 4 ■ MEMORY STRUCTURES 149
- Page 196 and 197: CHAPTER 4 ■ MEMORY STRUCTURES 151
- Page 198 and 199: CHAPTER 4 ■ MEMORY STRUCTURES 153
- Page 200 and 201: CHAPTER 5 ■ ■ ■ Oracle Proces
- Page 202 and 203: CHAPTER 5 ■ ORACLE PROCESSES 157
- Page 204 and 205: CHAPTER 5 ■ ORACLE PROCESSES 159
- Page 206 and 207: CHAPTER 5 ■ ORACLE PROCESSES 161
- Page 208 and 209: CHAPTER 5 ■ ORACLE PROCESSES 163
- Page 210 and 211: CHAPTER 5 ■ ORACLE PROCESSES 165
- Page 212 and 213: CHAPTER 5 ■ ORACLE PROCESSES 167
- Page 214 and 215: CHAPTER 5 ■ ORACLE PROCESSES 169
- Page 216 and 217: CHAPTER 5 ■ ORACLE PROCESSES 171
- Page 220 and 221: CHAPTER 5 ■ ORACLE PROCESSES 175
- Page 222 and 223: CHAPTER 5 ■ ORACLE PROCESSES 177
- Page 224 and 225: CHAPTER 5 ■ ORACLE PROCESSES 179
- Page 226 and 227: CHAPTER 5 ■ ORACLE PROCESSES 181
- Page 228 and 229: CHAPTER 6 ■ ■ ■ Locking and L
- Page 230 and 231: CHAPTER 6 ■ LOCKING AND LATCHING
- Page 232 and 233: CHAPTER 6 ■ LOCKING AND LATCHING
- Page 234 and 235: CHAPTER 6 ■ LOCKING AND LATCHING
- Page 236 and 237: CHAPTER 6 ■ LOCKING AND LATCHING
- Page 238 and 239: CHAPTER 6 ■ LOCKING AND LATCHING
- Page 240 and 241: CHAPTER 6 ■ LOCKING AND LATCHING
- Page 242 and 243: CHAPTER 6 ■ LOCKING AND LATCHING
- Page 244 and 245: CHAPTER 6 ■ LOCKING AND LATCHING
- Page 246 and 247: CHAPTER 6 ■ LOCKING AND LATCHING
- Page 248 and 249: CHAPTER 6 ■ LOCKING AND LATCHING
- Page 250 and 251: CHAPTER 6 ■ LOCKING AND LATCHING
- Page 252 and 253: CHAPTER 6 ■ LOCKING AND LATCHING
- Page 254 and 255: CHAPTER 6 ■ LOCKING AND LATCHING
- Page 256 and 257: CHAPTER 6 ■ LOCKING AND LATCHING
- Page 258 and 259: CHAPTER 6 ■ LOCKING AND LATCHING
- Page 260 and 261: CHAPTER 6 ■ LOCKING AND LATCHING
- Page 262 and 263: CHAPTER 6 ■ LOCKING AND LATCHING
- Page 264 and 265: CHAPTER 6 ■ LOCKING AND LATCHING
- Page 266 and 267: CHAPTER 6 ■ LOCKING AND LATCHING
174<br />
CHAPTER 5 ■ ORACLE PROCESSES<br />
• Performs instance recovery of a failed node in RAC: In an <strong>Oracle</strong> RAC configuration,<br />
when a database instance in the cluster fails (e.g., the machine the instance was executing<br />
on fails), some other node in the cluster will open that failed instance’s redo log files<br />
<strong>and</strong> perform a recovery of all data for that failed instance.<br />
• Cleans up OBJ$: OBJ$ is a low-level data dictionary table that contains an entry for<br />
almost every object (table, index, trigger, view, <strong>and</strong> so on) in the database. Many times,<br />
there are entries in here that represent deleted objects, or objects that represent “not<br />
there” objects, used in <strong>Oracle</strong>’s dependency mechanism. SMON is the process that<br />
removes these rows that are no longer needed.<br />
• Shrinks rollback segments: SMON will perform the automatic shrinking of a rollback<br />
segment to its optimal size, if it is set.<br />
• “Offlines” rollback segments: It is possible for the DBA to offline, or make unavailable,<br />
a rollback segment that has active transactions. It may be possible that active transactions<br />
are using this offlined rollback segment. In this case, the rollback is not really<br />
offlined; it is marked as “pending offline.” In the background, SMON will periodically try<br />
to truly take it offline, until it succeeds.<br />
That should give you a flavor of what SMON does. It does many other things, such as flush<br />
the monitoring statistics that show up in the DBA_TAB_MONITORING view, the flush of the SCN<br />
to timestamp mapping information found in the SMON_SCN_TIME table, <strong>and</strong> so on. The SMON<br />
process can accumulate quite a lot of CPU over time, <strong>and</strong> this should be considered normal.<br />
SMON periodically wakes up (or is woken up by the other background processes) to perform<br />
these housekeeping chores.<br />
RECO: Distributed <strong>Database</strong> Recovery<br />
RECO has a very focused job: it recovers transactions that are left in a prepared state because<br />
of a crash or loss of connection during a two-phase commit (2PC). A 2PC is a distributed protocol<br />
that allows for a modification that affects many disparate databases to be committed<br />
atomically. It attempts to close the window for distributed failure as much as possible before<br />
committing. In a 2PC between N databases, one of the databases—typically (but not always)<br />
the one the client logged into initially—will be the coordinator. This one site will ask the other<br />
N-1 sites if they are ready to commit. In effect, this one site will go to the N-1 sites <strong>and</strong> ask<br />
them to be prepared to commit. Each of the N-1 sites reports back its “prepared state” as YES<br />
or NO. If any one of the sites votes NO, the entire transaction is rolled back. If all sites vote YES,<br />
then the site coordinator broadcasts a message to make the commit permanent on each of the<br />
N-1 sites.<br />
If after some site votes YES it is prepared to commit, but before it gets the directive from<br />
the coordinator to actually commit the network fails or some other error occurs, the transaction<br />
becomes an in-doubt distributed transaction. The 2PC tries to limit the window of time<br />
in which this can occur, but cannot remove it. If we have a failure right then <strong>and</strong> there, the<br />
transaction will become the responsibility of RECO. RECO will try to contact the coordinator of<br />
the transaction to discover its outcome. Until it does that, the transaction will remain in its<br />
uncommitted state. When the transaction coordinator can be reached again, RECO will either<br />
commit the transaction or roll it back.