Recovery Enhancements In Oracle9i
Minimal I/O Recovery
The speed of crash recovery is dependant on the rate at which data is read from the redo logs along with the time it takes to apply the changes to the datafiles for blocks that were dirty in the buffer cache at the time of instance failure. Processing redo log data for blocks that were not dirty in the buffer cache is unneccessary so it no longer happens in Oracle9i.
To avoid unneccessary processing the recovery phase is split into two passes. The first sequential read scans logs from the last checkpoint position to identify which data blocks contain unsaved changes and need to be recovered. This information is stored in the PGA and used when performing the second pass. The second pass reads only blocks that were identified as requiring recovery by the first pass. These blocks are processed to complete recovery of the datafiles.
Since reading the logfiles is significantly quicker than processing the blocks, the time spent doing two reads then missing out the unchanged blocks is less than processing each block, which results in quicker recovery.
Fast-Start Time-Based Recovery
Rather than wait for specific events such as log switches to trigger checkpoints, Oracle9i can be instructed to use fast-start checkpointing architecture to allow the DBWn processes to periodically write dirty buffers to disk and incrementally advance the checkpoint position. This results in a reduced Mean Time To Recovery (MTTR) and a reduction in I/O spikes associated with log switches.
FAST_START_MTTR_TARGET initialization parameter is used to specify the number of seconds crash recovery should take. Oracle uses this target time to configure the the
LOG_CHECKPOINT_INTERVAL parameters to reduce crash recovery time to a level as close to the target time as possible. The
LOG_CHECKPOINT_TIMEOUT parameters should not be set as they may interfere with the process.
The maximum value for
FAST_START_MTTR_TARGET is 3600 (1 hour), with values exceeding this being rounded down. There is no minimum value, but values that are too low may not be possible to achieve as this is limited by the low limit of the target number of dirty buffers, which is 1000. Added to this is the time mounting the database will take.
If the value is set too low, then the effective MTTR target will be the best MTTR target the system can achieve. If the value is set to high, the effective MTTR is estimated based on the whole buffer cache being dirty. The
ESTIMATED_MTTR column in the
V$INSTANCE_RECOVERY view can be used to view the effective MTTR. If the parameter setting, shown by the
TARGET_MTTR column, is consistently different to the effective MTTR it should be adjusted since this means it is set at an unrealistic value.
Remember that the extra checkpointing required to reduce the crash recovery time may compromise the system performance. A balance must be reached between general system performance and crash recovery time. Set
FAST_START_MTTR_TARGET to zero to disable fast-start checkpointing.
FAST_START_IO_TARGET initialization parameter is used to specify the maximum number of dirty blocks in the buffer cache. Its use has been deprecated in favour of the
FAST_START_MTTR_TARGET. In addition the
DB_BLOCK_MAX_DIRTY_TARGET parameter has been removed.
Flash Freeze is a diagnostic tool that captures a diagnostic snapshot of the entire system at the time of failure the first time it occurs. The database can be restarted on another node in the cluster, or the frozen instance can be deregistered from a Real Application Cluster, leaving the frozen instance available for offline diagnostics. The gathered data can then be used to find the problem that caused the failure and prevent future failures.
Media Recovery Enhancements
In Oracle8i media recovery would fail if a corrupt redo log block was encountered, requiring point in time recovery to be restarted to an SCN preceeding the corrupt block. Oracle9i has a number of enhancements including:
- Failed recoveries leave the database in a consistant state that can be opened as read-only or with resetlogs.
- The DBA can instruct recovery to mark blocks as corrupt and ignore the resultant inconsistency so recovery can continue beyond the corruption.
The DBA can invoke a Trail Recovery to check how widespread inconsistencies caused by skipping blocks will be when recovery is complete. This
is done by adding the
TESTkeyword to the end of the recovery command.
Hope this helps. Regards Tim...