Real Application Clusters
Oracle9i introduces Real Application Clusters (RAC) which has evolved from Oracle Parallel Server (OPS). The concepts, configuration and maintenance of RACs are covered in extreme detail in the Oracle9i manuals and well beyond the scope of this article. Here I will be focusing on those points relevant to the "Oracle9i Database: New Features For Administrators" OCP exam.
A real application cluster consists of a cluster of several servers (nodes) with a set of shared disks. Each node has a uniquely named instance which points to a set of common files. Since the common files are managed by multiple nodes they must be stored within raw partitions. The servers are connected to each other by an interconnect that allows them to run the Cluster Group Services (CGS), the Global Cache Service, and the Global Enqueue Service. Each node can contain several processors in either an SMP or NUMA configuration.
The CGS use a node monitor, part of the vendor-provided Cluster Management Software (CMS), to monitor the health of the processes in the cluster and control the membership of instances in Real Application Clusters. Which instances run on which nodes is determined by a node-to-instance mapping, stored in text files on UNIX or in the registry on Windows.
This type of configuration allows almost limitless scalability since new nodes can be added into the cluster to improve performance. Along with performance comes higher availability since the failure of a single node does not affect the other nodes in the cluster.
If two nodes require the same block for query or update, the block must be transfered from the cache of one node to the other. In Oracle7 this involved the block being written to disk so it could be queried by the second node. This process of disk writes to pass cached blocks was called 'pinging' and reduced the performance of the cluster significantly. Oracle8i introduced a Consistent Read Cache Fusion which allowed cached blocks to be passed between nodes over an interconnect to support queries. If a block modified by node 1 was needed for modification on node 2 it was still necessary to write the block to disk for the transfer to take place.
In Oracle9i Cache Fusion allows both unmodified and modified blocks to be passed directly over the interconnect without the need for pinging. Since data transfer over an interconnect is significantly faster than disk writes the performance and scalability of the cluster is significantly improved.
Information about shared resources (blocks) are stored in the Global Resource Directory which is maintained by the Global Cache and Global Enqueue Services which communicate using messages. These messages ensure that the current block image can be located and copies of the blocks are retained by the relevant instances for recovery purposes. They also include sequence information to identify the order of changes made to that block since it was read from disk.
Resources may be held in one of three modes:
- Null - The resource is not held in any mode.
- Shared - This mode is required if the content of the block is to be read to satisfy a query. Multiple instances can hold a resource in this mode simultaneously.
- Exclusive - This mode is required if the instance needs to modify the contents of the block. All other instances must hold the resource with a mode of Null.
Resources are also categorised with one of two roles:
- Local - Resources held with this role can be altered without reference to the Global Cache Service or other instances. This instance is capable of serving the changed block to other instances and writing it to disk. The changes made to the block only exist in the memory of the current instance.
- Global - Resources held with this role cannnot be altered without reference to the Global Cache Service or other instances. Several instances can have differing copies of the same dirty block covered by this role. This is possible where an instance serves a dirty blocks to several instance without associated disk writes. The Global Cache Service must coordinate further modifications to the block to prevent any discrepancies. This may involve the resource being released by another instance and recopied to the current instance.
Global copies of dirty blocks are called Past Images (PI). These must be stored until they, or later copies have been written to disk during a checkpoint. At this point the Global Cache Service will inform the instance that the PI is no longer needed. A Block Written Record (BWR) is placed in its redo log buffer when an instance writes or discards a block covered by a global resource. This prevents redo prior to this point being used during recovery. This makes recovery more efficient, but it is not necessary so the buffer is not flushed to disk during this process.
Shared Server Parameter Files
RACs can be configured to share a single Server Parameter File (SPFILE). This negates the need to maintain multiple parameter files. Each node in the cluster has an init.ora file that contains the same single-line reference to the shared SPFILE. This SPFILE is stored on a shared raw partition so all nodes can access it. In a two node cluster the nodes may be configured as follows.
Node 1 ORACLE_SID : TSH1 ORACLE_HOME : /local-node1/app/oracle/product/901 PFILE Locations : ORACLE_HOME/dbs/initTSH1.ora PFILE Content : SPFILE=/shared-disk1/spfile Node 2 ORACLE_SID : TSH2 ORACLE_HOME : /local-node2/app/oracle/product/901 PFILE Locations : ORACLE_HOME/dbs/initTSH2.ora PFILE Content : SPFILE=/shared-disk1/spfile
Since the SPFILE is shared between several instances, any node-specific parameters are prefixed with the SID.
TSH1.instance_name = TSH1 TSH2.instance_name = TSH2 TSH1.db_cache_size = 16000000 TSH2.db_cache_size = 32000000 etc...
The node-specific parameters can be set using the
ALTER SYSTEM SET OPEN_CURSORS=500 SID='SID1' SCOPE='SPFILE';
A parameter value can be reset to the default.
ALTER SYSTEM RESET OPEN_CURSORS SID='*' SCOPE='SPFILE';
Hope this helps. Regards Tim...