In 10g RAC the Cluster Ready Services (CRS) software is installed in it’s own $ORACLE_HOME, for the sake or argument let’s call this $CRS_HOME. In this directory there are a number of subdirectories including:
- $CRS_HOME/crs/init
- $CRS_HOME/css/init
- $CRS_HOME/evm/init
When the CRS daemons are running these directories contain an assortment of files with names like:
- myserver.mydomain.com.pid
- .lock-myserver.mydomain.com
- myserver.mydomain.com.lck
When CRS is shutdown cleanly these files are managed such that CRS will start up again without manual intervention, but when there is a power failure on one or more nodes the files aren’t cleaned up. The affect of this is that the CRS daemons won’t start properly until you manually clean up the mess.
RAC is a high availability solution, but it is crippled by a power failure. Is that a bug or a feature?
Note. I’m talking about the way CRS (10.1.0.3.0) works on Tru64. I’d be interested to know if it’s the same for CRS on other platforms. Also, I believe some changes have happened to the startup and shutdown of CRS in 10.1.0.4.0, but that’s not released for Tru64 yet, and a recent message on a HP forum suggests that Oracle will skip this patch and wait for 10.1.0.5.0 for Tru64.
Fun, fun, fun…
Cheers
Tim…