Link: Right Availability in RAC environment Playing with Oracleclusterware infrastructure components
How to do this inside the Oracle clusterware environment.
We need the following items operation:
- Type for resources "db_12c.type"
- Action Script [Start Stop Check Cleanup]
- Single database
- Oracle clusterware environment
First define a resource type which can be used as central point for a cluster resource.
We used the Oracle user to create the type db_12c.type:
This resource type needs an action script which will be used by the cluster agent to manage the status and failovee. For this we used a demo script from the clusterware location GRID_HOME/crs/demo/act_db.pl. We edit and change this for our purpose which do the validation of the start - stop - check and cleanup actions. Important is the the attribute items can be used by the action script as host variables. In this case we used three of those attributes for the database type "db_12c.type:
- USR_DB_NAME
- USR_DB_SPFILE
- USR_ORA_HOME
With these attributes the actionscript has all the information to start - stop - check and cleanup the managed database. The action script will be placed on the $GRID_HOME/crs/public location as act_db.pl on all the nodes.
See at the end the action script that is used.
With this predefined type and action script we can register a resource in which a single database can be managed by the Oracle clusterware.Create a single database with or without DBCA and place the passsword file and the spfile file on ASM. Keep in mind to deregister the new create database from the Oracle clusterware.Think about the audit-file location to be present on all nodes. Yes the spfile and password file not. These are centralized en will be managed by the action script.
Use the following command to register a resource with the predefined type and action action script.
[oracle@rac12c1 joord]$ $GRID_HOME/bin/crsctl stat res -t -------------------------------------------------------------------------------- Name Target State Server State details -------------------------------------------------------------------------------- Local Resources -------------------------------------------------------------------------------- ora.ASMNET1LSNR_ASM.lsnr ONLINE ONLINE rac12c1 STABLE ONLINE ONLINE rac12c2 STABLE ONLINE ONLINE rac12c3 STABLE ora.DATA.dg ONLINE ONLINE rac12c1 STABLE ONLINE ONLINE rac12c2 STABLE OFFLINE OFFLINE rac12c3 STABLE ora.FRA.dg ONLINE ONLINE rac12c1 STABLE ONLINE ONLINE rac12c2 STABLE OFFLINE OFFLINE rac12c3 STABLE ora.LISTENER.lsnr ONLINE ONLINE rac12c1 STABLE ONLINE ONLINE rac12c2 STABLE ONLINE ONLINE rac12c3 STABLE ora.net1.network ONLINE ONLINE rac12c1 STABLE ONLINE ONLINE rac12c2 STABLE ONLINE ONLINE rac12c3 STABLE ora.ons ONLINE ONLINE rac12c1 STABLE ONLINE ONLINE rac12c2 STABLE ONLINE ONLINE rac12c3 STABLE -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.LISTENER_SCAN1.lsnr 1 ONLINE ONLINE rac12c1 STABLE ora.LISTENER_SCAN2.lsnr 1 ONLINE ONLINE rac12c3 STABLE ora.LISTENER_SCAN3.lsnr 1 ONLINE ONLINE rac12c2 STABLE ora.MGMTLSNR 1 ONLINE ONLINE rac12c1 169.254.80.229 172.1 6.100.51,STABLE ora.asm 1 ONLINE ONLINE rac12c1 Started,STABLE 2 ONLINE ONLINE rac12c2 Started,STABLE ora.cvu 1 ONLINE ONLINE rac12c2 STABLE ora.mgmtdb 1 ONLINE ONLINE rac12c1 Open,STABLE ora.oc4j 1 ONLINE ONLINE rac12c2 STABLE ora.rac12c1.vip 1 ONLINE ONLINE rac12c1 STABLE ora.rac12c2.vip 1 ONLINE ONLINE rac12c2 STABLE ora.rac12c3.vip 1 ONLINE ONLINE rac12c3 STABLE ora.scan1.vip 1 ONLINE ONLINE rac12c1 STABLE ora.scan2.vip 1 ONLINE ONLINE rac12c3 STABLE ora.scan3.vip 1 ONLINE ONLINE rac12c2 STABLE --------------------------------------------------------------------------------
[oracle@rac12c1 joord]$ $GRID_HOME/bin/crsctl add type db_12c.type -basetype cluster_resource -attr "ATTRIBUTE=CHECK_INTERVAL,TYPE=INT,DEFAULT_VALUE=15, ATTRIBUTE=RESTART_ATTEMPTS,TYPE=INT,DEFAULT_VALUE=2, ATTRIBUTE=FAILURE_THRESHOLD,TYPE=INT,DEFAULT_VALUE=2, ATTRIBUTE=FAILURE_INTERVAL,TYPE=INT,DEFAULT_VALUE=3600, ATTRIBUTE=UPTIME_THRESHOLD,TYPE=STRING,DEFAULT_VALUE=1h, ATTRIBUTE=HOSTING_MEMBERS, TYPE=STRING, DEFAULT_VALUE=rac12c1 rac12c2 rac12c3, ATTRIBUTE=ACTION_SCRIPT,TYPE=STRING,DEFAULT_VALUE=/u01/app/12.1.0/grid/crs/public/act_db.pl, ATTRIBUTE=DESCRIPTION,TYPE=STRING,DEFAULT_VALUE=Oracle Database Res., ATTRIBUTE=PLACEMENT,TYPE=STRING,DEFAULT_VALUE=balanced, ATTRIBUTE=START_DEPENDENCIES,TYPE=STRING,DEFAULT_VALUE='hard(uniform:global:ora.DATA.dg,uniform:global:ora.FRA.dg) weak(type:ora.listener.type,uniform:ora.ons) pullup(global:ora.DATA.dg,global:ora.FRA.dg)dispersion(type:db_12c.type )', ATTRIBUTE=STOP_DEPENDENCIES,TYPE=STRING,DEFAULT_VALUE='hard(global:intermediate:ora.asm,global:shutdown:ora.DATA.dg,global:shutdown:ora.FRA.dg)', ATTRIBUTE=ADAPT_LOCAL_LSNR,TYPE=STRING,DEFAULT_VALUE=y, ATTRIBUTE=USR_ORA_HOME, TYPE=STRING, DEFAULT_VALUE=/u01/app/oracle/product/12.1.0/dbhome_1, ATTRIBUTE=USR_DB_NAME, TYPE=STRING, DEFAULT_VALUE=empty, ATTRIBUTE=USR_DB_SPFILE, TYPE=STRING, DEFAULT_VALUE=empty "
[oracle@rac12c1 joord] $GRID_HOME/bin/crsctl add resource joord.inst -type db_12c.type
[oracle@rac12c1 joord] $GRID_HOME/bin/crsctl stat res -w " TYPE = db_12c.type " -t -------------------------------------------------------------------------------- Name Target State Server State details -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- joord.inst 1 OFFLINE OFFLINE STABLE --------------------------------------------------------------------------------
Modify attributes formthe database instance specific attributes for the joord.inst cluster resource
[oracle@rac12c1 joord] $GRID_HOME/bin/crsctl modify resource joord.inst -attr "USR_DB_NAME=joord"
[oracle@rac12c1 joord] $GRID_HOME/bin/crsctl modify resource joord.inst -attr "USR_DB_SPFILE=+DATA/JOORD/PARAMETERFILE/spfile.310.929188933"
[oracle@rac12c1 joord] $GRID_HOME/bin/crsctl stat resource joord.inst -f NAME=joord.inst TYPE=db_12c.type STATE=OFFLINE TARGET=OFFLINE ACL=owner:oracle:rwx,pgrp:oinstall:rwx,other::r-- ACTIONS= ACTION_SCRIPT=/u01/app/12.1.0/grid/crs/public/act_db.pl ACTION_TIMEOUT=60 ACTIVE_PLACEMENT=0 ADAPT_LOCAL_LSNR=y AGENT_FILENAME=%CRS_HOME%/bin/scriptagent AUTO_START=restore CARDINALITY=1 CARDINALITY_ID=0 CHECK_INTERVAL=15 CHECK_TIMEOUT=0 CLEAN_TIMEOUT=60 DEGREE=1 DELETE_TIMEOUT=60 DESCRIPTION=Oracle Database Res. ENABLED=1 FAILOVER_DELAY=0 FAILURE_INTERVAL=3600 FAILURE_THRESHOLD=2 HOSTING_MEMBERS=rac12c1 rac12c2 rac12c3 ID=joord.inst INSTANCE_COUNT=1 INSTANCE_FAILOVER=1 INTERMEDIATE_TIMEOUT=0 LOAD=1 LOGGING_LEVEL=1 MODIFY_TIMEOUT=60 OFFLINE_CHECK_INTERVAL=0 PLACEMENT=balanced RELOCATE_BY_DEPENDENCY=1 RESTART_ATTEMPTS=2 SCRIPT_TIMEOUT=60 SERVER_CATEGORY= SERVER_POOLS= START_CONCURRENCY=0 START_DEPENDENCIES=hard(uniform:global:ora.DATA.dg,uniform:global:ora.FRA.dg) weak(type:ora.listener.type,uniform:ora.ons) pullup(global:ora.DATA.dg,global:ora.FRA.dg)dispersion(type:db_12c.type ) START_TIMEOUT=0 STOP_CONCURRENCY=0 STOP_DEPENDENCIES=hard(global:intermediate:ora.asm,global:shutdown:ora.DATA.dg,global:shutdown:ora.FRA.dg) STOP_TIMEOUT=0 UPTIME_THRESHOLD=1h USER_WORKLOAD=no USE_STICKINESS=0 USR_DB_NAME=joord USR_DB_SPFILE=+DATA/JOORD/PARAMETERFILE/spfile.310.929188933 USR_ORA_HOME=/u01/app/oracle/product/12.1.0/dbhome_1
For a new single database on the cluster environment we re-use the type "db_12c.type" and add the two attributes USR_DB_NAME and USR_DB_SPFILE for this database.
- $GRID_HOME/bin/crsctl add resource <<new>>.inst -type db_12c.type
- $GRID_HOME/bin/crsctl modify resource <<new>>.inst -attr "USR_DB_NAME=<<new>>"
- $GRID_HOME/bin/crsctl modify resource <<new>>.inst -attr "USR_DB_SPFILE=+DATA/<<file_location>>/<<spfile_name>>"
Remove the initSID.ora from the $ORACLE_HOME/dbs directory on all the nodes.
Lets test the resource by starting it.
[oracle@rac12c1 joord]$ $GRID_HOME/bin/crsctl stat res -w " NAME = ora.asm " -t -init
-------------------------------------------------------------------------------- Name Target State Server State details -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.asm 1 ONLINE ONLINE rac12c1 STABLE --------------------------------------------------------------------------------
[oracle@rac12c1 joord]$ $GRID_HOME/bin/crsctl stat res -w " NAME = ora.asm " -t
-------------------------------------------------------------------------------- Name Target State Server State details -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.asm 1 ONLINE ONLINE rac12c1 Started,STABLE 2 ONLINE ONLINE rac12c2 Started,STABLE -------------------------------------------------------------------------------- [oracle@rac12c1 joord]$ $GRID_HOME/bin/srvctl status asm
ASM is running on rac12c1,rac12c2 [oracle@rac12c1 joord]$ $GRID_HOME/bin/olsnodes -n
rac12c1 1 rac12c2 2 rac12c3 3
[oracle@rac12c1 joord]$ $GRID_HOME/bin/crsctl start resource joord.inst
CRS-2672: Attempting to start 'joord.inst' on 'rac12c1' CRS-2676: Start of 'joord.inst' on 'rac12c1' succeeded
relocate the cluster resource joord.inst to the rac12c2 node.
[oracle@rac12c1 joord]$ $GRID_HOME/bin/crsctl relocate resource joord.inst -n rac12c2
CRS-2673: Attempting to stop 'joord.inst' on 'rac12c1' CRS-2677: Stop of 'joord.inst' on 'rac12c1' succeeded CRS-2672: Attempting to start 'joord.inst' on 'rac12c2' CRS-2676: Start of 'joord.inst' on 'rac12c2' succeeded
$GRID_HOME/bin/crsctl stat res -w " TYPE = db_12c.type " -t
-------------------------------------------------------------------------------- Name Target State Server State details -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- joord.inst 1 ONLINE ONLINE rac12c2 STABLE --------------------------------------------------------------------------------
Kill the PMON process of the joord database
[oracle@rac12c1 joord]$ ssh rac12c2 ps -ef|grep pmon
grid 4872 1 0 08:36 ? 00:00:00 asm_pmon_+ASM2 oracle 19779 1 0 09:01 ? 00:00:00 ora_pmon_joord
[oracle@rac12c1 joord]$ ssh rac12c2 kill -9 19779 [oracle@rac12c1 joord]$ $GRID_HOME/bin/crsctl stat res -w " TYPE = db_12c.type " -t
-------------------------------------------------------------------------------- Name Target State Server State details -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- joord.inst 1 ONLINE OFFLINE rac12c2 STARTING -------------------------------------------------------------------------------- [oracle@rac12c1 joord]$ $GRID_HOME/bin/crsctl stat res -w " TYPE = db_12c.type " -t
-------------------------------------------------------------------------------- Name Target State Server State details -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- joord.inst 1 ONLINE ONLINE rac12c2 STABLE --------------------------------------------------------------------------------
Reboot node rac12c2 now and see what will happens with the cluster resource joord.inst
[oracle@rac12c1 joord]$ $GRID_HOME/bin/crsctl stat res -w " TYPE = db_12c.type " -t
-------------------------------------------------------------------------------- Name Target State Server State details -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- joord.inst 1 ONLINE ONLINE rac12c2 STOPPING --------------------------------------------------------------------------------
[oracle@rac12c1 joord]$ $GRID_HOME/bin/crsctl stat res -w " TYPE = db_12c.type " -t
-------------------------------------------------------------------------------- Name Target State Server State details -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- joord.inst 1 ONLINE OFFLINE STABLE --------------------------------------------------------------------------------
[oracle@rac12c1 joord]$ $GRID_HOME/bin/crsctl stat res -w " TYPE = db_12c.type " -t
-------------------------------------------------------------------------------- Name Target State Server State details -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- joord.inst 1 ONLINE OFFLINE rac12c1 STARTING --------------------------------------------------------------------------------
[oracle@rac12c1 joord]$ $GRID_HOME/bin/crsctl stat res -w " TYPE = db_12c.type " -t
-------------------------------------------------------------------------------- Name Target State Server State details -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- joord.inst 1 ONLINE ONLINE rac12c1 STABLE --------------------------------------------------------------------------------
Every thing work as designed. Some thing are important to get it operation:
- Use a method to register the database to its local_listener parameter [DYNAMIC value].
- Use STATIC value for the remote_listerner parameter
- Manually startup and stop the database with SQLPLUS will cause errors
- To remove the pfile/spfile entries on the system we get a nice warning that this database should be managed by a other method then with SQLPLUS
- Use in the action script "system ("set > /file_location/crs_attribute.log");"
With this we can see the OS variable of the source attributes
Play and have fun with this configuration
Action Script "cat act_db.pl"
#!/usr/bin/perl #system ("set > /tmp/crs_log.log"); $ORACLE_HOME = "$ENV{_USR_ORA_HOME}"; $ORACLE_SID = "$ENV{_USR_DB_NAME}"; $ORACLE_SPFILE = "$ENV{_USR_DB_SPFILE}"; $HOST_LSNR = "`hostname -i`"; $command = $ARGV[0]; my $addr = qx(`hostname -i`); # Database start stop check # Start database if ($command eq "start" ) { my $filename = ".init${ORACLE_SID}.crs.ora"; open(FH, '+>', $filename) or die $!; print FH "SPFILE='${ORACLE_SPFILE}'"; close(FH); system (" export ORACLE_SID=$ORACLE_SID export ORACLE_HOME=$ORACLE_HOME export LD_LIBRARY_PATH=$ORACLE_HOME/lib:$LD_LIBRARY_PATH # export TNS_ADMIN=$ORACLE_HOME/network/admin # optionally set TNS_ADMIN here export HOST_LSNR=`hostname` export ORACLE_SPFILE=$ORACLE_SPFILE $ORACLE_HOME/bin/sqlplus /nolog << EOF connect / as sysdba startup pfile='.init${ORACLE_SID}.crs.ora' force; ALTER SYSTEM SET LOCAL_LISTENER='(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=${HOST_LSNR})(PORT=1521))))' SCOPE=MEMORY; quit EOF " ); $MYRET = check(); exit $MYRET; } # Stop database if ($command eq "stop" ) { system (" ORACLE_SID=$ORACLE_SID export ORACLE_SID ORACLE_HOME=$ORACLE_HOME export ORACLE_HOME LD_LIBRARY_PATH=$ORACLE_HOME/lib:$LD_LIBRARY_PATH export LD_LIBRARY_PATH # export TNS_ADMIN=$ORACLE_HOME/network/admin # optionally set TNS_ADMIN here $ORACLE_HOME/bin/sqlplus /nolog << EOF connect / as sysdba shutdown immediate quit EOF " ); $MYRET = check(); if ($MYRET eq 1) { exit 0; } else { exit 1; } } # Check database if ($command eq "check" ) { $MYRET = check(); exit $MYRET; } sub check { my($check_proc,$process) = @_; $process = "ora_pmon_$ORACLE_SID"; $check_proc = qx(ps -ae -o args | grep -w ora_pmon_$ORACLE_SID| grep -v grep); chomp($check_proc); if ($process eq $check_proc) { $RET=0; } else { $RET=1; } return $RET; } # Clean database if ($command eq "clean") { my $kill_proc = qx(ps -aef | grep -w ora_pmon_$ORACLE_SID | grep -v grep | awk '{print \$2}'| xargs kill -9 $1); exit 0; }
It's perfect, I've tried it and it's going very well.
ReplyDeleteThe only question I have is how do we register the database with the listener scan.
This comment has been removed by the author.
DeleteUsing parameter REMOTE_LISTENE. This specifies a network name that resolves to an address or address list of Oracle Net remote listeners (that is, listeners that are not running on the same system as this instance). The address or address list is specified in the TNSNAMES.ORA file or other address repository as configured for your system. See http://shorturl.at/vFM48
Delete