Sunday, December 4, 2016

Single Database High Available with Oracle Clusterware

It is interested thought to make a single (Multitenant) database High Available inside  the  Oracle Cluster environment. In most cases we make Application High available by making a resource  without predefined Cluster Resource Types. Instead every time create a full stack of settings we have now created a predefined type in which we have set the predefined attributes for the cluster resource to handle on. We have created a type which can be used multiple times by different resources with the same fingerprint.




How to do this inside the Oracle clusterware environment.
We need the following items operation:

  • Type for resources "db_12c.type"
  • Action Script [Start Stop Check Cleanup]
  • Single database
  • Oracle clusterware environment 


First define a resource type which can be used  as central point for a cluster resource.
We used the Oracle user to create the type db_12c.type:


This resource type needs an action script which will  be used by the cluster agent to manage the status and failovee. For this we used a demo script from the clusterware location GRID_HOME/crs/demo/act_db.pl. We edit and change this for our purpose which do the validation of the start - stop - check and cleanup actions. Important is the the attribute items can be used by the action script as host variables. In this case we used three of those attributes for the database type "db_12c.type:

  1. USR_DB_NAME
  2. USR_DB_SPFILE
  3. USR_ORA_HOME

With these attributes the actionscript has all the information to start - stop - check and cleanup the managed database. The action script will be placed on the $GRID_HOME/crs/public location as act_db.pl on all the nodes.

See at the end the action script that is used.

With this predefined type and action script we can register a resource in  which a single database can be managed by the Oracle clusterware.Create a single database with or without DBCA and place the passsword file and the spfile file on ASM. Keep in mind to deregister the new create database from the Oracle clusterware.Think about the audit-file location to be present on all nodes. Yes the spfile and password file not. These are centralized en will be managed by the action script.

Use the following command to register a resource with the predefined type and action action script.

[oracle@rac12c1 joord]$ $GRID_HOME/bin/crsctl stat res -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ASMNET1LSNR_ASM.lsnr
               ONLINE  ONLINE       rac12c1                  STABLE
               ONLINE  ONLINE       rac12c2                  STABLE
               ONLINE  ONLINE       rac12c3                  STABLE
ora.DATA.dg
               ONLINE  ONLINE       rac12c1                  STABLE
               ONLINE  ONLINE       rac12c2                  STABLE
               OFFLINE OFFLINE      rac12c3                  STABLE
ora.FRA.dg
               ONLINE  ONLINE       rac12c1                  STABLE
               ONLINE  ONLINE       rac12c2                  STABLE
               OFFLINE OFFLINE      rac12c3                  STABLE
ora.LISTENER.lsnr
               ONLINE  ONLINE       rac12c1                  STABLE
               ONLINE  ONLINE       rac12c2                  STABLE
               ONLINE  ONLINE       rac12c3                  STABLE
ora.net1.network
               ONLINE  ONLINE       rac12c1                  STABLE
               ONLINE  ONLINE       rac12c2                  STABLE
               ONLINE  ONLINE       rac12c3                  STABLE
ora.ons
               ONLINE  ONLINE       rac12c1                  STABLE
               ONLINE  ONLINE       rac12c2                  STABLE
               ONLINE  ONLINE       rac12c3                  STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       rac12c1                  STABLE
ora.LISTENER_SCAN2.lsnr
      1        ONLINE  ONLINE       rac12c3                  STABLE
ora.LISTENER_SCAN3.lsnr
      1        ONLINE  ONLINE       rac12c2                  STABLE
ora.MGMTLSNR
      1        ONLINE  ONLINE       rac12c1                  169.254.80.229 172.1
                                                             6.100.51,STABLE
ora.asm
      1        ONLINE  ONLINE       rac12c1                  Started,STABLE
      2        ONLINE  ONLINE       rac12c2                  Started,STABLE
ora.cvu
      1        ONLINE  ONLINE       rac12c2                  STABLE
ora.mgmtdb
      1        ONLINE  ONLINE       rac12c1                  Open,STABLE
ora.oc4j
      1        ONLINE  ONLINE       rac12c2                  STABLE
ora.rac12c1.vip
      1        ONLINE  ONLINE       rac12c1                  STABLE
ora.rac12c2.vip
      1        ONLINE  ONLINE       rac12c2                  STABLE
ora.rac12c3.vip
      1        ONLINE  ONLINE       rac12c3                  STABLE
ora.scan1.vip
      1        ONLINE  ONLINE       rac12c1                  STABLE
ora.scan2.vip
      1        ONLINE  ONLINE       rac12c3                  STABLE
ora.scan3.vip
      1        ONLINE  ONLINE       rac12c2                  STABLE
--------------------------------------------------------------------------------

[oracle@rac12c1 joord]$ $GRID_HOME/bin/crsctl add type db_12c.type -basetype cluster_resource -attr "ATTRIBUTE=CHECK_INTERVAL,TYPE=INT,DEFAULT_VALUE=15,
 ATTRIBUTE=RESTART_ATTEMPTS,TYPE=INT,DEFAULT_VALUE=2,
 ATTRIBUTE=FAILURE_THRESHOLD,TYPE=INT,DEFAULT_VALUE=2,
 ATTRIBUTE=FAILURE_INTERVAL,TYPE=INT,DEFAULT_VALUE=3600,
 ATTRIBUTE=UPTIME_THRESHOLD,TYPE=STRING,DEFAULT_VALUE=1h,
 ATTRIBUTE=HOSTING_MEMBERS, TYPE=STRING, DEFAULT_VALUE=rac12c1 rac12c2 rac12c3,
 ATTRIBUTE=ACTION_SCRIPT,TYPE=STRING,DEFAULT_VALUE=/u01/app/12.1.0/grid/crs/public/act_db.pl,
 ATTRIBUTE=DESCRIPTION,TYPE=STRING,DEFAULT_VALUE=Oracle Database Res.,
 ATTRIBUTE=PLACEMENT,TYPE=STRING,DEFAULT_VALUE=balanced,
 ATTRIBUTE=START_DEPENDENCIES,TYPE=STRING,DEFAULT_VALUE='hard(uniform:global:ora.DATA.dg,uniform:global:ora.FRA.dg) weak(type:ora.listener.type,uniform:ora.ons) pullup(global:ora.DATA.dg,global:ora.FRA.dg)dispersion(type:db_12c.type )',
 ATTRIBUTE=STOP_DEPENDENCIES,TYPE=STRING,DEFAULT_VALUE='hard(global:intermediate:ora.asm,global:shutdown:ora.DATA.dg,global:shutdown:ora.FRA.dg)',
 ATTRIBUTE=ADAPT_LOCAL_LSNR,TYPE=STRING,DEFAULT_VALUE=y,
 ATTRIBUTE=USR_ORA_HOME, TYPE=STRING, DEFAULT_VALUE=/u01/app/oracle/product/12.1.0/dbhome_1,
 ATTRIBUTE=USR_DB_NAME, TYPE=STRING, DEFAULT_VALUE=empty,
 ATTRIBUTE=USR_DB_SPFILE, TYPE=STRING, DEFAULT_VALUE=empty "

[oracle@rac12c1 joord] $GRID_HOME/bin/crsctl add resource joord.inst -type db_12c.type
[oracle@rac12c1 joord] $GRID_HOME/bin/crsctl stat res -w " TYPE = db_12c.type " -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
joord.inst
      1        OFFLINE OFFLINE                               STABLE
--------------------------------------------------------------------------------

Modify attributes formthe database instance specific attributes for the joord.inst cluster resource



[oracle@rac12c1 joord] $GRID_HOME/bin/crsctl modify resource joord.inst -attr "USR_DB_NAME=joord"

[oracle@rac12c1 joord] $GRID_HOME/bin/crsctl modify resource joord.inst -attr "USR_DB_SPFILE=+DATA/JOORD/PARAMETERFILE/spfile.310.929188933"


[oracle@rac12c1 joord] $GRID_HOME/bin/crsctl stat resource joord.inst -f

NAME=joord.inst
TYPE=db_12c.type
STATE=OFFLINE
TARGET=OFFLINE
ACL=owner:oracle:rwx,pgrp:oinstall:rwx,other::r--
ACTIONS=
ACTION_SCRIPT=/u01/app/12.1.0/grid/crs/public/act_db.pl
ACTION_TIMEOUT=60
ACTIVE_PLACEMENT=0
ADAPT_LOCAL_LSNR=y
AGENT_FILENAME=%CRS_HOME%/bin/scriptagent
AUTO_START=restore
CARDINALITY=1
CARDINALITY_ID=0
CHECK_INTERVAL=15
CHECK_TIMEOUT=0
CLEAN_TIMEOUT=60
DEGREE=1
DELETE_TIMEOUT=60
DESCRIPTION=Oracle Database Res.
ENABLED=1
FAILOVER_DELAY=0
FAILURE_INTERVAL=3600
FAILURE_THRESHOLD=2
HOSTING_MEMBERS=rac12c1 rac12c2 rac12c3
ID=joord.inst
INSTANCE_COUNT=1
INSTANCE_FAILOVER=1
INTERMEDIATE_TIMEOUT=0
LOAD=1
LOGGING_LEVEL=1
MODIFY_TIMEOUT=60
OFFLINE_CHECK_INTERVAL=0
PLACEMENT=balanced
RELOCATE_BY_DEPENDENCY=1
RESTART_ATTEMPTS=2
SCRIPT_TIMEOUT=60
SERVER_CATEGORY=
SERVER_POOLS=
START_CONCURRENCY=0
START_DEPENDENCIES=hard(uniform:global:ora.DATA.dg,uniform:global:ora.FRA.dg) weak(type:ora.listener.type,uniform:ora.ons) pullup(global:ora.DATA.dg,global:ora.FRA.dg)dispersion(type:db_12c.type )
START_TIMEOUT=0
STOP_CONCURRENCY=0
STOP_DEPENDENCIES=hard(global:intermediate:ora.asm,global:shutdown:ora.DATA.dg,global:shutdown:ora.FRA.dg)
STOP_TIMEOUT=0
UPTIME_THRESHOLD=1h
USER_WORKLOAD=no
USE_STICKINESS=0
USR_DB_NAME=joord
USR_DB_SPFILE=+DATA/JOORD/PARAMETERFILE/spfile.310.929188933
USR_ORA_HOME=/u01/app/oracle/product/12.1.0/dbhome_1

For a new single database on the cluster environment we re-use the type "db_12c.type" and add the two attributes USR_DB_NAME and USR_DB_SPFILE for this database.

  1. $GRID_HOME/bin/crsctl add resource <<new>>.inst -type db_12c.type 
  2. $GRID_HOME/bin/crsctl modify resource <<new>>.inst -attr "USR_DB_NAME=<<new>>"
  3. $GRID_HOME/bin/crsctl modify resource <<new>>.inst -attr "USR_DB_SPFILE=+DATA/<<file_location>>/<<spfile_name>>"

Remove the initSID.ora from the $ORACLE_HOME/dbs directory on all the nodes.

Lets test the resource by starting it.


[oracle@rac12c1 joord]$ $GRID_HOME/bin/crsctl stat res   -w " NAME = ora.asm " -t -init

--------------------------------------------------------------------------------
Name           Target  State        Server                   State details
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
      1        ONLINE  ONLINE       rac12c1                  STABLE
--------------------------------------------------------------------------------

[oracle@rac12c1 joord]$ $GRID_HOME/bin/crsctl stat res   -w " NAME = ora.asm " -t

--------------------------------------------------------------------------------
Name           Target  State        Server                   State details
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
      1        ONLINE  ONLINE       rac12c1                  Started,STABLE
      2        ONLINE  ONLINE       rac12c2                  Started,STABLE
--------------------------------------------------------------------------------

[oracle@rac12c1 joord]$ $GRID_HOME/bin/srvctl status asm

ASM is running on rac12c1,rac12c2

[oracle@rac12c1 joord]$ $GRID_HOME/bin/olsnodes -n

rac12c1 1
rac12c2 2
rac12c3 3


[oracle@rac12c1 joord]$ $GRID_HOME/bin/crsctl start resource joord.inst

CRS-2672: Attempting to start 'joord.inst' on 'rac12c1'
CRS-2676: Start of 'joord.inst' on 'rac12c1' succeeded

relocate the cluster resource joord.inst to the rac12c2 node.

[oracle@rac12c1 joord]$ $GRID_HOME/bin/crsctl relocate resource joord.inst -n rac12c2

CRS-2673: Attempting to stop 'joord.inst' on 'rac12c1'
CRS-2677: Stop of 'joord.inst' on 'rac12c1' succeeded
CRS-2672: Attempting to start 'joord.inst' on 'rac12c2'
CRS-2676: Start of 'joord.inst' on 'rac12c2' succeeded


$GRID_HOME/bin/crsctl stat res -w " TYPE = db_12c.type " -t

--------------------------------------------------------------------------------
Name           Target  State        Server                   State details
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
joord.inst
      1        ONLINE  ONLINE       rac12c2                  STABLE
--------------------------------------------------------------------------------

Kill the PMON process of the joord database


[oracle@rac12c1 joord]$ ssh rac12c2 ps -ef|grep pmon

grid      4872     1  0 08:36 ?        00:00:00 asm_pmon_+ASM2
oracle   19779     1  0 09:01 ?        00:00:00 ora_pmon_joord

[oracle@rac12c1 joord]$ ssh rac12c2 kill -9 19779

[oracle@rac12c1 joord]$ $GRID_HOME/bin/crsctl stat res -w " TYPE = db_12c.type " -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
joord.inst
      1        ONLINE  OFFLINE      rac12c2                  STARTING
--------------------------------------------------------------------------------

[oracle@rac12c1 joord]$ $GRID_HOME/bin/crsctl stat res -w " TYPE = db_12c.type " -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
joord.inst
      1        ONLINE  ONLINE       rac12c2                  STABLE
--------------------------------------------------------------------------------

Reboot node rac12c2 now and see what will happens with the cluster resource joord.inst


[oracle@rac12c1 joord]$ $GRID_HOME/bin/crsctl stat res -w " TYPE = db_12c.type " -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
joord.inst
      1        ONLINE  ONLINE       rac12c2                  STOPPING
--------------------------------------------------------------------------------

[oracle@rac12c1 joord]$ $GRID_HOME/bin/crsctl stat res -w " TYPE = db_12c.type " -t

--------------------------------------------------------------------------------
Name           Target  State        Server                   State details
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
joord.inst
      1        ONLINE  OFFLINE                               STABLE
--------------------------------------------------------------------------------

[oracle@rac12c1 joord]$ $GRID_HOME/bin/crsctl stat res -w " TYPE = db_12c.type " -t

--------------------------------------------------------------------------------
Name           Target  State        Server                   State details
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
joord.inst
      1        ONLINE  OFFLINE      rac12c1                  STARTING
--------------------------------------------------------------------------------

[oracle@rac12c1 joord]$ $GRID_HOME/bin/crsctl stat res -w " TYPE = db_12c.type " -t

--------------------------------------------------------------------------------
Name           Target  State        Server                   State details
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
joord.inst
      1        ONLINE  ONLINE       rac12c1                  STABLE
--------------------------------------------------------------------------------

Every thing work as designed. Some thing are important to get it operation:
  • Use a method to register the database to its local_listener parameter [DYNAMIC value].
  • Use STATIC value for the remote_listerner parameter
  • Manually startup and stop the database with SQLPLUS will cause errors
    • To remove the pfile/spfile entries on the system we get a nice warning that this database should be managed by a other method then with SQLPLUS
  • Use in the action script "system ("set > /file_location/crs_attribute.log");"
    With this we can see the OS variable of the source attributes 
Play and have fun with this configuration

Action Script "cat act_db.pl"

#!/usr/bin/perl
#system ("set > /tmp/crs_log.log");
$ORACLE_HOME = "$ENV{_USR_ORA_HOME}";
$ORACLE_SID = "$ENV{_USR_DB_NAME}";
$ORACLE_SPFILE = "$ENV{_USR_DB_SPFILE}";
$HOST_LSNR = "`hostname -i`";
$command = $ARGV[0];
my $addr = qx(`hostname -i`);
# Database start stop check
# Start database
if ($command eq "start" ) {
        my $filename = ".init${ORACLE_SID}.crs.ora";
        open(FH, '+>', $filename) or die $!;
        print FH "SPFILE='${ORACLE_SPFILE}'";
        close(FH);
        system ("
        export ORACLE_SID=$ORACLE_SID
        export ORACLE_HOME=$ORACLE_HOME
        export LD_LIBRARY_PATH=$ORACLE_HOME/lib:$LD_LIBRARY_PATH
#       export TNS_ADMIN=$ORACLE_HOME/network/admin  # optionally set TNS_ADMIN here
        export HOST_LSNR=`hostname`
        export ORACLE_SPFILE=$ORACLE_SPFILE
        $ORACLE_HOME/bin/sqlplus /nolog << EOF
        connect / as sysdba
        startup pfile='.init${ORACLE_SID}.crs.ora' force;
        ALTER SYSTEM SET LOCAL_LISTENER='(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=${HOST_LSNR})(PORT=1521))))'
         SCOPE=MEMORY;
        quit
        EOF " );
        $MYRET = check();
        exit $MYRET;
        }
# Stop database
if ($command eq "stop" ) {
        system ("
        ORACLE_SID=$ORACLE_SID
        export ORACLE_SID
        ORACLE_HOME=$ORACLE_HOME
        export ORACLE_HOME
        LD_LIBRARY_PATH=$ORACLE_HOME/lib:$LD_LIBRARY_PATH
        export LD_LIBRARY_PATH
#       export TNS_ADMIN=$ORACLE_HOME/network/admin  # optionally set TNS_ADMIN here
        $ORACLE_HOME/bin/sqlplus /nolog << EOF
        connect / as sysdba
        shutdown immediate
        quit
        EOF " );
        $MYRET = check();
        if ($MYRET eq 1) {
            exit 0;
            }
        else {
            exit 1;
        }
        }
# Check database
if ($command eq "check" ) {
        $MYRET = check();
        exit $MYRET;
}
sub check {
        my($check_proc,$process) = @_;
        $process = "ora_pmon_$ORACLE_SID";
        $check_proc = qx(ps -ae -o args | grep -w ora_pmon_$ORACLE_SID| grep -v grep);
        chomp($check_proc);
        if ($process eq $check_proc) {
                $RET=0;
        } else {
                $RET=1;
        }
        return $RET;
}
# Clean database
if ($command eq "clean") {
        my $kill_proc = qx(ps -aef | grep -w ora_pmon_$ORACLE_SID | grep -v grep | awk '{print \$2}'| xargs kill -9 $1);
        exit 0;
}