Tuesday, January 31, 2012

Instance Caging - Managing Multiple Database Instances on a Single Server

Instance caging is another small but useful feature of Oracle Database 11g Release 2. Thanks to it the database resource manager is able, to limit the number of CPUs that can be used by a given instance simultaneously. This feature allows Oracle DBAs to easily manage Oracle Instance CPU consumption.

There are two typical approaches to instance caging for a server:

  • Over-provisioning:
    In this approach, the sum of the CPU limits for each instance exceeds the actual number of CPUs on the system.
  • Partitioning:In this approach, the sum of allocate CPUs for each instance is equal to the number of CPUs on the server
Most of the servers with multiple database are pollout conform the "over-provisioning" method.


The default value of the CPU_COUNT = 0 # default setting results in Dynamic CPUs Reconfiguration. In this senario the Oracle Database continuously monitors the number of CPUs reported by the operating system and uses the current count. The database dynamically detects any change in the number of available CPUs and reallocates internal resources. On most platforms, the database automatically adjusts the value of the CPU_COUNT initialization parameter to the number of available CPUs.

If CPU_COUNT is set to a value other than 0, then Oracle Database will use this count rather than the actual number of CPUs, thus disabling dynamic CPU reconfiguration.

REMARK; Setting CPU_COUNT to a value greater than the current number of CPUs results in an error. However, if CPU_COUNT is set to a value greater than the current number of CPUs in the initialization parameter file, then CPU_COUNT is capped to the current number of CPUs.




Enabling Instance Caging

To enable instance caging, do the following for each instance on the server:

  1. Set the cpu_count initialization parameter.
    This is a dynamic parameter, and can be set with the following statement:
    ALTER SYSTEM SET CPU_COUNT = 4;
    
  2. Enable the Resource Manager by assigning a resource plan, and ensure that the resource plan has CPU directives, using the mgmt_p1 through mgmt_p8 parameters.

When Resource Manager is enabled, setting CPU_COUNT limits the CPU utilization to approximately CPU_COUNT processors.

Instance caging works by throttling database instance processes that require more CPUs than the instance is entitled to. Wait event resmgr:cpu quantum" may appear in the AWR reports. If throttling becomes significant then increase CPU_COUNT or move this or other databases from the node.

Instance Caging validation

How do you validate that caging is correctly setup
select value from v$parameter where name ='cpu_count' 
and (isdefault='FALSE' or ismodified != 'FALSE');
Query return the current value if modified else nothing.
select name from v$rsrc_plan where cpu_managed='ON' and is_top_plan='TRUE';
If a row returned, then the plan is active.

Playing and monitoring

A database server with only four physical cores. for the four databases we have assigned CPU_COUNT = 3. The maximum percentage of CPU that a single database instance can consume at any point in time is its own limit divided by the sum of limits for all active database instances.

In this example is this 25% [3/(3+3+3+3)=3/12=1/4] => 25 percent.

If only two instances are active and CPU bound then a third instance, which is not caged, will be able to consume 50 percent [4/(3+3+4)=4/10] => 40 persent.

To burn all the available CPU resources, Started four sessions executing the following and use vmstat to validate

PLSQL block
DECLARE
  n NUMBER;
BEGIN
  WHILE (TRUE)
  LOOP
    n:= dbms_random.random();
  END LOOP;
END;
/

VMSTAT
procs ------------memory------------ --swap-- ----io---- --system-- -----cpu-----
 r  b  swpd     free   buff    cache  si   so   bi    bo    in   cs   us sy id wa
 5  0     0  6037164  71568  5476620   0    0   66   198  1033  823  100  0  0  0
 5  0     0  6037164  71572  5476616   0    0    0   158  1021  801  100  0  0  0
 5  0     0  6037164  71572  5476616   0    0    0    48  1016  750  100  0  0  0

Start in another session the following PL/SQL block

PLSQL block
DECLARE
  l_sql VARCHAR2(100) := 'ALTER SYSTEM SET cpu_count = ';
BEGIN
  EXECUTE IMMEDIATE l_sql || '4';
  dbms_lock.sleep(10);
  EXECUTE IMMEDIATE l_sql || '3';
  dbms_lock.sleep(10);
  EXECUTE IMMEDIATE l_sql || '2';
  dbms_lock.sleep(10);
  EXECUTE IMMEDIATE l_sql || '1';
  dbms_lock.sleep(10);
  EXECUTE IMMEDIATE l_sql || '0';
END; 
/

VMSTAT
procs ------------memory------------ --swap-- ----io---- --system-- -----cpu-----
 r  b  swpd     free   buff    cache  si   so   bi    bo    in   cs   us sy id wa
 5  0      0 3695176  80856  7780352   0    0    0    96  1019  783  100  0  0  0
 5  0      0 3695176  80856  7780352   0    0    0   170  1023  795  100  0  0  0
 5  0      0 3695184  80856  7780352   0    0    0   156  1028  795  100  0  0  0
 5  0      0 3695184  80860  7780348   0    0    0   120  1021  795  100  0  0  0
 5  0      0 3694928  80860  7780348   0    0    2   168  1029  810  100  0  0  0
 3  0      0 3694928  80860  7780348   0    0    0   214  1029  939   77  0 23  0
 3  0      0 3694928  80864  7780344   0    0    0   118  1020  961   75  0 25  0
 3  0      0 3694928  80864  7780344   0    0    2   152  1026  961   75  0 25  0
 3  0      0 3694596  80868  7780340   0    0    2   158  1027  975   75  0 25  0
 3  0      0 3694612  80868  7780340   0    0    0   142  1031  979   75  0 25  0
 3  0      0 3694612  80868  7780340   0    0    2   164  1024  963   75  0 25  0
 3  0      0 3694616  80872  7780336   0    0    0   358  1079  961   52  0 49  0
 2  0      0 3694616  80872  7780336   0    0    0   120  1021  909   50  0 50  0
 2  0      0 3697312  80872  7780336   0    0    0   162  1025  952   50  0 50  0
 2  0      0 3694744  80876  7780332   0    0    0   142  1027  948   50  0 50  0
 1  0      0 3694744  80876  7780332   0    0    0   120  1021  954   40  0 60  0
 1  0      0 3694748  80876  7780332   0    0    0   234  1034  953   26  0 74  0
 1  0      0 3694748  80876  7780332   0    0    0   134  1021  921   26  0 74  0
 1  0      0 3696484  80876  7780332   0    0    0   120  1020  954   26  0 74  0
 1  0      0 3696476  80880  7780328   0    0    2   196  1035  996   26  0 74  0
 5  0      0 3696476  80880  7780328   0    0    0   112  1020  643   96  0  4  0
 6  0      0 3696484  80880  7780328   0    0    0   216  1040  778  100  0  0  0
 5  0      0 3696484  80884  7780324   0    0    0   160  1021  763  100  0  0  0
 5  0      0 3696484  80884  7780324   0    0    0   112  1020  775  100  0  0  0
 5  0      0 3696468  80888  7780320   0    0    0   156  1026  785  100  0  0  0

Overprovisioning, stacking - packing as many database instance as possible onto a singel database server, will reduce the idle CPU and wast CPU, but instance caging will ensure that in worse case possible an instance can still acquire a predictable amount of CPU resource.