There are two typical approaches to instance caging for a server:
- Over-provisioning:
In this approach, the sum of the CPU limits for each instance exceeds the actual number of CPUs on the system. - Partitioning:In this approach, the sum of allocate CPUs for each instance is equal to the number of CPUs on the server
The default value of the CPU_COUNT = 0 # default setting results in Dynamic CPUs Reconfiguration. In this senario the Oracle Database continuously monitors the number of CPUs reported by the operating system and uses the current count. The database dynamically detects any change in the number of available CPUs and reallocates internal resources. On most platforms, the database automatically adjusts the value of the CPU_COUNT initialization parameter to the number of available CPUs.
If CPU_COUNT is set to a value other than 0, then Oracle Database will use this count rather than the actual number of CPUs, thus disabling dynamic CPU reconfiguration.
REMARK; Setting CPU_COUNT to a value greater than the current number of CPUs results in an error. However, if CPU_COUNT is set to a value greater than the current number of CPUs in the initialization parameter file, then CPU_COUNT is capped to the current number of CPUs.
Enabling Instance Caging
To enable instance caging, do the following for each instance on the server:- Set the cpu_count initialization parameter.
This is a dynamic parameter, and can be set with the following statement:
ALTER SYSTEM SET CPU_COUNT = 4;
- Enable the Resource Manager by assigning a resource plan, and ensure that the resource plan has CPU directives, using the mgmt_p1 through mgmt_p8 parameters.
When Resource Manager is enabled, setting CPU_COUNT limits the CPU utilization to approximately CPU_COUNT processors.
Instance caging works by throttling database instance processes that require more CPUs than the instance is entitled to. Wait event resmgr:cpu quantum" may appear in the AWR reports. If throttling becomes significant then increase CPU_COUNT or move this or other databases from the node.
Instance Caging validation
How do you validate that caging is correctly setupselect value from v$parameter where name ='cpu_count' and (isdefault='FALSE' or ismodified != 'FALSE');Query return the current value if modified else nothing.
select name from v$rsrc_plan where cpu_managed='ON' and is_top_plan='TRUE';If a row returned, then the plan is active.
Playing and monitoring
A database server with only four physical cores. for the four databases we have assigned CPU_COUNT = 3. The maximum percentage of CPU that a single database instance can consume at any point in time is its own limit divided by the sum of limits for all active database instances.In this example is this 25% [3/(3+3+3+3)=3/12=1/4] => 25 percent.
If only two instances are active and CPU bound then a third instance, which is not caged, will be able to consume 50 percent [4/(3+3+4)=4/10] => 40 persent.
To burn all the available CPU resources, Started four sessions executing the following and use vmstat to validate
PLSQL block
DECLARE n NUMBER; BEGIN WHILE (TRUE) LOOP n:= dbms_random.random(); END LOOP; END; /
VMSTAT
procs ------------memory------------ --swap-- ----io---- --system-- -----cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa 5 0 0 6037164 71568 5476620 0 0 66 198 1033 823 100 0 0 0 5 0 0 6037164 71572 5476616 0 0 0 158 1021 801 100 0 0 0 5 0 0 6037164 71572 5476616 0 0 0 48 1016 750 100 0 0 0
Start in another session the following PL/SQL block
PLSQL block
DECLARE l_sql VARCHAR2(100) := 'ALTER SYSTEM SET cpu_count = '; BEGIN EXECUTE IMMEDIATE l_sql || '4'; dbms_lock.sleep(10); EXECUTE IMMEDIATE l_sql || '3'; dbms_lock.sleep(10); EXECUTE IMMEDIATE l_sql || '2'; dbms_lock.sleep(10); EXECUTE IMMEDIATE l_sql || '1'; dbms_lock.sleep(10); EXECUTE IMMEDIATE l_sql || '0'; END; /
VMSTAT
procs ------------memory------------ --swap-- ----io---- --system-- -----cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa 5 0 0 3695176 80856 7780352 0 0 0 96 1019 783 100 0 0 0 5 0 0 3695176 80856 7780352 0 0 0 170 1023 795 100 0 0 0 5 0 0 3695184 80856 7780352 0 0 0 156 1028 795 100 0 0 0 5 0 0 3695184 80860 7780348 0 0 0 120 1021 795 100 0 0 0 5 0 0 3694928 80860 7780348 0 0 2 168 1029 810 100 0 0 0 3 0 0 3694928 80860 7780348 0 0 0 214 1029 939 77 0 23 0 3 0 0 3694928 80864 7780344 0 0 0 118 1020 961 75 0 25 0 3 0 0 3694928 80864 7780344 0 0 2 152 1026 961 75 0 25 0 3 0 0 3694596 80868 7780340 0 0 2 158 1027 975 75 0 25 0 3 0 0 3694612 80868 7780340 0 0 0 142 1031 979 75 0 25 0 3 0 0 3694612 80868 7780340 0 0 2 164 1024 963 75 0 25 0 3 0 0 3694616 80872 7780336 0 0 0 358 1079 961 52 0 49 0 2 0 0 3694616 80872 7780336 0 0 0 120 1021 909 50 0 50 0 2 0 0 3697312 80872 7780336 0 0 0 162 1025 952 50 0 50 0 2 0 0 3694744 80876 7780332 0 0 0 142 1027 948 50 0 50 0 1 0 0 3694744 80876 7780332 0 0 0 120 1021 954 40 0 60 0 1 0 0 3694748 80876 7780332 0 0 0 234 1034 953 26 0 74 0 1 0 0 3694748 80876 7780332 0 0 0 134 1021 921 26 0 74 0 1 0 0 3696484 80876 7780332 0 0 0 120 1020 954 26 0 74 0 1 0 0 3696476 80880 7780328 0 0 2 196 1035 996 26 0 74 0 5 0 0 3696476 80880 7780328 0 0 0 112 1020 643 96 0 4 0 6 0 0 3696484 80880 7780328 0 0 0 216 1040 778 100 0 0 0 5 0 0 3696484 80884 7780324 0 0 0 160 1021 763 100 0 0 0 5 0 0 3696484 80884 7780324 0 0 0 112 1020 775 100 0 0 0 5 0 0 3696468 80888 7780320 0 0 0 156 1026 785 100 0 0 0
Overprovisioning, stacking - packing as many database instance as possible onto a singel database server, will reduce the idle CPU and wast CPU, but instance caging will ensure that in worse case possible an instance can still acquire a predictable amount of CPU resource.
Thanks a lot very much for the high quality and results-oriented help. I won’t think twice to endorse your blog post to anybody who wants and needs support about this area.
ReplyDeleteBest CRM System