|
|
RAC and STATSPACK
Oracle RAC
Cluster Tips by Burleson Consulting |
This is an excerpt from the bestselling book
Oracle Grid & Real Application Clusters. To get immediate
access to the code depot of working RAC scripts, buy it
directly from the publisher and save more than 30%.
The following is a STATSPACK
report for the troubled instance. The sections covered will be those
that involve RAC statistics. The first section deals with the top
five timed events:
Top 5
Timed Events
~~~~~~~~~~~~~~~~~~
% Total
Event
Waits Time (s) Ela Time
-------------------------------------------- ------------
----------- --------
global cache cr request
820 154
72.50
CPU time
54 25.34
global cache null to x
478 1
.52
control file sequential read
600 1
.52
control file parallel write
141 1
.28
-------------------------------------------------------------
Observe the events in the report
that are taking a majority of the % total elapsed time column that
are greater than or near the %total ela time value for cpu time. The
cpu time statistic should be the predominant event as it denotes
processing time. If cpu time is not the predominant event, the
events that exceed cpu time?s share of the total elapsed time need
to be investigated. In the above report section, global cache cr
request events are dominating the report. This indicates that
transfer times are excessive from the other instances in the cluster
to this instance. The excessive transfer times could be due to
network problems or buffer cache sizing issues. After making the
network changes and adding an index, the STATSPACK wait report for
instance one looks like:
Top 5
Timed Events
~~~~~~~~~~~~~~~~~~
% Total
Event
Waits Time (s) Ela Time
-------------------------------------------- ------------
----------- --------
CPU time
99 64.87
global cache null to x
1,655 28
18.43
enqueue
46 8
5.12
global cache busy
104 7
4.73
DFS lock handle
38 2
1.64
The number one wait is now cpu
time, followed by global cache null to x, which indicates the major
wait has been shifted from intra-cache to I/O-based as global cache
null to x indicates a read from disk.
The next report in the STATSPACK
listing shows the workload characteristics for the instance for
which the report was generated:
Cluster Statistics for DB: MIKE
Instance: mike2 Snaps: 25 -26
Global
Cache Service - Workload Characteristics
-----------------------------------------------
Ave global cache get time (ms):
3.1
Ave global cache convert time (ms):
3.2
Ave build time for CR block
(ms):
0.2
Ave flush time for CR block (ms):
0.0
Ave send time for CR block (ms):
1.0
Ave time to process CR block request (ms):
1.3
Ave receive time for CR block (ms):
17.2
Ave pin time for current block
(ms):
0.2
Ave flush time for current block (ms): 0.0
Ave send time for current block (ms):
0.9
Ave time to process current block request (ms):
1.1
Ave receive time for current block (ms):
3.1
Global cache hit ratio:
1.7
Ratio of current block defers:
0.0
% of messages sent for buffer gets:
1.4
% of remote buffer gets:
1.1
Ratio of I/O for coherence:
8.7
Ratio of local vs remote work:
0.6
Ratio of fusion vs physical writes:
1.0
In the above report, the
statistics in relation to the other instances in the cluster should
be examined. The possible causes of any statistics that are not in
line with the other cluster instances should be investigated. By
making the network changes and index changes stated before, the
workload was increased by a factor of greater than three, and the
response time was still less than in the original STATSPACK. The
following is the same section from the STATSPACK report taken after
the network changes. Almost all statistics show an increase:
Cluster Statistics for DB: MIKE
Instance: mike2 Snaps: 105 -106
Global
Cache Service - Workload Characteristics
-----------------------------------------------
Ave global cache get time (ms):
8.2
Ave global cache convert time (ms):
16.5
Ave build time for CR block
(ms):
1.5
Ave flush time for CR block (ms):
6.0
Ave send time for CR block (ms):
0.9
Ave time to process CR block request (ms):
8.5
Ave receive time for CR block (ms):
18.3
Ave pin time for current block
(ms):
13.7
Ave flush time for current block (ms):
3.9
Ave send time for current block (ms):
0.8
Ave time to process current block request (ms): 18.4
Ave receive time for current block (ms):
17.4
Global cache hit ratio:
2.5
Ratio of current block defers:
0.2
% of messages sent for buffer gets:
2.2
% of remote buffer gets:
1.6
Ratio of I/O for coherence:
2.8
Ratio of local vs remote work:
0.5
Ratio of fusion vs physical writes:
0.0