 |
|
Getting Dump Details from dmstool
Oracle Application Server Tips by Burleson
Consulting |
By using the ?dump option of dmstool we can
collect all metrics from an Oracle9iAS instance. Most
Oracle9iAS administration use a small shell script like the one
below and schedule it to run every hour:
#!/bin/ksh
PATH=$PATH:/home/oracle/oraportal904/bin
export PATH
dmstool
-dump >> dumparch.lst
You can use the -dump option to store
dmstool performance metrics for later analysis (Listing 10.9).
However, note that the ?dump option does not display the metrics in
an easy-to-summarize format, and a code snippet is required to
gather the information and place it inside a metadata table.
/DMS-Internal/Measurement [type=n/a]
createNoun.count: 136 ops
createSensor.count: 591 ops
destroyNoun.count: 4
ops
destroySensor.count: 25 ops
lastTreeNodeID.value: 0
sampleMetric.count: 5531850 ops
sensorWeight.value: 5
treeNodes.maxValue: 1635.0
treeNodes.value: 1635
/JDBC/OracleConnectionCacheImpl [type=JDBC_ConnectionSource]
CacheFreeSize.count: 18 ops
CacheFreeSize.maxValue: 5.0
connections
CacheFreeSize.minValue: 0.0
connections
CacheFreeSize.value: 2
connections
CacheGetConnection.active: 0
threads
CacheGetConnection.avg:
0.42857142857142855 msecs
CacheGetConnection.completed:
7 ops
CacheGetConnection.maxTime: 1
msecs
CacheGetConnection.minTime: 0
msecs
CacheGetConnection.time: 3
msecs
CacheHit.count: 7
ops
CacheMiss.count: 2
ops
CacheSize.count: 3
ops
CacheSize.maxValue: 5.0 connections
CacheSize.minValue: 1.0 connections
CacheSize.value: 1
connections
Listing 10.9: Detailed dmstool dump output
While this listing may be cumbersome, it is
a trivial matter to write a program to parse and summarize this
output, storing the metrics inside special iasdb tables. For
details on this technique, see the preceding section on Forms
Servers performance analysis.
Next let?s look at using dmstool to gather
load balancing performance information on your Oracle HTTP servers.
Using dmstool to monitor and load balance
Oracle HTTP Servers
You can use the dmstool command with the
?table ohs_server option to gather detailed information about the
performance of all of the components of the OHS server. Table
10.1 shows the most important OHS server metrics. Note that
the usecs metric represents microseconds (millionths of a second).
Metric |
Description |
Unit |
handle.maxTime |
Maximum time spent in
module handler |
usecs |
handle.minTime |
Minimum time spent in
module handler |
usecs |
handle.avg |
Average time spent in
module handler |
usecs |
handle.active |
Chile servers currently
in the handle processing phase |
threads |
handle.time |
Total time spent in
module handler |
usecs |
handle.completed |
Number of times the
handle processing phase has completed |
ops |
request.maxTime |
Maximum time required to
service an HTTP request |
usecs |
request.minTime |
Minimum time required to
service an HTTP request |
usecs |
request.avg |
Average time required to
service an HTTP request |
usecs |
request.active |
Child servers currently
in the request processing phase |
threads |
request.time |
Total time required to
service HTTP requests |
usecs |
request.completed |
Number of HTTP request
completed |
ops |
connection.maxTime |
Maximum time spent
servicing any HTTP connection |
usecs |
connection.minTime |
Minimum time spent
servicing any HTTP connection |
usecs |
connection.avg |
Average time spent
servicing HTTP connections |
usecs |
connection.active |
Number of connections
currently open |
threads |
connection.time |
Total time spent
servicing HTTP connections |
usecs |
Table 10.1: Metrics from dmstool ohs_server
command
Most Oracle9iAS administrators automate this
collection task by placing the dmstool command inside a shell script
and directing the output to a flat file for later analysis.
#!/bin/ksh
PATH=$PATH:/home/oracle/oraportal904/bin
export PATH
dmstool
-table ohs_server >> ohs.lst
Here is a small sample of the output from
this script (Listing 10.10). The output is very voluminous
because it performs a snapshot of the values every ten seconds, and
the output provides details on the number of operations (ops), and
timing information on all OHS components.
Sun Jul 13
21:01:45 MDT 2003
----------
ohs_server
----------
busyChildren.value: 16
childFinish.count: 24703
ops
childStart.count: 24748
ops
connection.active: 24
threads
connection.avg: 116999118 usecs
connection.completed: 58559 ops
connection.maxTime: 120275397680
usecs
connection.minTime: 1437
usecs
connection.time:
6851351400020 usecs
error.count: 138 ops
get.count: 150940 ops
handle.active: 1 threads
handle.avg: 8620 usecs
handle.completed: 247278
ops
handle.maxTime: 32791802
usecs
handle.minTime: 2 usecs
handle.time: 2131602896
usecs
internalRedirect.count: 7650 ops
lastConfigChange.value: 1057965990
numChildren.value: 44
numMods.value: 0
post.count: 2
ops
readyChildren.value: 27
request.active: 1 threads
request.avg: 15321 usecs
request.completed: 150942 ops
request.maxTime: 32792567
usecs
request.minTime: 533 usecs
request.time: 2312728152
usecs
responseSize.value: 1622607150
Host: appsvr
Name: Apache
Parent: /
Process: Apache:2534:6004
Listing 10.10: dmstool ohs_server output
The most useful part of the ohs_server
listing are the details on OHS child server processes. The values
for the child servers are specified in the httpd.conf file by the
MaxSpareServers and MinSpareServers parameters, and OHS will create
and destroy child server processes based upon the volume of incoming
requests. It is important to know the number of OHS child server
in-use and the number of child servers that are processing HTTP
requests.
Referring to the bold lines of Listing
10.10, we see that numChildren.value is 43, indicating that there
are 43 OHS child servers active. Of these 43 server,
busyChildren.value is 16, indicating that there are currently 27
child server ready to accept work, as verified by the
readyChildren.value metric. We also see that the childStart.count is
24,748, showing the number of invocations of OHS child processes
since startup time. The most important of these metrics is
request.avg, which shows that the average time spent in the HTTP
server is 15,321 milliseconds, or about one-tenth of a second for
connection.active = 24 transactions. Taken together, these
metrics give us a good idea about the volume of transactions
experienced on each OHS server.
Remember, when the demand on the OHS server
exceeds the number of child servers defined in the httpd.conf
parameter file, OHS will spawn more child processes, but it is a
good idea to determine the peak load for each OHS server and perform
load balancing from the Web Cache to ensure that no single OHS
server becomes overloaded.
Now that we see the concept, let?s expand
this concept and write a short script to filter through the
voluminous OHS server statistics and extract information on active
requests and the status of the OHS child processes.
extract_ohs_time_series.ksh
#!/bin/ksh
PATH=$PATH:/home/oracle/oraportal904/bin
export PATH
dmstool
-table ohs_server > ohs.lst
cat
ohs.lst|grep connection.active > con_active.lst
cat ohs.lst|grep request.active
> req_active.lst
cat ohs.lst|grep busyChildren.value > busy_child.lst
cat ohs.lst|grep readyChildren.value > readyChild.lst
cat ohs.lst|grep numChildren.value > det.lst
From extracting and plotting the data in
these files (MS-Excel chart wizard works great), you should
carefully monitor the volume of transactions (connection.active) and
the average response time (request.avg) to determine the threshold
where performance drops (Figure 10.10). In this example, we
see that the server becomes overwhelmed (usually due to a RAM
shortage and the resulting paging), and that performance declines
sharply after 50 active connections. Once this threshold is
identified, you can create enough new OHS servers to ensure that no
server exceeds this threshold.
Figure 10.10: Transaction levels and OHS
server average response time
Note: This type of chart is critical to OHS
load balancing. As we recall from previous chapters, the Web
Cache performs automatic load balancing between the active OHS
servers. However, the Oracle9iAS administrator can keep a
?pool? of servers in standby mode with Web Cache and OHS installed
on them. Depending on need, you can add them into the
Oracle9iAS architecture as an OHS server or a Web cache server.
Again, most Oracle9iAS administrators will
collect this information on a scheduled basis and write programs to
gather summary information to store in iasdb extension tables.
This builds the framework for time-series analysis of this important
performance data. Next let?s look at using the dmstool command
to show statistics for active requests.
This is an excerpt from "Oracle
10g Application Server Administration Handbook" by Don Burleson
and John Garmany.