 |
|
Upgrading an Entire Server
Oracle Application Server Tips by Burleson
Consulting |
On mission-critical databases where speed is
a primary concern, adding additional processors may not be the best
solution. Oracle tuning professionals will sometimes recommend
upgrading to faster server architecture. For example, many of the
new 64-bit CPU processors will handle Oracle9iAS transactions an
order of magnitude faster than their 32-bit predecessors. For
example, in the IBM AIX environment, the IBM SP2 processors run on
32 bits. IBM's next generation of processors utilize a 64-bit
technology, and these systems can process information far faster
than their 32-bit ancestors.
When making recommendations for upgrades of
entire servers, many Oracle9iAS tuning professionals use the analogy
of the performance of a 16-bit PC compared to the performance of
32-bit PC. In general, moving to faster CPU architecture can greatly
improve the speed of Oracle applications, and many vendors such as
IBM will allow you to actually load your production system onto one
of the new processors for speed benchmarks prior to purchasing the
new servers.
Adding Additional CPU Processors
Most symmetric multiprocessor (SMP)
architectures for Oracle9iAS servers are expandable, and additional
processors can be added at any time. Once added, the processor
architecture will immediately make the new CPUs available to the
Oracle database.
The problem with adding additional
processors is the high cost that can often outweigh the cost of a
whole new server. Adding additional processors to an existing server
can commonly cost over $100,000, and most managers require a
detailed cost-benefit analysis when making the decision to buy more
CPUs. Essentially, the cost-benefit analysis compares the lost
productivity of the end users (due to the response time latency)
with the additional costs of the processors.
Another problem with justifying additional
processors is the sporadic nature of CPU overloads. Oracle9iAS
servers often experience ?transient? overloads, and there will be
times when the processors are heavily burdened and other times when
the processors are not at full utilization. Before recommending a
processor upgrade, most Oracle9iAS administrators will perform a
load-balancing analysis to ensure that any batch-oriented tasks are
presented to the server at non-peak hours.
Next, let?s look at some of the tools that
we can use to monitor server usage.
Overview of the vmstat Utility
The vmstat utility is the most common UNIX
monitoring utility, and it is found in the majority of UNIX dialects
(Note that vmstat is called osview on the IRIX dialect of UNIX). The
vmstat utility displays various server values over a given time
interval. The vmstat utility is invoked from the UNIX prompt, and it
has several numeric parameters. The first numeric argument to vmstat
represents the time interval (expressed in seconds) between server
samples. The second argument specifies the number of samples to be
reported. In the example that follows, vmstat is executed to take
five samples at 2-second intervals:
root>
vmstat 2 5
Almost all UNIX servers have some version of
vmstat. Before we look at the details for this powerful utility,
let?s explore the differences that you are likely to see.
Dialect Differences in vmstat
Because each hardware vendor writes their
own vmstat utility, there are significant differences in vmstat
output. The vmstat output is different depending on the dialect of
UNIX, but each dialect contains the important server metrics.
Because vendors have written their own
versions of the vmstat utility, it can be useful to consult the
online UNIX documentation to see the display differences. In UNIX,
you can see your documentation by invoking the man pages. The term
man is short for manual, and you can see the documentation for your
particular implementation of vmstat by entering man vmstat from your
UNIX prompt.
Below is a sample of vmstat output for the
four most popular dialects of UNIX. In each example below, the
important metrics appear in bold.
vmstat for Solaris
In the Sun Solaris operating environment,
the output from vmstat will appear like this:
root>
vmstat 2 5
procs
memory
page disk
faults cpu
r b w swap free re mf pi po ?
s6 -- -- in sy cs us sy id
0 0 0 2949744 988800 0 4 0 0 ? 0
0 0 148 200 41 0 0 99
0 0 0 2874808 938960 27 247 0 1 ? 0 0 0
196 434 64 1 2 98
0 0 0 2874808 938960 0 0 0 0 ? 0
0 0 134 55 32 0 0
100
0 0 0 2874808 938960 0 0 0 0 ? 0
0 0 143 114 39 0 0 100
0 0 0 2874808 938960 0 0 0 0 ? 0
0 0 151 86 38 0 0 100
vmstat for Linux
In the Linux operating environment, the
output from vmstat will appear like this:
root> vmstat 2 5
procs
memory swap io
system cpu
r b w swpd free
buff cache si ? bi bo
in cs us sy id
1 0 0 140 90372 726988
26228 0 ? 0 0
14 7 0 0
4
0 0 0 140 90372 726988
26228 0 ? 0 2
103 11 0 0 100
0 0 0 140 90372 726988
26228 0 ? 0 5
106 10 0 0 100
0 0 0 140 90372 726988
26228 0 ? 0 0
101 11 0 0 100
0 0 0 140 90372 726988
26228 0 ? 0 0
102 11 0 0 100
vmstat for AIX
In the IBM AIX operating environment, the
output from vmstat will appear like this:
root>
vmstat 2 5
kthr
memory
page
faults cpu
----- ----------- ------------------------ ------------ -----------
r b avm fre re pi
po fr sr cy in
sy cs us sy id wa
7 5 220214 141 0 0 0
42 53 0 1724 12381 2206 19 46 28 7
9 5 220933 195 0 0
1 216 290 0 1952 46118 2712 27 55 13 5
13 5 220646 452 0 0
1 33 54 0 2130 86185 3014 30 59
8 3
6 5 220228 672 0 0
0 0 0 0 1929 25068 2485 25
49 16 10
vmstat for HP/UX
In the Hewlett Packard HP/UX operating
environment, the output from vmstat will appear like this:
root> vmstat 2 5
r b w
avm free re at pi po ?
in sy cs us sy id
1 0 0 70635 472855 10 5 2
0 ? 2024 2859 398 4
1 96
1 0 0 74985 472819 9 0 1
0 ? 1864 1820 322 0
0 100
0 0 0 83056 472819 2 0 0
0 ? 1846 1684 302 0
0 100
0 0 0 81390 472819 0 0 0
0 ? 1847 1571 288 0
0 100
0 0 0 78788 472819 0 0 0
0 ? 1852 1608 291 0
0 100
Now that we have seen the different display
options for each dialect of vmstat, let?s take a look at the data
items in vmstat and understand the common values that we can capture
in STATSPACK tables.
What to look for in vmstat output
As you can see, each dialect of vmstat
reports different information about the current status of the
server. Despite these dialect differences, there are only a small
number of metrics that are important for server monitoring. These
metrics include:
-
r (runqueue) - The runqueue value
shows the number of tasks executing and waiting for CPU resources.
When this number exceeds the number of CPUs on the server, a CPU
bottleneck exists, and some tasks are waiting for execution.
-
pi (page in) - A page-in operation
occurs when the server is experiencing a shortage of RAM memory.
While all virtual memory server will page out to the swap disk,
page-in operations show that the server has exceeded the available
RAM storage. Any nonzero value for pi indicates excessive activity
as RAM memory contents are read in from the swap disk.
-
us (user CPU) - This is the amount
of CPU that is servicing user tasks.
-
sy (system CPU) - This is the
percentage of CPU being used to service system tasks.
-
id (idle) - This is the percentage
of CPU that is idle
-
wa (wait?IBM-AIX only) - This shows
the percentage of CPU that is waiting on external operations such
as disk I/O.
Note that all of the CPU metrics are
expressed as percentages. Hence, all of the CPU values (us + sy + id
+ wa) will always sum to 100. Now that we have a high-level
understanding of the important vmstat data, let?s look into some
methods for using vmstat to identify server problems.
This is an excerpt from "Oracle
10g Application Server Administration Handbook" by Don Burleson
and John Garmany.