|
|
Oracle Predictive Modeling
Oracle Tips by Burleson Consulting |
"Predictions are difficult,
especially about the future".
Yogi Berra
One feature that makes Oracle the most powerful database ever created
is his ability to create predictive models to assist the DBA.
These powerful predictive models allow Oracle to forecast the
benefit of changes to the database:
-
SGA Advisors
- Oracle
has predictive advisories for the
data buffer cache, the shared pool and almost every region
within the Oracle RAM heap. In Oracle 10g automated memory
management (AMM), Oracle acts upon the predictions and morphs
the SGA automatically.
-
Tuning advisors -
Oracle's automatic diagnostic database advisor (ADDM) uses
artificial intelligence to recommend new materialized views and
indexes. Oracle also has a
SQLTuning advisor and a
SQLAccess advisor.
These advisors gather real-world empirical execution information
from Oracle and recommends actions based upon the execution
samples.
Is it possible for Oracle to
predict a future problem and prevent it? The adage "Those who
forget the past are condemned to repeat it" is relevant here, and
it's easy to develop a predictive model that will analyze historical patterns and
"proactively" change your database BEFORE the problem occurs.
This
approach is used in the
Ion analyzer, where historical data from STATSPACK or AWR is analyzed to identify the signatures (repeating patterns).
You can then
interface with dbms_scheduler to change the database just-in-time to
prevent the future outage. This revolutionary techniques is
the basis of his new book "Oracle
Tuning: The Definitive Reference".
Oracle is listening, and it appears that Oracle will soon leverage upon their wonderful
time-series performance repository, the AWR.
In an OracleWorld 2003
presentation titled Oracle Database 10g: The Self-Managing
Database by Sushil Kumar of Oracle Corporation, Kumar states that
the Automatic Maintenance Tasks (AMT) Oracle10g feature will
automatically detect and re-build sub-optimal indexes.
In a superb paper titled METRIC BASELINES: DETECTING AND EXPLAINING
PERFORMANCE EVENTS IN EM 10GR2, Presented at the
RMOUG 2005
Training Days in Denver, John Beresniewicz of Oracle Corporation
gives us a great preview into the new predictive modeling tools in
Oracle 10g release 2 (10.2). Beresniewicz told me that the use
of "baselines to capture and adapt thresholds to expected
time-dependent workload variations" is a core feature of the
next release of Oracle:
The
metric baselines introduced in Enterprise Manager 10gR2
statistically characterize specific system metrics over time periods
matched to system usage patterns. These statistical profiles of
expected metric behavior are used to implement adaptive alert
thresholds that can signal administrators when statistically unusual
metric events occur.
Assuming that systems are normally stable and performance problems
rare, it is reasonable to expect that actual performance events will
be highly correlated with observed unusual values in some metric or
other. Thus it is hoped that baseline-driven adaptive thresholds
will both reduce configuration overhead for administrators and more
reliably signal real problems than fixed alert thresholds.
The idea is simple. Because
Automated Workload Repository (AWR, or STATSPACK in 8i and ) keeps
a historical performance record, Oracle now knows when a scheduled
task is going to cause a resource shortage, before it happens!
Using the AWR we can analyze statistically valid trends and
repeating patterns. Many databases have batch job schedules,
implemented via crontabs, dbms_job or dbms_scheduler,
and this valuable data can be used to:
-
Automatically morph the Oracle
instance to anticipate a daily of weekly batch job. The most
common example of this is a database that runs in OLTP mode
during the day (first_rows) and switches to decision support at
night (all_rows).
-
Predict the reduction in
logical I/O and physical I/O from reorganizing a table or index
-
Use linear regression to
predict a future time when Oracle will exhaust server resources
RAM, CPU)
Beresniewicz also shows the value
of signature analysis where you average metric values by
day-of-the-week and hour-of-the-day, and use the valid trends to
create a predictive model for proactive DBA changes:
The
baseline period can be divided into sub-intervals of time over which
statistical aggregates are computed. These subintervals, or time
partitions are intended to allow baselines to capture and adapt
thresholds to expected time-dependent workload variations.
EM
10gR2 allows for simple time partitioning that can capture common
daily or weekly usage/workload cycles. The daily options are:
By
hour of day: aggregate each hour separately, strong variation across
hours
By
day and night: aggregate the hours of 7am-7pm as day and 7pm-7am as
night
By
all hours: aggregate all hours together, no strong daily cycle
The
weekly time partitioning options are:
By
day of week: aggregate days separately, strong variation across days
By
weekday and weekend: aggregate Mon-Fri together and Sat-Sun together
By
all days: aggregate all days together, no strong weekly cycle
Time
partitioning is fully specified in EM 10gR2 by selecting both daily
and weekly partitioning options.
For
example, consider an operational system serving many online users
working similar daily schedules and running batch
jobs in the
evening.
In this exciting paper we see that
Oracle is planning to adopt the same kind of predictive model that I
have espoused for many years. The basic idea is that historical
data will provide valid predictive information that can be used by
Oracle to fix a resource issue BEFORE it cripples your performance.
Practical Application of predictive modeling
Predictive modeling is widely used in the
credit card industry:
- Predictive models exist that will
accurately determine your age, gender and race only from examining your
credit card purchase history.
- Using predictive modeling we can even
"guess" your income range, number of dependant and even your weight.
- Credit card companies can calculate the odds you eat at McDonald's today,
considering you ate at McDonald's once every X day.
- The credit card
companies use "cohorts", statistical groupings of related purchasers. They
give these cohorts cute names like "dinks (double income, no kids):
-
Credit card fraud prevention software is so sophisticated that if you order
products a person in your group never ordered, your card will get
automatically locked.
- Credit card use is never private and it can
be used by the police as credit card purchase history is used to prove DUI
(you took a large tab at a bar) indirectly.
|
If you like Oracle tuning, see the book "Oracle
Tuning: The Definitive Reference", with 950 pages of tuning tips and
scripts.
You can buy it direct from the publisher for 30%-off and get
instant access to the code depot of Oracle tuning scripts. |