Oracle Training Oracle Support Development Oracle Apps

 E-mail Us
 Oracle Articles
New Oracle Articles

 Oracle Training
 Oracle Tips

 Oracle Forum
 Class Catalog

 Remote DBA
 Oracle Tuning
 Emergency 911
 RAC Support
 Apps Support
 Oracle Support

 SQL Tuning

 Oracle UNIX
 Oracle Linux
 Remote s
 Remote plans
 Application Server

 Oracle Forms
 Oracle Portal
 App Upgrades
 SQL Server
 Oracle Concepts
 Software Support

 Remote S


 Consulting Staff
 Consulting Prices
 Help Wanted!


 Oracle Posters
 Oracle Books

 Oracle Scripts

Don Burleson Blog 








Important 2007 update:  The rules for identification of candidates for index rebuilding are changing.  Please see my updated notes on index rebuilding.

Predict if a table or index will benefit from reorganization

February 17, 2005 (Revised March 29, 2005)

Reliably predicting when to reorganize your tables and indexes is an important decision, and one that you can be the first to automate.  This has become the “Holy Grail” for many Oracle professionals, with numerous articles and debates about predicting the benefit from a reorg.  Some purists say that it's better to design the table/indexes never to fragment in the first place, but the reality of the Oracle DBA is that high attrition, vendor packages and the inability to control the developers makes this a very important real-world problem.

It is clear that Oracle will soon leverage upon their wonderful time-series performance repository, the AWR, and create predictive models:

  • In an OracleWorld 2003 presentation titled “Oracle Database 10g: The Self-Managing Database” by Sushil Kumar of Oracle Corporation, Kumar states that the new Automatic Maintenance Tasks (AMT) Oracle10g feature will "automatically detect and re-build sub-optimal indexes.“

  • In a paper titled “Metric Baselines: detecting and explaining performance events in em 10gr2” (Presented at the RMOUG 2005 Training Days), John Beresniewicz of Oracle Corporation notes that the use of "baselines to capture and adapt thresholds to expected time-dependent workload variations" is a core feature of the next release of Oracle.

The Objective:

The goal is to create a reliable predictive model that will suggest tables and indexes which will measurably benefit from reorganization, predict the reduction in I/O (logical I/O - consistent gets and physical I/O - physical reads) after the reorganization, and suggest changes that will prevent a reoccurrence of the fragmentation (i.e. new pctfree, new blocksize, etc.):

  • Oracle tables/indexes can be reorganized, sometimes resulting in faster SQL execution speeds

  • AWR (STATSPACK) has a history of how the tables were accessed by historical SQL (dba_hist_sql_plan, stats$sql_plan, etc.), including the I/O and CPU costs associated with each step of SQL execution.

  • We can see the current internal structure of every table/index (e.g. chain_cnt, clustering_factor, etc.)

  • Most databases have repeating patterns of behavior and historical SQL is usually representative of future activity.  The model should be able to see if the table/index history has simply by examining the historical access patterns.

The steps to solving this problem are simple, but the process logic is complex:

1 - Gather table/index structure information

- There are many examples of routines that can perform a alter index xxx validate structure command and store the results in a table for analysis.

2 - Analyze historical patterns of usage

- We can alter the plan9i.sql script to access the stats$sql_plan table and see exactly how table/indexes are accessed by our SQL, and how often too:

3 - Add the decision rules

- Design an expert-system algorithm that will interrogate your data from steps 1 and 2 and create the list of tables/indexes, predicted I/O reductions, and suggested structural changes.

4 - Test and Refine your rules

- Iterate through step 3, successively refining your rules, based on actual results.

Developing your rule-set

But how can we find these rules?  Do the rules have to make-sense?  No, they don't!  Scientists commonly create models (The Oracle data mining tools) that scan historical data and identify statistically significant correlations (within 2 standard deviations of the mean value).  For example, the popular MMPI test is a set of 500 true/false questions that accesses personality with remarkable validity, and it's results are accepted in all U.S. courts.  Their test-base consists of hundreds of thousands of subjects,. with a pre-diagnosed mental disorder (see DSM IV). By comparing their responses to seemingly innocuous questions (e.g. "I read the editorials in the newspaper every day") a proven predictive model was created (Federal courts have affirmed the MMPI as a scientifically valid) and accepted procedure for personality assessment. 

For example, the subjects preference to take showers vs. baths is an extremely reliable measure of self-esteem.  Do we know why?  No.  Do we care?  Not really. All that is proven is that this correlation is a statistically reliable predictor of feelings of self-worth.  We see the exact same scientific principle applied to Oracle data mining (ODM) tools.  For example, we might find-out that people with red hair buy a disproportionate amount of skin care products.  Knowing "why" is not important.  What's important is knowing that the data supports the assertion.  Also useful is the book "Unobtrusive Measures", which shows creative techniques for finding "hidden" significant metrics.

In sum, rules don't have to be proven true to be statistically reliable, and exceptions do not make the rule invalid.  For example, if two out of every 1,000 read-haired people don't buy skin care products, we still have a model with a 99% predictive quality.

Some possible Rules for your AI engine:

To get you started, here is a list of possible conditions that may contribute to table/index fragmentation.  Remember, you don't need to pre-justify your rule set.  The only thing that counts is that your rules make accurate predictions.

  • Index fast full scans will run faster after index reorganization whenever the “density” of the index entries becomes greater.  In other words, it takes less time to read 100,000 entries from a 100 block index than reading the entries from a 500 block index.
  • Multi-block Index range scans will run faster when the data blocks are arranged in index-key order and when the data blocks have a high number of row entries (as evidenced by clustering_factor in dba_indexes).
  • Large-table full-table scans will run faster after reorganization when the table has excessive chained or relocated rows, or low block density after massive DML (updates and deletes).

Proving the Benefit:

·        Full-table scans

- The predictive model will have a rate, expressed as a function of the number of rows and the amount of I/O (both logical and physical) that are required to access the rows via a “db file scattered read” (Usually an index fast-full scan, a full-table scan, etc.). 

                                       CONSISTENT GETS     PHYSICAL READS

   DATE         TABLE       BLOCKS     PER 1,000 ROWS      PER 1,000 ROWS                          

   6/20/05      CUSTOMER    100,000         56                   72  
   -- TABLE REORGED ON 6/20/05  *********************************** --
   6/30/05      CUSTOMER     99,000         20                   11  
   7/30/05      CUSTOMER     99,000         35                   32  
   8/30/05      CUSTOMER     99,000         49                   55  

The report output should show the forecasted reduction in I/O after the table was reorganized.

·        Multi-block Index range scans

- The predictive model should be able to see the size of each index range scan (the number of rows returned), the number of blocks that were required to fetch the rows, and the number of logical and physical I/O.

·        Index Rebuild

- The predictive model should capture the “speed” of index fast-full scans, multi-block index range scans, and show the degradation over time.

  • Sparsity of the index blocks (using dba_hist_sqlstat and dba_hist_sql_plan)

  • Rate of time (expressed in blocks/index row values) required to perform an index fast-full scan

Finding the Data:

The Oracle 10g AWR (and STATSPACK in previous releases) provides a time-series database.  This time-series performance details can be used to create accurate predictions of the real benefit of reorganizing tables and indexes.  Specifically, the AWR can measure, between any two snapshot values:

1 - The change in the speed of SQL execution steps over-time.


In this prototype below, we see the change in the speed of full-table scans:

                   OBJECT                             AVG.            NBR OF

DATE               OPERATION:                         TIME            BLOCKS

6/20/05 10:00 AM   TABLE ACCESSS FULL (CUSTOMER)      12.03 secs      100,000
6/21/05 11:00 AM   TABLE ACCESSS FULL (CUSTOMER)      14.03 secs      100,000
6/22/05 12:00 PM   TABLE ACCESSS FULL (CUSTOMER)      16.03 secs      100,000
6/23/05 01:00 PM   TABLE ACCESSS FULL (CUSTOMER)      18.03 secs      100,000
6/20/05 10:00 AM   INDEX FFS (CUSTOMER_FULL_IDX)     242.03 secs      100,000
6/21/05 11:00 AM   INDEX FFS (CUSTOMER_FULL_IDX)     311.03 secs      100,000
6/22/05 12:00 PM   INDEX FFS (CUSTOMER_FULL_IDX)     350.03 secs      100,000
6/23/05 01:00 PM   INDEX FFS (CUSTOMER_FULL_IDX)     400.03 secs      100,000

2 – The state of the object at a historical snapshot time:

The AWR contains important information about the data distribution and density of the data blocks within each Oracle segment:

            Table Segment Report:

DATE               SEGEMENT       BLOCKS    NUM_ROWS        LENGTH

6/20/05 10:00 AM   CUST_TABLE     100,000   500,000,000       77   

Sample Index Report:

                                            Average   Average                     Average            

Index              Average  Average          Leaf     Leaf Row     Average      Leaf Block            

Name    Date       Height   Blocks           Rows     Length      Leaf Blocks     Length            

CUST_IDX  6/19/05      1      236             12        234           1            7,996 

The Method:

The solution does not have to generate the syntax for the reorg, just suggest table and index names, predict the improvement, and suggest changes to the parameter setting/table structure to prevent a subsequent reorg:

  • Table reorgs can be done with dbms_redefinition, CTAS, or “alter table move” syntax.
  • Index reorgs are performed by rebuilding the index with the “alter index xxx rebuild” command.
  • Boolean rules and process control flow can be implemented in PL/SQL stored procedures.

The Challenge:

Since Oracle is working on this same problem, it will be interesting to see if a scientist DBA can develop a model with a higher rate of success.  This is not a trivial problem and a statistically valid DBA predictive model has eluded many of the world’s top DBA’s.  However, I believe that the information is there, buried deep inside the STATSPACK and AWR tables.  We just need a savvy DBA who can measure historical performance and determine those characteristics that recommend tables and indexes that will show measurable benefits from reorganization.

I will also promote all working solutions that I receive to show how Oracle10g can be used to solve complex DBA problems.  Predictive forecasting is there now, and it’s the very core of the future of Oracle database administration.

Test your DBA skills and try this important DBA challenge!  Good Luck!


Oracle Training at Sea
oracle dba poster

Follow us on Twitter 
Oracle performance tuning software 
Oracle Linux poster


Burleson is the American Team

Note: This Oracle documentation was created as a support and Oracle training reference for use by our DBA performance tuning consulting professionals.  Feel free to ask questions on our Oracle forum.

Verify experience! Anyone considering using the services of an Oracle support expert should independently investigate their credentials and experience, and not rely on advertisements and self-proclaimed expertise. All legitimate Oracle experts publish their Oracle qualifications.

Errata?  Oracle technology is changing and we strive to update our BC Oracle support information.  If you find an error or have a suggestion for improving our content, we would appreciate your feedback.  Just  e-mail:  

and include the URL for the page.


Burleson Consulting

The Oracle of Database Support

Oracle Performance Tuning

Remote DBA Services


Copyright © 1996 -  2017

All rights reserved by Burleson

Oracle ® is the registered trademark of Oracle Corporation.

Remote Emergency Support provided by Conversational