Call now: 252-767-6166  
Oracle Training Oracle Support Development Oracle Apps

 
 Home
 E-mail Us
 Oracle Articles
New Oracle Articles


 Oracle Training
 Oracle Tips

 Oracle Forum
 Class Catalog


 Remote DBA
 Oracle Tuning
 Emergency 911
 RAC Support
 Apps Support
 Analysis
 Design
 Implementation
 Oracle Support


 SQL Tuning
 Security

 Oracle UNIX
 Oracle Linux
 Monitoring
 Remote s
upport
 Remote plans
 Remote
services
 Application Server

 Applications
 Oracle Forms
 Oracle Portal
 App Upgrades
 SQL Server
 Oracle Concepts
 Software Support

 Remote S
upport  
 Development  

 Implementation


 Consulting Staff
 Consulting Prices
 Help Wanted!

 


 Oracle Posters
 Oracle Books

 Oracle Scripts
 Ion
 Excel-DB  

Don Burleson Blog 


 

 

 


 

 

 
 

Hypercharge Oracle with Intelligent Indexes

Expert Oracle Tips by Donald K. Burleson

March 21, 2011

 

Overview of intelligent indexing

  • It is a myth that indexes must per persistent at all times ? In reality, it is more efficient for some indexes to be created and destroyed as-needed by queries.

  • The factors that effect index persistence is the size of the table and the frequency that the SQL invokes the index.

  • For example, an indexes that is only used once per month may be better being create immediately before use.

  • Un-used indexes are expensive to maintain ? DML is slower and this can be significant on busy systems.

Dynamic systems commonly have end-of-period job, some end-of-day (EOD), and of week (EOW), end of month (EOM) end of quarter (EOQ) and end of year (EOY).

  • If it not uncommon to have large batch jobs that run during these periods that only use an index during the period, and not at any other time.

  • In cases of high DML databases with long period between index usage, it is most effective to build the index run the job and destroy the index, rather than constantly maintaining the index

  • This concept is especially true for bitmap and Con*text indexes, which have a huge cost of DML maintenance.  Many bitmap indexes and Con*text only exist during end-of-period processing (when the database is read-only).

In order to be effective at just-in-time indexes, you must be able to weigh the options between the costs and benefits:

  • The cost is the sum of the additional cost to keep the index current during DML operations.  You can estimate this conservatively as 1 disk write per DML.  If an un-used indexes experiences 12,000,000 DML statements then we can estimate a cost of 12 million disk I/O's on the system.

  • There is also the additional costs of locking when an in-used index is updated.

 


 

Oracle has a wealth of features that allow the DBA and developer to fetch data from an Oracle database with a minimum of trips to the database.

Ever since direct access storage (DASD) was introduced a half-century ago, computer scientists have recognized that disks have addresses, and that indexes can be built on any disk file.

The Indexed sequential access method (ISAM) was done by creating a new sorted file that contains the symbolic key and the disk address.  As the first large-scale commercial databases were introduced (IMS, IDMS), commercial databases started offering choices of indexing methods.  These early databases allowed the DBA to crate tree indexes, and allowed the DBA to control the size of the index nodes.  

While the concept of an index is straightforward, the implementation of an intelligent Oracle indexing strategy can be extremely complex.  Oracle Database 11g offers many types of indexing but the usage of indexes depends upon the shifting SQL workloads.

The vast majority of OLTP shops experience vastly different workloads for daytime transaction processing, evening batch job and end of period reporting and it is these workloads that drive the index usage.

Oracle does not report on wasted indexing resources, and there are many perils to over-indexing and under-indexing for any given SQL workload.  Just as a missing index requires unnecessary large-table full-table scans, maintaining an un-used index requires a huge amount of computing resources.  

In many cases, indexes that are only used periodically will be expensive and inefficient to maintain while they are not being used, and it is far more cost effective to create and destroy indexes as-needed, based on the needs of each specific SQL workload. 

Just as manufacturing has implemented just-in-time delivers of raw materials, an intelligent DBA strategy allows for specific indexes, optimized instance parameter settings, and customized optimizer statistics to be delivered immediately before a shift in the workload.  

Using data from real-world production databases, this presentation will present a proven approach for intelligent indexing:

  1.  Understand the cost and benefits of indexes:  The DBA needs to understand the relative merits and cost of sorted hash clusters, index-organized tables, b-tree indexes, bitmap indexes multi-column indexes.  After reviewing these and learn how to match the appropriate index type to their specific SQL workload.  

  2.  Intelligent index enhancement:  The Oracle DBA needs to know how to add human intelligence to tip-off the SQL optimizer.  This includes optimizer parameters such as optimizer_index_caching, intelligently creating column histograms and deploying Oracle Database 11g extended optimizer statistics, tools a smart DBA can use to improve the efficiency of the SQL optimizer.   

  3.  Identify repeating workloads:  The DBA will use SQL execution plan feedback from AWR and STATSPACK to identify distinct repeating workloads.  These workload are then named (e.g. day_mode, night_mode, eow_mode, eom_mode), and a profile of index usage is gathered from historical data.  

  4.  Profile index usage for each workload:  Next, we determine the optimal indexing strategy for each workload, counting the number of time per minute that each index is invoked and plotting usage patterns for multi-column indexes.  

  5.  Intelligently schedule instance customization:  Based on the usage for each index, we will identify duplicate indexes, un-used indexes and those indexes that are unique to each workload.  We can then use the dbms_scheduler utility to customize the Oracle instance to meet the demands of each distinct workload. 

This presentation also reveals secrets scripts to see track indexes are the most popular over time and how often the index is invoked by SQL.  We also explore how to track which columns of a multi-column indexes are referenced by your SQL.  In sum, this presentation is indispensible for any DBA who must manage a mission critical database corporate database with limited computing resources.   

Oracle & Indexing

As Oracle came to dominate the data processing world, they began to offer a wealth of indexing strategies.  We need to start by distinguishing between 'logical' and 'physical' indexing structures. 

Note:  Oracle indexes can cause 'extra' I/O operations, and the choice to use indexes is dependent upon your choice of optimizer_mode.  Not all database transactions will benefit from index usage and sometimes you want your SQL optimizer to optimize for fast throughput instead of fast response time:

  •  All_rows : The all_rows optimizer_mode is designed to maximize throughput

  • First_rows_n : The first_rows_n optimizer_mode is designed to minimize response time.

The logical indexing strategies are deployed at the analytical level and do not involve any knowledge of the underlying physical database structures:

 Logical Indexes:

  •  Function-based Indexes

  •  Multi-column indexes

  •  Virtual column indexes

  •  STAR transformation indexing  

Oracle Physical Indexes Oracle is the world' s most flexible and robust database because he offers almost every conceivable algorithm for speeding-up access to data:

 Physical Index structures:

  •  B-tree indexes

  • Bitmap Indexes

  • Bitmap join indexes

  • Hash key indexes (sorted hash clusters)

  • Hash partitioned indexes

  • Index-organized tables (IOT)

  • Con*Text Indexes  

While b-tree indexes are great for simple queries, they are not very good for the following situations:

  • Low-cardinality columns:  Columns with less a tiny number of distinct values do not have the selectivity that is required in order to benefit from standard b-tree index structures.

  • No support for SQL functions:  B-tree indexes are not able to support SQL queries using Oracle's built-in functions.  

Bitmap Indexes

Bitmap indexes are great for low cardinality columns but they have a huge overhead for DML, and most shops drop and rebuild bitmaps during off hours processing. 

Hence, bitmap indexes are not for active OLTP databases and they are used primarily for read-only DSS of data warehouse systems. Bitmaps, while not very useful by themselves, are great when combined together.

For example, assume there is a motor vehicle database with numerous low-cardinality columns such as car_color, car_make, car_model and car_year. Each column contains less than 100 distinct values by themselves, and a b-tree index would be fairly useless in a database of 20 million vehicles. However, combining these indexes together in a query can provide blistering response times a lot faster than the traditional method of reading each one of the 20 million rows in the base table. For example, assume we wanted to find old blue Toyota Corollas manufactured in 1981:

 select
    license_plat_nbr
from
    vehicle
where
    color = ' blue' 
and
    make = ' toyota' 
and
    year = 2014;  

Oracle uses a specialized optimizer method called a bitmapped index merge to service this query. In a bitmapped index merge, each Row-ID, or RID, list is built independently by using the bitmaps, and a special merge routine is used in order to compare the RID lists and find the intersecting values. Using this methodology, Oracle can provide sub-second response time when working against multiple low-cardinality columns:

Oracle bitmap merge join

Does Size Matter?

To meet the requirements of all operating systems from a mainframe to a Mac, Oracle has chosen to implement the index node size based upon the DBA's choice for the database block size.  An index built in a db_32k_cache_size will contain more keys per index node than the same index created in a db_2k_block_size.

While the vast majority of indexes would reside in your database's default db_cache_size, there are exceptions and some large OLTP databases have shown great performance improvements when moved into a blocksize that matches the I/O characteristics of the index.

For example, during a nested loop join, one index received multi-block reads (index range scans) while the other index is doing index unique probes.  In these cases, moving the index that experienced multi-block reads into a large blocksize allows more keys to be read in a single data buffer get.  

Workload-driven index usage

 It's backwards, but at implementation time, the DBA may not be aware of the SQL workload that will be used against their tables and this lack of a-priori knowledge of the SQL can lead to the phenomenon of 'missing indexes', especially function-based indexes.

Hence, all DBAs adopt a workload-based approach, examining their workloads for I/O signatures and implementing indexing strategies that match the workload.  Most production systems have at least two workload modes, a 'day mode' characterized by small OLTP SQL optimized for fast response time, and a 'night mode' characterized by large batch jobs that need to be optimized for fast throughput.

                Day Mode                 Night Mode                EOM Mode
Data Warehouse  read-only                batch updates             summary reports OLTP            small transactions       batch reports             summary reports

 An iterative process of intelligent SQL workload tuning involves these steps:

  1. Analyze the workload:  Use monitoring tools to identify repeating processing signatures
  2.  Codify and 'name' the workload signatures (day mode, night mode, EOW mode, EOM mode)
  3. Identify optimal setting for each workload (init.ora parameters, SGA regions and CBO statistics, building custom indexes)
  4. Schedule just-in-time morphing of Oracle to accommodate the changing workload  

End of Period Workloads

In addition to the normal day mode and night mode workloads, there are many systems that have special end-of-week, end-of-month, and end-of-quarter processing that require an entirely new SGA and database configuration.

  •  PGA adjustments:  You can over-ride the governor that only allows any job to grab 5% of the PGA, and you can set very large values for sort_area_size and hash_area_size to allow the single-threaded batch job to consume all of the PGA, if necessary.
  • Optimizer parameters:  Certain parameters like optimizer_mode, optimizer_index_caching can change the propensity of Oracle to use indexes.
  • Statistics:  Customized EOM CBO statistics can be collected, save and imported when you switch to EOM mode.  This is especially important for cases where specialized 'weekend-only' indexes are used and optimizer histograms are required.
  • SGA Adjustments:  When you have a batch mode when only a few concurrent jobs will run simultaneously, you will adjust the 'default governors' in Oracle wo allow the batch jobs to consume all available computing resources.
  • B-Tree Indexes:  Some indexes are only used for EOW processing, and it's smart to build them on Friday night and nuke the index on Sunday night.
  • Bitmap Indexes:  Many EOM reports are read-only and building temporary bitmap indexes can greatly improve the throughout for jobs that may run more then 10 hours.
  • Con*Text Indexes: Many EOY reports need to 'look inside' CLOB columns, and a Friday night build of a word index on a CLOB is perfect for read-only weekend reports.  The index is destroyed on Sunday night.  

Proactive workload analysis and predictive modeling

Statistically proven predictive analytic procedures comes with the Oracle Data mining package (ODM), but DBAs are now only learning that these same principles can be applied to their own production SQL workloads.  It is possible to fix problems before they affect your users!  All you need is a way to interrogate your historical data from STATSPACK or AWR and identify repeating 'signatures', statistically significant repeating patterns of usage that can be used to predict future events!  

'Those who forget the past are condemned to repeat it'  George Santayana  

'The farther that you can look into the past, the farther you can look into the future'   Winston Churchill  

' Predictions are difficult, especially about the future" - Yogi Berra

If you closely study your past workloads, it' s easy to develop a proactive approach that will intelligently anticipate new repeating workloads and change the SGA to accommodate the changes, before it impacts the end-users.  Oracle offers a similar one-sized-fits-all approach called Automatic Memory Management (AMM) but AMM is reactive and waits until a problem is detected before fixing it.

 SQL workload  

Starting in Oracle 10g, Oracle finally recognized that it is the SQL workload that determines the optimal index placement.

  • Intelligent index management
  • Identifying missing indexes
  • Identifying un-used indexes
  • Interrogating index usage within a workload
  • Building custom Indexes for workloads
  • Using function-based indexes

Interrogating index usage within a workload  

index_usage_hr.sql

col c1 heading ' Begin|Interval|time'  format a20
col c2 heading ' Search Columns'       format 999
col c3 heading ' Invocation|Count'     format 99,999,999
break on c1 skip 2  

accept idxname char prompt ' Enter Index Name: ' 
ttitle ' Invocation Counts for index|&idxname' 
select
    to_char(sn.begin_interval_time,'yy-mm-dd hh24')  c1,
    p.search_columns                                 c2,
    count(*)                                         c3
from
    dba_hist_snapshot  sn,
    dba_hist_sql_plan   p,
    dba_hist_sqlstat   st
where
    st.sql_id = p.sql_id
and
    sn.snap_id = st.snap_id   
and
    p.object_name = ' &idxname'
group by
    begin_interval_time,search_columns;

This will produce an output like this, showing a summary count of the index specified during the snapshot interval. This can be compared to the number of times that a table was invoked from SQL.  Here is a sample of the output from this script:

Interval                         Invocation  
time                 Search Columns       Count
-------------------- -------------- -----------
04-10-21 15                       1           3
04-10-10 16                       0           1
04-10-10 19                       1           1
04-10-11 02                       0           2
04-10-11 04                       2           1
04-10-11 06                       3           1
04-10-11 11                       0           1
04-10-11 12                       0           2
04-10-11 13                       2           1
04-10-11 15                       0           3
04-10-11 17                       0          14

Using function-based indexes

 Here we see a Function Based Index (FBI) on a large VARCHAR column based on a credit card number that it contains. The database will create this index, but even with a hint, the optimizer will not use it' and it is expensive to maintain.  

create index card_idx on dumbtbl (REGEXP_SUBSTR (lots_data,?(([0-9]{4})([[:space:]])){3}[0-9]{4}?));  

Table row re-sequencing and index usage

The original Relational Manifesto by Codd and Date stated that the physically ordering of rows on disk should not matter, but in the real world, row placement matters a lot, especially for tables whose rows are always accessed in the same key order. The dba_indexes view has a clustering_factor column that tells the optimizer how ' synchronized an index is with the tables and values range from ' blocks'  (synchronized):

On the other extreme, when clustering_factor approaches num_rows (not synchronized), range scan operations may require hundreds more I/Os to complete a transaction: 

Many DBAs will search the workload and seek tables that are frequently accessed by the same index, using an index range scan.      

Here is an unclustered index situation.  This transaction displays all customers in a city:

  •  Only 10 rows per data block

  •  Clustering factor is ' poor'  for the primary index

  •  Lots of ' unnecessary'  disk reads to fetch all customers for a city

Unclustered index situation

Below is a clustering index situation: 400 Rows fit onto a data block.

As we see, the decision to reorganize tables into the same order of an index depends on several factors:

  •  The percentage of total that the index is used to access the table rows.  Remember, a table can only be re-sequenced according to one index, and it must be the most popular index.

  • The index uses and index full-scan, index fast-full-scan, or index_range_scan.  Remember, some indexes are only used for ' unique'  index probes, and these will see not benefit from reorganization.

  • The database block size (db_nn_cache_size)

  • The average index key length (dba_tables.avg_row_len)    

Typical cases might include:

  •  Order Items:  An order processing systems displays all items for an order 1,000 times per hour with an average of 40 consistent gets per execution.  System-wide, the item_idx is used 95% of the time to show the items within an order.  Because there are about 70 items per order, and the item rows are 80 bytes each, reorganizing the item table by item_idx will reduce I/O by a factor of at least 40x.

  • Student Classes:  Assume a 2k University database that displays all classes for a student 10 times per second.  The class table is using the student_semester_idx 97% of the time with index range scans, and a student takes an average of 4 classes per semester.  

What factors should we consider when re-organizing this table?   

Oracle provides several methods for allowing the DBA to re-sequence table rows into the same order as the most popular index:

  • Sorted hash cluster tables

  • CTAS with order by

  • Index-Organized Tables    

Oracle MOSC Note: 223117.1 has some great advice for tuning-down ' db file sequential read'  waits by table reorganization in row-order:

  •  If Index Range scans are involved, more blocks than necessary could be being visited if the index is unselective: by forcing or enabling the use of a more selective index, we can access the same table data by visiting fewer index blocks (and doing fewer physical I/Os).

  • If the index being used has a large Clustering Factor, then more table data blocks have to be visited in order to get the rows in each Index block: by rebuilding the table with its rows sorted by the particular index columns we can reduce the Clustering Factor and hence the number of table data blocks that we have to visit for each index block.    

Conclusion    

Even though Oracle has introduced many automated features for locating missing indexes, no artificial tools will ever be able to replace the intuition of a human expert. Those that forget the past are condemned to repeat it, and it is the job of every data base administrator to analyze their historical performance data to identify workload based indexing opportunities. By applying an intelligent approach to Oracle indexing you can insure that the indexes match the data access patterns and prevent wasting computing resources by maintaining unnecessary indexes.

 

 
Get the Complete
Oracle SQL Tuning Information 

The landmark book "Advanced Oracle SQL Tuning  The Definitive Reference"  is filled with valuable information on Oracle SQL Tuning. This book includes scripts and tools to hypercharge Oracle 11g performance and you can buy it for 30% off directly from the publisher.























































































































































































































































































































































































 

 

Burleson is the American Team

Note: This Oracle documentation was created as a support and Oracle training reference for use by our DBA performance tuning consulting professionals.  Feel free to ask questions on our Oracle forum.

Verify experience! Anyone considering using the services of an Oracle support expert should independently investigate their credentials and experience, and not rely on advertisements and self-proclaimed expertise. All legitimate Oracle experts publish their Oracle qualifications.

Errata?  Oracle technology is changing and we strive to update our BC Oracle support information.  If you find an error or have a suggestion for improving our content, we would appreciate your feedback.  Just  e-mail:  and include the URL for the page.


                    









Burleson Consulting

The Oracle of Database Support

Oracle Performance Tuning

Remote DBA Services


 

Copyright © 1996 -  2016

All rights reserved by Burleson

Oracle ® is the registered trademark of Oracle Corporation.


 

��  
 
 
Oracle Training at Sea
 
 
 
 
oracle dba poster
 

 
Follow us on Twitter 
 
Oracle performance tuning software 
 
Oracle Linux poster