Because standard Oracle cannot index into a large column, there
“like” queries cause full-table scans, and Oracle must examine every
row in the table, even when the result set is very small. These
unnecessary full-table scans are a problem:
-
Large-table full-table scans increase the load
on the disk I/O sub-system
-
Small table full table scans (in the data
buffer) cause high consistent gets and drive-up CPU consumption
One obscure trick for indexing queries with a leading wildcard
character (like '%SON') is to create a REVERSE index and them
programmatically reverse the SQL like clause to read "like 'NOS%'",
effectively indexing on the other side of the text, clumsy, yet
effective. For details, see
Oracle SQL "like clause"
and index access.
Oracle*Text Indexes
The Oracle*Text utility (formally called Oracle ConText and Oracle
Intermedia) allows us to parse through a large text column and index
on the words within the column.
Unlike ordinary b-tree or bitmap indexes, Oracle context, ctxcat and
ctxrule indexes can be set not to update as content is changed. Since most
standard Oracle databases will use the ctxcat index with standard
relational tables, you must decide on a refresh interval.
Oracle provides the SYNC operator for this. The default is
SY^NC=MANUAL and you must manually synchronize the index with
CTX_DDL.SYNC_INDEX.
SYNC (MANUAL | EVERY "interval-string"
| ON COMMIT)]
Hence, Oracle Text indexes are only useful for removing full-table
scans when the tables are largely read-only and/or the end-users
don’t mind not having 100% search recall:
Oracle Text works with traditional data columns and also with
XML, MS-Word docs and Adobe PDF files that are stored within Oracle.
Oracle Text has several index types:
WHERE CATSEARCH(text_column, 'ipod')> 0;
-
CONTEXT Indexes - The CONTEXT index type
is used to index large amounts of text such as Word, PDF, XML,
HTML or plain text documents.
WHERE CONTAINS(test_column, 'ipod', 1) > 0
These types of indexes allow you to replace the old-fashioned SQL
“LIKE” syntax with “CONTAINS” or “CATSEARCH” SQL syntax:
When we execute the query with the new index we see that the
full-table scan is replaced with a index scan, greatly reducing
execution speed and improving hardware stress:
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=FIRST_ROWS
2 1 TABLE ACCESS (BY INDEX ROWID) OF 'BIGTAB'
3 2 DOMAIN INDEX OF 'TEXT-COLUMN_IDX'
Index re-synchronization
Because rebuilding an Oracle Text index (context, ctxcat, ctxrule)
requires a full-table scan and lots of internal parsing, it is not
practical to use triggers for instantaneous index updates.
Updating Oracle Text indexes is easy and they can be schedules using
dbms_job or the Oracle 10g dbms_scheduler utility
package: Oracle text provides a CTX_DDL package with the
sync_index and optimize_index procedures:
SQL> EXEC
CTX_DDL.SYNC_INDEX('text_column_idx');
SQL> EXEC CTX_DDL.OPTIMIZE_INDEX('text_column_idx','FULL');