Most Oracle professionals are aware of the
logical ways to join tables, such as equi-joins, outer joins and
semi-joins; however, Oracle also has internal ways of joining
tables, including physical join methods that are critical to
understanding SQL tuning.
Built into Oracle is a vast array of
internal algorithms, including several types of nested loops and
merge joins as well as six types of hash joins.
There are also many data access methods,
such as a variety of internal sorts and table access methods.
Oracle nested loops and hash join internals
When we examine an execution plan for a SQL statement, we may
see the physical join implementations with names like nested
loops, sort merge and hash join.
-
Hash joins
- In a hash join, the Oracle database does a full-scan of the driving
table, builds a RAM hash table, and then probes for matching rows in
the other table. For certain types of SQL, the hash join will
execute faster than a nested loop join, but the
hash join uses more RAM
resources.
-
Nested loops join -
The
nested loops table join
is one of the original table join plans and it remains the most
common. In a
nested loops join, we
have two tables a driving table and a secondary table.
The rows are usually accessed from a driving table
index range scan, and the
driving table result set is then nested within a probe of the second
table, normally using an index range scan method.
The propensity of the SQL optimizer to invoke a hash join is
heavily controlled by the setting for the hash_area_size
and pga_aggregate_target Oracle parameters. The larger
the value for hash_area_size, the more hash joins the
optimizer will invoke. In some releases of Oracle, the
hash_area_size defaults to double the value of the
sort_area_size parameter, but it highly dependent upon
parameter settings and the Oracle release level.
When tuning SQL, we must always remember that it is possible for
the optimizer to fail to choose the best table join method. This
is especially true for cases where a hash join is wrongly chosen
over nested loops. This is frequently the case when we have
sub-optimal schema statistics, especially column histograms,
which can lead to cases where the optimizer makes an incorrect
guess about the cardinality of a result set and wrongly invokes
a join that requires a full-table scan rather than choosing
nested loops.
This is frequently the
case when we have sub-optimal schema statistics (especially column
histograms) can lead to cases where the optimizer makes an incorrect
guess about the cardinality of a result set and wrongly invokes a join
that requires a full-table scan, not choosing nested loops.
For example, consider
a zillion rows sales table
with this query:
select
sales_stuff
from
sales
natural join
stores
where
state = 'idaho';
If we assume that
Idaho has less than 1% of the rows in the sales table, we would want to
force the optimizer to invoke a nested loops join, so that we may use an index for the join.
This can be accomplished in several ways:
- Analyze a column histogram on the state column
-
This will address the root cause of the issue since the optimizer
will now recognize that Idaho is a low cardinality query.
- Use a hint - Hints are a last resort because of
unintended side-effects, but you can use the use_nl_with_index
hint to force the query to choose a nested loops join method.
- Invoke dynamic sampling - You can use the
opt_estimate
hint or the dynamic_sampling hint to force the nested loops
join:
Here are
equivalent ways to force a nested loops join:
--********************************************
--
Nested loops hint
--
*******************************************
select /*+
use_nl_with_index */
sales_stuff
from
sales
natural join
stores
where
state = 'idaho';
--********************************************
-- Dynamic
sampling
-- *******************************************
select /*+ dynamic_sampling(sales,10)
sales_stuff
from
sales
natural join
stores
where
state = 'idaho';
--********************************************
--
opt_estimate hint
--
*******************************************
select /*+
opt_estimate(table,sales,scale_rows=.001)
sales_stuff
from
sales
natural join
stores
where
state = 'idaho';
Beware of the
use_nl hint!
Using the use_nl hint can force a query to use a nested loops join,
but the use of join hints can create a problem, especially with nested
loops indexes which reply on indexes.
If the index were dropped, you
would get a horrific join plan.
To alleviate this issue, you can use the use_nl_with_index hint.
This hint will only direct a nested loops join if the suitable index is
present.
Forcing nested loop joins
Vadim next tries forcing nested loop joins with
the use_nl hint and discovers that the execution speed gets worse, not
faster:
Those query block names allow
directing the hints to the query blocks where they are supposed to be
applied. In our case, the joint method can be hinted as
/*+ use_nl */
which produces the following rowsource level execution statistics:

Execution plan for comparing the
contents of two tables with nested loop joins
From the statistics output we see
significant increase in buffer gets which is not offset by any noticeable
improvement in the execution time. It is fair to conclude that indexed
nested loops didn't meet our performance expectations in this case.
|
|
Get the Complete
Oracle SQL Tuning Information
The landmark book
"Advanced Oracle
SQL Tuning The Definitive Reference" is
filled with valuable information on Oracle SQL Tuning.
This book includes scripts and tools to hypercharge Oracle 11g
performance and you can
buy it
for 30% off directly from the publisher.
|