AUTOMATIC INDEXING This Oracle Database 19c new feature is
available on certain Oracle Database platforms. Check the Oracle Database
Licensing Guide for more information. INDEXING CHALLENGES Indexes are useful
for most types of workload and particularly critical for high performance in
online transaction processing (OLTP) and operational data store (ODS)
environments. However, identifying the best indexes is not straightforward and
can require a considerable amount of manual labor. This is largely because
index management is a continuous process. Applications are rarely static; they
change in response to changing business requirements. For this reason, there
will be changes in data volumes, of course, but also less obvious changes in
data value distributions and the way data is organized, processed and accessed.
Traditionally, DBAs have been responsible to monitoring performance and
deciding when and where to add, change or remove indexes in a tactical and
often ad-hoc manner. This ad-doc approach to index maintenance is prone to
error because it is almost impossible to quantify the effect any change – both
positive and negative. This may lead to a database that has many more indexes
than necessary, where indexes have been gradually added over time and there is
a reluctance to remove any of them for fear of negative consequences. This will
lead to an increase the system resources required to maintain indexes when data
is modified and processed. In addition, over-indexed environments often suffer
from less stable SQL execution plans as the sheer number of indexes make the
optimizer's choice of index access path more and more finely balanced.
AUTOMATIC INDEXING Automatic indexing addresses these issues. It is not a
simple advisor, but instead it is an expert system that implements indexes
based on what a performance engineer skilled in index tuning would do. The
Oracle Database analyzes the application workload and identifies the queries
that will benefit from additional indexes. In other words, it identifies
candidate indexes and validates them before implementation, and the entire
process is fully automatic. Here is a summary of the workflow: Figure 1
Automatic indexing workflow 5 WHITE PAPER / The Optimizer in Oracle Database
19c The steps can be summarized as follows: Capture · Periodically capture SQL
statements from the application workload in a SQL tuning set. This tuning set
called the automatic SQL tuning set (ASTS). Identify · Identify candidate indexes that
may benefit the application workload. ·
Creates unusable and invisible index candidates (this is a data dictionary
metadata change only) Verify ·
The optimizer verifies which index candidates will be used by the captures SQL
statements ·
Materialize the successful candidates and measure their effect on the
performance of the captured SQL statements ·
The indexes remain invisible to the application workload and all verification
is done outside the application workflow Decide ·
Indexes that are found to offer a significant performance improvement are made
visible to the application workload. ·
Indexes that are found to provide insufficient performance benefits remain
invisible to the application workload. ·
Automatic indexing offers a mixed solution where necessary. If there is an
overall benefit in using a particular auto-index, then it will be marked
visible. If individual queries suffer from a performance regression, SQL plan
baselines are used to prevent them from using regressed plans. Monitor · SQL performance and index
usage is continuously monitored. Indexes that have not been used for a
configurable period of time will be dropped. Configuring and Monitoring
Automatic Indexing Automatic indexing requires little to no manual
intervention, but a package called DBMS_AUTO_INDEX package is provided for
changing a small number of defaults. The feature can be enabled as follows:
SQL> exec dbms_auto_index.configure('AUTO_INDEX_MODE', 'IMPLEMENT') And
disabled: SQL> exec dbms_auto_index.configure('AUTO_INDEX_MODE', 'OFF') The
number of days unused auto-indexes are retained is controlled as follows:
SQL> exec dbms_auto_index.configure('AUTO_INDEX_RETENTION_FOR_AUTO', '373')
It is possible to specify which schemas are subject to auto indexing: SQL>
-- Exclude SH and HR SQL> exec dbms_auto_index.configure('AUTO_INDEX_SCHEMA',
'SH', FALSE) SQL> exec dbms_auto_index.configure('AUTO_INDEX_SCHEMA', 'HR',
FALSE) SQL> -- Remove HR from exclusion list SQL> exec
dbms_auto_index.configure('AUTO_INDEX_SCHEMA', 'HR', NULL) SQL> -- Remove
all schemas from exclusion list SQL> exec dbms_auto_index.configure('AUTO_INDEX_SCHEMA',
NULL, TRUE) SQL> -- Include SH for auto indexing but exclude HR SQL> exec
dbms_auto_index.configure('AUTO_INDEX_SCHEMA', 'SH', FALSE) SQL> exec
dbms_auto_index.configure('AUTO_INDEX_SCHEMA', 'HR', TRUE) All parameter settings
(and schemas that have been included and excluded) can be seen as follows:
SQL> select * from dba_auto_index_config; 6 WHITE PAPER / The Optimizer in
Oracle Database 19c A report on auto index activity can be generated. For
example: SQL> set linesize 300 trims on pagesize 1000 long 100000 SQL>
column report format a120 SQL> SELECT
dbms_auto_index.report_activity(sysdate-30,null,'text','all','all') report FROM
dual; Here is an extract from an example report: REPORT
----------------------------------------------------------------------------------------------------------
-------------- GENERAL INFORMATION
-------------------------------------------------------------------------------
Activity start : 29-DEC-2018 05:09:37 Activity end : 28-JAN-2019 05:09:37 Executions
completed : 743 Executions interrupted : 0 Executions with fatal error : 0
-------------------------------------------------------------------------------
SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates : 16 Indexes created (visible / invisible) : 2 (2 / 0) Space
used (visible / invisible) : 268.44 MB (268.44 MB / 0 B) Indexes dropped : 0
SQL statements verified : 6 SQL statements improved (improvement factor) : 3
(493.3x) SQL plan baselines created : 0 Overall improvement factor : 247.4x
-------------------------------------------------------------------------------
SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------
Unused indexes : 0 Space used : 0 B Unusable indexes : 0
-------------------------------------------------------------------------------
INDEX DETAILS
-------------------------------------------------------------------------------
1. The following indexes were created: *: invisible
-------------------------------------------------------------------------------
---------------------------------------------------------------------- | Owner
| Table | Index | Key | Type | Properties | ----------------------------------------------------------------------
| AUTOI | FACT1 | SYS_AI_0rn9u2kmxxbs7 | F1 | B-TREE | NONE | | AUTOI | FACT1 |
SYS_AI_151bnmf3xsxaw | F2,F3 | B-TREE | NONE |
----------------------------------------------------------------------
-------------------------------------------------------------------------------
VERIFICATION DETAILS
-------------------------------------------------------------------------------
1. The performance of the following statements improved:
-------------------------------------------------------------------------------
Parsing Schema Name : AUTOI SQL ID : 34xymh9usuxzz SQL Text : select /* TESTQ3
*/ sum(f4) from fact1 where f1 = 10 Improvement Factor : 2268.9x PLANS SECTION
---------------------------------------------------------------------------------------------
- Original ----------------------------- Plan Hash Value : 4087791341 7 WHITE
PAPER / The Optimizer in Oracle Database 19c
----------------------------------------------------------------------- | Id |
Operation | Name | Rows | Bytes | Cost | Time |
----------------------------------------------------------------------- | 0 |
SELECT STATEMENT | | | | 51280 | | | 1 | SORT AGGREGATE | | 1 | 8 | | | | 2 |
TABLE ACCESS FULL | FACT1 | 98 | 784 | 51280 | 00:00:03 | -----------------------------------------------------------------------
- With Auto Indexes ----------------------------- Plan Hash Value : 911518680
--------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Time |
--------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 8 | 102 | 00:00:01 | | 1 | SORT AGGREGATE | | 1
| 8 | | | | 2 | TABLE ACCESS BY INDEX ROWID BATCHED | FACT1 | 98 | 784 | 102 |
00:00:01 | | * 3 | INDEX RANGE SCAN | SYS_AI_0rn9u2kmxxbs7 | 98 | | 3 |
00:00:01 |
--------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
------------------------------------------ * 3 - access("F1"=10) 8
WHITE PAPER / The Optimizer in Oracle Database 19c ADAPTIVE QUERY OPTIMIZATION
Adaptive Query Optimization is a set of capabilities that enable the optimizer
to make run-time adjustments to execution plans and discover additional
information that can lead to better statistics. This new approach is extremely
helpful when existing statistics are not sufficient to generate an optimal
plan. There are two distinct aspects in Adaptive Query Optimization: adaptive
plans, which focuses on improving the execution of a query and adaptive
statistics, which uses additional information to improve query execution plans.
Figure 2: The components that make up the new Adaptive Query Optimization
functionality Adaptive Plans An adaptive plan will be chosen by the Optimizer
if certain conditions are met; for example, when a query includes joins and
complex predicates that make it difficult to estimate cardinality accurately. Adaptive
plans enable the optimizer to defer the plan decision for a statement until
execution time. The optimizer instruments its chosen plan (the default plan),
with statistics collectors so that at runtime, it can detect if cardinality
estimates differ greatly from the actual number of rows seen by the operations
in the plan. If there is a significant difference, then the plan or a portion
of it will be automatically adapted to avoid suboptimal performance. From
Oracle Database 12.2 onwards, adaptive plans are enabled and disabled using the
database parameter optimizer_adaptive_plans. Use of the default value of TRUE
(enabled) is recommended. ADAPTIVE JOIN METHODS The optimizer is able to adapt
join methods on the fly by predetermining multiple sub-plans for portions of
the plan. For example, in Figure 3, the optimizer’s default plan choice for
joining the orders and products tables is a nested loops join via an index
access on the products table. An alternative sub-plan, has also been determined
that allows the optimizer to switch the join type to a hash join. In the
alternative plan the products table will be accessed via a full table scan. 9
WHITE PAPER / The Optimizer in Oracle Database 19c During the initial
execution, the statistics collector gathers information about the execution and
buffers a portion of rows coming into the sub-plan. The Optimizer determines
what statistics are to be collected, and how the plan should be resolved for
different values of the statistics. It computes an “inflection point” which is
the value of the statistic where the two plan choices are equally good. For
instance, if the nested loops join is optimal when the scan of an orders table
produces fewer than 10 rows, and the hash join is optimal when the scan of
orders produces more than 10 rows, then the inflection point for these two
plans is 10. The optimizer computes this value, and configures a buffering
statistics collector to buffer and count up to 10 rows. If at least 10 rows are
produced by the scan, then the join method is resolved to hash join; otherwise
it is resolved to nested loops join. In Figure 3, the statistics collector is
monitoring and buffering rows coming from the full table scan of orders. Based
on the information seen in the statistics collector, the optimizer will make
the decision about which sub-plan to use. In this case, the hash join is chosen
since the number of rows coming from the orders table is larger than the
optimizer initially estimated. Figure 3: Adaptive execution plan for the join
between ORDERS and PRODUCTS. Default plan on the left, chosen plan on the
right. The optimizer can switch from a nested loops join to a hash join and
vice versa. However, if the initial join method chosen is a sort merge join no
adaptation will take place. By default, the explain plan command will show only
the initial or default plan chosen by the optimizer. Whereas the
DBMS_XPLAN.DISPLAY_CURSOR function displays the plan actually used by the
query. Figure 4: Explain plan and DBMS_XPLAN.DISPLAY_CURSOR plan output for the
scenario represented in Figure 3. 10 WHITE PAPER / The Optimizer in Oracle
Database 19c To see all of the operations in an adaptive plan, including the
positions of the statistics collectors, the additional format parameter
‘adaptive’ must be specified in the DBMS_XPLAN functions. In this mode, an
additional notation “-” appears in the Id column of the plan, indicating the
operations in the plan that were not used (inactive). Figure 5: Complete
adaptive plan displayed using ‘ADAPTIVE’ format parameter in
DBMS_XPLAN.DISPLAY_CURSOR SQL Monitor visualizes all operations if “Full” is
selected in the “Plan” drop down box. The inactive parts of the plan are grayed
out (see Figure 6). If the “Plan Note” icon is clicked, a pop-up box will be
displayed confirming that the plan is an adaptive plan. Figure 6: SQL Monitor
showing and adaptive plan 11 WHITE PAPER / The Optimizer in Oracle Database 19c
ADAPTIVE PARALLEL DISTRIBUTION METHODS When a SQL statement is executed in
parallel certain operations, such as sorts, aggregations, and joins require
data to be redistributed among the parallel server processes executing the
statement. The distribution method chosen by the optimizer depends on the
operation, the number of parallel server processes involved, and the number of
rows expected. If the optimizer inaccurately estimates the number of rows, then
the distribution method chosen could be suboptimal and could result in some
parallel server processes being underutilized. With the new adaptive
distribution method, HYBRID HASH the optimizer can defer its distribution
method decision until execution, when it will have more information on the
number of rows involved. A statistics collector is inserted before the
operation and if the actual number of rows buffered is less than the threshold
the distribution method will switch from HASH to BROADCAST. If however the
number of rows buffered reaches the threshold then the distribution method will
be HASH. The threshold is defined as 2 X degree of parallelism. Figure 7 shows
an example of a SQL Monitor execution plan for a join between EMP and DEPT that
is executed in parallel. One set of parallel server processes (producers or
pink icons) scan the two tables and send the rows to another set of parallel
server processes (consumers or blue icons) that actually do the join. The
optimizer has decided to use the HYBRID HASH distribution method. The first
table accessed in this join is the DEPT table. The rows coming out of the DEPT
table are buffered in the statistics collector, on line 6 of the plan, until
the threshold is exceeded or the final row is fetched. At that point the
optimizer will make its decision on a distribution method. Figure 7: SQL
Monitor execution plan for hash join between EMP & DEPT that uses adaptive
distribution method To understand which distribution method was chosen at
runtime, the easiest way to find this information is to look at the OTHER
column in SQL Monitor. This column shows a binocular icon in the lines with PX
SEND HYBRID HASH row source. When you click the icon, you can see the
distribution method used at runtime. Figure 8: Hybrid hash distribution method
For the adaptive distribution methods there are three possible values reported
in this dialog box: 6 = BROADCAST, 5 = ROUNDROBIN, and 16 = HASH distribution.
12 WHITE PAPER / The Optimizer in Oracle Database 19c ADAPTIVE BITMAP INDEX
PRUNING When the optimizer generates a star transformation plan, it must choose
the right combination of bitmap indexes to reduce the relevant set of ROWIDs as
efficiently as possible. If there are many indexes, some of them might not
reduce the ROWID set very substantially but will nevertheless introduce
significant processing cost during query execution. Adaptive plans are
therefore used to prune out indexes that are not significantly filtering down
the number of matched rows. DBMS_XPLAN.DISPLAY_CURSOR will reveal adaptive
bitmap pruning in a SQL execution plan with the adaptive keyword in a similar
manner to the example shown in Figure 4. For example, consider the following
SQL execution plan showing the bitmap index CAR_MODEL_IDX being pruned: Figure
9: Example of adaptive bitmap index pruning. ADAPTIVE STATISTICS The quality of
the execution plans determined by the optimizer depends on the quality of the
statistics available. However, some query predicates become too complex to rely
on base table statistics alone and the optimizer can now augment these
statistics with adaptive statistics. The use of adaptive statistics can extend
SQL statement parse times, which can become an overhead in systems that have
high hard parse rates. In addition, SQL execution plans are likely to be more
dynamic and less consistent. This feature is disabled by default from Oracle
Database 12c Release 2 onwards. Systems that have low hard parse rates and
complex queries (particularly ad-hoc queries) may benefit from this feature by
setting the database initialization parameter optimizer_adaptive_statistics to
TRUE. DYNAMIC STATISTICS During the compilation of a SQL statement, the
optimizer decides if the available statistics are sufficient to generate a good
execution plan or if it should consider using dynamic sampling. Dynamic
sampling is used to compensate for missing or insufficient statistics that
would otherwise lead to a very bad plan. For the case where one or more of the
tables in the query does not have statistics, dynamic sampling is used by the
optimizer to gather basic statistics on these tables before optimizing the
statement. The statistics gathered in this case are not as high a quality (due
to sampling) or as complete as the statistics gathered using the DBMS_STATS
package. Beginning with Oracle Database 12c Release 1, dynamic sampling was
enhanced to become dynamic statistics. Dynamic statistics allow the optimizer
to augment existing statistics to get more accurate cardinality estimates for
not only single table accesses but also joins and group-by predicates. Also,
from Oracle Database 12c Release 1, a new level 11 has been introduced for the
initialization parameter OPTIMIZER_DYNAMIC_SAMPLING. Level 11 enables the
optimizer to automatically decide to use dynamic statistics for any SQL
statement, even if all basic table statistics exist. The optimizer bases its
decision to use dynamic statistics on the complexity of the predicates used,
the existing base statistics, and the total execution time expected for the SQL
statement. For example, dynamic statistics will kick in for situations where
the optimizer previously would have used a guess, such as queries with LIKE
predicates and wildcards. 13 WHITE PAPER / The Optimizer in Oracle Database 19c
The default dynamic sampling level is 2, so it’s likely that when set to level
11, dynamic sampling will kick-in much more often and will increase hard parse
times. AUTOMATIC RE-OPTIMIZATION During the first execution of a SQL statement,
an execution plan is generated as usual. During optimization, certain types of
estimates that are known to be of low quality (for example, estimates for
tables which lack statistics or tables with complex predicates) are noted, and
monitoring is enabled for the cursor that is produced. If feedback monitoring
is enabled for a cursor by the system, cardinality estimates in the plan are
compared to the actual cardinalities seen during execution. If estimates are found
to differ significantly from the actual cardinalities, then the optimizer looks
for a replacement plan on the next execution. The optimizer will use the
information gathered during the previous execution to help determine an
alternative plan. The optimizer can re-optimize a query several times, each
time learning more and further improving the plan. Oracle Database 19c supports
multiple forms of re-optimization. Statistics Feedback Statistics feedback
(formally known as cardinality feedback) is one form of re-optimization that
automatically improves plans for repeated queries that have cardinality
misestimates. During the first execution of a SQL statement, the optimizer
generates an execution plan and decides if it should enable statistics feedback
monitoring for the cursor. Statistics feedback is enabled in the following
cases: tables with no statistics, multiple conjunctive or disjunctive filter
predicates on a table, and predicates containing complex operators for which
the optimizer cannot accurately compute cardinality estimates. At the end of
the execution, the optimizer compares its original cardinality estimates to the
actual cardinalities observed during execution and, if estimates differ
significantly from actual cardinalities, it stores the correct estimates for
subsequent use. It will also create a SQL plan directive so other SQL
statements can benefit from the information learnt during this initial
execution. If the query executes again, then the optimizer uses the corrected
cardinality estimates instead of its original estimates to determine the
execution plan. If the initial estimates are found to be accurate no additional
steps are taken. After the first execution, the optimizer disables monitoring
for statistics feedback. Figure 10 shows an example of a SQL statement that
benefits from statistics feedback. On the first execution of this two-table
join, the optimizer underestimates the cardinality by 8X due to multiple,
correlated, single-column predicates on the customers table. Figure 10: Initial
execution of a SQL statement that benefits from automatic re-optimization
statistics feedback Where estimates vary greatly from the actual number of rows
returned, the cursor is marked IS_REOPTIMIZIBLE and will not be used again. The
IS_REOPTIMIZIBLE attribute indicates that this SQL statement should be hard
parsed on the next execution so the optimizer can use the execution statistics
recorded on the initial execution to determine a better execution plan. 14
WHITE PAPER / The Optimizer in Oracle Database 19c Figure 11: Cursor marked
IS_REOPTIMIZIBLE after initial execution statistics vary greatly from original
cardinality estimates A SQL plan directive is also created, to ensure that the
next time any SQL statement that uses similar predicates on the customers table
is executed, the optimizer will be aware of the correlation among these
columns. On the second execution the optimizer uses the statistics from the
initial execution to determine a new plan that has a different join order. The
use of statistics feedback in the generation of execution plan is indicated in
the note section under the execution plan. Figure 12: New plan generated using
execution statistics from initial execution 15 WHITE PAPER / The Optimizer in
Oracle Database 19c The new plan is not marked IS_REOPTIMIZIBLE, so it will be
used for all subsequent executions of this SQL statement. Figure 13: New plan
generated using execution statistics from initial execution 16 WHITE PAPER /
The Optimizer in Oracle Database 19c Performance Feedback Another form of
re-optimization is Performance Feedback, which helps to improve the degree of
parallelism chosen for repeated SQL statements when Automatic Degree of
Parallelism (Auto DOP)1 is enabled with parallel_degree_policy = ADAPTIVE. When
Auto DOP is enabled in adaptive mode, during the first execution of a SQL
statement, the optimizer determines if the statement should execute in parallel
and if so what parallel degree should be used. The parallel degree is chosen
based on the estimated performance of the statement. Additional performance
monitoring is also enabled for the initial execution of any SQL statement the
optimizer decides to execute in parallel. At the end of the initial execution,
the parallel degree chosen by the optimizer is compared to the parallel degree
computed base on the actual performance statistics (e.g. CPU-time) gathered
during the initial execution of the statement. If the two values vary
significantly then the statement is marked for re-optimization and the initial
execution performance statistics are stored as feedback to help compute a more
appropriate degree of parallelism for subsequent executions. If performance
feedback is used for a SQL statement then it is reported in the note section
under the plan as shown in Figure 14. Figure 14: Execution plan for a SQL
statement that was found to run better serial by performance feedback 17 WHITE
PAPER / The Optimizer in Oracle Database 19c SQL plan directives SQL plan
directives are automatically created based on information learnt via automatic
re-optimization. A SQL plan directive is additional information that the
optimizer uses to generate a more optimal execution plan. For example, when
joining two tables that have a data skew in their join columns, a SQL plan
directive can direct the optimizer to use dynamic statistics to obtain a more
accurate join cardinality estimate. SQL plan directives are created on query
expressions rather than at a statement or object level to ensure they can be
applied to multiple SQL statements. It is also possible to have multiple SQL
plan directives used for a SQL statement. The number of SQL plan directives
used for a SQL statement is shown in the note section under the execution plan
(Figure 15). Figure 15: The number of SQL plan directives used for a statement
is shown in the note section under the plan The database automatically
maintains SQL plan directives and stores them in the SYSAUX tablespace. Any SQL
plan directive that is not used after 53 weeks will be automatically purged.
SQL plan directives can also be manually managed (altered or deleted) using the
package DBMS_SPD but it is not possible to manually create a SQL plan
directive. SQL plan directives can be monitored using the views
DBA_SQL_PLAN_DIRECTIVES and DBA_SQL_PLAN_DIR_OBJECTS (See Figure 16). Figure
16: Monitoring SQL plan directives automatically created based on information
learnt via re-optimization There a two types of SQL plan directive rows:
DYNAMIC_SAMPLING and DYNAMIC_SAMPLING_RESULT. The dynamic sampling type tells
the optimizer that when it sees this particular query expression (for example,
filter predicates on country_id, cust_city, and cust_state_province being used
together) it should use dynamic sampling to address the cardinality
misestimate. 18 WHITE PAPER / The Optimizer in Oracle Database 19c The dynamic
sampling result type is present from Oracle Database 12c Release 2 onwards and
signifies where results from dynamic sampling queries are stored in the SQL
directive repository (instead of the Server Result Cache as used by Oracle
Database 12c Release 1). Figure 17: Dynamic sampling results stored in the SQL
plan directive repository in Oracle Database 12c Release 2 onwards. From Oracle
Database 12c Release 2 onwards, SQL plan directives are created but they are
not used at parse time to modify SQL execution plans. This prioritizes plan
stability over maximizing query execute performance. The optimizer's use of SQL
plan directives is controlled by the database initialization parameter
optimizer_adaptive_statistics, which has the default value of FALSE. This
setting is recommended for most systems. In addition to finding optimal SQL
execution plans, SQL plan directives can be used by the database to determine
if extended statistics2 , specifically column groups, are missing and would
resolve the cardinality misestimates. After a SQL directive is used the
optimizer decides if the cardinality misestimate could be resolved with a
column group. If so, the database can automatically create that column group
the next time statistics are gathered on the appropriate table. This step is
“always on” in Oracle Database 12c Release 1, but from Oracle Database 12c
Release 2, it is controlled by the DBMS_STATS preference AUTO_STAT_EXTENSIONS.
The recommended (and default) value is OFF. This prioritizes plan stability
over maximizing query performance. Automatic column group creation can be
enabled using the following step: EXEC
DBMS_STATS.SET_GLOBAL_PREFS('AUTO_STAT_EXTENSIONS', 'ON') Extended statistics
will be used in place of the SQL plan directive when possible (equality
predicates, group bys etc.). If the SQL plan directive is no longer necessary
it will be automatically purged after 53 weeks. 2 More information on extended
statistics can be found in the paper “Understanding Optimizer Statistics with
Oracle Database 19c”. See Reference 1. 19 WHITE PAPER / The Optimizer in Oracle
Database 19c OPTIMIZER STATISTICS Optimizer statistics are a collection of data
that describe the database and the objects in it. The optimizer uses these
statistics to choose the best execution plan for each SQL statement. Being able
to gather the appropriate statistics in a timely manner is critical to
maintaining acceptable performance on any Oracle system. With each new release,
Oracle strives to provide the necessary statistics automatically. A summary is
presented here, but full details can be found in Reference 1, Understanding
Optimizer Statistics with Oracle Database 19c. High-frequency Automatic
Statistics Gathering This Oracle Database 19c new feature is available on
certain Oracle Database platforms. Check the Oracle Database Licensing Guide
for more information. Stale statistics can mislead the optimizer and in some
cases can result in a sub-optimal plan. To address this, automatic optimizer
statistics collection was introduced in Oracle 11g. The automated maintenance
task infrastructure schedules statistics gathering to occur in maintenance
windows. By default, one window is scheduled for each day of the week and
statistics collection runs in all predefined maintenance windows. For systems
with rapidly changing datasets, data can drastically change in a short period
time, rapidly rendering statistics stale and making daily statistics gathering
insufficient. New to Oracle Database 19c, high-frequency automatic statistics
collection is introduced as a complement to the existing auto task. It is
decoupled from the maintenance window. By configuring the DBMS_STATS
preferences, the DBA can customize the frequency and maximum run time according
to their requirement. Note that the existing automatic statistics collection
running in the predefined maintenance window will not be affected, and
high-frequency automatic statistics task will not start during the maintenance
window. Histograms Histograms tell the optimizer about the distribution of data
within a column. By default, the optimizer assumes a uniform distribution of
rows across the distinct values in a column and will calculate the cardinality
for a query with an equality predicate by dividing the total number of rows in
the table by the number of distinct values in the column used in the equality
predicate. The presence of a histogram changes the formula used by the
optimizer to determine the cardinality estimate, and allows it to generate a more
accurate estimate. Online Statistics Gathering When an index is created, Oracle
automatically gathers optimizer statistics as part of the index creation by
piggybacking the statistics gather on the full data scan and sort necessary for
the index creation (this has been available since Oracle Database 9i). The same
technique is applied for direct path operations such as, create table as select
(CTAS) and insert as select (IAS) operations into empty tables. Piggybacking
the statistics gather as part of the data loading operation, means no
additional full data scan is required to have statistics available immediately
after the data is loaded. The additional time spent on gathering statistics is
small compared to a separate statistics collection process, and it guarantees
to have accurate statistics readily available from the get-go. 20 WHITE PAPER /
The Optimizer in Oracle Database 19c Real-time Statistics This Oracle Database
19c new feature is available on certain Oracle Database platforms. Check the
Oracle Database Licensing Guide for more information. Online statistics
gathering for create table as select (CTAS) and direct path insert was first
introduced in Oracle Database 12c Release 1. This allows statistics to be
accurate and available immediately, but prior to Oracle Database 19c, this
capability was not applicable to conventional database manipulation language
(DML) operations which dominate the way data is manipulated in (for example)
traditional OLTP systems. Real-time statistics extends the online statistic
gathering techniques to conventional insert, update and merge DML operations.
In order to minimize the performance overhead of generating these statistics,
only the most essential optimizer statistics are gathered during DML
operations. The collection of the remaining stats (such as the number of
distinct values) is deferred to the automatic statistics gathering job,
high-frequency stats gathering and/or the manual invocation of the DBMS_STATS
API. See Reference 2, Best Practices for Gathering Optimizer Statistics with
Oracle Database 19c, for more information. Incremental Statistics Gathering
statistics on partitioned tables consists of gathering statistics at both the
table level (global statistics) and at the (sub)partition level. If data in (sub)partitions
is changed in any way, or if partitions are added or removed, then the
global-level statistics must be updated to reflect the changes so that there is
correspondence between the partition-level and global-level statistics. For
large partitioned tables, it can be very costly to scan the whole table to
reconstruct accurate global-level statistics. For this reason, incremental
statistics were introduced in Oracle Database 11g to address this issue,
whereby synopses were created for each partition in the table. These data
structures can be used to derive global-level statistics – including
non-aggregatable statistics such as column cardinality - without scanning the
entire table. Incremental Statistics and Staleness A DBMS_STATS preference
called INCREMENTAL_STALENESS allows you to control when partition statistics
will be considered stale and not good enough to generate global level
statistics. By default, INCREMENTAL_STALENESS is set to NULL, which means
partition level statistics are considered stale as soon as a single row changes
(the same behavior as in Oracle Database 11g). Alternatively, it can be set to
USE_STALE_PERCENT or USE_LOCKED_STATS. USE_STALE_PERCENT means the partition
level statistics will be used as long as the percentage of rows changed (in the
respective partition or subpartition) is less than the value of the preference
STALE_PERCENTAGE (10% by default). USE_LOCKED_STATS means if statistics on a
partition are locked, they will be used to generate global level statistics
regardless of how many rows have changed in that partition since statistics
were last gathered. Incremental Statistics and Partition Exchange Loads One of
the benefits of partitioning is the ability to load data quickly and easily,
with minimal impact on the business users, by using the exchange partition
command. The exchange partition command allows the data in a non-partitioned
table to be swapped into a specified partition in the partitioned table. The
command does not physically move data; instead it updates the data dictionary
to exchange a pointer from the partition to the table and vice versa. The
Oracle Database allows you to create the necessary statistics (synopses) on
load tables before they are exchanged into the partitioned table. This means
that global partition statistics can be maintained incrementally the moment
that the load table is exchanged with the relevant table partition. Compact
Synopses The performance for statistics gathering with incremental statistics
can come with the price of high disk storage of synopses (they are stored in
the SYSAUX tablespace). More storage is required for synopses for tables with a
high number of partitions and a large number of columns, particularly where the
number of distinct values (NDV) is high. Besides consuming storage space, the
performance overhead of maintaining very large synopses can become significant.
Oracle Database 12c Release 2 introduced a new algorithm for gathering and
storing NDV information, which results in much smaller synopses while maintaining
a similar level of accuracy to the previous algorithm. 21 WHITE PAPER / The
Optimizer in Oracle Database 19c CONCURRENT STATISTICS When the global
statistics gathering preference CONCURRENT is set, Oracle employs the Oracle
Job Scheduler to create and manage one statistics gathering job per object
(tables and / or partitions) concurrently. AUTOMATIC COLUMN GROUP DETECTION
Extended statistics were introduced in Oracle Database 11g. They help the
optimizer improve the accuracy of cardinality estimates for SQL statements that
contain predicates involving a function wrapped column (e.g. UPPER(LastName))
or multiple columns from the same table that are used in filter predicates,
join conditions, or group-by keys. Although extended statistics are extremely
useful it can be difficult to know which extended statistics should be created
if you are not familiar with an application or data set. Auto column group
detection, automatically determines which column groups are required for a
table based on a given workload. The detection and creation of column groups is
a simple three-step procedure3 . NEW REPORTING SUBPROGRAMS IN DBMS_STATS
PACKAGE Knowing when and how to gather statistics in a timely manner is
critical to maintain acceptable performance on any system. Determining what
statistics gathering operations are currently executing in an environment and
how changes to the statistics methodology will impact the system can be
difficult and time consuming. Reporting subprograms in DBMS_STATS package make
it easier to monitor what statistics gathering activities are currently going
on and what impact changes to the parameter settings of these operations will
have. The DBMS_STATS subprograms are REPORT_STATS_OPERATIONS,
REPORT_SINGLE_STATS_OPERATION and REPORT_GATHER_*_STATS. Figure 18 shows an
example output from the REPORT_STATS_OPERATIONS function. The report shows
detailed information about what statistics gathering operations have occurred,
during a specified time window. It gives details on when each operation occurred,
its status, and the number of objects covered and it can be displayed in either
text or HTML format. Figure 18: Reporting stats operations. 3 For information
on creating column groups, see the white paper Understanding Optimizer
Statistics With Oracle Database 19c. 22 WHITE PAPER / The Optimizer in Oracle
Database 19c SQL QUARANTINE This Oracle Database 19c new feature is available
on certain Oracle Database platforms. Check the Oracle Database Licensing Guide
for more information. RUNAWAY SQL STATEMENTS Runaway SQL statements can be a
threat to database service levels by consuming significant system resources if
they remain unchecked. This is a particular problem if multiple runaway queries
are running concurrently. The Oracle Database Resource Manager (DBRM) has
always offered the ability to terminate a SQL statement that exceed certain
resource utilization thresholds (such as CPU and IO) as well as the ability to
terminate queries that exceed maximum runtime thresholds. However, this does
not prevent repeated execution of runaway queries. A problem query may be run
over and over again, each time consuming significant resources before being
terminated each time. For example: Figure 19 - Runaway SQL statements For the
sake of example, we will assume that the machine CPU utilization looks
something like this in response to the runaway queries being executed. In this
case, when three queries are running concurrently, the CPU is nearly at 100%:
Figure 20 - CPU consumed by runaway SQL 23 WHITE PAPER / The Optimizer in
Oracle Database 19c SQL QUARANTINE The new Oracle Database 19c feature SQL
Quarantine can be used to eliminate the overhead of runaway queries. When DBRM
detects a SQL statement is exceeding a resource or run-time limit, the SQL
execution plan used by the statement is quarantined. If the SQL statement is
executed again and it is using the same SQL execution plan then it will be
terminated immediately. This can significantly reduce the amount of system
resource that would otherwise be wasted. In the following example, there is
high utilization when the initial queries are executed, but once they are
quarantined they no longer consume system resources because they are terminated
prior to execution: Figure 21 - CPU saved by SQL quarantine DBRM must be configured
to apply a limit. In this case, queries executing for more than 5 seconds will
be terminated: -- The plan will cancel the current SQL if elapsed time exceeds
5 seconds. dbms_resource_Manager.create_plan_directive( plan =>
'LIMIT_RESOURCE', group_or_subplan => 'TEST_RUNAWAY_PLANS', comment =>
'Kill statement after exceeding desired execution time', switch_group =>
'CANCEL_SQL', switch_time => 5, switch_estimate => false ); A DBRM
exception is raised as follows: SQL> select /* Time consuming query */ *
from ...; * ERROR at line 1: ORA-00040: active time limit exceeded - call
aborted Elapsed: 00:00:07.59 Quarantine information is held in memory and
flushed to disk periodically. Once the quarantine configuration is persisted in
the data dictionary, the following error condition will be raised: SQL>
select /* Time consuming query */ * from ...; * ERROR at line 1: ORA-56955:
quarantined plan used Elapsed: 00:00:00.00 24 WHITE PAPER / The Optimizer in
Oracle Database 19c OPTIMIZER STATISTICS ADVISOR It is well known that inferior
statistics cause query performance problems. It is relatively easy to identify
stale, out-of-date statistics and missing statistics, but poor quality
statistics can be harder to identify: such as inconsistencies between tables and
indexes, primary-key/foreign-key relationships and so on. Inconsistencies in
statistics are usually a result of not following recommended approaches, but it
is not always easy to strictly adhere to these for a number of reasons. For
example, Oracle continuously enhances statistics gathering features but
enhancements can be overlooked postupgrade (a good example is the
recommendation to use AUTO_SAMPLE_SIZE rather than fixed percentages). DBAs may
use legacy scripts to gather statistics manually so that there is a reluctance
to change “proven” procedures. Sometimes statistics gathering can be overlooked
and statistics might not be maintained during batch processing and there may be
a perceived lack of time in batch windows. There are many “inherited” systems
too, where nobody understands the scripts that are used to maintain statistics.
From Oracle Database 12 Release 2, a feature called the Optimizer Statistics
Advisor is available. The goal of the advisor is to analyze how statistics are
gathered, validate the quality of statistics already gathered and check the
status of auto stats gathering (for example, checking for successful
completion). To achieve this, it examines the data dictionary with respect to a
set of rules. Where exceptions to the rules are found, findings may be
generated and these, in turn, may lead to specific recommendations. The advisor
will generate a report that lists findings (with the associated “broken” rule),
and then list specific recommendations to remedy the situation. Finally, the
recommendations can be implemented using a set of actions. Actions can be
output in the form of a SQL script or they can be implemented automatically.
Full details can be found in Reference 2, Best Practices for Gathering
Optimizer Statistics with Oracle Database 19c. 25 WHITE PAPER / The Optimizer
in Oracle Database 19c SQL PLAN MANAGEMENT SQL Plan Management (SPM) is an
exceptionally important feature to consider for critical applications that
require guaranteed SQL execution plan stability. SPM is furthermore fundamental
for any database upgrade to evolve execution plans from one optimizer version
to another in a controlled manner, managing execution plans and ensuring that
only known or verified plans are used. A number of enhancements were made to SQL
plan management in Oracle Database 12c: ·
Automatic plan evolution ·
Filtered auto capture ·
Capture plans from AWR Repository Oracle Database 18c adds: · Support for a subset of
SQL plan management functionality in Oracle Database Standard Edition Oracle
Database 19c adds: ·
Prevention of SQL regression with a new default 'AUTO' evolution mode that
locates and tests alternative SQL execution plans from multiple sources to find
the most optimal. These features are covered in detail in Reference 4, SQL Plan
Management with Oracle Database 19c. ENHANCEMENTS TO DBMS EXPLAIN PLAN Hint
Usage Reporting Hints used in SQL statements can be difficult to 'debug'. For
example, it is not always easy to understand why a particular hint is not being
used. There is no error reporting for hints, so if they are incorrectly
specified in any way, they will be ignored silently. Hint usage reporting
addresses this. It provides detailed information like invalid hints,
conflicting hints, ignored hints (e.g. because of the query block location
specified) and hints that successfully affected the final plan. Here is an
example: EXPLAIN PLAN FOR SELECT /*+ INDEX(t1) FULL(@sel$2 t1) */ COUNT(*) FROM
jobs t1 WHERE t1.job_id IN (SELECT /*+ FULL(t1) */ job_id FROM employees t1);
SELECT * FROM TABLE(DBMS_XPLAN.display(format => 'typical'));
----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 17 | 3 (34)| 00:00:01 | | 1 | SORT AGGREGATE | |
1 | 17 | | | | 2 | NESTED LOOPS | | 19 | 323 | 3 (34)| 00:00:01 | | 3 | SORT
UNIQUE | | 107 | 963 | 2 (0)| 00:00:01 | | 4 | TABLE ACCESS FULL| EMPLOYEES |
107 | 963 | 2 (0)| 00:00:01 | |* 5 | INDEX UNIQUE SCAN | JOB_ID_PK | 1 | 8 | 0
(0)| 00:00:01 |
----------------------------------------------------------------------------------
... Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 3 (U - Unused (1))
--------------------------------------------------------------------------- 4 -
SEL$5DA710D3 / "T1"@"SEL$2" U - FULL(t1) / hint overridden
by another in parent query block - FULL(@sel$2 t1) 5 - SEL$5DA710D3 /
"T1"@"SEL$1" - INDEX(t1) 26 WHITE PAPER / The Optimizer in
Oracle Database 19c New format specifiers have been created to support hint
reporting: ·
HINT_REPORT ·
HINT_REPORT_USED ·
HINT_REPORT_UNUSED For example: SELECT * FROM TABLE(DBMS_XPLAN.display(format
=> 'basic +hint_report')); SQL Plan Comparison The plan comparison feature
(hence forth referred to compare plans tool) takes as input a reference plan
and an arbitrary list of test plans. It compares the reference plan with each
of the test plans in the list and produces a difference report. Such a tool is
useful in a variety of use cases: ·
New plans: When a plan change occurs, users want to compare it with an old plan
(for instance stored in AWR). ·
SQL Performance Analyzer (SPA), SQL Tune: How does a plan generated based on a
new SQL profile or plan generated through SPA differs from the original plan? · SQL Plan Management: When
an accepted baseline plan is not reproduced by SPM, how does the original plan
that is stored in the baseline differs from the generated plan? What are the
differences between the different plan baselines captured for the same query? · Plan generated from a
different SQL statement or based on the effect of hints/parameters: How does
adding a particular hint or changing the value of a parameter affects the plan?
Plan comparison goes beyond a simple line-by-line comparison because SQL
execution plans are execution trees. A more complex analysis is required and
logical differences must be identified. Consider two queries: select count(*)
from products p, sales s where p.prod_id = s.prod_id and p.prod_min_price >
200; select count(*) from products p, sales s where p.prod_id = s.prod_id and
s.quantity_sold = 43; They can be compared as follows: VARIABLE v_rep CLOB
BEGIN :v_rep := DBMS_XPLAN.COMPARE_PLANS( reference_plan =>
cursor_cache_object('0hxmvnfkasg6q', NULL), compare_plan_list =>
plan_object_list(cursor_cache_object('10dqxjph6bwum', NULL)), type =>
'TEXT', level => 'TYPICAL', section => 'ALL'); END; / 27 WHITE PAPER /
The Optimizer in Oracle Database 19c A report can be generated as follows: SET
PAGESIZE 50000 SET LONG 100000 SET LINESIZE 210 COLUMN report FORMAT a200
SELECT :v_rep REPORT FROM DUAL; REPORT
---------------------------------------------------------------------------------------------
COMPARE PLANS REPORT
---------------------------------------------------------------------------------------------
Current user : SH Total number of plans : 2 Number of findings : 1
---------------------------------------------------------------------------------------------
COMPARISON DETAILS
---------------------------------------------------------------------------------------------
Plan Number : 1 (Reference Plan) Plan Found : Yes Plan Source : Cursor Cache
SQL ID : 0hxmvnfkasg6q Child Number : 0 Plan Database Version : 19.0.0.0
Parsing Schema : "SH" SQL Text : select count(*) from products p,
sales s where p.prod_id = s.prod_id and s.quantity_sold = 43 Plan
----------------------------- Plan Hash Value : 3519235612 -------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Time |
------------------------------------------------------------------------- | 0 |
SELECT STATEMENT | | | | 469 | | | 1 | SORT AGGREGATE | | 1 | 3 | | | | 2 |
PARTITION RANGE ALL | | 1 | 3 | 469 | 00:00:01 | | * 3 | TABLE ACCESS FULL |
SALES | 1 | 3 | 469 | 00:00:01 |
-------------------------------------------------------------------------
Predicate Information (identified by operation id): ------------------------------------------
* 3 - filter("S"."QUANTITY_SOLD"=43)
---------------------------------------------------------------------------------------------
Plan Number : 2 Plan Found : Yes Plan Source : Cursor Cache SQL ID :
10dqxjph6bwum Child Number : 0 Plan Database Version : 19.0.0.0 Parsing Schema
: "SH" SQL Text : select count(*) from products p, sales s where
p.prod_id = s.prod_id and p.prod_min_price > 200 Plan
----------------------------- Plan Hash Value : 3037679890 --------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Time |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 34 | | | 1 | SORT AGGREGATE | | 1 | 13 | | | | *
2 | HASH JOIN | | 781685 | 10161905 | 34 | 00:00:01 | | * 3 | TABLE ACCESS FULL
| PRODUCTS | 61 | 549 | 2 | 00:00:01 | 28 WHITE PAPER / The Optimizer in Oracle
Database 19c | 4 | PARTITION RANGE ALL | | 918843 | 3675372 | 29 | 00:00:01 | |
5 | BITMAP CONVERSION TO ROWIDS | | 918843 | 3675372 | 29 | 00:00:01 | | 6 |
BITMAP INDEX FAST FULL SCAN | SALES_PROD_BIX | | | | |
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
------------------------------------------ * 2 -
access("P"."PROD_ID"="S"."PROD_ID") * 3
- filter("P"."PROD_MIN_PRICE">200) Notes ----- - This is
an adaptive plan Comparison Results (1): ----------------------------- 1. Query
block SEL$1: Transformation JOIN REMOVED FROM QUERY BLOCK occurred only in the
reference plan (result query block: SEL$A43D1678). INITIALIZATION PARAMETER
CHANGES Oracle Database 18c made the following initialization parameters
available: optimizer_ignore_hints Setting to TRUE will all the optimizer to
ignore hints (other than hints intended to control parallel execution or direct
path load). The default value is FALSE. optimizer_ignore_parallel_hints Setting
to TRUE will allow the optimizer to ignore hints that control parallel
operations (e.g. PARALLEL and NOPARALLEL). The default value is FALSE. 29 WHITE
PAPER / The Optimizer in Oracle Database 19c REFERENCES 1. Understanding
Optimizer Statistics with Oracle Database 19c 2. Best Practices for Gathering
Optimizer Statistics with Oracle Database 19c 3. SQL Plan Management with
Oracle Database 19c
No comments:
Post a Comment