Friday, 19 March 2021

The Optimizer In Oracle Database 19c

 

AUTOMATIC INDEXING This Oracle Database 19c new feature is available on certain Oracle Database platforms. Check the Oracle Database Licensing Guide for more information. INDEXING CHALLENGES Indexes are useful for most types of workload and particularly critical for high performance in online transaction processing (OLTP) and operational data store (ODS) environments. However, identifying the best indexes is not straightforward and can require a considerable amount of manual labor. This is largely because index management is a continuous process. Applications are rarely static; they change in response to changing business requirements. For this reason, there will be changes in data volumes, of course, but also less obvious changes in data value distributions and the way data is organized, processed and accessed. Traditionally, DBAs have been responsible to monitoring performance and deciding when and where to add, change or remove indexes in a tactical and often ad-hoc manner. This ad-doc approach to index maintenance is prone to error because it is almost impossible to quantify the effect any change – both positive and negative. This may lead to a database that has many more indexes than necessary, where indexes have been gradually added over time and there is a reluctance to remove any of them for fear of negative consequences. This will lead to an increase the system resources required to maintain indexes when data is modified and processed. In addition, over-indexed environments often suffer from less stable SQL execution plans as the sheer number of indexes make the optimizer's choice of index access path more and more finely balanced. AUTOMATIC INDEXING Automatic indexing addresses these issues. It is not a simple advisor, but instead it is an expert system that implements indexes based on what a performance engineer skilled in index tuning would do. The Oracle Database analyzes the application workload and identifies the queries that will benefit from additional indexes. In other words, it identifies candidate indexes and validates them before implementation, and the entire process is fully automatic. Here is a summary of the workflow: Figure 1 Automatic indexing workflow 5 WHITE PAPER / The Optimizer in Oracle Database 19c The steps can be summarized as follows: Capture · Periodically capture SQL statements from the application workload in a SQL tuning set. This tuning set called the automatic SQL tuning set (ASTS). Identify · Identify candidate indexes that may benefit the application workload. · Creates unusable and invisible index candidates (this is a data dictionary metadata change only) Verify · The optimizer verifies which index candidates will be used by the captures SQL statements · Materialize the successful candidates and measure their effect on the performance of the captured SQL statements · The indexes remain invisible to the application workload and all verification is done outside the application workflow Decide · Indexes that are found to offer a significant performance improvement are made visible to the application workload. · Indexes that are found to provide insufficient performance benefits remain invisible to the application workload. · Automatic indexing offers a mixed solution where necessary. If there is an overall benefit in using a particular auto-index, then it will be marked visible. If individual queries suffer from a performance regression, SQL plan baselines are used to prevent them from using regressed plans. Monitor · SQL performance and index usage is continuously monitored. Indexes that have not been used for a configurable period of time will be dropped. Configuring and Monitoring Automatic Indexing Automatic indexing requires little to no manual intervention, but a package called DBMS_AUTO_INDEX package is provided for changing a small number of defaults. The feature can be enabled as follows: SQL> exec dbms_auto_index.configure('AUTO_INDEX_MODE', 'IMPLEMENT') And disabled: SQL> exec dbms_auto_index.configure('AUTO_INDEX_MODE', 'OFF') The number of days unused auto-indexes are retained is controlled as follows: SQL> exec dbms_auto_index.configure('AUTO_INDEX_RETENTION_FOR_AUTO', '373') It is possible to specify which schemas are subject to auto indexing: SQL> -- Exclude SH and HR SQL> exec dbms_auto_index.configure('AUTO_INDEX_SCHEMA', 'SH', FALSE) SQL> exec dbms_auto_index.configure('AUTO_INDEX_SCHEMA', 'HR', FALSE) SQL> -- Remove HR from exclusion list SQL> exec dbms_auto_index.configure('AUTO_INDEX_SCHEMA', 'HR', NULL) SQL> -- Remove all schemas from exclusion list SQL> exec dbms_auto_index.configure('AUTO_INDEX_SCHEMA', NULL, TRUE) SQL> -- Include SH for auto indexing but exclude HR SQL> exec dbms_auto_index.configure('AUTO_INDEX_SCHEMA', 'SH', FALSE) SQL> exec dbms_auto_index.configure('AUTO_INDEX_SCHEMA', 'HR', TRUE) All parameter settings (and schemas that have been included and excluded) can be seen as follows: SQL> select * from dba_auto_index_config; 6 WHITE PAPER / The Optimizer in Oracle Database 19c A report on auto index activity can be generated. For example: SQL> set linesize 300 trims on pagesize 1000 long 100000 SQL> column report format a120 SQL> SELECT dbms_auto_index.report_activity(sysdate-30,null,'text','all','all') report FROM dual; Here is an extract from an example report: REPORT ---------------------------------------------------------------------------------------------------------- -------------- GENERAL INFORMATION ------------------------------------------------------------------------------- Activity start : 29-DEC-2018 05:09:37 Activity end : 28-JAN-2019 05:09:37 Executions completed : 743 Executions interrupted : 0 Executions with fatal error : 0 ------------------------------------------------------------------------------- SUMMARY (AUTO INDEXES) ------------------------------------------------------------------------------- Index candidates : 16 Indexes created (visible / invisible) : 2 (2 / 0) Space used (visible / invisible) : 268.44 MB (268.44 MB / 0 B) Indexes dropped : 0 SQL statements verified : 6 SQL statements improved (improvement factor) : 3 (493.3x) SQL plan baselines created : 0 Overall improvement factor : 247.4x ------------------------------------------------------------------------------- SUMMARY (MANUAL INDEXES) ------------------------------------------------------------------------------- Unused indexes : 0 Space used : 0 B Unusable indexes : 0 ------------------------------------------------------------------------------- INDEX DETAILS ------------------------------------------------------------------------------- 1. The following indexes were created: *: invisible ------------------------------------------------------------------------------- ---------------------------------------------------------------------- | Owner | Table | Index | Key | Type | Properties | ---------------------------------------------------------------------- | AUTOI | FACT1 | SYS_AI_0rn9u2kmxxbs7 | F1 | B-TREE | NONE | | AUTOI | FACT1 | SYS_AI_151bnmf3xsxaw | F2,F3 | B-TREE | NONE | ---------------------------------------------------------------------- ------------------------------------------------------------------------------- VERIFICATION DETAILS ------------------------------------------------------------------------------- 1. The performance of the following statements improved: ------------------------------------------------------------------------------- Parsing Schema Name : AUTOI SQL ID : 34xymh9usuxzz SQL Text : select /* TESTQ3 */ sum(f4) from fact1 where f1 = 10 Improvement Factor : 2268.9x PLANS SECTION --------------------------------------------------------------------------------------------- - Original ----------------------------- Plan Hash Value : 4087791341 7 WHITE PAPER / The Optimizer in Oracle Database 19c ----------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | Time | ----------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 51280 | | | 1 | SORT AGGREGATE | | 1 | 8 | | | | 2 | TABLE ACCESS FULL | FACT1 | 98 | 784 | 51280 | 00:00:03 | ----------------------------------------------------------------------- - With Auto Indexes ----------------------------- Plan Hash Value : 911518680 -------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | Time | -------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 8 | 102 | 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 8 | | | | 2 | TABLE ACCESS BY INDEX ROWID BATCHED | FACT1 | 98 | 784 | 102 | 00:00:01 | | * 3 | INDEX RANGE SCAN | SYS_AI_0rn9u2kmxxbs7 | 98 | | 3 | 00:00:01 | -------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): ------------------------------------------ * 3 - access("F1"=10) 8 WHITE PAPER / The Optimizer in Oracle Database 19c ADAPTIVE QUERY OPTIMIZATION Adaptive Query Optimization is a set of capabilities that enable the optimizer to make run-time adjustments to execution plans and discover additional information that can lead to better statistics. This new approach is extremely helpful when existing statistics are not sufficient to generate an optimal plan. There are two distinct aspects in Adaptive Query Optimization: adaptive plans, which focuses on improving the execution of a query and adaptive statistics, which uses additional information to improve query execution plans. Figure 2: The components that make up the new Adaptive Query Optimization functionality Adaptive Plans An adaptive plan will be chosen by the Optimizer if certain conditions are met; for example, when a query includes joins and complex predicates that make it difficult to estimate cardinality accurately. Adaptive plans enable the optimizer to defer the plan decision for a statement until execution time. The optimizer instruments its chosen plan (the default plan), with statistics collectors so that at runtime, it can detect if cardinality estimates differ greatly from the actual number of rows seen by the operations in the plan. If there is a significant difference, then the plan or a portion of it will be automatically adapted to avoid suboptimal performance. From Oracle Database 12.2 onwards, adaptive plans are enabled and disabled using the database parameter optimizer_adaptive_plans. Use of the default value of TRUE (enabled) is recommended. ADAPTIVE JOIN METHODS The optimizer is able to adapt join methods on the fly by predetermining multiple sub-plans for portions of the plan. For example, in Figure 3, the optimizer’s default plan choice for joining the orders and products tables is a nested loops join via an index access on the products table. An alternative sub-plan, has also been determined that allows the optimizer to switch the join type to a hash join. In the alternative plan the products table will be accessed via a full table scan. 9 WHITE PAPER / The Optimizer in Oracle Database 19c During the initial execution, the statistics collector gathers information about the execution and buffers a portion of rows coming into the sub-plan. The Optimizer determines what statistics are to be collected, and how the plan should be resolved for different values of the statistics. It computes an “inflection point” which is the value of the statistic where the two plan choices are equally good. For instance, if the nested loops join is optimal when the scan of an orders table produces fewer than 10 rows, and the hash join is optimal when the scan of orders produces more than 10 rows, then the inflection point for these two plans is 10. The optimizer computes this value, and configures a buffering statistics collector to buffer and count up to 10 rows. If at least 10 rows are produced by the scan, then the join method is resolved to hash join; otherwise it is resolved to nested loops join. In Figure 3, the statistics collector is monitoring and buffering rows coming from the full table scan of orders. Based on the information seen in the statistics collector, the optimizer will make the decision about which sub-plan to use. In this case, the hash join is chosen since the number of rows coming from the orders table is larger than the optimizer initially estimated. Figure 3: Adaptive execution plan for the join between ORDERS and PRODUCTS. Default plan on the left, chosen plan on the right. The optimizer can switch from a nested loops join to a hash join and vice versa. However, if the initial join method chosen is a sort merge join no adaptation will take place. By default, the explain plan command will show only the initial or default plan chosen by the optimizer. Whereas the DBMS_XPLAN.DISPLAY_CURSOR function displays the plan actually used by the query. Figure 4: Explain plan and DBMS_XPLAN.DISPLAY_CURSOR plan output for the scenario represented in Figure 3. 10 WHITE PAPER / The Optimizer in Oracle Database 19c To see all of the operations in an adaptive plan, including the positions of the statistics collectors, the additional format parameter ‘adaptive’ must be specified in the DBMS_XPLAN functions. In this mode, an additional notation “-” appears in the Id column of the plan, indicating the operations in the plan that were not used (inactive). Figure 5: Complete adaptive plan displayed using ‘ADAPTIVE’ format parameter in DBMS_XPLAN.DISPLAY_CURSOR SQL Monitor visualizes all operations if “Full” is selected in the “Plan” drop down box. The inactive parts of the plan are grayed out (see Figure 6). If the “Plan Note” icon is clicked, a pop-up box will be displayed confirming that the plan is an adaptive plan. Figure 6: SQL Monitor showing and adaptive plan 11 WHITE PAPER / The Optimizer in Oracle Database 19c ADAPTIVE PARALLEL DISTRIBUTION METHODS When a SQL statement is executed in parallel certain operations, such as sorts, aggregations, and joins require data to be redistributed among the parallel server processes executing the statement. The distribution method chosen by the optimizer depends on the operation, the number of parallel server processes involved, and the number of rows expected. If the optimizer inaccurately estimates the number of rows, then the distribution method chosen could be suboptimal and could result in some parallel server processes being underutilized. With the new adaptive distribution method, HYBRID HASH the optimizer can defer its distribution method decision until execution, when it will have more information on the number of rows involved. A statistics collector is inserted before the operation and if the actual number of rows buffered is less than the threshold the distribution method will switch from HASH to BROADCAST. If however the number of rows buffered reaches the threshold then the distribution method will be HASH. The threshold is defined as 2 X degree of parallelism. Figure 7 shows an example of a SQL Monitor execution plan for a join between EMP and DEPT that is executed in parallel. One set of parallel server processes (producers or pink icons) scan the two tables and send the rows to another set of parallel server processes (consumers or blue icons) that actually do the join. The optimizer has decided to use the HYBRID HASH distribution method. The first table accessed in this join is the DEPT table. The rows coming out of the DEPT table are buffered in the statistics collector, on line 6 of the plan, until the threshold is exceeded or the final row is fetched. At that point the optimizer will make its decision on a distribution method. Figure 7: SQL Monitor execution plan for hash join between EMP & DEPT that uses adaptive distribution method To understand which distribution method was chosen at runtime, the easiest way to find this information is to look at the OTHER column in SQL Monitor. This column shows a binocular icon in the lines with PX SEND HYBRID HASH row source. When you click the icon, you can see the distribution method used at runtime. Figure 8: Hybrid hash distribution method For the adaptive distribution methods there are three possible values reported in this dialog box: 6 = BROADCAST, 5 = ROUNDROBIN, and 16 = HASH distribution. 12 WHITE PAPER / The Optimizer in Oracle Database 19c ADAPTIVE BITMAP INDEX PRUNING When the optimizer generates a star transformation plan, it must choose the right combination of bitmap indexes to reduce the relevant set of ROWIDs as efficiently as possible. If there are many indexes, some of them might not reduce the ROWID set very substantially but will nevertheless introduce significant processing cost during query execution. Adaptive plans are therefore used to prune out indexes that are not significantly filtering down the number of matched rows. DBMS_XPLAN.DISPLAY_CURSOR will reveal adaptive bitmap pruning in a SQL execution plan with the adaptive keyword in a similar manner to the example shown in Figure 4. For example, consider the following SQL execution plan showing the bitmap index CAR_MODEL_IDX being pruned: Figure 9: Example of adaptive bitmap index pruning. ADAPTIVE STATISTICS The quality of the execution plans determined by the optimizer depends on the quality of the statistics available. However, some query predicates become too complex to rely on base table statistics alone and the optimizer can now augment these statistics with adaptive statistics. The use of adaptive statistics can extend SQL statement parse times, which can become an overhead in systems that have high hard parse rates. In addition, SQL execution plans are likely to be more dynamic and less consistent. This feature is disabled by default from Oracle Database 12c Release 2 onwards. Systems that have low hard parse rates and complex queries (particularly ad-hoc queries) may benefit from this feature by setting the database initialization parameter optimizer_adaptive_statistics to TRUE. DYNAMIC STATISTICS During the compilation of a SQL statement, the optimizer decides if the available statistics are sufficient to generate a good execution plan or if it should consider using dynamic sampling. Dynamic sampling is used to compensate for missing or insufficient statistics that would otherwise lead to a very bad plan. For the case where one or more of the tables in the query does not have statistics, dynamic sampling is used by the optimizer to gather basic statistics on these tables before optimizing the statement. The statistics gathered in this case are not as high a quality (due to sampling) or as complete as the statistics gathered using the DBMS_STATS package. Beginning with Oracle Database 12c Release 1, dynamic sampling was enhanced to become dynamic statistics. Dynamic statistics allow the optimizer to augment existing statistics to get more accurate cardinality estimates for not only single table accesses but also joins and group-by predicates. Also, from Oracle Database 12c Release 1, a new level 11 has been introduced for the initialization parameter OPTIMIZER_DYNAMIC_SAMPLING. Level 11 enables the optimizer to automatically decide to use dynamic statistics for any SQL statement, even if all basic table statistics exist. The optimizer bases its decision to use dynamic statistics on the complexity of the predicates used, the existing base statistics, and the total execution time expected for the SQL statement. For example, dynamic statistics will kick in for situations where the optimizer previously would have used a guess, such as queries with LIKE predicates and wildcards. 13 WHITE PAPER / The Optimizer in Oracle Database 19c The default dynamic sampling level is 2, so it’s likely that when set to level 11, dynamic sampling will kick-in much more often and will increase hard parse times. AUTOMATIC RE-OPTIMIZATION During the first execution of a SQL statement, an execution plan is generated as usual. During optimization, certain types of estimates that are known to be of low quality (for example, estimates for tables which lack statistics or tables with complex predicates) are noted, and monitoring is enabled for the cursor that is produced. If feedback monitoring is enabled for a cursor by the system, cardinality estimates in the plan are compared to the actual cardinalities seen during execution. If estimates are found to differ significantly from the actual cardinalities, then the optimizer looks for a replacement plan on the next execution. The optimizer will use the information gathered during the previous execution to help determine an alternative plan. The optimizer can re-optimize a query several times, each time learning more and further improving the plan. Oracle Database 19c supports multiple forms of re-optimization. Statistics Feedback Statistics feedback (formally known as cardinality feedback) is one form of re-optimization that automatically improves plans for repeated queries that have cardinality misestimates. During the first execution of a SQL statement, the optimizer generates an execution plan and decides if it should enable statistics feedback monitoring for the cursor. Statistics feedback is enabled in the following cases: tables with no statistics, multiple conjunctive or disjunctive filter predicates on a table, and predicates containing complex operators for which the optimizer cannot accurately compute cardinality estimates. At the end of the execution, the optimizer compares its original cardinality estimates to the actual cardinalities observed during execution and, if estimates differ significantly from actual cardinalities, it stores the correct estimates for subsequent use. It will also create a SQL plan directive so other SQL statements can benefit from the information learnt during this initial execution. If the query executes again, then the optimizer uses the corrected cardinality estimates instead of its original estimates to determine the execution plan. If the initial estimates are found to be accurate no additional steps are taken. After the first execution, the optimizer disables monitoring for statistics feedback. Figure 10 shows an example of a SQL statement that benefits from statistics feedback. On the first execution of this two-table join, the optimizer underestimates the cardinality by 8X due to multiple, correlated, single-column predicates on the customers table. Figure 10: Initial execution of a SQL statement that benefits from automatic re-optimization statistics feedback Where estimates vary greatly from the actual number of rows returned, the cursor is marked IS_REOPTIMIZIBLE and will not be used again. The IS_REOPTIMIZIBLE attribute indicates that this SQL statement should be hard parsed on the next execution so the optimizer can use the execution statistics recorded on the initial execution to determine a better execution plan. 14 WHITE PAPER / The Optimizer in Oracle Database 19c Figure 11: Cursor marked IS_REOPTIMIZIBLE after initial execution statistics vary greatly from original cardinality estimates A SQL plan directive is also created, to ensure that the next time any SQL statement that uses similar predicates on the customers table is executed, the optimizer will be aware of the correlation among these columns. On the second execution the optimizer uses the statistics from the initial execution to determine a new plan that has a different join order. The use of statistics feedback in the generation of execution plan is indicated in the note section under the execution plan. Figure 12: New plan generated using execution statistics from initial execution 15 WHITE PAPER / The Optimizer in Oracle Database 19c The new plan is not marked IS_REOPTIMIZIBLE, so it will be used for all subsequent executions of this SQL statement. Figure 13: New plan generated using execution statistics from initial execution 16 WHITE PAPER / The Optimizer in Oracle Database 19c Performance Feedback Another form of re-optimization is Performance Feedback, which helps to improve the degree of parallelism chosen for repeated SQL statements when Automatic Degree of Parallelism (Auto DOP)1 is enabled with parallel_degree_policy = ADAPTIVE. When Auto DOP is enabled in adaptive mode, during the first execution of a SQL statement, the optimizer determines if the statement should execute in parallel and if so what parallel degree should be used. The parallel degree is chosen based on the estimated performance of the statement. Additional performance monitoring is also enabled for the initial execution of any SQL statement the optimizer decides to execute in parallel. At the end of the initial execution, the parallel degree chosen by the optimizer is compared to the parallel degree computed base on the actual performance statistics (e.g. CPU-time) gathered during the initial execution of the statement. If the two values vary significantly then the statement is marked for re-optimization and the initial execution performance statistics are stored as feedback to help compute a more appropriate degree of parallelism for subsequent executions. If performance feedback is used for a SQL statement then it is reported in the note section under the plan as shown in Figure 14. Figure 14: Execution plan for a SQL statement that was found to run better serial by performance feedback 17 WHITE PAPER / The Optimizer in Oracle Database 19c SQL plan directives SQL plan directives are automatically created based on information learnt via automatic re-optimization. A SQL plan directive is additional information that the optimizer uses to generate a more optimal execution plan. For example, when joining two tables that have a data skew in their join columns, a SQL plan directive can direct the optimizer to use dynamic statistics to obtain a more accurate join cardinality estimate. SQL plan directives are created on query expressions rather than at a statement or object level to ensure they can be applied to multiple SQL statements. It is also possible to have multiple SQL plan directives used for a SQL statement. The number of SQL plan directives used for a SQL statement is shown in the note section under the execution plan (Figure 15). Figure 15: The number of SQL plan directives used for a statement is shown in the note section under the plan The database automatically maintains SQL plan directives and stores them in the SYSAUX tablespace. Any SQL plan directive that is not used after 53 weeks will be automatically purged. SQL plan directives can also be manually managed (altered or deleted) using the package DBMS_SPD but it is not possible to manually create a SQL plan directive. SQL plan directives can be monitored using the views DBA_SQL_PLAN_DIRECTIVES and DBA_SQL_PLAN_DIR_OBJECTS (See Figure 16). Figure 16: Monitoring SQL plan directives automatically created based on information learnt via re-optimization There a two types of SQL plan directive rows: DYNAMIC_SAMPLING and DYNAMIC_SAMPLING_RESULT. The dynamic sampling type tells the optimizer that when it sees this particular query expression (for example, filter predicates on country_id, cust_city, and cust_state_province being used together) it should use dynamic sampling to address the cardinality misestimate. 18 WHITE PAPER / The Optimizer in Oracle Database 19c The dynamic sampling result type is present from Oracle Database 12c Release 2 onwards and signifies where results from dynamic sampling queries are stored in the SQL directive repository (instead of the Server Result Cache as used by Oracle Database 12c Release 1). Figure 17: Dynamic sampling results stored in the SQL plan directive repository in Oracle Database 12c Release 2 onwards. From Oracle Database 12c Release 2 onwards, SQL plan directives are created but they are not used at parse time to modify SQL execution plans. This prioritizes plan stability over maximizing query execute performance. The optimizer's use of SQL plan directives is controlled by the database initialization parameter optimizer_adaptive_statistics, which has the default value of FALSE. This setting is recommended for most systems. In addition to finding optimal SQL execution plans, SQL plan directives can be used by the database to determine if extended statistics2 , specifically column groups, are missing and would resolve the cardinality misestimates. After a SQL directive is used the optimizer decides if the cardinality misestimate could be resolved with a column group. If so, the database can automatically create that column group the next time statistics are gathered on the appropriate table. This step is “always on” in Oracle Database 12c Release 1, but from Oracle Database 12c Release 2, it is controlled by the DBMS_STATS preference AUTO_STAT_EXTENSIONS. The recommended (and default) value is OFF. This prioritizes plan stability over maximizing query performance. Automatic column group creation can be enabled using the following step: EXEC DBMS_STATS.SET_GLOBAL_PREFS('AUTO_STAT_EXTENSIONS', 'ON') Extended statistics will be used in place of the SQL plan directive when possible (equality predicates, group bys etc.). If the SQL plan directive is no longer necessary it will be automatically purged after 53 weeks. 2 More information on extended statistics can be found in the paper “Understanding Optimizer Statistics with Oracle Database 19c”. See Reference 1. 19 WHITE PAPER / The Optimizer in Oracle Database 19c OPTIMIZER STATISTICS Optimizer statistics are a collection of data that describe the database and the objects in it. The optimizer uses these statistics to choose the best execution plan for each SQL statement. Being able to gather the appropriate statistics in a timely manner is critical to maintaining acceptable performance on any Oracle system. With each new release, Oracle strives to provide the necessary statistics automatically. A summary is presented here, but full details can be found in Reference 1, Understanding Optimizer Statistics with Oracle Database 19c. High-frequency Automatic Statistics Gathering This Oracle Database 19c new feature is available on certain Oracle Database platforms. Check the Oracle Database Licensing Guide for more information. Stale statistics can mislead the optimizer and in some cases can result in a sub-optimal plan. To address this, automatic optimizer statistics collection was introduced in Oracle 11g. The automated maintenance task infrastructure schedules statistics gathering to occur in maintenance windows. By default, one window is scheduled for each day of the week and statistics collection runs in all predefined maintenance windows. For systems with rapidly changing datasets, data can drastically change in a short period time, rapidly rendering statistics stale and making daily statistics gathering insufficient. New to Oracle Database 19c, high-frequency automatic statistics collection is introduced as a complement to the existing auto task. It is decoupled from the maintenance window. By configuring the DBMS_STATS preferences, the DBA can customize the frequency and maximum run time according to their requirement. Note that the existing automatic statistics collection running in the predefined maintenance window will not be affected, and high-frequency automatic statistics task will not start during the maintenance window. Histograms Histograms tell the optimizer about the distribution of data within a column. By default, the optimizer assumes a uniform distribution of rows across the distinct values in a column and will calculate the cardinality for a query with an equality predicate by dividing the total number of rows in the table by the number of distinct values in the column used in the equality predicate. The presence of a histogram changes the formula used by the optimizer to determine the cardinality estimate, and allows it to generate a more accurate estimate. Online Statistics Gathering When an index is created, Oracle automatically gathers optimizer statistics as part of the index creation by piggybacking the statistics gather on the full data scan and sort necessary for the index creation (this has been available since Oracle Database 9i). The same technique is applied for direct path operations such as, create table as select (CTAS) and insert as select (IAS) operations into empty tables. Piggybacking the statistics gather as part of the data loading operation, means no additional full data scan is required to have statistics available immediately after the data is loaded. The additional time spent on gathering statistics is small compared to a separate statistics collection process, and it guarantees to have accurate statistics readily available from the get-go. 20 WHITE PAPER / The Optimizer in Oracle Database 19c Real-time Statistics This Oracle Database 19c new feature is available on certain Oracle Database platforms. Check the Oracle Database Licensing Guide for more information. Online statistics gathering for create table as select (CTAS) and direct path insert was first introduced in Oracle Database 12c Release 1. This allows statistics to be accurate and available immediately, but prior to Oracle Database 19c, this capability was not applicable to conventional database manipulation language (DML) operations which dominate the way data is manipulated in (for example) traditional OLTP systems. Real-time statistics extends the online statistic gathering techniques to conventional insert, update and merge DML operations. In order to minimize the performance overhead of generating these statistics, only the most essential optimizer statistics are gathered during DML operations. The collection of the remaining stats (such as the number of distinct values) is deferred to the automatic statistics gathering job, high-frequency stats gathering and/or the manual invocation of the DBMS_STATS API. See Reference 2, Best Practices for Gathering Optimizer Statistics with Oracle Database 19c, for more information. Incremental Statistics Gathering statistics on partitioned tables consists of gathering statistics at both the table level (global statistics) and at the (sub)partition level. If data in (sub)partitions is changed in any way, or if partitions are added or removed, then the global-level statistics must be updated to reflect the changes so that there is correspondence between the partition-level and global-level statistics. For large partitioned tables, it can be very costly to scan the whole table to reconstruct accurate global-level statistics. For this reason, incremental statistics were introduced in Oracle Database 11g to address this issue, whereby synopses were created for each partition in the table. These data structures can be used to derive global-level statistics – including non-aggregatable statistics such as column cardinality - without scanning the entire table. Incremental Statistics and Staleness A DBMS_STATS preference called INCREMENTAL_STALENESS allows you to control when partition statistics will be considered stale and not good enough to generate global level statistics. By default, INCREMENTAL_STALENESS is set to NULL, which means partition level statistics are considered stale as soon as a single row changes (the same behavior as in Oracle Database 11g). Alternatively, it can be set to USE_STALE_PERCENT or USE_LOCKED_STATS. USE_STALE_PERCENT means the partition level statistics will be used as long as the percentage of rows changed (in the respective partition or subpartition) is less than the value of the preference STALE_PERCENTAGE (10% by default). USE_LOCKED_STATS means if statistics on a partition are locked, they will be used to generate global level statistics regardless of how many rows have changed in that partition since statistics were last gathered. Incremental Statistics and Partition Exchange Loads One of the benefits of partitioning is the ability to load data quickly and easily, with minimal impact on the business users, by using the exchange partition command. The exchange partition command allows the data in a non-partitioned table to be swapped into a specified partition in the partitioned table. The command does not physically move data; instead it updates the data dictionary to exchange a pointer from the partition to the table and vice versa. The Oracle Database allows you to create the necessary statistics (synopses) on load tables before they are exchanged into the partitioned table. This means that global partition statistics can be maintained incrementally the moment that the load table is exchanged with the relevant table partition. Compact Synopses The performance for statistics gathering with incremental statistics can come with the price of high disk storage of synopses (they are stored in the SYSAUX tablespace). More storage is required for synopses for tables with a high number of partitions and a large number of columns, particularly where the number of distinct values (NDV) is high. Besides consuming storage space, the performance overhead of maintaining very large synopses can become significant. Oracle Database 12c Release 2 introduced a new algorithm for gathering and storing NDV information, which results in much smaller synopses while maintaining a similar level of accuracy to the previous algorithm. 21 WHITE PAPER / The Optimizer in Oracle Database 19c CONCURRENT STATISTICS When the global statistics gathering preference CONCURRENT is set, Oracle employs the Oracle Job Scheduler to create and manage one statistics gathering job per object (tables and / or partitions) concurrently. AUTOMATIC COLUMN GROUP DETECTION Extended statistics were introduced in Oracle Database 11g. They help the optimizer improve the accuracy of cardinality estimates for SQL statements that contain predicates involving a function wrapped column (e.g. UPPER(LastName)) or multiple columns from the same table that are used in filter predicates, join conditions, or group-by keys. Although extended statistics are extremely useful it can be difficult to know which extended statistics should be created if you are not familiar with an application or data set. Auto column group detection, automatically determines which column groups are required for a table based on a given workload. The detection and creation of column groups is a simple three-step procedure3 . NEW REPORTING SUBPROGRAMS IN DBMS_STATS PACKAGE Knowing when and how to gather statistics in a timely manner is critical to maintain acceptable performance on any system. Determining what statistics gathering operations are currently executing in an environment and how changes to the statistics methodology will impact the system can be difficult and time consuming. Reporting subprograms in DBMS_STATS package make it easier to monitor what statistics gathering activities are currently going on and what impact changes to the parameter settings of these operations will have. The DBMS_STATS subprograms are REPORT_STATS_OPERATIONS, REPORT_SINGLE_STATS_OPERATION and REPORT_GATHER_*_STATS. Figure 18 shows an example output from the REPORT_STATS_OPERATIONS function. The report shows detailed information about what statistics gathering operations have occurred, during a specified time window. It gives details on when each operation occurred, its status, and the number of objects covered and it can be displayed in either text or HTML format. Figure 18: Reporting stats operations. 3 For information on creating column groups, see the white paper Understanding Optimizer Statistics With Oracle Database 19c. 22 WHITE PAPER / The Optimizer in Oracle Database 19c SQL QUARANTINE This Oracle Database 19c new feature is available on certain Oracle Database platforms. Check the Oracle Database Licensing Guide for more information. RUNAWAY SQL STATEMENTS Runaway SQL statements can be a threat to database service levels by consuming significant system resources if they remain unchecked. This is a particular problem if multiple runaway queries are running concurrently. The Oracle Database Resource Manager (DBRM) has always offered the ability to terminate a SQL statement that exceed certain resource utilization thresholds (such as CPU and IO) as well as the ability to terminate queries that exceed maximum runtime thresholds. However, this does not prevent repeated execution of runaway queries. A problem query may be run over and over again, each time consuming significant resources before being terminated each time. For example: Figure 19 - Runaway SQL statements For the sake of example, we will assume that the machine CPU utilization looks something like this in response to the runaway queries being executed. In this case, when three queries are running concurrently, the CPU is nearly at 100%: Figure 20 - CPU consumed by runaway SQL 23 WHITE PAPER / The Optimizer in Oracle Database 19c SQL QUARANTINE The new Oracle Database 19c feature SQL Quarantine can be used to eliminate the overhead of runaway queries. When DBRM detects a SQL statement is exceeding a resource or run-time limit, the SQL execution plan used by the statement is quarantined. If the SQL statement is executed again and it is using the same SQL execution plan then it will be terminated immediately. This can significantly reduce the amount of system resource that would otherwise be wasted. In the following example, there is high utilization when the initial queries are executed, but once they are quarantined they no longer consume system resources because they are terminated prior to execution: Figure 21 - CPU saved by SQL quarantine DBRM must be configured to apply a limit. In this case, queries executing for more than 5 seconds will be terminated: -- The plan will cancel the current SQL if elapsed time exceeds 5 seconds. dbms_resource_Manager.create_plan_directive( plan => 'LIMIT_RESOURCE', group_or_subplan => 'TEST_RUNAWAY_PLANS', comment => 'Kill statement after exceeding desired execution time', switch_group => 'CANCEL_SQL', switch_time => 5, switch_estimate => false ); A DBRM exception is raised as follows: SQL> select /* Time consuming query */ * from ...; * ERROR at line 1: ORA-00040: active time limit exceeded - call aborted Elapsed: 00:00:07.59 Quarantine information is held in memory and flushed to disk periodically. Once the quarantine configuration is persisted in the data dictionary, the following error condition will be raised: SQL> select /* Time consuming query */ * from ...; * ERROR at line 1: ORA-56955: quarantined plan used Elapsed: 00:00:00.00 24 WHITE PAPER / The Optimizer in Oracle Database 19c OPTIMIZER STATISTICS ADVISOR It is well known that inferior statistics cause query performance problems. It is relatively easy to identify stale, out-of-date statistics and missing statistics, but poor quality statistics can be harder to identify: such as inconsistencies between tables and indexes, primary-key/foreign-key relationships and so on. Inconsistencies in statistics are usually a result of not following recommended approaches, but it is not always easy to strictly adhere to these for a number of reasons. For example, Oracle continuously enhances statistics gathering features but enhancements can be overlooked postupgrade (a good example is the recommendation to use AUTO_SAMPLE_SIZE rather than fixed percentages). DBAs may use legacy scripts to gather statistics manually so that there is a reluctance to change “proven” procedures. Sometimes statistics gathering can be overlooked and statistics might not be maintained during batch processing and there may be a perceived lack of time in batch windows. There are many “inherited” systems too, where nobody understands the scripts that are used to maintain statistics. From Oracle Database 12 Release 2, a feature called the Optimizer Statistics Advisor is available. The goal of the advisor is to analyze how statistics are gathered, validate the quality of statistics already gathered and check the status of auto stats gathering (for example, checking for successful completion). To achieve this, it examines the data dictionary with respect to a set of rules. Where exceptions to the rules are found, findings may be generated and these, in turn, may lead to specific recommendations. The advisor will generate a report that lists findings (with the associated “broken” rule), and then list specific recommendations to remedy the situation. Finally, the recommendations can be implemented using a set of actions. Actions can be output in the form of a SQL script or they can be implemented automatically. Full details can be found in Reference 2, Best Practices for Gathering Optimizer Statistics with Oracle Database 19c. 25 WHITE PAPER / The Optimizer in Oracle Database 19c SQL PLAN MANAGEMENT SQL Plan Management (SPM) is an exceptionally important feature to consider for critical applications that require guaranteed SQL execution plan stability. SPM is furthermore fundamental for any database upgrade to evolve execution plans from one optimizer version to another in a controlled manner, managing execution plans and ensuring that only known or verified plans are used. A number of enhancements were made to SQL plan management in Oracle Database 12c: · Automatic plan evolution · Filtered auto capture · Capture plans from AWR Repository Oracle Database 18c adds: · Support for a subset of SQL plan management functionality in Oracle Database Standard Edition Oracle Database 19c adds: · Prevention of SQL regression with a new default 'AUTO' evolution mode that locates and tests alternative SQL execution plans from multiple sources to find the most optimal. These features are covered in detail in Reference 4, SQL Plan Management with Oracle Database 19c. ENHANCEMENTS TO DBMS EXPLAIN PLAN Hint Usage Reporting Hints used in SQL statements can be difficult to 'debug'. For example, it is not always easy to understand why a particular hint is not being used. There is no error reporting for hints, so if they are incorrectly specified in any way, they will be ignored silently. Hint usage reporting addresses this. It provides detailed information like invalid hints, conflicting hints, ignored hints (e.g. because of the query block location specified) and hints that successfully affected the final plan. Here is an example: EXPLAIN PLAN FOR SELECT /*+ INDEX(t1) FULL(@sel$2 t1) */ COUNT(*) FROM jobs t1 WHERE t1.job_id IN (SELECT /*+ FULL(t1) */ job_id FROM employees t1); SELECT * FROM TABLE(DBMS_XPLAN.display(format => 'typical')); ---------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 17 | 3 (34)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 17 | | | | 2 | NESTED LOOPS | | 19 | 323 | 3 (34)| 00:00:01 | | 3 | SORT UNIQUE | | 107 | 963 | 2 (0)| 00:00:01 | | 4 | TABLE ACCESS FULL| EMPLOYEES | 107 | 963 | 2 (0)| 00:00:01 | |* 5 | INDEX UNIQUE SCAN | JOB_ID_PK | 1 | 8 | 0 (0)| 00:00:01 | ---------------------------------------------------------------------------------- ... Hint Report (identified by operation id / Query Block Name / Object Alias): Total hints for statement: 3 (U - Unused (1)) --------------------------------------------------------------------------- 4 - SEL$5DA710D3 / "T1"@"SEL$2" U - FULL(t1) / hint overridden by another in parent query block - FULL(@sel$2 t1) 5 - SEL$5DA710D3 / "T1"@"SEL$1" - INDEX(t1) 26 WHITE PAPER / The Optimizer in Oracle Database 19c New format specifiers have been created to support hint reporting: · HINT_REPORT · HINT_REPORT_USED · HINT_REPORT_UNUSED For example: SELECT * FROM TABLE(DBMS_XPLAN.display(format => 'basic +hint_report')); SQL Plan Comparison The plan comparison feature (hence forth referred to compare plans tool) takes as input a reference plan and an arbitrary list of test plans. It compares the reference plan with each of the test plans in the list and produces a difference report. Such a tool is useful in a variety of use cases: · New plans: When a plan change occurs, users want to compare it with an old plan (for instance stored in AWR). · SQL Performance Analyzer (SPA), SQL Tune: How does a plan generated based on a new SQL profile or plan generated through SPA differs from the original plan? · SQL Plan Management: When an accepted baseline plan is not reproduced by SPM, how does the original plan that is stored in the baseline differs from the generated plan? What are the differences between the different plan baselines captured for the same query? · Plan generated from a different SQL statement or based on the effect of hints/parameters: How does adding a particular hint or changing the value of a parameter affects the plan? Plan comparison goes beyond a simple line-by-line comparison because SQL execution plans are execution trees. A more complex analysis is required and logical differences must be identified. Consider two queries: select count(*) from products p, sales s where p.prod_id = s.prod_id and p.prod_min_price > 200; select count(*) from products p, sales s where p.prod_id = s.prod_id and s.quantity_sold = 43; They can be compared as follows: VARIABLE v_rep CLOB BEGIN :v_rep := DBMS_XPLAN.COMPARE_PLANS( reference_plan => cursor_cache_object('0hxmvnfkasg6q', NULL), compare_plan_list => plan_object_list(cursor_cache_object('10dqxjph6bwum', NULL)), type => 'TEXT', level => 'TYPICAL', section => 'ALL'); END; / 27 WHITE PAPER / The Optimizer in Oracle Database 19c A report can be generated as follows: SET PAGESIZE 50000 SET LONG 100000 SET LINESIZE 210 COLUMN report FORMAT a200 SELECT :v_rep REPORT FROM DUAL; REPORT --------------------------------------------------------------------------------------------- COMPARE PLANS REPORT --------------------------------------------------------------------------------------------- Current user : SH Total number of plans : 2 Number of findings : 1 --------------------------------------------------------------------------------------------- COMPARISON DETAILS --------------------------------------------------------------------------------------------- Plan Number : 1 (Reference Plan) Plan Found : Yes Plan Source : Cursor Cache SQL ID : 0hxmvnfkasg6q Child Number : 0 Plan Database Version : 19.0.0.0 Parsing Schema : "SH" SQL Text : select count(*) from products p, sales s where p.prod_id = s.prod_id and s.quantity_sold = 43 Plan ----------------------------- Plan Hash Value : 3519235612 ------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | Time | ------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 469 | | | 1 | SORT AGGREGATE | | 1 | 3 | | | | 2 | PARTITION RANGE ALL | | 1 | 3 | 469 | 00:00:01 | | * 3 | TABLE ACCESS FULL | SALES | 1 | 3 | 469 | 00:00:01 | ------------------------------------------------------------------------- Predicate Information (identified by operation id): ------------------------------------------ * 3 - filter("S"."QUANTITY_SOLD"=43) --------------------------------------------------------------------------------------------- Plan Number : 2 Plan Found : Yes Plan Source : Cursor Cache SQL ID : 10dqxjph6bwum Child Number : 0 Plan Database Version : 19.0.0.0 Parsing Schema : "SH" SQL Text : select count(*) from products p, sales s where p.prod_id = s.prod_id and p.prod_min_price > 200 Plan ----------------------------- Plan Hash Value : 3037679890 -------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | Time | -------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 34 | | | 1 | SORT AGGREGATE | | 1 | 13 | | | | * 2 | HASH JOIN | | 781685 | 10161905 | 34 | 00:00:01 | | * 3 | TABLE ACCESS FULL | PRODUCTS | 61 | 549 | 2 | 00:00:01 | 28 WHITE PAPER / The Optimizer in Oracle Database 19c | 4 | PARTITION RANGE ALL | | 918843 | 3675372 | 29 | 00:00:01 | | 5 | BITMAP CONVERSION TO ROWIDS | | 918843 | 3675372 | 29 | 00:00:01 | | 6 | BITMAP INDEX FAST FULL SCAN | SALES_PROD_BIX | | | | | -------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): ------------------------------------------ * 2 - access("P"."PROD_ID"="S"."PROD_ID") * 3 - filter("P"."PROD_MIN_PRICE">200) Notes ----- - This is an adaptive plan Comparison Results (1): ----------------------------- 1. Query block SEL$1: Transformation JOIN REMOVED FROM QUERY BLOCK occurred only in the reference plan (result query block: SEL$A43D1678). INITIALIZATION PARAMETER CHANGES Oracle Database 18c made the following initialization parameters available: optimizer_ignore_hints Setting to TRUE will all the optimizer to ignore hints (other than hints intended to control parallel execution or direct path load). The default value is FALSE. optimizer_ignore_parallel_hints Setting to TRUE will allow the optimizer to ignore hints that control parallel operations (e.g. PARALLEL and NOPARALLEL). The default value is FALSE. 29 WHITE PAPER / The Optimizer in Oracle Database 19c REFERENCES 1. Understanding Optimizer Statistics with Oracle Database 19c 2. Best Practices for Gathering Optimizer Statistics with Oracle Database 19c 3. SQL Plan Management with Oracle Database 19c

No comments:

Post a Comment