Friday 2 February 2018

Collecting statistics for large table in oracle


To improve performance of default gather stats in ORACLE Database 11g

Sometime Default stats job will not collect statistic for large table. So, it is required to provide preference for those tables which will improve performance of overall database. 

Our Production Database (7 TB) consists of some large tables , which size is more than 200 GB 
 Stats Scheduler  is running every morning at 01:00 for 5 Hours . but scheduler is no picking large tables
which resulted statistics for some large tables are obsoletes 
The Oracle Statistics are being processed in parallel 
stats_parallel_degree = 2 , which means that basically up to 2 tables/indexes are being processed at the same time, but
one single thread processes each table. For the sake of simplicity, I call it "external parallelism"
It means that when the statistics of some large tables are obsoletes , and are automatically recalculated, it takes some significant amount of time, more than 9  hours for the stats jobs to be processed.
For all those reasons, I am trying to implement the "internal parallelism" , through the dbms_stats package, in order to configure several threads for the largest tables

Oracle uses the default degree of parallelism based on the number of CPUs. Note that the
default degree is NULL, which means that the database collects statistics using parallelism only if you set
the degree of parallelism at the table level with the DEGREE clause.


The maintenance tasks can run for a max of 8 hrs from 10pm on week nights, (will be ended at 6am, no matter if work is complete or not) 

Lets look at the GATHER_STATS_PROG and it’s details

It’s main job is to collect statistics based on the following:

1.  the tables found in the sys.DBA_TAB_MODIFICATIONS table.
2.  The sample size set by by the database, (commonly at 25% but I’ve seen it determine 100% in some databases)
3.  Deciding the method option for the tables, determining if histograms are required, etc..
4.  Determining the date the job is to be processing stats for, the priority of what objects should be gathered and in what order.
5. Exiting the gather stats job at the end of the maintenance window and not “over-running” this window.

If you have a database that is not a small, OLTP, I would like you to now go an inspect your sys.DBA_TAB_MODIFCATIONS table with the following query:

select * from sys.dba_tab_modifications
where table_owner not in (‘SYS’,’SYSTEM’)
order by timestamp;

Note

When Oracle is collecting the Optimizer Statistics, it will look into the following argument specified with
DBMA_STATS package
CASCADE- Gather statistics on the indexes as well
DEGREE -- Degree of parallelism
ESTIMATE_PERCENT- Percentage of rows to estimate (NULL
means compute):
METHOD_OPT
NO_INVALIDATE - Does not invalidate the dependent cursors if set to TRUE

GRANULARITY-- Granularity of statistics to collect (for partitioned tables).
PUBLISH -- Collect Statistics info directly in DD or in Private area
INCREMENTAL- Pertaining to Partition tables global Statistics information
STALE_PERCENT- When the Statistics is considered outdated, default to 10


Please Note that the arguments in Bold above are new in 11g.




















Improving the efficiency of gathering statistics

As data volumes grow and maintenance windows shrink, it is more important than ever to gather statistics in a timely manner. Oracle offers a variety of ways to speed up the statistics collection, from parallelizing the statistics gathering operations to generating statistics rather than collecting them.
Using parallelism
Parallelism can be leveraged in several ways for statistics collection

1) Intra object parallelism
2) Inter object parallelism
3)A combination of both intra and inter object parallelism

Intra object parallelism

Intra object parallelism is controlled by the DEGREE parameter in the DBMS_STATS.GATHER_*_STATS procedures.
The DEGREE parameter controls the number of parallel server processes that will be used to gather the statistics.
By default Oracle uses the same number of parallel server processes specified as an attribute of the table in the data dictionary (Degree of Parallelism).
All tables in an Oracle database have this attribute
set to 1 by default. It may be useful to explicitly set this parameter for the statistics collection on a large table to speed up statistics collection.
Alternatively you can set DEGREE to AUTO_DEGREE ;

Oracle will automatically determine the
appropriate number of parallel server processes that should be used to gather statistics, based on the size of an object. The value can be between 1 (serial execution) for small objects to DEFAULT_DEGREE
(PARALLEL_THREADS_PER_CPU X CPU_COUNT ) for larger objects.
15Best Practices for Gathering Optimizer Statistics
Figure 15. Use Intra object parallelism via the DEGREE parameter of the DBMS_STATS.GATHER_*_STATS
procedures
You should note that setting the DEGREE for a partitioned table means that multiple parallel sever
processes will be used to gather statistics on each partition but the statistics will not be gathered
concurrently on the different partitions. Statistics will be gathered on each partition one after the other.
Inter object parallelism
In Oracle Database 11.2.0.2, inter object parallelism was introduced and is controlled by the global
statistics gathering preference CONCURRENT 8 . When CONCURRENT is set to TRUE , Oracle employs the
Oracle Job Scheduler and Advanced Queuing components to create and manage multiple statistics
gathering jobs concurrently. Gathering statistics on multiple tables and (sub)partitions concurrently can
reduce the overall time it takes to gather statistics by allowing Oracle to fully utilize a multi-processor
environment.
The maximum number of active concurrent statistics gathering jobs is controlled by the
JOB_QUEUE_PROCESSES parameter. By default the JOB_QUEUE_PROCESSES is set to 1000. Typically
this is too high for a CONCURRENT statistics gathering operation especially if parallel execution will also
be employed. A more appropriate value would be 2 X total number of CPU cores (this is a per node
parameter in a RAC environment). You need to make sure that you set this parameter system-wise
( ALTER SYSTEM ... or in init.ora file) rather than at the session level ( ALTER SESSION ).
Combining Intra and Inter parallelism
Each of the statistics gathering jobs in a concurrent statistics gather operation can execute in parallel.
Combining concurrent statistics gathering and parallel execution can greatly reduce the time it takes to
gather statistics.
More information on concurrent statistics gathering can be found in part one of this series, Understanding
Optimizer Statistics.
8
16Best Practices for Gathering Optimizer Statistics
Figure 16. Use Inter and Intra object parallelism to speed up a DBMS_STATS.GATHER_TABLE_STATS on a
partitioned table.
When using parallel execution as part of a concurrent statistics gathering you should disable the
PARALLEL_ADAPTIVE_MULTI_USER initialization parameter to prevent the parallel jobs from being
down graded to serial. Again this should be done at a system level and not at a session level. That is;
Figure 17. Disable parallel_adaptive_mutli_user parameter
Incremental statistics
Gathering statistics on partitioned tables consists of gathering statistics at both the table level (global
statistics) and (sub)partition level. If the INCREMENTAL 9 preference for a partitioned table is set to
TRUE , the DBMS_STATS.GATHER_*_STATS parameter GRANULARITY includes GLOBAL, and
ESTIMATE_PERCENT is set to AUTO_SAMPLE_SIZE , Oracle will accurately derive all global level
statistics by scanning only those partitions that have been added or modified, and not the entire table.
Incremental global statistics works by storing a synopsis for each partition in the table. A synopsis is
statistical metadata for that partition and the columns in the partition. Aggregating the partition level
statistics and the synopses from each partition will accurately generate global level statistics, thus
eliminating the need to scan the entire table. When a new partition is added to the table, you only need
More information on Incremental Statistics can be found in part one of this series, Understanding
Optimizer Statistics.
9
17Best Practices for Gathering Optimizer Statistics
to gather statistics for the new partition. The table level statistics will be automatically and accurately
calculated using the new partition synopsis and the existing partitions’ synopses.
Note partition statistics are not aggregated from subpartition statistics when incremental statistics are
enabled.


eg

1) Identify by big tables and table having more transactions and set parallelism for those tables

 table_table                 
 table_tt_SHARE  
 table_SNAPSHOT      
 table_ACTION            
 tt_BNPL_table
 table_ACCOUNT               
 FORMATTED_table_LINE   
 FORMATTED_table        
 REWARD_table_DETAILS   
 tableTION                 

Pre-check

select dbms_stats.get_prefs ('DEGREE','table_table') Degree from dual;                 
select dbms_stats.get_prefs ('DEGREE','table_tt_SHARE') Degree from dual;  
select dbms_stats.get_prefs ('DEGREE','table_SNAPSHOT') Degree from dual;      
select dbms_stats.get_prefs ('DEGREE','table_ACTION') Degree from dual;            
select dbms_stats.get_prefs ('DEGREE','tt_BNPL_table') Degree from dual;
select dbms_stats.get_prefs ('DEGREE','table_ACCOUNT') Degree from dual;               
select dbms_stats.get_prefs ('DEGREE','FORMATTED_table_LINE') Degree from dual;   
select dbms_stats.get_prefs ('DEGREE','FORMATTED_table') Degree from dual;        
select dbms_stats.get_prefs ('DEGREE','REWARD_table_DETAILS') Degree from dual;   
select dbms_stats.get_prefs ('DEGREE','tableTION') Degree from dual;                 


Implementation

select dbms_stats.get_prefs ('DEGREE','table_table') Degree from dual;                 
select dbms_stats.get_prefs ('DEGREE','table_tt_SHARE') Degree from dual;  
select dbms_stats.get_prefs ('DEGREE','table_SNAPSHOT') Degree from dual;      
select dbms_stats.get_prefs ('DEGREE','table_ACTION') Degree from dual;            
select dbms_stats.get_prefs ('DEGREE','tt_BNPL_table') Degree from dual;
select dbms_stats.get_prefs ('DEGREE','table_ACCOUNT') Degree from dual;               
select dbms_stats.get_prefs ('DEGREE','FORMATTED_table_LINE') Degree from dual;   
select dbms_stats.get_prefs ('DEGREE','FORMATTED_table') Degree from dual;        
select dbms_stats.get_prefs ('DEGREE','REWARD_table_DETAILS') Degree from dual;   
select dbms_stats.get_prefs ('DEGREE','tableTION') Degree from dual;                 

exec dbms_stats.set_table_prefs('schema', 'table_table', 'DEGREE', '4');                 
exec dbms_stats.set_table_prefs('schema', 'table_tt_SHARE', 'DEGREE', '4');  
exec dbms_stats.set_table_prefs('schema', 'table_SNAPSHOT', 'DEGREE', '4');      
exec dbms_stats.set_table_prefs('schema', 'table_ACTION', 'DEGREE', '4');            
exec dbms_stats.set_table_prefs('schema', 'tt_BNPL_table', 'DEGREE', '4');
exec dbms_stats.set_table_prefs('schema', 'table_ACCOUNT', 'DEGREE', '4');               
exec dbms_stats.set_table_prefs('schema', 'FORMATTED_table_LINE', 'DEGREE', '4');   
exec dbms_stats.set_table_prefs('schema', 'FORMATTED_table', 'DEGREE', '4');        
exec dbms_stats.set_table_prefs('schema', 'REWARD_table_DETAILS ', 'DEGREE', '4');  
exec dbms_stats.set_table_prefs('schema', 'tableTION', 'DEGREE', '4');                 



select dbms_stats.get_prefs ('DEGREE','table_table') Degree from dual;                 
select dbms_stats.get_prefs ('DEGREE','table_tt_SHARE') Degree from dual;  
select dbms_stats.get_prefs ('DEGREE','table_SNAPSHOT') Degree from dual;      
select dbms_stats.get_prefs ('DEGREE','table_ACTION') Degree from dual;            
select dbms_stats.get_prefs ('DEGREE','tt_BNPL_table') Degree from dual;
select dbms_stats.get_prefs ('DEGREE','table_ACCOUNT') Degree from dual;               
select dbms_stats.get_prefs ('DEGREE','FORMATTED_table_LINE') Degree from dual;   
select dbms_stats.get_prefs ('DEGREE','FORMATTED_table') Degree from dual;        
select dbms_stats.get_prefs ('DEGREE','REWARD_table_DETAILS') Degree from dual;   
select dbms_stats.get_prefs ('DEGREE','tableTION') Degree from dual;                 

Rollback

exec DBMS_STATS.DELETE_TABLE_PREFS('schema', ' table_table ', 'DEGREE');                
exec DBMS_STATS.DELETE_TABLE_PREFS('schema', ' table_tt_SHARE', 'DEGREE');  
exec DBMS_STATS.DELETE_TABLE_PREFS('schema', ' table_SNAPSHOT', 'DEGREE');      
exec DBMS_STATS.DELETE_TABLE_PREFS('schema', ' table_ACTION', 'DEGREE');            
exec DBMS_STATS.DELETE_TABLE_PREFS('schema', ' tt_BNPL_table', 'DEGREE');
exec DBMS_STATS.DELETE_TABLE_PREFS('schema', ' table_ACCOUNT', 'DEGREE');               
exec DBMS_STATS.DELETE_TABLE_PREFS('schema', ' FORMATTED_table_LINE ', 'DEGREE');  
exec DBMS_STATS.DELETE_TABLE_PREFS('schema', ' FORMATTED_table', 'DEGREE');        
exec DBMS_STATS.DELETE_TABLE_PREFS('schema', ' REWARD_table_DETAILS', 'DEGREE');   
exec DBMS_STATS.DELETE_TABLE_PREFS('schema', ' tableTION', 'DEGREE');                 


select dbms_stats.get_prefs ('DEGREE','table_table') Degree from dual;

exec dbms_stats.set_table_prefs('schema', table_table', 'DEGREE', '4')


Rollback

exec DBMS_STATS.DELETE_TABLE_PREFS('schema', 'table_table', 'DEGREE')


    Bug 16475397


DBMS_STATS.DELETE_TABLE_PREFS ('SCOTT', 'EMP', 'DEGREE');


SQL> select dbms_stats.get_prefs ('STALE_PERCENT','table_table') stale_percent from dual;


STALE_PERCENT
---------------------------------------------------------------------
SQL> select dbms_stats.get_prefs ('DEGREE','table_table') stale_percent from dual;


STALE_PERCENT
---------------------------------------------------------------
NULL

SQL>





No comments:

Post a Comment