8i | 9i | 10g | 11g | 12c | 13c | 18c | 19c | 21c | 23c | Misc | PL/SQL | SQL | RAC | WebLogic | Linux
Real-Time Statistics in Oracle Database 19c
Oracle database 19c introduced real-time statistics to reduce the chances that stale statistics will adversely affect optimizer decisions when generating execution plans.
Oracle database 12.1 introduced online statistics gathering for bulk loads. This feature allowed the database to gather a subset of statistics during CTAS and some direct path insert operations. Real-time statistics take this further, allowing statistics to be gathered as part of some conventional path operations.
This feature is only available on Engineered Systems such as Exadata and Exadata Cloud Service.
- Prerequisites
- Setup
- Format of Each Test
- Conventional Path INSERT ... SELECT
- Direct Path INSERT ... SELECT
- Conventional Path INSERT
- Conventional Path INSERT in Loop
- Bulk-Bind INSERT
- UPDATE
- DELETE
- Partitioned Tables
- NO_GATHER_OPTIMIZER_STATISTICS Hint
- Hidden Parameters
- Transfer Real-Time Statistics
- Thoughts
Related articles.
- Real-Time Statistics in Oracle Database 19c
- Online Statistics Gathering for Bulk Loads in Oracle Database 12c Release 1 (12.1)
- Cost-Based Optimizer (CBO) And Database Statistics
Prerequisites
This feature is currently restricted to Enterprise Edition on Engineered Systems, like Exadata and Exadata Cloud Service, as described here. There is a workaround for testing by enabling the "_exadata_feature_on
" initialisation parameter. Don't use this on a real instance or you will be breaking your license agreement.
export ORACLE_SID=cdb1 export ORAENV_ASK=NO . oraenv export ORAENV_ASK=YES sqlplus / as sysdba <<EOF alter system set "_exadata_feature_on"=true scope=spfile; shutdown immediate; startup; exit; EOF
Remember to reset this parameter once you have finished testing real-time ststistics.
export ORACLE_SID=cdb1 export ORAENV_ASK=NO . oraenv export ORAENV_ASK=YES sqlplus / as sysdba <<EOF alter system reset "_exadata_feature_on" scope=spfile; shutdown immediate; startup; exit; EOF
Setup
Connect to a privileged user and create a test user. Give it the CREATE SESSION
and CREATE TABLE
privileges.
CONN sys/SysPassword1@//localhost:1521/pdb1 AS SYSDBA --DROP USER testuser1 CASCADE; CREATE USER testuser1 IDENTIFIED BY testuser1 QUOTA UNLIMITED ON users; GRANT CREATE SESSION, CREATE TABLE TO testuser1;
We also need a few extra grants, just for our testing.
-- For resetting the test data. GRANT CREATE PROCEDURE TO testuser1; -- So we can use the DBMS_XPLAN.DISPLAY_CURSOR table function. GRANT SELECT_CATALOG_ROLE TO testuser1; -- So we can use the DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO procedure. GRANT ANALYZE ANY TO testuser1;
Connect to test user and create a test table.
CONN testuser1/testuser1@//localhost:1521/pdb1 CREATE TABLE tab1 ( id NUMBER, description VARCHAR2(50) );
Create a procedure to truncate the test table, populate it with some data and gather all table statistics.
CREATE OR REPLACE PROCEDURE reset_tab1 AS BEGIN EXECUTE IMMEDIATE 'TRUNCATE TABLE tab1'; INSERT /*+ APPEND */ INTO tab1 (id, description) SELECT level, 'Description of ' || level FROM dual CONNECT BY level <= 1000; COMMIT; DBMS_STATS.gather_table_stats(NULL, 'tab1'); END; /
Format of Each Test
Before we run a test we will reset the test table using the following procedure call.
EXEC reset_tab1;
It can take some time for monitoring information to be written to the *_TAB_MODIFICATIONS
, *_TAB_STATISTICS
and *_IND_STATISTICS
views, so after each test we will use the DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO
procedure to immediately flush the information to the dictionary.
EXEC DBMS_STATS.flush_database_monitoring_info;
We will then run the following queries to show some of the statistics information for the table and columns. These are not the only statistics that are affected, but they are useful to give us an indication of what is happening. The output below represents the starting point after the reset.
-- SQL*Plus and SQLcl formatting. SET LINESIZE 100 PAGESIZE 1000 COLUMN table_name FORMAT A10 COLUMN partition_name FORMAT A15 COLUMN column_name FORMAT A11 COLUMN notes FORMAT A25 COLUMN high_value FORMAT 9999999999 COLUMN low_value FORMAT 9999999999 -- NUM_ROWS from USER_TABLES. SELECT table_name, num_rows FROM user_tables WHERE table_name = 'TAB1'; TABLE_NAME NUM_ROWS ---------- ---------- TAB1 1000 SQL> -- NUM_ROWS from USER_TAB_STATISTICS. SELECT table_name, num_rows, notes FROM user_tab_statistics WHERE table_name = 'TAB1'; TABLE_NAME NUM_ROWS NOTES ---------- ---------- ------------------------- TAB1 1000 SQL> -- LOW_VALUE and HIGH_VALUE of ID column from USER_TAB_COL_STATISTICS. SELECT table_name, column_name, UTL_RAW.cast_to_number(low_value) AS low_value, UTL_RAW.cast_to_number(high_value) AS high_value, sample_size, notes FROM user_tab_col_statistics WHERE table_name = 'TAB1' AND column_name = 'ID'; TABLE_NAME COLUMN_NAME LOW_VALUE HIGH_VALUE SAMPLE_SIZE NOTES ---------- ----------- ----------- ----------- ----------- ------------------------- TAB1 ID 1 1000 1000 SQL>
The first test will have some extra detail, but subsequent tests will be less verbose.
Conventional Path INSERT ... SELECT
We reset the test data and run a conventional path INSERT ... SELECT
to create 1000 rows.
-- Reset the test data. EXEC reset_tab1; -- Conventional path insert. (rows 1001-2000) INSERT INTO tab1 SELECT level+1000 AS id, 'Description of ' || level AS description FROM dual CONNECT BY level <= 1000; COMMIT;
We check the execution plan of the statement using the DBMS_XPLAN.DISPLAY_CURSOR
table function and we can see an "OPTIMIZER STATISTICS GATHERING" operation in the plan.
SELECT * FROM TABLE(DBMS_XPLAN.display_cursor); PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------- SQL_ID 5w9a7t69pzwcj, child number 0 ------------------------------------- INSERT INTO tab1 SELECT level+1000 AS id, 'Description of ' || level AS description FROM dual CONNECT BY level <= 1000 Plan hash value: 1236776825 --------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | --------------------------------------------------------------------------------- | 0 | INSERT STATEMENT | | | 2 (100)| | | 1 | LOAD TABLE CONVENTIONAL | TAB1 | | | | | 2 | OPTIMIZER STATISTICS GATHERING | | 1 | 2 (0)| 00:00:01 | | 3 | CONNECT BY WITHOUT FILTERING | | | | | | 4 | FAST DUAL | | 1 | 2 (0)| 00:00:01 | --------------------------------------------------------------------------------- SQL>
We flush the monitoring information to the dictionary.
EXEC DBMS_STATS.flush_database_monitoring_info;
We check some basic statistics for the TAB1
table.
-- NUM_ROWS from USER_TABLES. SELECT table_name, num_rows FROM user_tables WHERE table_name = 'TAB1'; TABLE_NAME NUM_ROWS ---------- ---------- TAB1 1000 SQL> -- NUM_ROWS from USER_TAB_STATISTICS. SELECT table_name, num_rows, notes FROM user_tab_statistics WHERE table_name = 'TAB1'; TABLE_NAME NUM_ROWS NOTES ---------- ---------- ------------------------- TAB1 1000 TAB1 2000 STATS_ON_CONVENTIONAL_DML SQL> -- LOW_VALUE and HIGH_VALUE of ID column from USER_TAB_COL_STATISTICS. SELECT table_name, column_name, UTL_RAW.cast_to_number(low_value) AS low_value, UTL_RAW.cast_to_number(high_value) AS high_value, sample_size, notes FROM user_tab_col_statistics WHERE table_name = 'TAB1' AND column_name = 'ID'; TABLE_NAME COLUMN_NAME LOW_VALUE HIGH_VALUE SAMPLE_SIZE NOTES ---------- ----------- ----------- ----------- ----------- ------------------------- TAB1 ID 1 1000 1000 TAB1 ID 1 1885 11 STATS_ON_CONVENTIONAL_DML SQL>
We see the impact of the real-time statistics.
- The
NUM_ROWS
column in theUSER_TABLES
view is not amended by this action. This is common to all tests shown below. This value is only changed when new statistics are gathered using theDBMS_STATS
package. - An extra row has been added to the
USER_TAB_STATISTICS
view with theNOTES
column set to "STATS_ON_CONVENTIONAL_DML". TheNUM_ROWS
column of this new row contains the estimate of the rows in the table after the operation is complete. There were 1000 rows in the table and we added 1000 new rows, so the total is 2000 rows. - An extra row has been added to the
USER_TAB_COL_STATISTICS
view with theNOTES
column set to "STATS_ON_CONVENTIONAL_DML". We see a sample size of 11 rows, which has allowed it to estimate theHIGH_VALUE
of theID
column to be 1885. We know it is 2000, but the estimate of 1885 is pretty good compared to the originalHIGH_VALUE
of 1000.
We run a simple query against the TAB1
table and check the execution plan.
SELECT MAX(id) FROM tab1; MAX(ID) ---------- 2000 SQL> SELECT * FROM TABLE(DBMS_XPLAN.display_cursor); PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------- SQL_ID 0gp1p1ghjk17r, child number 0 ------------------------------------- SELECT MAX(id) FROM tab1 Plan hash value: 1117438016 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 6 (100)| | | 1 | SORT AGGREGATE | | 1 | 4 | | | | 2 | TABLE ACCESS FULL| TAB1 | 2000 | 8000 | 6 (0)| 00:00:01 | --------------------------------------------------------------------------- Note ----- - dynamic statistics used: statistics for conventional DML SQL>
The estimated number of rows for the full table scan of the TAB1
table is 2000, so the optimizer has used the real-time statistics to get better information about the contents of the table. The note at the bottom tells use that "statistics for convention DML" were used for this execution plan.
Direct Path INSERT ... SELECT
We reset the test data and run a direct path INSERT ... SELECT
to create 1000 rows, then flush the monitoring data to the dictionary.
-- Reset the test data. EXEC reset_tab1; -- Direct path insert. (rows 1001-2000) INSERT /*+ APPEND */ INTO tab1 SELECT level+1000 AS id, 'Description of ' || level AS description FROM dual CONNECT BY level <= 1000; COMMIT; -- Flush monitoring information to the data dictionary. EXEC DBMS_STATS.flush_database_monitoring_info;
We check some basic statistics for the TAB1
table.
-- NUM_ROWS from USER_TABLES. SELECT table_name, num_rows FROM user_tables WHERE table_name = 'TAB1'; TABLE_NAME NUM_ROWS ---------- ---------- TAB1 1000 SQL> -- NUM_ROWS from USER_TAB_STATISTICS. SELECT table_name, num_rows, notes FROM user_tab_statistics WHERE table_name = 'TAB1'; TABLE_NAME NUM_ROWS NOTES ---------- ---------- ------------------------- TAB1 1000 SQL> -- LOW_VALUE and HIGH_VALUE of ID column from USER_TAB_COL_STATISTICS. SELECT table_name, column_name, UTL_RAW.cast_to_number(low_value) AS low_value, UTL_RAW.cast_to_number(high_value) AS high_value, sample_size, notes FROM user_tab_col_statistics WHERE table_name = 'TAB1' AND column_name = 'ID'; TABLE_NAME COLUMN_NAME LOW_VALUE HIGH_VALUE SAMPLE_SIZE NOTES ---------- ----------- ----------- ----------- ----------- ------------------------- TAB1 ID 1 1000 1000 SQL>
We see real-time statistics are not triggered.
- The
NUM_ROWS
column in theUSER_TABLES
view is not amended by this action. - No extra row has been added to the
USER_TAB_STATISTICS
view - No extra row has been added to the
USER_TAB_COL_STATISTICS
view
Remember, if we had performed this operation against a truncated table, statistics would have been gather by online statistics gathering for bulk loads in the normal way.
Conventional Path INSERT
We reset the test data and run a conventional path INSERT
to create single new row, then flush the monitoring data to the dictionary.
-- Reset the test data. EXEC reset_tab1; -- Conventional path insert. (row 1001) INSERT INTO tab1 VALUES (1001, 'Description of 1001'); COMMIT; -- Flush monitoring information to the data dictionary. EXEC DBMS_STATS.flush_database_monitoring_info;
We check some basic statistics for the TAB1
table.
-- NUM_ROWS from USER_TABLES. SELECT table_name, num_rows FROM user_tables WHERE table_name = 'TAB1'; TABLE_NAME NUM_ROWS ---------- ---------- TAB1 1000 SQL> -- NUM_ROWS from USER_TAB_STATISTICS. SELECT table_name, num_rows, notes FROM user_tab_statistics WHERE table_name = 'TAB1'; TABLE_NAME NUM_ROWS NOTES ---------- ---------- ------------------------- TAB1 1000 SQL> -- LOW_VALUE and HIGH_VALUE of ID column from USER_TAB_COL_STATISTICS. SELECT table_name, column_name, UTL_RAW.cast_to_number(low_value) AS low_value, UTL_RAW.cast_to_number(high_value) AS high_value, sample_size, notes FROM user_tab_col_statistics WHERE table_name = 'TAB1' AND column_name = 'ID'; TABLE_NAME COLUMN_NAME LOW_VALUE HIGH_VALUE SAMPLE_SIZE NOTES ---------- ----------- ----------- ----------- ----------- ------------------------- TAB1 ID 1 1000 1000 SQL>
We see real-time statistics are not triggered.
- The
NUM_ROWS
column in theUSER_TABLES
view is not amended by this action. - No extra row has been added to the
USER_TAB_STATISTICS
view - No extra row has been added to the
USER_TAB_COL_STATISTICS
view
I've repeated this many times and I've only seen one occasion where a single row insert has triggered real-time statistics.
Conventional Path INSERT in Loop
We reset the test data and run a conventional path INSERT
to create single new row, but run it 1000 times with a commit at the end, then flush the monitoring data to the dictionary.
-- Reset the test data. EXEC reset_tab1; -- Conventional path insert. One commit. (rows 1001-2000) BEGIN FOR i IN 1001 .. 2000 LOOP INSERT INTO tab1 VALUES (i, 'Description of ' || i); END LOOP; COMMIT; END; / -- Flush monitoring information to the data dictionary. EXEC DBMS_STATS.flush_database_monitoring_info;
We check some basic statistics for the TAB1
table.
-- NUM_ROWS from USER_TABLES. SELECT table_name, num_rows FROM user_tables WHERE table_name = 'TAB1'; TABLE_NAME NUM_ROWS ---------- ---------- TAB1 1000 SQL> -- NUM_ROWS from USER_TAB_STATISTICS. SELECT table_name, num_rows, notes FROM user_tab_statistics WHERE table_name = 'TAB1'; TABLE_NAME NUM_ROWS NOTES ---------- ---------- ------------------------- TAB1 1000 TAB1 2000 STATS_ON_CONVENTIONAL_DML SQL> -- LOW_VALUE and HIGH_VALUE of ID column from USER_TAB_COL_STATISTICS. SELECT table_name, column_name, UTL_RAW.cast_to_number(low_value) AS low_value, UTL_RAW.cast_to_number(high_value) AS high_value, sample_size, notes FROM user_tab_col_statistics WHERE table_name = 'TAB1' AND column_name = 'ID'; TABLE_NAME COLUMN_NAME LOW_VALUE HIGH_VALUE SAMPLE_SIZE NOTES ---------- ----------- ----------- ----------- ----------- ------------------------- TAB1 ID 1 1000 1000 TAB1 ID 1 1998 6 STATS_ON_CONVENTIONAL_DML SQL>
We see the impact of the real-time statistics.
- The
NUM_ROWS
column in theUSER_TABLES
view is not amended by this action. - An extra row has been added to the
USER_TAB_STATISTICS
view with theNOTES
column set to "STATS_ON_CONVENTIONAL_DML". TheNUM_ROWS
column has a value of 2000 rows. - An extra row has been added to the
USER_TAB_COL_STATISTICS
view with theNOTES
column set to "STATS_ON_CONVENTIONAL_DML". We see a sample size of 6 rows, which has allowed it to estimate theHIGH_VALUE
of theID
column to be 1998.
We can repeat this test but commit after each insert to see if it makes a difference to the outcome.
-- Reset the test data. EXEC reset_tab1; -- Conventional path insert. Commit per row. (rows 1001-2000) BEGIN FOR i IN 1001 .. 2000 LOOP INSERT INTO tab1 VALUES (i, 'Description of ' || i); COMMIT; END LOOP; END; / -- Flush monitoring information to the data dictionary. EXEC DBMS_STATS.flush_database_monitoring_info;
The output below shows the result is similar. Of course, the HIGH_VALUE
will vary a little with each test due to sampling.
-- NUM_ROWS from USER_TABLES. SELECT table_name, num_rows FROM user_tables WHERE table_name = 'TAB1'; TABLE_NAME NUM_ROWS ---------- ---------- TAB1 1000 SQL> -- NUM_ROWS from USER_TAB_STATISTICS. SELECT table_name, num_rows, notes FROM user_tab_statistics WHERE table_name = 'TAB1'; TABLE_NAME NUM_ROWS NOTES ---------- ---------- ------------------------- TAB1 1000 TAB1 2000 STATS_ON_CONVENTIONAL_DML SQL> -- LOW_VALUE and HIGH_VALUE of ID column from USER_TAB_COL_STATISTICS. SELECT table_name, column_name, UTL_RAW.cast_to_number(low_value) AS low_value, UTL_RAW.cast_to_number(high_value) AS high_value, sample_size, notes FROM user_tab_col_statistics WHERE table_name = 'TAB1' AND column_name = 'ID'; TABLE_NAME COLUMN_NAME LOW_VALUE HIGH_VALUE SAMPLE_SIZE NOTES ---------- ----------- ----------- ----------- ----------- ------------------------- TAB1 ID 1 1000 1000 TAB1 ID 1 1946 14 STATS_ON_CONVENTIONAL_DML SQL>
Bulk-Bind INSERT
We reset the test data and create 1000 new rows using a bulk-bind, then flush the monitoring data to the dictionary.
-- Reset the test data. EXEC reset_tab1; -- Bulk-bind insert. (rows 1001-2000) DECLARE TYPE t_tab IS TABLE OF tab1%ROWTYPE; l_tab t_tab; BEGIN SELECT level+1000 AS id, 'Description of ' || level AS description BULK COLLECT INTO l_tab FROM dual CONNECT BY level <= 1000; FORALL i IN l_tab.first .. l_tab.last INSERT INTO tab1 VALUES l_tab(i); COMMIT; END; / -- Flush monitoring information to the data dictionary. EXEC DBMS_STATS.flush_database_monitoring_info;
We check some basic statistics for the TAB1
table.
-- NUM_ROWS from USER_TABLES. SELECT table_name, num_rows FROM user_tables WHERE table_name = 'TAB1'; TABLE_NAME NUM_ROWS ---------- ---------- TAB1 1000 SQL> -- NUM_ROWS from USER_TAB_STATISTICS. SELECT table_name, num_rows, notes FROM user_tab_statistics WHERE table_name = 'TAB1'; TABLE_NAME NUM_ROWS NOTES ---------- ---------- ------------------------- TAB1 1000 TAB1 2000 STATS_ON_CONVENTIONAL_DML SQL> -- LOW_VALUE and HIGH_VALUE of ID column from USER_TAB_COL_STATISTICS. SELECT table_name, column_name, UTL_RAW.cast_to_number(low_value) AS low_value, UTL_RAW.cast_to_number(high_value) AS high_value, sample_size, notes FROM user_tab_col_statistics WHERE table_name = 'TAB1' AND column_name = 'ID'; TABLE_NAME COLUMN_NAME LOW_VALUE HIGH_VALUE SAMPLE_SIZE NOTES ---------- ----------- ----------- ----------- ----------- ------------------------- TAB1 ID 1 1000 1000 TAB1 ID 1 1971 8 STATS_ON_CONVENTIONAL_DML SQL>
We see the impact of the real-time statistics.
- The
NUM_ROWS
column in theUSER_TABLES
view is not amended by this action. - An extra row has been added to the
USER_TAB_STATISTICS
view with theNOTES
column set to "STATS_ON_CONVENTIONAL_DML". TheNUM_ROWS
column has a value of 2000 rows. - An extra row has been added to the
USER_TAB_COL_STATISTICS
view with theNOTES
column set to "STATS_ON_CONVENTIONAL_DML". We see a sample size of 8 rows, which has allowed it to estimate theHIGH_VALUE
of theID
column to be 1971.
UPDATE
We reset the test data and update all the rows adding 10,000 to the ID
column value, then flush the monitoring data to the dictionary.
-- Reset the test data. EXEC reset_tab1; -- Update ID+10,000. UPDATE tab1 SET id =id+10000; COMMIT; -- Flush monitoring information to the data dictionary. EXEC DBMS_STATS.flush_database_monitoring_info;
We check some basic statistics for the TAB1
table.
-- NUM_ROWS from USER_TABLES. SELECT table_name, num_rows FROM user_tables WHERE table_name = 'TAB1'; TABLE_NAME NUM_ROWS ---------- ---------- TAB1 1000 SQL> -- NUM_ROWS from USER_TAB_STATISTICS. SELECT table_name, num_rows, notes FROM user_tab_statistics WHERE table_name = 'TAB1'; TABLE_NAME NUM_ROWS NOTES ---------- ---------- ------------------------- TAB1 1000 TAB1 1000 STATS_ON_CONVENTIONAL_DML SQL> -- LOW_VALUE and HIGH_VALUE of ID column from USER_TAB_COL_STATISTICS. SELECT table_name, column_name, UTL_RAW.cast_to_number(low_value) AS low_value, UTL_RAW.cast_to_number(high_value) AS high_value, sample_size, notes FROM user_tab_col_statistics WHERE table_name = 'TAB1' AND column_name = 'ID'; TABLE_NAME COLUMN_NAME LOW_VALUE HIGH_VALUE SAMPLE_SIZE NOTES ---------- ----------- ----------- ----------- ----------- ------------------------- TAB1 ID 1 1000 1000 TAB1 ID 1 10887 9 STATS_ON_CONVENTIONAL_DML SQL>
We see the impact of the real-time statistics.
- The number of rows is not affected by this operation, so the
USER_TABLES
andUSER_TAB_STATISTICS
views are not so relevant. - An extra row has been added to the
USER_TAB_COL_STATISTICS
view with theNOTES
column set to "STATS_ON_CONVENTIONAL_DML". We see a sample size of 9 rows, which has allowed it to estimate theHIGH_VALUE
of theID
column to be 10887. Notice theLOW_VALUE
is unchanged, even though it should be +10,000.
DELETE
We reset the test data, delete all rows, then flush the monitoring data to the dictionary.
-- Reset the test data. EXEC reset_tab1; -- Delete all rows. DELETE FROM tab1; COMMIT; -- Flush monitoring information to the data dictionary. EXEC DBMS_STATS.flush_database_monitoring_info;
We check some basic statistics for the TAB1
table.
-- NUM_ROWS from USER_TABLES. SELECT table_name, num_rows FROM user_tables WHERE table_name = 'TAB1'; TABLE_NAME NUM_ROWS ---------- ---------- TAB1 1000 SQL> -- NUM_ROWS from USER_TAB_STATISTICS. SELECT table_name, num_rows, notes FROM user_tab_statistics WHERE table_name = 'TAB1'; TABLE_NAME NUM_ROWS NOTES ---------- ---------- ------------------------- TAB1 1000 SQL> -- LOW_VALUE and HIGH_VALUE of ID column from USER_TAB_COL_STATISTICS. SELECT table_name, column_name, UTL_RAW.cast_to_number(low_value) AS low_value, UTL_RAW.cast_to_number(high_value) AS high_value, sample_size, notes FROM user_tab_col_statistics WHERE table_name = 'TAB1' AND column_name = 'ID'; TABLE_NAME COLUMN_NAME LOW_VALUE HIGH_VALUE SAMPLE_SIZE NOTES ---------- ----------- ----------- ----------- ----------- ------------------------- TAB1 ID 1 1000 1000 SQL>
We see real-time statistics are not triggered.
- The
NUM_ROWS
column in theUSER_TABLES
view is not amended by this action. - No extra row has been added to the
USER_TAB_STATISTICS
view - No extra row has been added to the
USER_TAB_COL_STATISTICS
view
I've repeated this many times and I've never seen a delete trigger real-time statistics.
Partitioned Tables
We create a partitioned table called TAB2
and gather statistics on the empty table.
DROP TABLE tab2 PURGE; CREATE TABLE tab2 (id NUMBER, created_date DATE) PARTITION BY RANGE (created_date) (PARTITION tab2_2019 VALUES LESS THAN (TO_DATE('01/01/2020', 'DD/MM/YYYY')) TABLESPACE users, PARTITION tab2_2020 VALUES LESS THAN (TO_DATE('01/01/2021', 'DD/MM/YYYY')) TABLESPACE users); EXEC DBMS_STATS.gather_table_stats(NULL, 'tab2');
We insert 1000 records into the table. Because of the date used they will all go into one partitions. Then we flush the monitoring information to the dictionary.
INSERT INTO tab2 SELECT level, TO_DATE('01/01/2019', 'DD/MM/YYYY') FROM dual CONNECT BY level <= 1000; COMMIT; -- Flush monitoring information to the data dictionary. EXEC DBMS_STATS.flush_database_monitoring_info;
We check some basic statistics for the TAB2
table.
-- NUM_ROWS from USER_TABLES. SELECT table_name, num_rows FROM user_tables WHERE table_name = 'TAB2'; TABLE_NAME NUM_ROWS ---------- ---------- TAB2 0 SQL> -- NUM_ROWS from USER_TAB_STATISTICS. SELECT table_name, NVL(partition_name, '{GLOBAL}') AS partition_name, num_rows, notes FROM user_tab_statistics WHERE table_name = 'TAB2'; TABLE_NAME PARTITION_ NUM_ROWS NOTES ---------- ---------- ---------- ------------------------- TAB2 {GLOBAL} 0 TAB2 TAB2_2019 0 TAB2 TAB2_2020 0 TAB2 {GLOBAL} 1000 STATS_ON_CONVENTIONAL_DML SQL> -- LOW_VALUE and HIGH_VALUE of ID column from USER_TAB_COL_STATISTICS. SELECT table_name, column_name, UTL_RAW.cast_to_number(low_value) AS low_value, UTL_RAW.cast_to_number(high_value) AS high_value, sample_size, notes FROM user_tab_col_statistics WHERE table_name = 'TAB2' AND column_name = 'ID'; TABLE_NAME COLUMN_NAME LOW_VALUE HIGH_VALUE SAMPLE_SIZE NOTES ---------- ----------- ----------- ----------- ----------- ------------------------- TAB2 ID TAB2 ID 276 519 4 STATS_ON_CONVENTIONAL_DML SQL>
The main thing to note is the USER_TAB_STATISTICS
view shows the real-time statistics at the global level, not at the partition level. The LOW_VALUE
and HIGH_VALUE
estimates seem to be consistently less accurate when working with partitions. We get the same behaviour even when we explicitly reference a partition, as shown below.
We truncate the table and gather fresh stats, insert 1000 records explicitly naming the partition they should go into, then we flush the monitoring information to the dictionary.
-- Empty the table and clear down the statistics. EXEC DBMS_STATS.delete_table_stats(NULL, 'tab2'); TRUNCATE TABLE tab2; EXEC DBMS_STATS.gather_table_stats(NULL, 'tab2'); INSERT INTO tab2 PARTITION (TAB2_2019) SELECT level, TO_DATE('01/01/2019', 'DD/MM/YYYY') FROM dual CONNECT BY level <= 1000; COMMIT; -- Flush monitoring information to the data dictionary. EXEC DBMS_STATS.flush_database_monitoring_info;
We check some basic statistics for the TAB2
table.
-- NUM_ROWS from USER_TABLES. SELECT table_name, num_rows FROM user_tables WHERE table_name = 'TAB2'; TABLE_NAME NUM_ROWS ---------- ---------- TAB2 0 SQL> -- NUM_ROWS from USER_TAB_STATISTICS. SELECT table_name, NVL(partition_name, '{GLOBAL}') AS partition_name, num_rows, notes FROM user_tab_statistics WHERE table_name = 'TAB2'; TABLE_NAME PARTITION_ NUM_ROWS NOTES ---------- ---------- ---------- ------------------------- TAB2 {GLOBAL} 0 TAB2 TAB2_2019 0 TAB2 TAB2_2020 0 TAB2 {GLOBAL} 1000 STATS_ON_CONVENTIONAL_DML SQL> -- LOW_VALUE and HIGH_VALUE of ID column from USER_TAB_COL_STATISTICS. SELECT table_name, column_name, UTL_RAW.cast_to_number(low_value) AS low_value, UTL_RAW.cast_to_number(high_value) AS high_value, sample_size, notes FROM user_tab_col_statistics WHERE table_name = 'TAB2' AND column_name = 'ID'; TABLE_NAME COLUMN_NAME LOW_VALUE HIGH_VALUE SAMPLE_SIZE NOTES ---------- ----------- ----------- ----------- ----------- ------------------------- TAB2 ID TAB2 ID 112 776 4 STATS_ON_CONVENTIONAL_DML SQL>
The USER_TAB_STATISTICS
view shows the real-time statistics at the global level..
NO_GATHER_OPTIMIZER_STATISTICS Hint
Similar to online statistics gathering for bulk loads, we can use the NO_GATHER_OPTIMIZER_STATISTICS
hint to prevent real-time statistics from being triggered.
We repeat the first test, but add the NO_GATHER_OPTIMIZER_STATISTICS
hint.
-- Reset the test data. EXEC reset_tab1; -- Conventional path insert. (rows 1001-2000) INSERT INTO tab1 SELECT /*+ NO_GATHER_OPTIMIZER_STATISTICS */ level+1000 AS id, 'Description of ' || level AS description FROM dual CONNECT BY level <= 1000; COMMIT; -- Flush monitoring information to the data dictionary. EXEC DBMS_STATS.flush_database_monitoring_info;
The output below shows real-time statistics are not triggered.
-- NUM_ROWS from USER_TABLES. SELECT table_name, num_rows FROM user_tables WHERE table_name = 'TAB1'; TABLE_NAME NUM_ROWS ---------- ---------- TAB1 1000 SQL> -- NUM_ROWS from USER_TAB_STATISTICS. SELECT table_name, num_rows, notes FROM user_tab_statistics WHERE table_name = 'TAB1'; TABLE_NAME NUM_ROWS NOTES ---------- ---------- ------------------------- TAB1 1000 SQL> -- LOW_VALUE and HIGH_VALUE of ID column from USER_TAB_COL_STATISTICS. SELECT table_name, column_name, UTL_RAW.cast_to_number(low_value) AS low_value, UTL_RAW.cast_to_number(high_value) AS high_value, sample_size, notes FROM user_tab_col_statistics WHERE table_name = 'TAB1' AND column_name = 'ID'; TABLE_NAME COLUMN_NAME LOW_VALUE HIGH_VALUE SAMPLE_SIZE NOTES ---------- ----------- ----------- ----------- ----------- ------------------------- TAB1 ID 1 1000 1000 SQL>
Hidden Parameters
There are a number of hidden (underscore) parameters related to this feature including the following.
COLUMN parameter FORMAT A50 COLUMN description FORMAT A30 WORD_WRAPPED COLUMN session_value FORMAT A15 COLUMN instance_value FORMAT A15 SELECT a.ksppinm AS parameter, a.ksppdesc AS description, b.ksppstvl AS session_value, c.ksppstvl AS instance_value FROM x$ksppi a, x$ksppcv b, x$ksppsv c WHERE a.indx = b.indx AND a.indx = c.indx AND a.ksppinm LIKE '/_%stats_on_conventional%' ESCAPE '/' ORDER BY a.ksppinm; PARAMETER DESCRIPTION SESSION_VALUE INSTANCE_VALUE -------------------------------------------------- ------------------------------ --------------- --------------- _optimizer_gather_stats_on_conventional_config settings for optimizer online 0 0 stats gathering on conventional DML _optimizer_gather_stats_on_conventional_dml optimizer online stats TRUE TRUE gathering for conventional DML _optimizer_stats_on_conventional_dml_sample_rate sampling rate for online stats 100 100 gathering on conventional DML _optimizer_use_stats_on_conventional_config settings for optimizer usage 0 0 of online stats on conventional DML _optimizer_use_stats_on_conventional_dml use optimizer statistics TRUE TRUE gathered for conventional DML SQL>
Setting the "_optimizer_gather_stats_on_conventional_dml" parameter at session or system level will stop the optimizer gathering real-time statistics.
ALTER SESSION SET "_optimizer_gather_stats_on_conventional_dml"=FALSE;
Alternatively, you can allow them to be gathered, but prevent the optimizer from using them by setting the "_optimizer_use_stats_on_conventional_dml" parameter at session or system level.
ALTER SESSION SET "_optimizer_use_stats_on_conventional_dml"=FALSE;
Transfer Real-Time Statistics
Real-time statistics use the normal method for transferring database statistics, so they can be exported and imported at database, schema or table level.
First the statistics must be collected into a statistics table. In the following examples the statistics for the TESTUSER1
table are collected into a new statistics table, STATS_TABLE
, which is owned by the TESTUSER1
schema.
BEGIN DBMS_STATS.create_stat_table('TESTUSER1','STATS_TABLE'); DBMS_STATS.import_schema_stats( ownname => 'TESTUSER1', stattab => 'STATS_TABLE', statown => 'TESTUSER1', stat_category => 'OBJECT_STATS,REALTIME_STATS'); END; /
This table can be transferred to another server using your preferred method (Export/Import, SQL*Plus COPY etc.) and the stats imported into the data dictionary as follows.
BEGIN DBMS_STATS.import_schema_stats( ownname => 'TESTUSER1', stattab => 'STATS_TABLE', statown => 'TESTUSER1', force => TRUE, stat_category => 'OBJECT_STATS,REALTIME_STATS'); DBMS_STATS.drop_stat_table('TESTUSER1', 'STATS_TABLE'); END; /
Thoughts
- This feature is only available on Engineered Systems such as Exadata and Exadata Cloud Service.
- This feature will not keep the statistics perfect, but will make them "more representative" of the table contents if there is a lot of change between statistics gathering.
- Gathering statistics for a table will wipe out the real-time statistics, as they are no longer needed.
- I suspect its main purpose is to get round the problems associated with load tables for ETL processes.
- Some people may be concerned about the level of change and the potential impact on execution plans. If so this feature can be switched off using the hidden parameters, but it is enabled by default.
For more information see:
- Real-Time Statistics
- DBMS_STATS
- Real-Time Statistics in Oracle Database 19c
- Online Statistics Gathering for Bulk Loads in Oracle Database 12c Release 1 (12.1)
- Cost-Based Optimizer (CBO) And Database Statistics
Hope this helps. Regards Tim...