8i | 9i | 10g | 11g | 12c | 13c | 18c | 19c | 21c | 23c | Misc | PL/SQL | SQL | RAC | WebLogic | Linux
Online Statistics Gathering for Bulk Loads in Oracle Database 12c Release 1 (12.1)
Oracle is now able to gather statistics during some bulk operations, making it unnecessary to gather statistics subsequently. This means extra table scans for statistics collection are unnecessary and there is less chance that you will forget to gather statistics after bulk loads.
Related articles.
- Online Statistics Gathering for Bulk Loads
- Cost-Based Optimizer (CBO) And Database Statistics
- Statistics Collection Enhancements in Oracle Database 12c Release 1 (12.1)
- Real-Time Statistics in Oracle Database 19c
CREATE TABLE ... AS SELECT (CTAS)
Oracle is able to gather statistics during a CREATE TABLE ... AS SELECT
(CTAS) operation for most regular heap organized tables, as shown below.
We use a CREATE TABLE AS SELECT
or CTAS to create a table with 1000 rows. This is dummy data, but it could have been based on a select from an external table.
CONN test/test@pdb1 DROP TABLE tab1 PURGE; CREATE TABLE tab1 AS SELECT level AS id, 'Description of ' || level AS description FROM dual CONNECT BY level <= 1000; COLUMN table_name FORMAT A20 SELECT table_name, num_rows FROM user_tables WHERE table_name = 'TAB1'; TABLE_NAME NUM_ROWS -------------------- ---------- TAB1 1000 SQL>
Checking the NUM_ROWS
column in the USER_TABLES
view shows us some stats have been gathered. There are other table-level stats, but for simplicity we'll use the NUM_ROWS
column to show if statistics have been gathered.
INSERT INTO ... SELECT
In a similar fashion, INSERT INTO ... SELECT
operations on tables with no segments using direct path can also benefit from online statistics gathering. These operations are performed using direct path if they are running in parallel, or if the APPEND
hint is specified.
We truncate the table to remove all the segments, then we do a direct path insert using the APPEND
hint. This time we load 500 rows.
TRUNCATE TABLE tab1; INSERT /*+ APPEND */ INTO tab1 SELECT level AS id, 'Description of ' || level AS description FROM dual CONNECT BY level <= 500; COMMIT; COLUMN table_name FORMAT A20 SELECT table_name, num_rows FROM user_tables WHERE table_name = 'TAB1'; TABLE_NAME NUM_ROWS -------------------- ---------- TAB1 500 SQL>
A conventional path insert, without a PARALLEL
or APPEND
hint, will not result in the gathering of statistics. The following is a repeat of the first example but for 700 rows instead of 500, and it doesn't have the APPEND
hint. The NUM_ROWS
column still shows 500 rows, from the previous test.
TRUNCATE TABLE tab1; INSERT INTO tab1 SELECT level AS id, 'Description of ' || level AS description FROM dual CONNECT BY level <= 700; COMMIT; COLUMN table_name FORMAT A20 SELECT table_name, num_rows FROM user_tables WHERE table_name = 'TAB1'; TABLE_NAME NUM_ROWS -------------------- ---------- TAB1 500 SQL>
The table must have no segments (missing segment due to delayed segment creation) for this to work. A regular delete will not leave the table in the correct state to allow the online gathering of statistics to kick in.
In this example, rather than truncating the table we delete the rows. The table is empty, but there are existing segments from the previous load. We do a direct path insert of 700 rows. Notice the APPEND hint is back.
DELETE FROM tab1; COMMIT; INSERT /*+ APPEND */ INTO tab1 SELECT level AS id, 'Description of ' || level AS description FROM dual CONNECT BY level <= 700; COMMIT; COLUMN table_name FORMAT A20 SELECT table_name, num_rows FROM user_tables WHERE table_name = 'TAB1'; TABLE_NAME NUM_ROWS -------------------- ---------- TAB1 500 SQL>
Once again the stats were not gathered during the insert. With direct path inserts the stats are only gathered for tables with no segments.
The NO_GATHER_OPTIMIZER_STATISTICS
hint can be used to explicitly prevent the online gathering of statistics. In this example we truncate the table and do a direct path insert of 700 rows, but we include the hint.
TRUNCATE TABLE tab1; INSERT /*+ APPEND */ INTO tab1 SELECT /*+ NO_GATHER_OPTIMIZER_STATISTICS */ level AS id, 'Description of ' || level AS description FROM dual CONNECT BY level <= 700; COMMIT; COLUMN table_name FORMAT A20 SELECT table_name, num_rows FROM user_tables WHERE table_name = 'TAB1'; TABLE_NAME NUM_ROWS -------------------- ---------- TAB1 500 SQL>
As expected, the hint prevented the stats from being gathered during the insert, even though the rest of the conditions were correct. We still see the 500 row value from a previous load.
Partitioned Tables
A bulk load into an empty partitioned table will trigger the gathering of global statistics, but no partition-level statistics.
We drop and recreate the test table as a partitioned table with 2 partitions. We do a direct path insert of 100 rows into the new table. All the data will go into the 2016 partition.
DROP TABLE tab1 PURGE; CREATE TABLE tab1 (id NUMBER, created_date DATE) PARTITION BY RANGE (created_date) (PARTITION tab1_2015 VALUES LESS THAN (TO_DATE('01/01/2016', 'DD/MM/YYYY')) TABLESPACE users, PARTITION tab1_2016 VALUES LESS THAN (TO_DATE('01/01/2017', 'DD/MM/YYYY')) TABLESPACE users); INSERT /*+ APPEND */ INTO tab1 SELECT level, TO_DATE('01/01/2016', 'DD/MM/YYYY') FROM dual CONNECT BY level <= 100; COMMIT; COLUMN table_name FORMAT A20 SELECT table_name, num_rows FROM user_tables WHERE table_name = 'TAB1'; TABLE_NAME NUM_ROWS -------------------- ---------- TAB1 100 SQL> COLUMN table_name FORMAT A20 COLUMN partition_name FORMAT A20 SELECT table_name, partition_name, num_rows FROM user_tab_partitions WHERE table_name = 'TAB1' ORDER BY partition_name; TABLE_NAME PARTITION_NAME NUM_ROWS -------------------- -------------------- ---------- TAB1 TAB1_2015 TAB1 TAB1_2016 SQL>
We see the USER_TABLES
view has the correct global value of 100 rows, but there are no rows recorded at partition-level in the USER_TAB_PARTITIONS
view.
A bulk load that explicitly references an empty partition will result in partition-level statistics, but no global statistics.
We delete the table stats, truncate the table, and do a direct path insert into the table, explicitly naming the table partition.
EXEC DBMS_STATS.delete_table_stats(USER, 'TAB1'); TRUNCATE TABLE tab1; INSERT /*+ APPEND */ INTO tab1 PARTITION (TAB1_2016) SELECT level, TO_DATE('01/01/2016', 'DD/MM/YYYY') FROM dual CONNECT BY level <= 100; COMMIT; COLUMN table_name FORMAT A20 SELECT table_name, num_rows FROM user_tables WHERE table_name = 'TAB1'; TABLE_NAME NUM_ROWS -------------------- ---------- TAB1 SQL> COLUMN table_name FORMAT A20 COLUMN partition_name FORMAT A20 SELECT table_name, partition_name, num_rows FROM user_tab_partitions WHERE table_name = 'TAB1' ORDER BY partition_name; TABLE_NAME PARTITION_NAME NUM_ROWS -------------------- -------------------- ---------- TAB1 TAB1_2015 TAB1 TAB1_2016 100 SQL>
This time the USER_TABLES
view doesn't contain any global stats, but the partition stats are present in the USER_TAB_PARTITIONS
view.
Restrictions
There are a number of restrictions associated with online statistics gathering. They are not gathered for:
Index statistics or histograms. If those are required the must be gathered in a separate operation. The following call will gather missing statistics, but will not re-gather table or column statistics unless the existing statistics are already stale.
EXEC DBMS_STATS.GATHER_TABLE_STATS(USER, 'TAB1', options => 'GATHER AUTO');
- Non-empty segments, as described above.
- Tables in built-in schemas. Only those in user-defined schemas.
- Nested, index-organized or external tables.
- Global temporary tables using the
ON COMMIT DELETE ROWS
clause. - Table with virtual columns.
- Tables if the
PUBLISH
preference is set to FALSE forDBMS_STATS
. - Tables with locked statistics.
- Partitioned tables using incremental statistics, where the insert is not explicitly referencing a partition using the
PARTITION
clause. - Tables loaded using multitable inserts.
For more information see:
- Online Statistics Gathering for Bulk Loads
- Online Statistics Gathering for Bulk Loads
- Cost-Based Optimizer (CBO) And Database Statistics
- Statistics Collection Enhancements in Oracle Database 12c Release 1 (12.1)
- Real-Time Statistics in Oracle Database 19c
Hope this helps. Regards Tim...