8i | 9i | 10g | 11g | 12c | 13c | 18c | 19c | 21c | 23c | Misc | PL/SQL | SQL | RAC | WebLogic | Linux

Home » Articles » 12c » Here

Online Statistics Gathering for Bulk Loads in Oracle Database 12c Release 1 (12.1)

Oracle is now able to gather statistics during some bulk operations, making it unnecessary to gather statistics subsequently. This means extra table scans for statistics collection are unnecessary and there is less chance that you will forget to gather statistics after bulk loads.

Related articles.

CREATE TABLE ... AS SELECT (CTAS)

Oracle is able to gather statistics during a CREATE TABLE ... AS SELECT (CTAS) operation for most regular heap organized tables, as shown below.

We use a CREATE TABLE AS SELECT or CTAS to create a table with 1000 rows. This is dummy data, but it could have been based on a select from an external table.

CONN test/test@pdb1

DROP TABLE tab1 PURGE;

CREATE TABLE tab1 AS
SELECT level AS id,
       'Description of ' || level AS description
FROM   dual
CONNECT BY level <= 1000;

COLUMN table_name FORMAT A20

SELECT table_name,
       num_rows
FROM   user_tables
WHERE  table_name = 'TAB1';

TABLE_NAME             NUM_ROWS
-------------------- ----------
TAB1                       1000

SQL>

Checking the NUM_ROWS column in the USER_TABLES view shows us some stats have been gathered. There are other table-level stats, but for simplicity we'll use the NUM_ROWS column to show if statistics have been gathered.

INSERT INTO ... SELECT

In a similar fashion, INSERT INTO ... SELECT operations on tables with no segments using direct path can also benefit from online statistics gathering. These operations are performed using direct path if they are running in parallel, or if the APPEND hint is specified.

We truncate the table to remove all the segments, then we do a direct path insert using the APPEND hint. This time we load 500 rows.

TRUNCATE TABLE tab1;

INSERT /*+ APPEND */ INTO tab1
SELECT level AS id,
       'Description of ' || level AS description
FROM   dual
CONNECT BY level <= 500;
COMMIT;

COLUMN table_name FORMAT A20

SELECT table_name,
       num_rows
FROM   user_tables
WHERE  table_name = 'TAB1';

TABLE_NAME             NUM_ROWS
-------------------- ----------
TAB1                        500

SQL>

A conventional path insert, without a PARALLEL or APPEND hint, will not result in the gathering of statistics. The following is a repeat of the first example but for 700 rows instead of 500, and it doesn't have the APPEND hint. The NUM_ROWS column still shows 500 rows, from the previous test.

TRUNCATE TABLE tab1;

INSERT INTO tab1
SELECT level AS id,
       'Description of ' || level AS description
FROM   dual
CONNECT BY level <= 700;
COMMIT;

COLUMN table_name FORMAT A20

SELECT table_name,
       num_rows
FROM   user_tables
WHERE  table_name = 'TAB1';

TABLE_NAME             NUM_ROWS
-------------------- ----------
TAB1                        500

SQL>

The table must have no segments (missing segment due to delayed segment creation) for this to work. A regular delete will not leave the table in the correct state to allow the online gathering of statistics to kick in.

In this example, rather than truncating the table we delete the rows. The table is empty, but there are existing segments from the previous load. We do a direct path insert of 700 rows. Notice the APPEND hint is back.

DELETE FROM tab1;
COMMIT;

INSERT /*+ APPEND */ INTO tab1
SELECT level AS id,
       'Description of ' || level AS description
FROM   dual
CONNECT BY level <= 700;
COMMIT;

COLUMN table_name FORMAT A20

SELECT table_name,
       num_rows
FROM   user_tables
WHERE  table_name = 'TAB1';

TABLE_NAME             NUM_ROWS
-------------------- ----------
TAB1                        500

SQL>

Once again the stats were not gathered during the insert. With direct path inserts the stats are only gathered for tables with no segments.

The NO_GATHER_OPTIMIZER_STATISTICS hint can be used to explicitly prevent the online gathering of statistics. In this example we truncate the table and do a direct path insert of 700 rows, but we include the hint.

TRUNCATE TABLE tab1;

INSERT /*+ APPEND */ INTO tab1
SELECT /*+ NO_GATHER_OPTIMIZER_STATISTICS */
       level AS id,
       'Description of ' || level AS description
FROM   dual
CONNECT BY level <= 700;
COMMIT;

COLUMN table_name FORMAT A20

SELECT table_name,
       num_rows
FROM   user_tables
WHERE  table_name = 'TAB1';

TABLE_NAME             NUM_ROWS
-------------------- ----------
TAB1                        500

SQL>

As expected, the hint prevented the stats from being gathered during the insert, even though the rest of the conditions were correct. We still see the 500 row value from a previous load.

Partitioned Tables

A bulk load into an empty partitioned table will trigger the gathering of global statistics, but no partition-level statistics.

We drop and recreate the test table as a partitioned table with 2 partitions. We do a direct path insert of 100 rows into the new table. All the data will go into the 2016 partition.

DROP TABLE tab1 PURGE;

CREATE TABLE tab1
(id            NUMBER,
 created_date  DATE)
PARTITION BY RANGE (created_date)
(PARTITION tab1_2015 VALUES LESS THAN (TO_DATE('01/01/2016', 'DD/MM/YYYY')) TABLESPACE users,
 PARTITION tab1_2016 VALUES LESS THAN (TO_DATE('01/01/2017', 'DD/MM/YYYY')) TABLESPACE users);

INSERT /*+ APPEND */ INTO tab1
SELECT level,
       TO_DATE('01/01/2016', 'DD/MM/YYYY')
FROM   dual
CONNECT BY level <= 100;
COMMIT;

COLUMN table_name FORMAT A20

SELECT table_name,
       num_rows
FROM   user_tables
WHERE  table_name = 'TAB1';

TABLE_NAME             NUM_ROWS
-------------------- ----------
TAB1                        100

SQL>


COLUMN table_name FORMAT A20
COLUMN partition_name FORMAT A20

SELECT table_name,
       partition_name,
       num_rows
FROM   user_tab_partitions
WHERE  table_name = 'TAB1'
ORDER BY partition_name;

TABLE_NAME           PARTITION_NAME         NUM_ROWS
-------------------- -------------------- ----------
TAB1                 TAB1_2015
TAB1                 TAB1_2016

SQL>

We see the USER_TABLES view has the correct global value of 100 rows, but there are no rows recorded at partition-level in the USER_TAB_PARTITIONS view.

A bulk load that explicitly references an empty partition will result in partition-level statistics, but no global statistics.

We delete the table stats, truncate the table, and do a direct path insert into the table, explicitly naming the table partition.

EXEC DBMS_STATS.delete_table_stats(USER, 'TAB1');
TRUNCATE TABLE tab1;

INSERT /*+ APPEND */ INTO tab1 PARTITION (TAB1_2016)
SELECT level,
       TO_DATE('01/01/2016', 'DD/MM/YYYY')
FROM   dual
CONNECT BY level <= 100;
COMMIT;

COLUMN table_name FORMAT A20

SELECT table_name,
       num_rows
FROM   user_tables
WHERE  table_name = 'TAB1';

TABLE_NAME             NUM_ROWS
-------------------- ----------
TAB1

SQL>


COLUMN table_name FORMAT A20
COLUMN partition_name FORMAT A20

SELECT table_name,
       partition_name,
       num_rows
FROM   user_tab_partitions
WHERE  table_name = 'TAB1'
ORDER BY partition_name;

TABLE_NAME           PARTITION_NAME         NUM_ROWS
-------------------- -------------------- ----------
TAB1                 TAB1_2015
TAB1                 TAB1_2016                   100

SQL>

This time the USER_TABLES view doesn't contain any global stats, but the partition stats are present in the USER_TAB_PARTITIONS view.

Restrictions

There are a number of restrictions associated with online statistics gathering. They are not gathered for:

For more information see:

Hope this helps. Regards Tim...

Back to the Top.