8i | 9i | 10g | 11g | 12c | 13c | 18c | 19c | 21c | 23ai | Misc | PL/SQL | SQL | RAC | WebLogic | Linux
Automatic Column Group Detection in Oracle Database 12c Release 1 (12.1)
Extended statistics were introduced in Oracle 11g to allow statistics to be gathered on groups of columns, to highlight the relationship between them, or on expressions. In Oracle 11gR1, extended statistics had to be created manually for column groups. In Oracle 11gR2, the DBMS_STATS.SEED_COL_USAGE
procedure was added, allowing Oracle to determine which column group statistics would be useful, based on a specified SQL Tuning Set or a specific monitoring period.
Oracle 12c introduced automatic column group detection, but as you will see when working through the sections below, the circumstances where it is triggered can appear less than obvious.
- Setup
- Automatic Column Group Detection and Dynamic Statistics
- Automatic Column Group Detection and Statistics Feedback
- Automatic Column Group Detection and SQL Plan Directives
- Conclusion
Related articles.
- Extended Statistics - Multi-Column (Column Group) Statistics
- Extended Statistics Enhancements in Oracle Database 11g Release 2
Setup
Make sure the shared pool is flushed, so we know we are starting with a clean system.
conn sys/SysPassword1@//localhost:1521/pdb1 as sysdba alter system flush shared_pool;
The following test table contains columns to indicate gender and the presence of a Y chromosome and the presence of a beard. These columns have check constraints to tell the optimizer about the allowable values, as well as indexes on the columns.
conn testuser1/testuser1@//localhost:1521/pdb1 drop table tab1 purge; create table tab1 ( id number, name varchar2(20), gender varchar2(1), y_chromosome varchar2(1), beard varchar2(1), constraint tab1_pk primary key (id), constraint tab1_gender_chk check (gender in ('M', 'F')), constraint tab1_has_y_chromosome_chk check (y_chromosome in ('Y', 'N')), constraint tab1_has_beard_chk check (beard in ('Y', 'N')) ); insert into tab1 select level, dbms_random.string('l',trunc(dbms_random.value(10,16))) as name, case when mod(rownum, 2) = 0 then 'M' else 'F' end as gender, case when mod(rownum, 2) = 0 then 'Y' else 'N' end as y_chromosome, case when mod(rownum, 2) = 0 then decode(trunc(dbms_random.value(1,3)), 1, 'Y', 'N') else 'N' end as beard from dual connect by level <= 10000; commit; create index tab1_gender_idx on tab1(gender); create index tab1_has_y_chromosome_idx on tab1(y_chromosome); create index tab1_has_beard_idx on tab1(beard); -- Don't explicitly gather that statistics at this point! --exec dbms_stats.gather_table_stats(null, 'tab1');
We can see the breakdown of the data below.
select gender, y_chromosome, beard, count(*) as total from tab1 group by gender, y_chromosome, beard order by 1,2,3; G Y B TOTAL - - - ---------- F N N 5000 M Y N 2523 M Y Y 2477 SQL>
We know that genetically speaking, all males have a Y chromosome and all females do not. Our data shows approximately 50% of the men have beards and none of the women do. The optimizer is unaware of the relationships between these columns, so it has to estimate based on each column having two possible allowable values.
Assuming the data is spread evenly amongst the allowable values it would expect half of the rows to be marked as male, half to be marked as having a Y chromosome and half to me marked as having a beard. If there were no relationship between the columns, it would expect 10000/2/2/2=1250 rows to be marked as female, not having a Y chromosome and not having a beard. The following sections use a that query pulls out all the rows matching this criteria and displays the execution plan used to return the data, including the expected and actual cardinalities.
Automatic Column Group Detection and Dynamic Statistics
Make sure you have repeated the setup before starting this test.
Run the following test query.
conn testuser1/testuser1@//localhost:1521/pdb1 select /*+ gather_plan_statistics */ * from tab1 where gender = 'F' and y_chromosome = 'N' and beard = 'N'; -- Output edited for brevity. set linesize 200 pagesize 100 select * from table(dbms_xplan.display_cursor(format => 'allstats last')); PLAN_TABLE_OUTPUT -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- SQL_ID dvwdbmm4txayg, child number 0 ------------------------------------- select /*+ gather_plan_statistics */ * from tab1 where gender = 'F' and y_chromosome = 'N' and beard = 'N' Plan hash value: 2211052296 ------------------------------------------------------------------------------------ | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | ------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | | 5000 |00:00:00.01 | 378 | |* 1 | TABLE ACCESS FULL| TAB1 | 1 | 5000 | 5000 |00:00:00.01 | 378 | ------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(("GENDER"='F' AND "Y_CHROMOSOME"='N' AND "BEARD"='N')) Note ----- - dynamic statistics used: dynamic sampling (level=2) SQL>
A comparison of the expected and actual rows show the cardinality estimate was accurate. That lack of adequate statistics meant dynamic statistics were necessary to generate the optimum execution plan.
Notice the query is not reoptimizable.
conn sys/SysPassword1@//localhost:1521/pdb1 as sysdba column sql_text format a40 column is_reoptimizable format a16 select sql_text, is_reoptimizable from v$sql where sql_id = 'dvwdbmm4txayg'; SQL_TEXT IS_REOPTIMIZABLE ---------------------------------------- ---------------- select /*+ gather_plan_statistics */ N * from tab1 where gender = 'F' an d y_chromosome = 'N' and beard = ' N' SQL>
Using the DBMS_STATS.REPORT_COL_USAGE
function, we can see the basic column usage has been detected, but there are no FILTER
predicates in the reports, which we would expect if a column group had been detected.
conn sys/SysPassword1@//localhost:1521/pdb1 as sysdba set long 100000 set lines 120 set pages 50 select dbms_stats.report_col_usage('testuser1', 'tab1') from dual; DBMS_STATS.REPORT_COL_USAGE('TESTUSER1','TAB1') -------------------------------------------------------------------------------- LEGEND: ....... EQ : Used in single table EQuality predicate RANGE : Used in single table RANGE predicate LIKE : Used in single table LIKE predicate NULL : Used in single table is (not) NULL predicate EQ_JOIN : Used in EQuality JOIN predicate NONEQ_JOIN : Used in NON EQuality JOIN predicate FILTER : Used in single table FILTER predicate JOIN : Used in JOIN predicate GROUP_BY : Used in GROUP BY expression ............................................................................... ############################################################################### COLUMN USAGE REPORT FOR TEST.TAB1 ................................. 1. BEARD : EQ 2. GENDER : EQ 3. Y_CHROMOSOME : EQ ############################################################################### SQL>
There was no cardinality misestimate, so no SQL plan directives were created.
conn sys/SysPassword1@//localhost:1521/pdb1 as sysdba exec dbms_spd.flush_sql_plan_directive; set linesize 200 column dir_id format a20 column owner format a10 column object_name format a10 column col_name format a12 select to_char(d.directive_id) dir_id, o.owner, o.object_name, o.subobject_name col_name, o.object_type, d.type, d.state, d.reason from dba_sql_plan_directives d, dba_sql_plan_dir_objects o where d.directive_id=o.directive_id and o.owner = 'TESTUSER1' ORDER BY 1,2,3,4,5; no rows selected SQL>
As expected, running the test query a second time results in no change in the execution plan.
conn testuser1/testuser1@//localhost:1521/pdb1 select /*+ gather_plan_statistics */ * from tab1 where gender = 'F' and y_chromosome = 'N' and beard = 'N'; -- Output edited for brevity. set linesize 200 pagesize 100 select * from table(dbms_xplan.display_cursor(format => 'allstats last')); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------ SQL_ID dvwdbmm4txayg, child number 0 ------------------------------------- select /*+ gather_plan_statistics */ * from tab1 where gender = 'F' and y_chromosome = 'N' and beard = 'N' Plan hash value: 2211052296 ------------------------------------------------------------------------------------ | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | ------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | | 5000 |00:00:00.02 | 377 | |* 1 | TABLE ACCESS FULL| TAB1 | 1 | 5000 | 5000 |00:00:00.02 | 377 | ------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(("GENDER"='F' AND "Y_CHROMOSOME"='N' AND "BEARD"='N')) Note ----- - dynamic statistics used: dynamic sampling (level=2) SQL>
More importantly, it doesn't alter the column group detection.
conn sys/SysPassword1@//localhost:1521/pdb1 as sysdba set long 100000 set lines 120 set pages 50 select dbms_stats.report_col_usage('testuser1', 'tab1') from dual; DBMS_STATS.REPORT_COL_USAGE('TESTUSER1','TAB1') -------------------------------------------------------------------------------- LEGEND: ....... EQ : Used in single table EQuality predicate RANGE : Used in single table RANGE predicate LIKE : Used in single table LIKE predicate NULL : Used in single table is (not) NULL predicate EQ_JOIN : Used in EQuality JOIN predicate NONEQ_JOIN : Used in NON EQuality JOIN predicate FILTER : Used in single table FILTER predicate JOIN : Used in JOIN predicate GROUP_BY : Used in GROUP BY expression ............................................................................... ############################################################################### COLUMN USAGE REPORT FOR TEST.TAB1 ................................. 1. BEARD : EQ 2. GENDER : EQ 3. Y_CHROMOSOME : EQ ############################################################################### SQL>
Automatic Column Group Detection and Statistics Feedback
Make sure you have repeated the setup before starting this test.
Gather statistics and run the following test query.
conn testuser1/testuser1@//localhost:1521/pdb1 exec dbms_stats.gather_table_stats(null, 'tab1'); select /*+ gather_plan_statistics */ * from tab1 where gender = 'F' and y_chromosome = 'N' and beard = 'N'; -- Output edited for brevity. set linesize 200 pagesize 100 select * from table(dbms_xplan.display_cursor(format => 'allstats last')); PLAN_TABLE_OUTPUT -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- SQL_ID dvwdbmm4txayg, child number 0 ------------------------------------- select /*+ gather_plan_statistics */ * from tab1 where gender = 'F' and y_chromosome = 'N' and beard = 'N' Plan hash value: 2211052296 ------------------------------------------------------------------------------------ | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | ------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | | 5000 |00:00:00.01 | 377 | |* 1 | TABLE ACCESS FULL| TAB1 | 1 | 1250 | 5000 |00:00:00.01 | 377 | ------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(("GENDER"='F' AND "Y_CHROMOSOME"='N' AND "BEARD"='N')) SQL>
The presence of statistics means dynamic statistics weren't used and as such, the optimizer was unaware of the relationship between the columns and estimated the cardinality incorrectly.
Notice the query is now marked as reoptimizable.
conn sys/SysPassword1@//localhost:1521/pdb1 as sysdba column sql_text format a40 column is_reoptimizable format a16 select sql_text, is_reoptimizable from v$sql where sql_id = 'dvwdbmm4txayg'; SQL_TEXT IS_REOPTIMIZABLE ---------------------------------------- ---------------- select /*+ gather_plan_statistics */ Y * from tab1 where gender = 'F' an d y_chromosome = 'N' and beard = ' N' SQL>
Using the DBMS_STATS.REPORT_COL_USAGE
function, we can see the basic column usage has been detected, but there are no FILTER
predicates in the reports, which we would expect if a column group had been detected.
conn sys/SysPassword1@//localhost:1521/pdb1 as sysdba set long 100000 set lines 120 set pages 50 select dbms_stats.report_col_usage('testuser1', 'tab1') from dual; DBMS_STATS.REPORT_COL_USAGE('TESTUSER1','TAB1') -------------------------------------------------------------------------------- LEGEND: ....... EQ : Used in single table EQuality predicate RANGE : Used in single table RANGE predicate LIKE : Used in single table LIKE predicate NULL : Used in single table is (not) NULL predicate EQ_JOIN : Used in EQuality JOIN predicate NONEQ_JOIN : Used in NON EQuality JOIN predicate FILTER : Used in single table FILTER predicate JOIN : Used in JOIN predicate GROUP_BY : Used in GROUP BY expression ............................................................................... ############################################################################### COLUMN USAGE REPORT FOR TEST.TAB1 ................................. 1. BEARD : EQ 2. GENDER : EQ 3. Y_CHROMOSOME : EQ ############################################################################### SQL>
If we run the test query again, we can see a more accurate cardinality has been estimated because statistics feedback has been used to adjust the estimate.
conn testuser1/testuser1@//localhost:1521/pdb1 select /*+ gather_plan_statistics */ * from tab1 where gender = 'F' and y_chromosome = 'N' and beard = 'N'; -- Output edited for brevity. set linesize 200 pagesize 100 select * from table(dbms_xplan.display_cursor(format => 'allstats last')); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------ SQL_ID dvwdbmm4txayg, child number 1 ------------------------------------- select /*+ gather_plan_statistics */ * from tab1 where gender = 'F' and y_chromosome = 'N' and beard = 'N' Plan hash value: 2211052296 ------------------------------------------------------------------------------------ | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | ------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | | 5000 |00:00:00.01 | 379 | |* 1 | TABLE ACCESS FULL| TAB1 | 1 | 5000 | 5000 |00:00:00.01 | 379 | ------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(("GENDER"='F' AND "Y_CHROMOSOME"='N' AND "BEARD"='N')) Note ----- - statistics feedback used for this statement SQL>
Using the DBMS_STATS.REPORT_COL_USAGE
function again, there is still no sign of the FILTER
predicate in the report.
conn sys/SysPassword1@//localhost:1521/pdb1 as sysdba set long 100000 set lines 120 set pages 50 select dbms_stats.report_col_usage('testuser1', 'tab1') from dual; DBMS_STATS.REPORT_COL_USAGE('TESTUSER1','TAB1') -------------------------------------------------------------------------------- LEGEND: ....... EQ : Used in single table EQuality predicate RANGE : Used in single table RANGE predicate LIKE : Used in single table LIKE predicate NULL : Used in single table is (not) NULL predicate EQ_JOIN : Used in EQuality JOIN predicate NONEQ_JOIN : Used in NON EQuality JOIN predicate FILTER : Used in single table FILTER predicate JOIN : Used in JOIN predicate GROUP_BY : Used in GROUP BY expression ............................................................................... ############################################################################### COLUMN USAGE REPORT FOR TEST.TAB1 ................................. 1. BEARD : EQ 2. GENDER : EQ 3. Y_CHROMOSOME : EQ ############################################################################### SQL>
We can see the query is no longer reoptimizable.
conn sys/SysPassword1@//localhost:1521/pdb1 as sysdba column sql_text format a40 column is_reoptimizable format a16 select sql_text, is_reoptimizable from v$sql where sql_id = 'dvwdbmm4txayg'; SQL_TEXT IS_REOPTIMIZABLE ---------------------------------------- ---------------- select /*+ gather_plan_statistics */ Y * from tab1 where gender = 'F' an d y_chromosome = 'N' and beard = ' N' select /*+ gather_plan_statistics */ N * from tab1 where gender = 'F' an d y_chromosome = 'N' and beard = ' N' SQL>
Automatic Column Group Detection and SQL Plan Directives
Make sure you have repeated the setup before starting this test.
We are effectively repeated the previous test here, but we will check the SQL plan directives along the way. Gather statistics and run the following test query.
conn testuser1/testuser1@//localhost:1521/pdb1 exec dbms_stats.gather_table_stats(null, 'tab1'); select /*+ gather_plan_statistics */ * from tab1 where gender = 'F' and y_chromosome = 'N' and beard = 'N'; -- Output edited for brevity. set linesize 200 pagesize 100 select * from table(dbms_xplan.display_cursor(format => 'allstats last')); PLAN_TABLE_OUTPUT -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- SQL_ID dvwdbmm4txayg, child number 0 ------------------------------------- select /*+ gather_plan_statistics */ * from tab1 where gender = 'F' and y_chromosome = 'N' and beard = 'N' Plan hash value: 2211052296 ------------------------------------------------------------------------------------ | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | ------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | | 5000 |00:00:00.01 | 377 | |* 1 | TABLE ACCESS FULL| TAB1 | 1 | 1250 | 5000 |00:00:00.01 | 377 | ------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(("GENDER"='F' AND "Y_CHROMOSOME"='N' AND "BEARD"='N')) SQL>
The presence of statistics means dynamic statistics weren't used and as such, the optimizer was unaware of the relationship between the columns and estimated the cardinality incorrectly.
Using the DBMS_STATS.REPORT_COL_USAGE
function, we can see the basic column usage has been detected, but there are no FILTER
predicates in the reports, which we would expect if a column group had been detected.
conn sys/SysPassword1@//localhost:1521/pdb1 as sysdba set long 100000 set lines 120 set pages 50 select dbms_stats.report_col_usage('testuser1', 'tab1') from dual; DBMS_STATS.REPORT_COL_USAGE('TESTUSER1','TAB1') -------------------------------------------------------------------------------- LEGEND: ....... EQ : Used in single table EQuality predicate RANGE : Used in single table RANGE predicate LIKE : Used in single table LIKE predicate NULL : Used in single table is (not) NULL predicate EQ_JOIN : Used in EQuality JOIN predicate NONEQ_JOIN : Used in NON EQuality JOIN predicate FILTER : Used in single table FILTER predicate JOIN : Used in JOIN predicate GROUP_BY : Used in GROUP BY expression ............................................................................... ############################################################################### COLUMN USAGE REPORT FOR TEST.TAB1 ................................. 1. BEARD : EQ 2. GENDER : EQ 3. Y_CHROMOSOME : EQ ############################################################################### SQL>
Notice the query is now marked as reoptimizable.
conn sys/SysPassword1@//localhost:1521/pdb1 as sysdba column sql_text format a40 column is_reoptimizable format a16 select sql_text, is_reoptimizable from v$sql where sql_id = 'dvwdbmm4txayg'; SQL_TEXT IS_REOPTIMIZABLE ---------------------------------------- ---------------- select /*+ gather_plan_statistics */ Y * from tab1 where gender = 'F' an d y_chromosome = 'N' and beard = ' N' SQL>
Let's check to see if any SQL plan directives were created as a result of the previous cardinality misestimates. Remember to flush them first, or they might not be visible.
conn sys/SysPassword1@//localhost:1521/pdb1 as sysdba exec dbms_spd.flush_sql_plan_directive; set linesize 200 column dir_id format a20 column owner format a10 column object_name format a10 column col_name format a12 select to_char(d.directive_id) dir_id, o.owner, o.object_name, o.subobject_name col_name, o.object_type, d.type, d.state, d.reason from dba_sql_plan_directives d, dba_sql_plan_dir_objects o where d.directive_id=o.directive_id and o.owner = 'TESTUSER1' ORDER BY 1,2,3,4,5; DIR_ID OWNER OBJECT_NAM COL_NAME OBJECT TYPE STATE REASON -------------------- ---------- ---------- ------------ ------ ---------------- ---------- ------------------------------------ 12539685487831241411 TEST TAB1 BEARD COLUMN DYNAMIC_SAMPLING USABLE SINGLE TABLE CARDINALITY MISESTIMATE 12539685487831241411 TEST TAB1 GENDER COLUMN DYNAMIC_SAMPLING USABLE SINGLE TABLE CARDINALITY MISESTIMATE 12539685487831241411 TEST TAB1 Y_CHROMOSOME COLUMN DYNAMIC_SAMPLING USABLE SINGLE TABLE CARDINALITY MISESTIMATE 12539685487831241411 TEST TAB1 TABLE DYNAMIC_SAMPLING USABLE SINGLE TABLE CARDINALITY MISESTIMATE SQL>
So we now have SQL plan directives. Run the test query again.
conn testuser1/testuser1@//localhost:1521/pdb1 select /*+ gather_plan_statistics */ * from tab1 where gender = 'F' and y_chromosome = 'N' and beard = 'N'; -- Output edited for brevity. set linesize 200 pagesize 100 select * from table(dbms_xplan.display_cursor(format => 'allstats last')); PLAN_TABLE_OUTPUT -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- SQL_ID dvwdbmm4txayg, child number 1 ------------------------------------- Select /*+ gather_plan_statistics */ * from tab1 where gender = 'F' and y_chromosome = 'N' and beard = 'N' Plan hash value: 2211052296 ------------------------------------------------------------------------------------ | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | ------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | | 5000 |00:00:00.02 | 377 | |* 1 | TABLE ACCESS FULL| TAB1 | 1 | 5000 | 5000 |00:00:00.02 | 377 | ------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(("GENDER"='F' AND "Y_CHROMOSOME"='N' AND "BEARD"='N')) Note ----- - statistics feedback used for this statement SQL>
The statistics feedback was still used in preference to the SQL plan directives and once again, the query is no longer reoptimizable.
conn sys/SysPassword1@//localhost:1521/pdb1 as sysdba column sql_text format a40 column is_reoptimizable format a16 select sql_text, is_reoptimizable from v$sql where sql_id = 'dvwdbmm4txayg'; SQL_TEXT IS_REOPTIMIZABLE ---------------------------------------- ---------------- select /*+ gather_plan_statistics */ Y * from tab1 where gender = 'F' an d y_chromosome = 'N' and beard = ' N' select /*+ gather_plan_statistics */ N * from tab1 where gender = 'F' an d y_chromosome = 'N' and beard = ' N' SQL>
Using the DBMS_STATS.REPORT_COL_USAGE
function again, finally we can see the FILTER
predicate in the report.
conn sys/SysPassword1@//localhost:1521/pdb1 as sysdba set long 100000 set lines 120 set pages 50 select dbms_stats.report_col_usage('testuser1', 'tab1') from dual; DBMS_STATS.REPORT_COL_USAGE('TESTUSER1','TAB1') -------------------------------------------------------------------------------- LEGEND: ....... EQ : Used in single table EQuality predicate RANGE : Used in single table RANGE predicate LIKE : Used in single table LIKE predicate NULL : Used in single table is (not) NULL predicate EQ_JOIN : Used in EQuality JOIN predicate NONEQ_JOIN : Used in NON EQuality JOIN predicate FILTER : Used in single table FILTER predicate JOIN : Used in JOIN predicate GROUP_BY : Used in GROUP BY expression ............................................................................... ############################################################################### COLUMN USAGE REPORT FOR TEST.TAB1 ................................. 1. BEARD : EQ 2. GENDER : EQ 3. Y_CHROMOSOME : EQ 4. (GENDER, Y_CHROMOSOME, BEARD) : FILTER ############################################################################### SQL>
At this point, there are still no column group statistics available.
COLUMN extension_name FORMAT A30 COLUMN extension FORMAT A45 SELECT extension_name, extension FROM dba_stat_extensions WHERE table_name = 'TAB1'; no rows selected SQL>
Once we gather default statistics, the column group statistics are created.
exec dbms_stats.gather_table_stats('testuser1', 'tab1'); column extension_name format a30 column extension format a45 select extension_name, extension from dba_stat_extensions where table_name = 'TAB1'; EXTENSION_NAME EXTENSION ------------------------------ --------------------------------------------- SYS_STSJ_NI$1E_EDU917#V63HO2OA ("GENDER","Y_CHROMOSOME","BEARD") SQL>
We can see the resulting statistics in the USER_TAB_COL_STATISTICS
view.
column column_name format a30 select column_name, num_distinct, num_nulls, histogram from user_tab_col_statistics where table_name = 'TAB1' order by column_name; COLUMN_NAME NUM_DISTINCT NUM_NULLS HISTOGRAM ------------------------------ ------------ ---------- --------------- BEARD 2 0 FREQUENCY GENDER 2 0 FREQUENCY ID 10000 0 NONE NAME 10000 0 NONE SYS_STSJ_NI$1E_EDU917#V63HO2OA 3 0 FREQUENCY Y_CHROMOSOME 2 0 FREQUENCY SQL>
Conclusion
The column group is only automatically detected during a hard parse or a reoptimization of a statement in the presence of persisted SQL plan directives. The SQL plan directive doesn't have to be used in the generation of the final execution plan, for this to happen.
For a query that we would expect to trigger automatic column group detection, we seem to have at least three possible scenarios that result in different behaviour.
- If the lack of good statistics force the use of dynamic statistics directly, no automatic column group detection will be performed. At some point, the statistics will be refreshed and one of the other scenarios will then be available.
- If a cardinality misestimate results in both statistics feedback and SQL plan directives being produced, the statistics feedback takes priority during the reoptimization the next time the query is run. If the query is run a second time before the SQL plan directives are persisted, no automatic column group detection will be triggered during the reoptimization using statistics feedback. The next hard parse after the SQL plan directives are persisted will result in automatic column group detection, but depending on how long the statement stays in the shared pool, this could be quite a delay.
- If SQL plan directives are persisted before the query is run a second time, automatic column groups detection is triggered during the reoptimization, whether the SQL plan directives or statistics feedback are used for the reoptimization of the execution plan.
It is only after the column group is detected that default statistics gathering will be able to create the column group statistics.
For more information see:
- Extended Statistics - Multi-Column (Column Group) Statistics
- Extended Statistics Enhancements in Oracle Database 11g Release 2
- DBMS_STATS
- Optimizer with Oracle Database 12c
Hope this helps. Regards Tim...