8i | 9i | 10g | 11g | 12c | 13c | 18c | 19c | 21c | 23c | Misc | PL/SQL | SQL | RAC | WebLogic | Linux

Home » Articles » Misc » Here

Join Elimination

The Oracle optimizer can perform a number of query transformations to improve the performance of SQL. If the optimizer can see an opportunity to remove a table from a query, without affecting the resultset it will do so using join elimination.

Related articles.

Setup

Create the schema objects for a parent-child (master-detail) relationship.

CREATE TABLE parent (
  id          NUMBER NOT NULL,
  description VARCHAR2(50) NOT NULL,
  CONSTRAINT parent_pk PRIMARY KEY (id)
);

INSERT INTO parent VALUES (1, 'PARENT ONE');
INSERT INTO parent VALUES (2, 'PARENT TWO');
COMMIT;

CREATE TABLE child (
  id          NUMBER NOT NULL,
  parent_id   NUMBER NOT NULL,
  description VARCHAR2(50) NOT NULL,
  CONSTRAINT child_pk PRIMARY KEY (id),
  CONSTRAINT child_parent_fk FOREIGN KEY (parent_id) REFERENCES parent(id)
);

CREATE INDEX child_parent_fk_idx ON child(parent_id);

INSERT INTO child VALUES (1, 1, 'CHILD ONE');
INSERT INTO child VALUES (2, 1, 'CHILD ONE');
INSERT INTO child VALUES (3, 2, 'CHILD TWO');
INSERT INTO child VALUES (4, 2, 'CHILD TWO');
COMMIT;

EXEC DBMS_STATS.gather_table_stats(USER, 'PARENT');
EXEC DBMS_STATS.gather_table_stats(USER, 'CHILD');

Notice the foreign key (FK) between the CHILD.PARENT_ID column and the PARENT.ID column. Because of this we know that for the CHILD record to be present the PARENT record *must* exist. This gives the optimizer information to make some clever decisions. Assuming there are no additional filters, a join from the CHILD table to the PARENT table will never reduce the number of rows in the resultset, so if no columns from the PARENT table are referenced, Oracle can remove that table using join elimination.

Basic Join Elimination

The following query performs a join between the PARENT and CHILD tables, but only references columns from the CHILD table in the select list. Notice also, there are no filters are on any columns in the PARENT table.

SET AUTOTRACE TRACEONLY EXPLAIN

SELECT c.id, c.parent_id, c.description
FROM   child c
       JOIN parent p ON c.parent_id = p.id;

---------------------------------------------------------------------------
| Id  | Operation         | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |       |     4 |    64 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS FULL| CHILD |     4 |    64 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------

The execution plan shows the optimizer was able to safely eliminate the join to the PARENT table.

Join Elimination in Views

Obviously, it's a bit stupid to join tables that aren't being used, but sometimes you might have views that include tables that in some contexts are not needed. For example, let's create a view based on the tables used in the last query.

CREATE VIEW parent_child_v AS
  SELECT p.id AS parent_id,
         p.description AS parent_description,
         c.id AS child_id,
         c.parent_id AS child_parent_id,
         c.description AS child_description
  FROM   child c
  JOIN   parent p ON c.parent_id = p.id;

As long as we don't reference columns in the parent table in the select list or in the WHERE cause as filters, the parent table can be eliminated.

SET AUTOTRACE TRACEONLY EXPLAIN

SELECT child_id,
       child_parent_id,
       child_description
FROM   parent_child_v;

---------------------------------------------------------------------------
| Id  | Operation         | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |       |     4 |    64 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS FULL| CHILD |     4 |    64 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------

This is possible because of view merging. The contents of the view are merged into the query block that contains it, allowing subsequent optimizations like join elimination to happen.

Additional Transformation to Allow Join Elimination

As soon as we reference the columns in the PARENT table, we have to perform the join.

SET AUTOTRACE TRACEONLY EXPLAIN

SELECT parent_description,
       child_id,
       child_parent_id,
       child_description
FROM   parent_child_v;

----------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                     |     4 |   120 |     5   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                |                     |       |       |            |          |
|   2 |   NESTED LOOPS               |                     |     4 |   120 |     5   (0)| 00:00:01 |
|   3 |    TABLE ACCESS FULL         | PARENT              |     2 |    28 |     3   (0)| 00:00:01 |
|*  4 |    INDEX RANGE SCAN          | CHILD_PARENT_FK_IDX |     2 |       |     0   (0)| 00:00:01 |
|   5 |   TABLE ACCESS BY INDEX ROWID| CHILD               |     2 |    32 |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------

This is not always the case though. Look what happens if we query the PARENT_CHILD_V.PARENT_ID column, which is sourced from the PARENT.ID column.

SET AUTOTRACE TRACEONLY EXPLAIN

SELECT parent_id,
       child_id,
       child_parent_id,
       child_description
FROM   parent_child_v;

---------------------------------------------------------------------------
| Id  | Operation         | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |       |     4 |    64 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS FULL| CHILD |     4 |    64 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------

We can see this even more clearly if we query the tables directly.

SET AUTOTRACE TRACEONLY EXPLAIN

SELECT p.id AS p_id, c.id, c.parent_id, c.description
FROM   child c
       JOIN parent p ON c.parent_id = p.id;

---------------------------------------------------------------------------
| Id  | Operation         | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |       |     4 |    64 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS FULL| CHILD |     4 |    64 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------

How can this be? Because of the foreign key between the CHILD.PARENT_ID and the PARENT.ID columns, Oracle knows that any reference to PARENT.ID can actually be replaced by a reference to CHILD.PARENT_ID. In both the above cases, Oracle transformed the query to use the column from the CHILD table, allowing the PARENT table to be removed by join elimination.

Breaking Join Elimination

It is very easy to prevent join elimination by not giving the optimizer enough information. Simply removing the foreign key removes the posibility of join elimination.

ALTER TABLE child DROP CONSTRAINT child_parent_fk;

SET AUTOTRACE TRACEONLY EXPLAIN

SELECT parent_id,
       child_id,
       child_parent_id,
       child_description
FROM   parent_child_v;

----------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                     |     4 |    76 |     3   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                |                     |       |       |            |          |
|   2 |   NESTED LOOPS               |                     |     4 |    76 |     3   (0)| 00:00:01 |
|   3 |    INDEX FULL SCAN           | PARENT_PK           |     2 |     6 |     1   (0)| 00:00:01 |
|*  4 |    INDEX RANGE SCAN          | CHILD_PARENT_FK_IDX |     2 |       |     0   (0)| 00:00:01 |
|   5 |   TABLE ACCESS BY INDEX ROWID| CHILD               |     2 |    32 |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------

With the constraint gone, the optimizer no longer knows about the special relationship between the CHILD.PARENT_ID and PARENT.ID columns. Indeed, there is nothing on the database to make sure the relationship is maintained, so even if our application enforces the behaviour, the optimizer can not take advantage of it.

For more information see:

Hope this helps. Regards Tim...

Back to the Top.