8i | 9i | 10g | 11g | 12c | 13c | 18c | Misc | PL/SQL | SQL | RAC | WebLogic | Linux

Home » Articles » Misc » Here

SQL for Beginners (Part 5) : Joins

This is the fifth part of a series of articles showing the basics of SQL. In this article we take a look at some of the common joins, both ANSI and non-ANSI, available in SQL.

Related articles.

Setup

You can perform all these queries online for free using SQL Fiddle.

The examples in this article require the following tables to be present.

--DROP TABLE employees PURGE;
--DROP TABLE departments PURGE;

CREATE TABLE departments (
  department_id   NUMBER(2) CONSTRAINT departments_pk PRIMARY KEY,
  department_name VARCHAR2(14),
  location        VARCHAR2(13)
);

INSERT INTO departments VALUES (10,'ACCOUNTING','NEW YORK');
INSERT INTO departments VALUES (20,'RESEARCH','DALLAS');
INSERT INTO departments VALUES (30,'SALES','CHICAGO');
INSERT INTO departments VALUES (40,'OPERATIONS','BOSTON');
COMMIT;


CREATE TABLE employees (
  employee_id   NUMBER(4) CONSTRAINT employees_pk PRIMARY KEY,
  employee_name VARCHAR2(10),
  job           VARCHAR2(9),
  manager_id    NUMBER(4),
  hiredate      DATE,
  salary        NUMBER(7,2),
  commission    NUMBER(7,2),
  department_id NUMBER(2) CONSTRAINT emp_department_id_fk REFERENCES departments(department_id)
);

INSERT INTO employees VALUES (7369,'SMITH','CLERK',7902,to_date('17-12-1980','dd-mm-yyyy'),800,NULL,20);
INSERT INTO employees VALUES (7499,'ALLEN','SALESMAN',7698,to_date('20-2-1981','dd-mm-yyyy'),1600,300,30);
INSERT INTO employees VALUES (7521,'WARD','SALESMAN',7698,to_date('22-2-1981','dd-mm-yyyy'),1250,500,30);
INSERT INTO employees VALUES (7566,'JONES','MANAGER',7839,to_date('2-4-1981','dd-mm-yyyy'),2975,NULL,20);
INSERT INTO employees VALUES (7654,'MARTIN','SALESMAN',7698,to_date('28-9-1981','dd-mm-yyyy'),1250,1400,30);
INSERT INTO employees VALUES (7698,'BLAKE','MANAGER',7839,to_date('1-5-1981','dd-mm-yyyy'),2850,NULL,30);
INSERT INTO employees VALUES (7782,'CLARK','MANAGER',7839,to_date('9-6-1981','dd-mm-yyyy'),2450,NULL,10);
INSERT INTO employees VALUES (7788,'SCOTT','ANALYST',7566,to_date('13-JUL-87','dd-mm-rr')-85,3000,NULL,20);
INSERT INTO employees VALUES (7839,'KING','PRESIDENT',NULL,to_date('17-11-1981','dd-mm-yyyy'),5000,NULL,10);
INSERT INTO employees VALUES (7844,'TURNER','SALESMAN',7698,to_date('8-9-1981','dd-mm-yyyy'),1500,0,30);
INSERT INTO employees VALUES (7876,'ADAMS','CLERK',7788,to_date('13-JUL-87', 'dd-mm-rr')-51,1100,NULL,20);
INSERT INTO employees VALUES (7900,'JAMES','CLERK',7698,to_date('3-12-1981','dd-mm-yyyy'),950,NULL,30);
INSERT INTO employees VALUES (7902,'FORD','ANALYST',7566,to_date('3-12-1981','dd-mm-yyyy'),3000,NULL,20);
INSERT INTO employees VALUES (7934,'MILLER','CLERK',7782,to_date('23-1-1982','dd-mm-yyyy'),1300,NULL,10);
COMMIT;

These tables are a variant of the EMP and DEPT tables from the SCOTT schema. You will see a lot of Oracle examples on the internet using the tables from the SCOTT schema. You can find the original table definitions in the "$ORACLE_HOME/rdbms/admin/utlsampl.sql" script.

Introduction

Joins are used to combine data from multiple tables to form a single result set. Oracle provides two approaches to joining tables, the non-ANSI join syntax and the ANSI join syntax, which look quite different.

The non-ANSI join syntax has historically been the way you perform joins in Oracle and it is still very popular today. The tables to be joined are listed in the FROM clause and the join conditions are defined as predicates in the WHERE clause. Even if you don't like it, you are going to have to get used to it as there is a lot of code out there that still uses it. If you are not familiar with the syntax you will struggle to bug fix any existing code and some of the examples on the internet will look rather mysterious to you.

The ANSI join syntax was introduced in Oracle 9i. It has a number of advantages over the original syntax.

Despite all these advantages, many Oracle developers still use the non-ANSI join syntax. Partly this is just because of habit. Partly this is because the Oracle optimizer transforms most ANSI join syntax into the non-ANSI join syntax equivalent before it is executed.

For a beginner, my personal opinion is you should focus on the ANSI join syntax, but be aware of the non-ANSI equivalent. In this article I will show the ANSI and non-ANSI syntax for each example, where relevant.

Some join methods are more popular than others, so initially focus your attention on those you are most likely to see. The most common joins you are likely to see in code are the following.

The following are less common.

The following are very rare at this point.

If a word is surrounded by "[]" it means it is an optional keyword. Without any other qualifier, a join is an inner join, so using the INNER keyword is redundant. If the join includes the words LEFT, RIGHT or FULL, it is by definition an outer join, so the OUTER keyword is redundant. The choice to include or exclude these words is really personal preference, so follow the standard in your company, or do what feels right to you.

With all that in mind, lets have a look at some examples.

[INNER] JOIN ... ON

An INNER JOIN combines data from two tables where there is a match on the joining column(s) in both tables.

INNER JOIN

Remember, the INNER keyword is optional. In the examples below, we are returning the DEPARTMENT_NAME and the EMPLOYEE_NAME for each employee. The OPERATIONS department has a DEPARTMENT_ID of 40, so it is not removed by the filter condition, but there are no employees in this department, so there is no match and it is not returned in the result set.

Here is an example of an ANSI INNER JOIN.

SELECT d.department_name,
       e.employee_name
FROM   departments d
       JOIN employees e ON d.department_id = e.department_id
WHERE  d.department_id >= 30
ORDER BY d.department_name;

DEPARTMENT_NAM EMPLOYEE_N
-------------- ----------
SALES          ALLEN
SALES          BLAKE
SALES          JAMES
SALES          MARTIN
SALES          TURNER
SALES          WARD

6 rows selected.

SQL>

Here is the non-ANSI equivalent of the previous statement.

SELECT d.department_name,
       e.employee_name
FROM   departments d, employees e
WHERE  d.department_id = e.department_id
AND    d.department_id >= 30
ORDER BY d.department_name;

DEPARTMENT_NAM EMPLOYEE_N
-------------- ----------
SALES          ALLEN
SALES          BLAKE
SALES          JAMES
SALES          MARTIN
SALES          TURNER
SALES          WARD

6 rows selected.

SQL>

LEFT [OUTER] JOIN

A LEFT [OUTER] JOIN returns all valid rows from the table on the left side of the JOIN keyword, along with the values from the table on the right side, or NULLs if a matching row doesn't exist.

LEFT OUTER JOIN

Using the previous example, but switching to a LEFT OUTER JOIN means we will see the OPERATIONS department, even though it has no employees.

Here is an example of an ANSI LEFT OUTER JOIN.

SELECT d.department_name,
       e.employee_name     
FROM   departments d
       LEFT OUTER JOIN employees e ON d.department_id = e.department_id
WHERE  d.department_id >= 30
ORDER BY d.department_name, e.employee_name;

DEPARTMENT_NAM EMPLOYEE_N
-------------- ----------
OPERATIONS
SALES          ALLEN
SALES          BLAKE
SALES          JAMES
SALES          MARTIN
SALES          TURNER
SALES          WARD

7 rows selected.

SQL>

Here is the non-ANSI equivalent of the previous statement. Notice the "(+)" is used to indicate the side of the join condition that may be missing. For a multi-column join condition, each column must have the "(+)" present. Unlike the ANSI join syntax, the non-ANSI join syntax is not affected by the order of the tables.

SELECT d.department_name,
       e.employee_name      
FROM   departments d, employees e
WHERE  d.department_id = e.department_id (+) 
AND    d.department_id >= 30
ORDER BY d.department_name, e.employee_name;

DEPARTMENT_NAM EMPLOYEE_N
-------------- ----------
OPERATIONS
SALES          ALLEN
SALES          BLAKE
SALES          JAMES
SALES          MARTIN
SALES          TURNER
SALES          WARD

7 rows selected.

SQL>

Adding filters to columns returned from an outer joined table is a common cause for confusion. If you test for a specific value, for example "salary >= 2000", but the value for the SALARY column is NULL because the row is missing, a regular condition in the WHERE clause will throw the row away, therefore defeating the object of doing an outer join. Both the ANSI and non-ANSI methods have a way of dealing with this.

Using the ANSI join syntax, filters on columns from the outer joined table are included in the join itself, rather than being placed in the WHERE clause.

SELECT d.department_name,
       e.employee_name     
FROM   departments d
       LEFT OUTER JOIN employees e ON d.department_id = e.department_id AND e.salary >= 2000
WHERE  d.department_id >= 30
ORDER BY d.department_name, e.employee_name;

DEPARTMENT_NAM EMPLOYEE_N
-------------- ----------
OPERATIONS
SALES          BLAKE

2 rows selected.

SQL>

Using the non-ANSI join syntax, the "(+)" is used to indicate a column may have a NULL value as a result of an outer join.

SELECT d.department_name,
       e.employee_name      
FROM   departments d, employees e
WHERE  d.department_id = e.department_id (+)
AND    e.salary (+) >= 2000
AND    d.department_id >= 30
ORDER BY d.department_name, e.employee_name;

DEPARTMENT_NAM EMPLOYEE_N
-------------- ----------
OPERATIONS
SALES          BLAKE

2 rows selected.

SQL>

RIGHT [OUTER] JOIN

The RIGHT [OUTER] JOIN is the opposite of the LEFT [OUTER] JOIN. It returns all valid rows from the table on the right side of the JOIN keyword, along with the values from the table on the left side, or NULLs if a matching row doesn't exist. All points raised in the previous section apply here also.

RIGHT OUTER JOIN

The following example has altered the order of the tables so a RIGHT [OUTER] JOIN is now required.

SELECT d.department_name,
       e.employee_name     
FROM   employees e
       RIGHT OUTER JOIN departments d ON e.department_id = d.department_id
WHERE  d.department_id >= 30
ORDER BY d.department_name, e.employee_name;

DEPARTMENT_NAM EMPLOYEE_N
-------------- ----------
OPERATIONS
SALES          ALLEN
SALES          BLAKE
SALES          JAMES
SALES          MARTIN
SALES          TURNER
SALES          WARD

7 rows selected.

SQL>

Remember, the non-ANSI outer join syntax is not dependent on table order, so there is no real concept of right or left outer joins, just outer joins.

FULL [OUTER] JOIN

A FULL [OUTER] JOIN combines all the rows from the tables on the left and right sides of the join. If there is a conventional match it is made. If either side has missing data, it is replaced by NULLs, rather than throwing the row away.

FULL OUTER JOIN

To see a working example, we need to add another employee who is not assigned to a department.

INSERT INTO employees VALUES (8888,'JONES','DBA',null,to_date('02-1-1982','dd-mm-yyyy'),1300,NULL,NULL);
COMMIT;

Here is an example of an ANSI FULL OUTER JOIN.

SELECT d.department_name,
       e.employee_name     
FROM   employees e
       FULL OUTER JOIN departments d ON e.department_id = d.department_id
ORDER BY d.department_name, e.employee_name;

DEPARTMENT_NAM EMPLOYEE_N
-------------- ----------
ACCOUNTING     CLARK
ACCOUNTING     KING
ACCOUNTING     MILLER
OPERATIONS
RESEARCH       ADAMS
RESEARCH       FORD
RESEARCH       JONES
RESEARCH       SCOTT
RESEARCH       SMITH
SALES          ALLEN
SALES          BLAKE
SALES          JAMES
SALES          MARTIN
SALES          TURNER
SALES          WARD
               JONES

16 rows selected.

SQL>

There is no direct equivalent of a full outer join using the non-ANSI join syntax, but we can recreate it by combining two outer join queries using a UNION ALL, as shown below.

SELECT d.department_name,
       e.employee_name      
FROM   employees e, departments d
WHERE  e.department_id = d.department_id (+)
UNION ALL
SELECT d.department_name,
       e.employee_name      
FROM   departments d, employees e
WHERE  d.department_id = e.department_id (+)
AND    e.employee_name IS NULL
ORDER BY 1, 2;

DEPARTMENT_NAM EMPLOYEE_N
-------------- ----------
ACCOUNTING     CLARK
ACCOUNTING     KING
ACCOUNTING     MILLER
OPERATIONS
RESEARCH       ADAMS
RESEARCH       FORD
RESEARCH       JONES
RESEARCH       SCOTT
RESEARCH       SMITH
SALES          ALLEN
SALES          BLAKE
SALES          JAMES
SALES          MARTIN
SALES          TURNER
SALES          WARD
               JONES

16 rows selected.

SQL>

Interestingly, when you run an ANSI FULL OUTER JOIN, the Oracle optimizer rewrites it to a non-ANSI join equivalent, so there is no performance improvement associated with it. It's just easier on the eye.

Let's remove that extra employee so it doesn't affect any other examples.

DELETE FROM employees WHERE employee_id = 8888;
COMMIT;

CROSS JOIN

A CROSS JOIN is the deliberate creation of a Cartesian product. There are no join columns specified, so every possible combination of rows between the two tables is produced.

CROSS JOIN

Here is an example of an ANSI CROSS JOIN.

SELECT e.employee_name,
       d.department_name
FROM   employees e
       CROSS JOIN departments d
ORDER BY e.employee_name, d.department_name;

EMPLOYEE_N DEPARTMENT_NAM
---------- --------------
ADAMS      ACCOUNTING
ADAMS      OPERATIONS
ADAMS      RESEARCH
ADAMS      SALES

... Output amended for brevity ...

WARD       ACCOUNTING
WARD       OPERATIONS
WARD       RESEARCH
WARD       SALES

56 rows selected.

SQL>

Here is the non-ANSI equivalent of the previous statement. Notice, there are no join conditions in the WHERE clause.

SELECT e.employee_name,
       d.department_name
FROM   employees e, departments d
ORDER BY e.employee_name, d.department_name;


EMPLOYEE_N DEPARTMENT_NAM
---------- --------------
ADAMS      ACCOUNTING
ADAMS      OPERATIONS
ADAMS      RESEARCH
ADAMS      SALES

... Output amended for brevity ...

WARD       ACCOUNTING
WARD       OPERATIONS
WARD       RESEARCH
WARD       SALES

56 rows selected.

SQL>

NATURAL JOIN

A NATURAL JOIN is a variant on an INNER JOIN. The join columns are determined implicitly, based on the column names. Any columns that share the same name between the two tables are assumed to be join columns. Here is an example using the ANSI join syntax.

SELECT e.employee_name,
       d.department_name
FROM   employees e
       NATURAL JOIN departments d
ORDER BY e.employee_name, d.department_name;

EMPLOYEE_N DEPARTMENT_NAM
---------- --------------
ADAMS      RESEARCH
ALLEN      SALES
BLAKE      SALES
CLARK      ACCOUNTING
FORD       RESEARCH
JAMES      SALES
JONES      RESEARCH
KING       ACCOUNTING
MARTIN     SALES
MILLER     ACCOUNTING
SCOTT      RESEARCH
SMITH      RESEARCH
TURNER     SALES
WARD       SALES

14 rows selected.

SQL>

There is no non-ANSI equivalent of this, as all join conditions must be specified.

Using a NATURAL JOIN is a bad idea for a couple of reasons. First, if someone adds a new column to one of the tables that happens to have the same name as a column in the other table, they may break any existing natural joins. It is effectively a bug waiting to happen. Second, you can't apply any filters to columns used in natural joins, as shown in the following example.

SELECT e.employee_name,
       d.department_name
FROM   employees e
       NATURAL JOIN departments d
WHERE  d.department_id = 20
ORDER BY e.employee_name;

WHERE  d.department_id = 20
       *
ERROR at line 5:
ORA-25155: column used in NATURAL join cannot have qualifier

SQL>

[INNER] JOIN ... USING

The INNER JOIN ... USING is almost a half-way house between a conventional INNER JOIN and a NATURAL JOIN. The join is made using columns with matching names in each table, but you have to specify the columns to be used, not the whole condition. This allows you to join on a subset of the columns common to both tables.

SELECT e.employee_name,
       d.department_name
FROM   employees e
       JOIN departments d USING (department_id)
ORDER BY e.employee_name;

EMPLOYEE_N DEPARTMENT_NAM
---------- --------------
ADAMS      RESEARCH
ALLEN      SALES
BLAKE      SALES
CLARK      ACCOUNTING
FORD       RESEARCH
JAMES      SALES
JONES      RESEARCH
KING       ACCOUNTING
MARTIN     SALES
MILLER     ACCOUNTING
SCOTT      RESEARCH
SMITH      RESEARCH
TURNER     SALES
WARD       SALES

14 rows selected.

SQL>

This is a safe join syntax as it can't be affected by addition of columns to either table, but it has a similar limitation to the NATURAL JOIN, in that you can't apply any filters to columns used in the join, as shown below.

SELECT e.employee_name,
       d.department_name
FROM   employees e
       JOIN departments d USING (department_id)
WHERE  d.department_id = 20
ORDER BY e.employee_name;

WHERE  d.department_id = 20
       *
ERROR at line 5:
ORA-25154: column part of USING clause cannot have qualifier

SQL>

Additional Joins

The CROSS APPLY and OUTER APPLY joins are available in Oracle, but they have only been supported for use in your application code from Oracle 12c onward, so you are unlikely to see them in application code for some time.

For more information see:

Hope this helps. Regards Tim...

Back to the Top.