8i | 9i | 10g | 11g | 12c | 13c | 18c | 19c | 21c | 23c | Misc | PL/SQL | SQL | RAC | WebLogic | Linux
SQL for Beginners (Part 5) : Joins
This is the fifth part of a series of articles showing the basics of SQL. In this article we take a look at some of the common joins, both ANSI and non-ANSI, available in SQL.
- Setup
- Introduction
- [INNER] JOIN ... ON
- LEFT [OUTER] JOIN
- RIGHT [OUTER] JOIN
- FULL [OUTER] JOIN
- CROSS JOIN
- NATURAL JOIN
- [INNER] JOIN ... USING
- Additional Joins
Related articles.
- SQL for Beginners (Part 5) : Joins
- SQL for Beginners - Full Playlist
- Oracle SQL Articles - Getting Started
- LATERAL Inline Views, CROSS APPLY and OUTER APPLY Joins in Oracle Database 12c Release 1 (12.1)
Setup
You can perform all these queries online for free using SQL Fiddle.
The examples in this article require the following tables to be present.
--DROP TABLE employees PURGE; --DROP TABLE departments PURGE; CREATE TABLE departments ( department_id NUMBER(2) CONSTRAINT departments_pk PRIMARY KEY, department_name VARCHAR2(14), location VARCHAR2(13) ); INSERT INTO departments VALUES (10,'ACCOUNTING','NEW YORK'); INSERT INTO departments VALUES (20,'RESEARCH','DALLAS'); INSERT INTO departments VALUES (30,'SALES','CHICAGO'); INSERT INTO departments VALUES (40,'OPERATIONS','BOSTON'); COMMIT; CREATE TABLE employees ( employee_id NUMBER(4) CONSTRAINT employees_pk PRIMARY KEY, employee_name VARCHAR2(10), job VARCHAR2(9), manager_id NUMBER(4), hiredate DATE, salary NUMBER(7,2), commission NUMBER(7,2), department_id NUMBER(2) CONSTRAINT emp_department_id_fk REFERENCES departments(department_id) ); INSERT INTO employees VALUES (7369,'SMITH','CLERK',7902,to_date('17-12-1980','dd-mm-yyyy'),800,NULL,20); INSERT INTO employees VALUES (7499,'ALLEN','SALESMAN',7698,to_date('20-2-1981','dd-mm-yyyy'),1600,300,30); INSERT INTO employees VALUES (7521,'WARD','SALESMAN',7698,to_date('22-2-1981','dd-mm-yyyy'),1250,500,30); INSERT INTO employees VALUES (7566,'JONES','MANAGER',7839,to_date('2-4-1981','dd-mm-yyyy'),2975,NULL,20); INSERT INTO employees VALUES (7654,'MARTIN','SALESMAN',7698,to_date('28-9-1981','dd-mm-yyyy'),1250,1400,30); INSERT INTO employees VALUES (7698,'BLAKE','MANAGER',7839,to_date('1-5-1981','dd-mm-yyyy'),2850,NULL,30); INSERT INTO employees VALUES (7782,'CLARK','MANAGER',7839,to_date('9-6-1981','dd-mm-yyyy'),2450,NULL,10); INSERT INTO employees VALUES (7788,'SCOTT','ANALYST',7566,to_date('13-JUL-87','dd-mm-rr')-85,3000,NULL,20); INSERT INTO employees VALUES (7839,'KING','PRESIDENT',NULL,to_date('17-11-1981','dd-mm-yyyy'),5000,NULL,10); INSERT INTO employees VALUES (7844,'TURNER','SALESMAN',7698,to_date('8-9-1981','dd-mm-yyyy'),1500,0,30); INSERT INTO employees VALUES (7876,'ADAMS','CLERK',7788,to_date('13-JUL-87', 'dd-mm-rr')-51,1100,NULL,20); INSERT INTO employees VALUES (7900,'JAMES','CLERK',7698,to_date('3-12-1981','dd-mm-yyyy'),950,NULL,30); INSERT INTO employees VALUES (7902,'FORD','ANALYST',7566,to_date('3-12-1981','dd-mm-yyyy'),3000,NULL,20); INSERT INTO employees VALUES (7934,'MILLER','CLERK',7782,to_date('23-1-1982','dd-mm-yyyy'),1300,NULL,10); COMMIT;
These tables are a variant of the EMP and DEPT tables from the SCOTT schema. You will see a lot of Oracle examples on the internet using the tables from the SCOTT schema. You can find the original table definitions in the "$ORACLE_HOME/rdbms/admin/utlsampl.sql" script.
Introduction
Joins are used to combine data from multiple tables to form a single result set. Oracle provides two approaches to joining tables, the non-ANSI join syntax and the ANSI join syntax, which look quite different.
The non-ANSI join syntax has historically been the way you perform joins in Oracle and it is still very popular today. The tables to be joined are listed in the FROM
clause and the join conditions are defined as predicates in the WHERE
clause. Even if you don't like it, you are going to have to get used to it as there is a lot of code out there that still uses it. If you are not familiar with the syntax you will struggle to bug fix any existing code and some of the examples on the internet will look rather mysterious to you.
The ANSI join syntax was introduced in Oracle 9i. It has a number of advantages over the original syntax.
- It reads more like English, so it is much clearer.
- The tables and join conditions are all kept together in the
FROM
clause, so theWHERE
clause only contains filters, not join conditions. - The syntax makes it difficult, if not impossible, to forget to include the join condition.
- Filters on columns from outer joined tables are handled in a much clearer manner.
- It is more portable, being supported by a number of relational database engines.
- It provides some functionality that is not supported directly by the non-ANSI join syntax, without using significantly more effort.
Despite all these advantages, many Oracle developers still use the non-ANSI join syntax. Partly this is just because of habit. Partly this is because the Oracle optimizer transforms most ANSI join syntax into the non-ANSI join syntax equivalent before it is executed.
For a beginner, my personal opinion is you should focus on the ANSI join syntax, but be aware of the non-ANSI equivalent. In this article I will show the ANSI and non-ANSI syntax for each example, where relevant.
Some join methods are more popular than others, so initially focus your attention on those you are most likely to see. The most common joins you are likely to see in code are the following.
[INNER] JOIN ... ON
LEFT [OUTER] JOIN
RIGHT [OUTER] JOIN
The following are less common.
FULL [OUTER] JOIN
CROSS JOIN
NATURAL JOIN
[INNER] JOIN ... USING
The following are very rare at this point.
CROSS APPLY
OUTER APPLY
If a word is surrounded by "[]" it means it is an optional keyword. Without any other qualifier, a join is an inner join, so using the INNER
keyword is redundant. If the join includes the words LEFT
, RIGHT
or FULL
, it is by definition an outer join, so the OUTER
keyword is redundant. The choice to include or exclude these words is really personal preference, so follow the standard in your company, or do what feels right to you.
With all that in mind, lets have a look at some examples.
[INNER] JOIN ... ON
An INNER JOIN
combines data from two tables where there is a match on the joining column(s) in both tables.
Remember, the INNER
keyword is optional. In the examples below, we are returning the DEPARTMENT_NAME and the EMPLOYEE_NAME for each employee. The OPERATIONS department has a DEPARTMENT_ID of 40, so it is not removed by the filter condition, but there are no employees in this department, so there is no match and it is not returned in the result set.
Here is an example of an ANSI INNER JOIN
.
SELECT d.department_name, e.employee_name FROM departments d JOIN employees e ON d.department_id = e.department_id WHERE d.department_id >= 30 ORDER BY d.department_name; DEPARTMENT_NAM EMPLOYEE_N -------------- ---------- SALES ALLEN SALES BLAKE SALES JAMES SALES MARTIN SALES TURNER SALES WARD 6 rows selected. SQL>
Here is the non-ANSI equivalent of the previous statement.
SELECT d.department_name, e.employee_name FROM departments d, employees e WHERE d.department_id = e.department_id AND d.department_id >= 30 ORDER BY d.department_name; DEPARTMENT_NAM EMPLOYEE_N -------------- ---------- SALES ALLEN SALES BLAKE SALES JAMES SALES MARTIN SALES TURNER SALES WARD 6 rows selected. SQL>
LEFT [OUTER] JOIN
A LEFT [OUTER] JOIN
returns all valid rows from the table on the left side of the JOIN
keyword, along with the values from the table on the right side, or NULLs if a matching row doesn't exist.
Using the previous example, but switching to a LEFT OUTER JOIN
means we will see the OPERATIONS department, even though it has no employees.
Here is an example of an ANSI LEFT OUTER JOIN
.
SELECT d.department_name, e.employee_name FROM departments d LEFT OUTER JOIN employees e ON d.department_id = e.department_id WHERE d.department_id >= 30 ORDER BY d.department_name, e.employee_name; DEPARTMENT_NAM EMPLOYEE_N -------------- ---------- OPERATIONS SALES ALLEN SALES BLAKE SALES JAMES SALES MARTIN SALES TURNER SALES WARD 7 rows selected. SQL>
Here is the non-ANSI equivalent of the previous statement. Notice the "(+)" is used to indicate the side of the join condition that may be missing. For a multi-column join condition, each column must have the "(+)" present. Unlike the ANSI join syntax, the non-ANSI join syntax is not affected by the order of the tables.
SELECT d.department_name, e.employee_name FROM departments d, employees e WHERE d.department_id = e.department_id (+) AND d.department_id >= 30 ORDER BY d.department_name, e.employee_name; DEPARTMENT_NAM EMPLOYEE_N -------------- ---------- OPERATIONS SALES ALLEN SALES BLAKE SALES JAMES SALES MARTIN SALES TURNER SALES WARD 7 rows selected. SQL>
Adding filters to columns returned from an outer joined table is a common cause for confusion. If you test for a specific value, for example "salary >= 2000", but the value for the SALARY column is NULL because the row is missing, a regular condition in the WHERE clause will throw the row away, therefore defeating the object of doing an outer join. Both the ANSI and non-ANSI methods have a way of dealing with this.
Using the ANSI join syntax, filters on columns from the outer joined table are included in the join itself, rather than being placed in the WHERE
clause.
SELECT d.department_name, e.employee_name FROM departments d LEFT OUTER JOIN employees e ON d.department_id = e.department_id AND e.salary >= 2000 WHERE d.department_id >= 30 ORDER BY d.department_name, e.employee_name; DEPARTMENT_NAM EMPLOYEE_N -------------- ---------- OPERATIONS SALES BLAKE 2 rows selected. SQL>
Using the non-ANSI join syntax, the "(+)" is used to indicate a column may have a NULL value as a result of an outer join.
SELECT d.department_name, e.employee_name FROM departments d, employees e WHERE d.department_id = e.department_id (+) AND e.salary (+) >= 2000 AND d.department_id >= 30 ORDER BY d.department_name, e.employee_name; DEPARTMENT_NAM EMPLOYEE_N -------------- ---------- OPERATIONS SALES BLAKE 2 rows selected. SQL>
RIGHT [OUTER] JOIN
The RIGHT [OUTER] JOIN
is the opposite of the LEFT [OUTER] JOIN
. It returns all valid rows from the table on the right side of the JOIN
keyword, along with the values from the table on the left side, or NULLs if a matching row doesn't exist. All points raised in the previous section apply here also.
The following example has altered the order of the tables so a RIGHT [OUTER] JOIN
is now required.
SELECT d.department_name, e.employee_name FROM employees e RIGHT OUTER JOIN departments d ON e.department_id = d.department_id WHERE d.department_id >= 30 ORDER BY d.department_name, e.employee_name; DEPARTMENT_NAM EMPLOYEE_N -------------- ---------- OPERATIONS SALES ALLEN SALES BLAKE SALES JAMES SALES MARTIN SALES TURNER SALES WARD 7 rows selected. SQL>
Remember, the non-ANSI outer join syntax is not dependent on table order, so there is no real concept of right or left outer joins, just outer joins.
FULL [OUTER] JOIN
A FULL [OUTER] JOIN
combines all the rows from the tables on the left and right sides of the join. If there is a conventional match it is made. If either side has missing data, it is replaced by NULLs, rather than throwing the row away.
To see a working example, we need to add another employee who is not assigned to a department.
INSERT INTO employees VALUES (8888,'JONES','DBA',null,to_date('02-1-1982','dd-mm-yyyy'),1300,NULL,NULL); COMMIT;
Here is an example of an ANSI FULL OUTER JOIN
.
SELECT d.department_name, e.employee_name FROM employees e FULL OUTER JOIN departments d ON e.department_id = d.department_id ORDER BY d.department_name, e.employee_name; DEPARTMENT_NAM EMPLOYEE_N -------------- ---------- ACCOUNTING CLARK ACCOUNTING KING ACCOUNTING MILLER OPERATIONS RESEARCH ADAMS RESEARCH FORD RESEARCH JONES RESEARCH SCOTT RESEARCH SMITH SALES ALLEN SALES BLAKE SALES JAMES SALES MARTIN SALES TURNER SALES WARD JONES 16 rows selected. SQL>
There is no direct equivalent of a full outer join using the non-ANSI join syntax, but we can recreate it by combining two outer join queries using a UNION ALL
, as shown below.
SELECT d.department_name, e.employee_name FROM employees e, departments d WHERE e.department_id = d.department_id (+) UNION ALL SELECT d.department_name, e.employee_name FROM departments d, employees e WHERE d.department_id = e.department_id (+) AND e.employee_name IS NULL ORDER BY 1, 2; DEPARTMENT_NAM EMPLOYEE_N -------------- ---------- ACCOUNTING CLARK ACCOUNTING KING ACCOUNTING MILLER OPERATIONS RESEARCH ADAMS RESEARCH FORD RESEARCH JONES RESEARCH SCOTT RESEARCH SMITH SALES ALLEN SALES BLAKE SALES JAMES SALES MARTIN SALES TURNER SALES WARD JONES 16 rows selected. SQL>
Let's remove that extra employee so it doesn't affect any other examples.
DELETE FROM employees WHERE employee_id = 8888; COMMIT;
CROSS JOIN
A CROSS JOIN
is the deliberate creation of a Cartesian product. There are no join columns specified, so every possible combination of rows between the two tables is produced.
Here is an example of an ANSI CROSS JOIN
.
SELECT e.employee_name, d.department_name FROM employees e CROSS JOIN departments d ORDER BY e.employee_name, d.department_name; EMPLOYEE_N DEPARTMENT_NAM ---------- -------------- ADAMS ACCOUNTING ADAMS OPERATIONS ADAMS RESEARCH ADAMS SALES ... Output amended for brevity ... WARD ACCOUNTING WARD OPERATIONS WARD RESEARCH WARD SALES 56 rows selected. SQL>
Here is the non-ANSI equivalent of the previous statement. Notice, there are no join conditions in the WHERE
clause.
SELECT e.employee_name, d.department_name FROM employees e, departments d ORDER BY e.employee_name, d.department_name; EMPLOYEE_N DEPARTMENT_NAM ---------- -------------- ADAMS ACCOUNTING ADAMS OPERATIONS ADAMS RESEARCH ADAMS SALES ... Output amended for brevity ... WARD ACCOUNTING WARD OPERATIONS WARD RESEARCH WARD SALES 56 rows selected. SQL>
NATURAL JOIN
A NATURAL JOIN
is a variant on an INNER JOIN
. The join columns are determined implicitly, based on the column names. Any columns that share the same name between the two tables are assumed to be join columns. Here is an example using the ANSI join syntax.
SELECT e.employee_name, d.department_name FROM employees e NATURAL JOIN departments d ORDER BY e.employee_name, d.department_name; EMPLOYEE_N DEPARTMENT_NAM ---------- -------------- ADAMS RESEARCH ALLEN SALES BLAKE SALES CLARK ACCOUNTING FORD RESEARCH JAMES SALES JONES RESEARCH KING ACCOUNTING MARTIN SALES MILLER ACCOUNTING SCOTT RESEARCH SMITH RESEARCH TURNER SALES WARD SALES 14 rows selected. SQL>
There is no non-ANSI equivalent of this, as all join conditions must be specified.
Using a NATURAL JOIN
is a bad idea. If someone adds a new column to one of the tables that happens to have the same name as a column in the other table, they may break any existing natural joins. It is effectively a bug waiting to happen.
You can't apply any aliased filters to columns used in natural joins, as shown in the following example.
SELECT e.employee_name, d.department_name FROM employees e NATURAL JOIN departments d WHERE d.department_id = 20 ORDER BY e.employee_name; WHERE d.department_id = 20 * ERROR at line 5: ORA-25155: column used in NATURAL join cannot have qualifier SQL>
Instead you must remove the alias, which in other circumstances would result in an ambiguous reference error.
SELECT e.employee_name, d.department_name FROM employees e NATURAL JOIN departments d WHERE department_id = 20 ORDER BY e.employee_name; EMPLOYEE_N DEPARTMENT_NAM ---------- -------------- ADAMS RESEARCH FORD RESEARCH JONES RESEARCH SCOTT RESEARCH SMITH RESEARCH 5 rows selected. SQL>
[INNER] JOIN ... USING
The INNER JOIN ... USING
is almost a half-way house between a conventional INNER JOIN
and a NATURAL JOIN
. The join is made using columns with matching names in each table, but you have to specify the columns to be used, not the whole condition. This allows you to join on a subset of the columns common to both tables.
SELECT e.employee_name, d.department_name FROM employees e JOIN departments d USING (department_id) ORDER BY e.employee_name; EMPLOYEE_N DEPARTMENT_NAM ---------- -------------- ADAMS RESEARCH ALLEN SALES BLAKE SALES CLARK ACCOUNTING FORD RESEARCH JAMES SALES JONES RESEARCH KING ACCOUNTING MARTIN SALES MILLER ACCOUNTING SCOTT RESEARCH SMITH RESEARCH TURNER SALES WARD SALES 14 rows selected. SQL>
This is a safe join syntax as it can't be affected by addition of columns to either table. Similar to the NATURAL JOIN
, you can't apply any aliased filters to columns used in the join, but if you remove the alias it works.
SELECT e.employee_name, d.department_name FROM employees e JOIN departments d USING (department_id) WHERE d.department_id = 20 ORDER BY e.employee_name; WHERE d.department_id = 20 * ERROR at line 5: ORA-25154: column part of USING clause cannot have qualifier SQL> SELECT e.employee_name, d.department_name FROM employees e JOIN departments d USING (department_id) WHERE department_id = 20 ORDER BY e.employee_name; EMPLOYEE_N DEPARTMENT_NAM ---------- -------------- ADAMS RESEARCH FORD RESEARCH JONES RESEARCH SCOTT RESEARCH SMITH RESEARCH 5 rows selected. SQL>
Additional Joins
The CROSS APPLY and OUTER APPLY joins are available in Oracle, but they have only been supported for use in your application code from Oracle 12c onward, so you are unlikely to see them in application code for some time.
For more information see:
- SQL for Beginners (Part 5) : Joins
- SQL for Beginners - Full Playlist
- Oracle SQL Articles - Getting Started
- Database SQL Language Reference : SELECT
- LATERAL Inline Views, CROSS APPLY and OUTER APPLY Joins in Oracle Database 12c Release 1 (12.1)
Hope this helps. Regards Tim...