8i | 9i | 10g | 11g | 12c | 13c | 18c | 19c | 21c | 23c | Misc | PL/SQL | SQL | RAC | WebLogic | Linux
LISTAGG DISTINCT in Oracle Database 19c
The LISTAGG
function was introduced in Oracle 11gR2 to make string aggregation simpler. In Oracle 12cR2 it was extended to include overflow error handling. Oracle 19c includes the ability to remove duplicates from the LISTAGG
results by including the DISTINCT
keyword.
Related articles.
- LISTAGG Analytic Function
- LISTAGG Analytic Function in 11g Release 2
- LISTAGG Function Enhancements in Oracle Database 12c Release 2 (12.2)
- Analytic Functions : All Articles
Setup
The examples in this article use the following table.
-- DROP TABLE EMP PURGE; CREATE TABLE EMP ( EMPNO NUMBER(4) CONSTRAINT PK_EMP PRIMARY KEY, ENAME VARCHAR2(10), JOB VARCHAR2(9), MGR NUMBER(4), HIREDATE DATE, SAL NUMBER(7,2), COMM NUMBER(7,2), DEPTNO NUMBER(2) ); INSERT INTO EMP VALUES (7369,'SMITH','CLERK',7902,to_date('17-12-1980','dd-mm-yyyy'),800,NULL,20); INSERT INTO EMP VALUES (7499,'ALLEN','SALESMAN',7698,to_date('20-2-1981','dd-mm-yyyy'),1600,300,30); INSERT INTO EMP VALUES (7521,'WARD','SALESMAN',7698,to_date('22-2-1981','dd-mm-yyyy'),1250,500,30); INSERT INTO EMP VALUES (7566,'JONES','MANAGER',7839,to_date('2-4-1981','dd-mm-yyyy'),2975,NULL,20); INSERT INTO EMP VALUES (7654,'MARTIN','SALESMAN',7698,to_date('28-9-1981','dd-mm-yyyy'),1250,1400,30); INSERT INTO EMP VALUES (7698,'BLAKE','MANAGER',7839,to_date('1-5-1981','dd-mm-yyyy'),2850,NULL,30); INSERT INTO EMP VALUES (7782,'CLARK','MANAGER',7839,to_date('9-6-1981','dd-mm-yyyy'),2450,NULL,10); INSERT INTO EMP VALUES (7788,'SCOTT','ANALYST',7566,to_date('13-JUL-87','dd-mm-rr')-85,3000,NULL,20); INSERT INTO EMP VALUES (7839,'KING','PRESIDENT',NULL,to_date('17-11-1981','dd-mm-yyyy'),5000,NULL,10); INSERT INTO EMP VALUES (7844,'TURNER','SALESMAN',7698,to_date('8-9-1981','dd-mm-yyyy'),1500,0,30); INSERT INTO EMP VALUES (7876,'ADAMS','CLERK',7788,to_date('13-JUL-87', 'dd-mm-rr')-51,1100,NULL,20); INSERT INTO EMP VALUES (7900,'JAMES','CLERK',7698,to_date('3-12-1981','dd-mm-yyyy'),950,NULL,30); INSERT INTO EMP VALUES (7902,'FORD','ANALYST',7566,to_date('3-12-1981','dd-mm-yyyy'),3000,NULL,20); INSERT INTO EMP VALUES (7934,'MILLER','CLERK',7782,to_date('23-1-1982','dd-mm-yyyy'),1300,NULL,10); COMMIT;
The Problem
The default action of the LISTAGG
function is shown below.
COLUMN employees FORMAT A40 SELECT deptno, LISTAGG(ename, ',') WITHIN GROUP (ORDER BY ename) AS employees FROM emp GROUP BY deptno ORDER BY deptno; DEPTNO EMPLOYEES ---------- ---------------------------------------- 10 CLARK,KING,MILLER 20 ADAMS,FORD,JONES,SCOTT,SMITH 30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD 3 rows selected. SQL>
Let's add some extra people called "MILLER" into department 10, to give us duplicates in the aggregated list.
INSERT INTO emp VALUES (9998,'MILLER','ANALYST',7782,to_date('23-1-1982','dd-mm-yyyy'),1600,NULL,10); INSERT INTO emp VALUES (9999,'MILLER','MANADER',7782,to_date('23-1-1982','dd-mm-yyyy'),1500,NULL,10); COMMIT;
As expected, we now see multiple entries for the name "MILLER" in department 10.
COLUMN employees FORMAT A40 SELECT deptno, LISTAGG(ename, ',') WITHIN GROUP (ORDER BY ename) AS employees FROM emp GROUP BY deptno ORDER BY deptno; DEPTNO EMPLOYEES ---------- ---------------------------------------- 10 CLARK,KING,MILLER,MILLER,MILLER 20 ADAMS,FORD,JONES,SCOTT,SMITH 30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD 3 rows selected. SQL>
If that's what we are expecting, great. If we want to remove duplicates, what do we do?
The Solution : Pre-19c
We could solve this in a number of ways. In the following example we use the ROW_NUMBER
analytic function to remove any duplicates, then use the conventional LISTAGG
function to aggregate the data.
COLUMN employees FORMAT A40 SELECT e2.deptno, LISTAGG(e2.ename, ',') WITHIN GROUP (ORDER BY e2.ename) AS employees FROM (SELECT e.*, ROW_NUMBER() OVER (PARTITION BY e.deptno, e.ename ORDER BY e.empno) AS myrank FROM emp e) e2 WHERE e2.myrank = 1 GROUP BY e2.deptno ORDER BY e2.deptno; DEPTNO EMPLOYEES ---------- ---------------------------------------- 10 CLARK,KING,MILLER 20 ADAMS,FORD,JONES,SCOTT,SMITH 30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD 3 rows selected. SQL>
Alternatively we could use DISTINCT
in an inline view to remove the duplicate rows, then use the conventional LISTAGG
function call to aggregate the data.
COLUMN employees FORMAT A40 SELECT e2.deptno, LISTAGG(e2.ename, ',') WITHIN GROUP (ORDER BY e2.ename) AS employees FROM (SELECT DISTINCT e.deptno, e.ename FROM emp e) e2 GROUP BY e2.deptno ORDER BY e2.deptno; DEPTNO EMPLOYEES ---------- ---------------------------------------- 10 CLARK,KING,MILLER 20 ADAMS,FORD,JONES,SCOTT,SMITH 30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD 3 rows selected. SQL>
The Solution : 19c Onward
Oracle 19c introduced a simpler solution. We can now include the DISTINCT
keyword directly in the LISTAGG
function call.
COLUMN employees FORMAT A40 SELECT deptno, LISTAGG(DISTINCT ename, ',') WITHIN GROUP (ORDER BY ename) AS employees FROM emp GROUP BY deptno ORDER BY deptno; DEPTNO EMPLOYEES ---------- ---------------------------------------- 10 CLARK,KING,MILLER 20 ADAMS,FORD,JONES,SCOTT,SMITH 30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD 3 rows selected. SQL>
The default functionality is to include all results, which we can express explicitly using the ALL
keyword.
SELECT deptno, LISTAGG(ALL ename, ',') WITHIN GROUP (ORDER BY ename) AS employees FROM emp GROUP BY deptno ORDER BY deptno; DEPTNO EMPLOYEES ---------- ---------------------------------------- 10 CLARK,KING,MILLER,MILLER,MILLER 20 ADAMS,FORD,JONES,SCOTT,SMITH 30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD 3 rows selected. SQL>
Quick Links
The "*" indicates the function supports the full analytic syntax, including the windowing clause.
For more information see:
- LISTAGG
- LISTAGG Analytic Function
- LISTAGG Analytic Function in 11g Release 2
- LISTAGG Function Enhancements in Oracle Database 12c Release 2 (12.2)
- Analytic Functions : All Articles
Hope this helps. Regards Tim...