8i | 9i | 10g | 11g | 12c | 13c | 18c | 19c | 21c | 23c | Misc | PL/SQL | SQL | RAC | WebLogic | Linux
CUME_DIST Analytic Function
This article gives an overview of the CUME_DIST
analytic function. If you are new to analytic functions you should probably read this introduction to analytic functions first.
Related articles.
Setup
The examples in this article require the following table.
--DROP TABLE emp PURGE; CREATE TABLE emp ( empno NUMBER(4) CONSTRAINT pk_emp PRIMARY KEY, ename VARCHAR2(10), job VARCHAR2(9), mgr NUMBER(4), hiredate DATE, sal NUMBER(7,2), comm NUMBER(7,2), deptno NUMBER(2) ); INSERT INTO emp VALUES (7369,'SMITH','CLERK',7902,to_date('17-12-1980','dd-mm-yyyy'),800,NULL,20); INSERT INTO emp VALUES (7499,'ALLEN','SALESMAN',7698,to_date('20-2-1981','dd-mm-yyyy'),1600,300,30); INSERT INTO emp VALUES (7521,'WARD','SALESMAN',7698,to_date('22-2-1981','dd-mm-yyyy'),1250,500,30); INSERT INTO emp VALUES (7566,'JONES','MANAGER',7839,to_date('2-4-1981','dd-mm-yyyy'),2975,NULL,20); INSERT INTO emp VALUES (7654,'MARTIN','SALESMAN',7698,to_date('28-9-1981','dd-mm-yyyy'),1250,1400,30); INSERT INTO emp VALUES (7698,'BLAKE','MANAGER',7839,to_date('1-5-1981','dd-mm-yyyy'),2850,NULL,30); INSERT INTO emp VALUES (7782,'CLARK','MANAGER',7839,to_date('9-6-1981','dd-mm-yyyy'),2450,NULL,10); INSERT INTO emp VALUES (7788,'SCOTT','ANALYST',7566,to_date('13-JUL-87','dd-mm-rr')-85,3000,NULL,20); INSERT INTO emp VALUES (7839,'KING','PRESIDENT',NULL,to_date('17-11-1981','dd-mm-yyyy'),5000,NULL,10); INSERT INTO emp VALUES (7844,'TURNER','SALESMAN',7698,to_date('8-9-1981','dd-mm-yyyy'),1500,0,30); INSERT INTO emp VALUES (7876,'ADAMS','CLERK',7788,to_date('13-JUL-87', 'dd-mm-rr')-51,1100,NULL,20); INSERT INTO emp VALUES (7900,'JAMES','CLERK',7698,to_date('3-12-1981','dd-mm-yyyy'),950,NULL,30); INSERT INTO emp VALUES (7902,'FORD','ANALYST',7566,to_date('3-12-1981','dd-mm-yyyy'),3000,NULL,20); INSERT INTO emp VALUES (7934,'MILLER','CLERK',7782,to_date('23-1-1982','dd-mm-yyyy'),1300,NULL,10); COMMIT;
CUME_DIST as an Aggregate Function
The CUME_DIST
aggregate function is predictive, returning the cumulative distribution of the specified value in a set. As an aggregate function it reduces the number of rows, hence the term "aggregate". If the data isn't grouped we turn the 14 rows in the EMP
table to a single row with the aggregated value. In the following example we see the cumulative distribution of the specified value relative to the salaries for all employees.
SELECT CUME_DIST(2000) WITHIN GROUP (ORDER BY sal) AS cume_dist_sal FROM emp; CUME_DIST_SAL ------------- .6 SQL>
We can get more granularity of information by including a GROUP BY
clause. In the following example we see the cumulative distribution of the specified value relative to the salaries on a per-department basis.
SELECT deptno, CUME_DIST(2000) WITHIN GROUP (ORDER BY sal) AS cume_dist_sal_by_dept FROM emp GROUP BY deptno ORDER BY deptno; DEPTNO CUME_DIST_SAL_BY_DEPT ---------- --------------------- 10 .5 20 .5 30 .857142857 SQL>
More than one column can be specified in the aggregate function.
SELECT deptno, CUME_DIST(2000, 0.1) WITHIN GROUP (ORDER BY sal, comm) AS cume_dist_sal_by_dept FROM emp GROUP BY deptno ORDER BY deptno; DEPTNO CUME_DIST_SAL_BY_DEPT ---------- --------------------- 10 .5 20 .5 30 .857142857 SQL>
In both cases we have aggregated the data to get the values, returning less rows than we started with. Analytic functions allow us to return these aggregate values while retaining the original row data.
CUME_DIST Analytic Function
The basic description for the CUME_DIST
analytic function is shown below. The analytic clause is described in more detail here.
CUME_DIST() OVER ([ query_partition_clause ] order_by_clause)
The CUME_DIST
analytic function is order sensitive so the ORDER BY
clause is mandatory. Omitting a partitioning clause from the OVER
clause means the whole result set is treated as a single partition. In the following example we display the cumulative distribution, or the relative position in the set, of each of all employees, as well as all the original data.
SELECT empno, ename, deptno, sal, CUME_DIST() OVER (ORDER BY sal) AS cume_dist_sal FROM emp; EMPNO ENAME DEPTNO SAL CUME_DIST_SAL ---------- ---------- ---------- ---------- ------------- 7369 SMITH 20 800 .0714285714 7900 JAMES 30 950 .142857143 7876 ADAMS 20 1100 .214285714 7521 WARD 30 1250 .357142857 7654 MARTIN 30 1250 .357142857 7934 MILLER 10 1300 .428571429 7844 TURNER 30 1500 .5 7499 ALLEN 30 1600 .571428571 7782 CLARK 10 2450 .642857143 7698 BLAKE 30 2850 .714285714 7566 JONES 20 2975 .785714286 7788 SCOTT 20 3000 .928571429 7902 FORD 20 3000 .928571429 7839 KING 10 5000 1 SQL>
Adding the partitioning clause allows us to display the cumulative distribution of each employee within a partition.
SELECT empno, ename, deptno, sal, CUME_DIST() OVER (PARTITION BY deptno ORDER BY sal) AS cume_dist_sal_by_dept FROM emp; EMPNO ENAME DEPTNO SAL CUME_DIST_SAL_BY_DEPT ---------- ---------- ---------- ---------- --------------------- 7934 MILLER 10 1300 .333333333 7782 CLARK 10 2450 .666666667 7839 KING 10 5000 1 7369 SMITH 20 800 .2 7876 ADAMS 20 1100 .4 7566 JONES 20 2975 .6 7788 SCOTT 20 3000 1 7902 FORD 20 3000 1 7900 JAMES 30 950 .166666667 7654 MARTIN 30 1250 .5 7521 WARD 30 1250 .5 7844 TURNER 30 1500 .666666667 7499 ALLEN 30 1600 .833333333 7698 BLAKE 30 2850 1 SQL>
Quick Links
The "*" indicates the function supports the full analytic syntax, including the windowing clause.
For more information see:
Hope this helps. Regards Tim...