8i | 9i | 10g | 11g | 12c | 13c | 18c | Misc | PL/SQL | SQL | RAC | WebLogic | Linux

Home » Articles » Misc » Here

PERCENT_RANK Analytic Function

This article gives an overview of the PERCENT_RANK analytic function. If you are new to analytic functions you should probably read this introduction to analytic functions first.

Related articles.

Setup

The examples in this article require the following table.

--DROP TABLE emp PURGE;

CREATE TABLE emp (
  empno    NUMBER(4) CONSTRAINT pk_emp PRIMARY KEY,
  ename    VARCHAR2(10),
  job      VARCHAR2(9),
  mgr      NUMBER(4),
  hiredate DATE,
  sal      NUMBER(7,2),
  comm     NUMBER(7,2),
  deptno   NUMBER(2)
);

INSERT INTO emp VALUES (7369,'SMITH','CLERK',7902,to_date('17-12-1980','dd-mm-yyyy'),800,NULL,20);
INSERT INTO emp VALUES (7499,'ALLEN','SALESMAN',7698,to_date('20-2-1981','dd-mm-yyyy'),1600,300,30);
INSERT INTO emp VALUES (7521,'WARD','SALESMAN',7698,to_date('22-2-1981','dd-mm-yyyy'),1250,500,30);
INSERT INTO emp VALUES (7566,'JONES','MANAGER',7839,to_date('2-4-1981','dd-mm-yyyy'),2975,NULL,20);
INSERT INTO emp VALUES (7654,'MARTIN','SALESMAN',7698,to_date('28-9-1981','dd-mm-yyyy'),1250,1400,30);
INSERT INTO emp VALUES (7698,'BLAKE','MANAGER',7839,to_date('1-5-1981','dd-mm-yyyy'),2850,NULL,30);
INSERT INTO emp VALUES (7782,'CLARK','MANAGER',7839,to_date('9-6-1981','dd-mm-yyyy'),2450,NULL,10);
INSERT INTO emp VALUES (7788,'SCOTT','ANALYST',7566,to_date('13-JUL-87','dd-mm-rr')-85,3000,NULL,20);
INSERT INTO emp VALUES (7839,'KING','PRESIDENT',NULL,to_date('17-11-1981','dd-mm-yyyy'),5000,NULL,10);
INSERT INTO emp VALUES (7844,'TURNER','SALESMAN',7698,to_date('8-9-1981','dd-mm-yyyy'),1500,0,30);
INSERT INTO emp VALUES (7876,'ADAMS','CLERK',7788,to_date('13-JUL-87', 'dd-mm-rr')-51,1100,NULL,20);
INSERT INTO emp VALUES (7900,'JAMES','CLERK',7698,to_date('3-12-1981','dd-mm-yyyy'),950,NULL,30);
INSERT INTO emp VALUES (7902,'FORD','ANALYST',7566,to_date('3-12-1981','dd-mm-yyyy'),3000,NULL,20);
INSERT INTO emp VALUES (7934,'MILLER','CLERK',7782,to_date('23-1-1982','dd-mm-yyyy'),1300,NULL,10);
COMMIT;

PERCENT_RANK as an Aggregate Function

The PERCENT_RANK aggregate function is predictive, returning the rank as a percentage (between 0 and 1) of the specified value in a set. As an aggregate function it reduces the number of rows, hence the term "aggregate". If the data isn't grouped we turn the 14 rows in the EMP table to a single row with the aggregated value. In the following example we see the percent rank of the specified value relative to the salaries for all employees.

SELECT PERCENT_RANK(2000) WITHIN GROUP (ORDER BY sal) AS percent_rank_sal,
       ROUND(PERCENT_RANK(2000) WITHIN GROUP (ORDER BY sal)*100,2) AS percent_rank_sal_pct
FROM   emp;

PERCENT_RANK_SAL PERCENT_RANK_SAL_PCT
---------------- --------------------
      .571428571                57.14

SQL>

We can get more granularity of information by including a GROUP BY clause. In the following example we see the percent rank of the specified value relative to the salaries on a per-department basis.

SELECT deptno,
       PERCENT_RANK(2000) WITHIN GROUP (ORDER BY sal) AS percent_rank_sal_by_dept,
       ROUND(PERCENT_RANK(2000) WITHIN GROUP (ORDER BY sal)*100,2) AS percent_rank_sal_pct_by_dept
FROM   emp
GROUP BY deptno
ORDER BY deptno;

    DEPTNO PERCENT_RANK_SAL_BY_DEPT PERCENT_RANK_SAL_PCT_BY_DEPT
---------- ------------------------ ----------------------------
        10               .333333333                        33.33
        20                       .4                           40
        30               .833333333                        83.33

SQL>

More than one column can be specified in the aggregate function.

SELECT deptno,
       PERCENT_RANK(2000, 0.1) WITHIN GROUP (ORDER BY sal, comm) AS percent_rank_sal_by_dept,
       ROUND(PERCENT_RANK(2000, 0.1) WITHIN GROUP (ORDER BY sal, comm)*100,2) AS percent_rank_sal_pct_by_dept
FROM   emp
GROUP BY deptno
ORDER BY deptno;

    DEPTNO PERCENT_RANK_SAL_BY_DEPT PERCENT_RANK_SAL_PCT_BY_DEPT
---------- ------------------------ ----------------------------
        10               .333333333                        33.33
        20                       .4                           40
        30               .833333333                        83.33

SQL>

In both cases we have aggregated the data to get the values, returning less rows than we started with. Analytic functions allow us to return these aggregate values while retaining the original row data.

PERCENT_RANK Analytic Function

The basic description for the PERCENT_RANK analytic function is shown below. The analytic clause is described in more detail here.

PERCENT_RANK() OVER ([ query_partition_clause ] order_by_clause)

The PERCENT_RANK analytic function is order sensitive so the ORDER BY clause is mandatory. Omitting a partitioning clause from the OVER clause means the whole result set is treated as a single partition. The first row of the ordered set is assigned 0 and the last row of the set is assigned 1. If there is a single row in the set it is assigned 0. Ties are assigned the same value. In the following example we display the percent rank, or the relative position in the set, of each of all employees, as well as all the original data.

SELECT empno,
       ename,
       deptno,
       sal,
       PERCENT_RANK() OVER (ORDER BY sal) AS percent_rank_sal,
       ROUND(PERCENT_RANK() OVER (ORDER BY sal)*100) AS percent_rank_sal_pct
FROM   emp;

     EMPNO ENAME          DEPTNO        SAL PERCENT_RANK_SAL PERCENT_RANK_SAL_PCT
---------- ---------- ---------- ---------- ---------------- --------------------
      7369 SMITH              20        800                0                    0
      7900 JAMES              30        950      .0769230769                    8
      7876 ADAMS              20       1100       .153846154                   15
      7521 WARD               30       1250       .230769231                   23
      7654 MARTIN             30       1250       .230769231                   23
      7934 MILLER             10       1300       .384615385                   38
      7844 TURNER             30       1500       .461538462                   46
      7499 ALLEN              30       1600       .538461538                   54
      7782 CLARK              10       2450       .615384615                   62
      7698 BLAKE              30       2850       .692307692                   69
      7566 JONES              20       2975       .769230769                   77
      7788 SCOTT              20       3000       .846153846                   85
      7902 FORD               20       3000       .846153846                   85
      7839 KING               10       5000                1                  100

SQL>

Adding the partitioning clause allows us to display the percent rank of each employee within a partition.

SELECT empno,
       ename,
       deptno,
       sal,
       PERCENT_RANK() OVER (PARTITION BY deptno ORDER BY sal) AS percent_rank_sal_by_dept,
       ROUND(PERCENT_RANK() OVER (PARTITION BY deptno ORDER BY sal)*100) AS percent_rank_sal_pct_by_dept
FROM   emp;

     EMPNO ENAME          DEPTNO        SAL PERCENT_RANK_SAL_BY_DEPT PERCENT_RANK_SAL_PCT_BY_DEPT
---------- ---------- ---------- ---------- ------------------------ ----------------------------
      7934 MILLER             10       1300                        0                            0
      7782 CLARK              10       2450                       .5                           50
      7839 KING               10       5000                        1                          100
      7369 SMITH              20        800                        0                            0
      7876 ADAMS              20       1100                      .25                           25
      7566 JONES              20       2975                       .5                           50
      7788 SCOTT              20       3000                      .75                           75
      7902 FORD               20       3000                      .75                           75
      7900 JAMES              30        950                        0                            0
      7654 MARTIN             30       1250                       .2                           20
      7521 WARD               30       1250                       .2                           20
      7844 TURNER             30       1500                       .6                           60
      7499 ALLEN              30       1600                       .8                           80
      7698 BLAKE              30       2850                        1                          100

SQL>

Assigning a percentage allows us to do a type of Top-N query based on the percentage. The following query returns the top 30% of employees in the company based on their pay.

SELECT *
FROM   (SELECT empno,
               ename,
               deptno,
               sal,
               PERCENT_RANK() OVER (ORDER BY sal) AS percent_rank_sal
        FROM   emp)
WHERE  percent_rank_sal >= 0.7;

     EMPNO ENAME          DEPTNO        SAL PERCENT_RANK_SAL
---------- ---------- ---------- ---------- ----------------
      7566 JONES              20       2975       .769230769
      7788 SCOTT              20       3000       .846153846
      7902 FORD               20       3000       .846153846
      7839 KING               10       5000                1

SQL>

If there is a single row in the set it is assigned 0.

SELECT empno,
       ename,
       deptno,
       sal,
       PERCENT_RANK() OVER (PARTITION BY deptno ORDER BY sal) AS percent_rank_sal_by_dept,
       ROUND(PERCENT_RANK() OVER (PARTITION BY deptno ORDER BY sal)*100) AS percent_rank_sal_pct_by_dept
FROM   emp
WHERE  empno = 7934;

     EMPNO ENAME          DEPTNO        SAL PERCENT_RANK_SAL_BY_DEPT PERCENT_RANK_SAL_PCT_BY_DEPT
---------- ---------- ---------- ---------- ------------------------ ----------------------------
      7934 MILLER             10       1300                        0                            0

SQL>

Quick Links

The "*" indicates the function supports the full analytic syntax, including the windowing clause.

AVG * CLUSTER_DETAILS CLUSTER_DISTANCE CLUSTER_ID CLUSTER_PROBABILITY
CLUSTER_SET CORR * COUNT * COVAR_POP * COVAR_SAMP *
CUME_DIST DENSE_RANK FEATURE_DETAILS FEATURE_ID FEATURE_SET
FEATURE_VALUE FIRST FIRST_VALUE * LAG LAST
LAST_VALUE * LEAD LISTAGG MATCH_RECOGNIZE MAX *
MEDIAN MIN * NTH_VALUE * NTILE PERCENT_RANK
PERCENTILE_CONT PERCENTILE_DISC PREDICTION PREDICTION_COST PREDICTION_DETAILS
PREDICTION_PROBABILITY PREDICTION_SET RANK RATIO_TO_REPORT REGR_ (Linear Regression) Functions *
ROW_NUMBER STDDEV * STDDEV_POP * STDDEV_SAMP * SUM *
VAR_POP * VAR_SAMP * VARIANCE * String Aggregation Top-N Queries

For more information see:

Hope this helps. Regards Tim...

Back to the Top.