8i | 9i | 10g | 11g | 12c | 13c | 18c | Misc | PL/SQL | SQL | RAC | WebLogic | Linux

Home » Articles » Misc » Here

STDDEV, STDDEV_POP and STDDEV_SAMP Analytic Functions

This article gives an overview of the STDDEV, STDDEV_POP and STDDEV_SAMP analytic functions. If you are new to analytic functions you should probably read this introduction to analytic functions first.

Related articles.

Setup

The examples in this article require the following table.

--DROP TABLE emp PURGE;

CREATE TABLE emp (
  empno    NUMBER(4) CONSTRAINT pk_emp PRIMARY KEY,
  ename    VARCHAR2(10),
  job      VARCHAR2(9),
  mgr      NUMBER(4),
  hiredate DATE,
  sal      NUMBER(7,2),
  comm     NUMBER(7,2),
  deptno   NUMBER(2)
);

INSERT INTO emp VALUES (7369,'SMITH','CLERK',7902,to_date('17-12-1980','dd-mm-yyyy'),800,NULL,20);
INSERT INTO emp VALUES (7499,'ALLEN','SALESMAN',7698,to_date('20-2-1981','dd-mm-yyyy'),1600,300,30);
INSERT INTO emp VALUES (7521,'WARD','SALESMAN',7698,to_date('22-2-1981','dd-mm-yyyy'),1250,500,30);
INSERT INTO emp VALUES (7566,'JONES','MANAGER',7839,to_date('2-4-1981','dd-mm-yyyy'),2975,NULL,20);
INSERT INTO emp VALUES (7654,'MARTIN','SALESMAN',7698,to_date('28-9-1981','dd-mm-yyyy'),1250,1400,30);
INSERT INTO emp VALUES (7698,'BLAKE','MANAGER',7839,to_date('1-5-1981','dd-mm-yyyy'),2850,NULL,30);
INSERT INTO emp VALUES (7782,'CLARK','MANAGER',7839,to_date('9-6-1981','dd-mm-yyyy'),2450,NULL,10);
INSERT INTO emp VALUES (7788,'SCOTT','ANALYST',7566,to_date('13-JUL-87','dd-mm-rr')-85,3000,NULL,20);
INSERT INTO emp VALUES (7839,'KING','PRESIDENT',NULL,to_date('17-11-1981','dd-mm-yyyy'),5000,NULL,10);
INSERT INTO emp VALUES (7844,'TURNER','SALESMAN',7698,to_date('8-9-1981','dd-mm-yyyy'),1500,0,30);
INSERT INTO emp VALUES (7876,'ADAMS','CLERK',7788,to_date('13-JUL-87', 'dd-mm-rr')-51,1100,NULL,20);
INSERT INTO emp VALUES (7900,'JAMES','CLERK',7698,to_date('3-12-1981','dd-mm-yyyy'),950,NULL,30);
INSERT INTO emp VALUES (7902,'FORD','ANALYST',7566,to_date('3-12-1981','dd-mm-yyyy'),3000,NULL,20);
INSERT INTO emp VALUES (7934,'MILLER','CLERK',7782,to_date('23-1-1982','dd-mm-yyyy'),1300,NULL,10);
COMMIT;

STDDEV, STDDEV_POP and STDDEV_SAMP as Aggregate Functions

The STDDEV, STDDEV_POP and STDDEV_SAMP aggregate functions are used to calculate the standard deviation, population standard deviation and cumulative sample standard deviation of a set of data respectively. As aggregate functions they reduce the number of rows, hence the term "aggregate". If the data isn't grouped we turn the 14 rows in the EMP table to a single row with the aggregated values.

SELECT STDDEV(sal) AS stddev_sal,
       STDDEV_POP(sal) AS stddev_pop_sal,
       STDDEV_SAMP(sal) AS stddev_samp_sal
FROM   emp;

STDDEV_SAL STDDEV_POP_SAL STDDEV_SAMP_SAL
---------- -------------- ---------------
1182.50322     1139.48862      1182.50322

SQL>

We can get more granularity of information by including a GROUP BY clause. In the following example we see the values on a per-department basis.

SELECT deptno,
       STDDEV(sal) AS stddev_sal,
       STDDEV_POP(sal) AS stddev_pop_sal,
       STDDEV_SAMP(sal) AS stddev_samp_sal
FROM   emp
GROUP BY deptno
ORDER BY deptno;

    DEPTNO STDDEV_SAL STDDEV_POP_SAL STDDEV_SAMP_SAL
---------- ---------- -------------- ---------------
        10 1893.62967     1546.14215      1893.62967
        20  1123.3321     1004.73877       1123.3321
        30 668.331255     610.100174      668.331255

SQL>

In both cases we have aggregated the data to get the values, returning less rows than we started with. Analytic functions allow us to return these aggregate values while retaining the original row data.

STDDEV Analytic Function

If there is more than one record in the sample after discarding nulls, the STDDEV function returns the result of the STDDEV_SAMP function, the cumulative sample standard deviation. If there is only a single row in the sample after discarding nulls, the STDDEV function returns the value "0". If there are no records in the set after discarding nulls, the return value is NULL.

The basic description for the STDDEV analytic function is shown below. The analytic clause is described in more detail here.

STDDEV([ DISTINCT | ALL ] expr) [ OVER (analytic_clause) ]

Using an empty OVER clause turns the STDDEV function into an analytic function. The lack of a partitioning clause means the whole result set is treated as a single partition, so we get the standard deviation of the salary for all employees, as well as all the original data.

SELECT empno,
       ename,
       deptno,
       sal,
       STDDEV(sal) OVER () AS stddev_sal
FROM   emp
ORDER BY deptno;

     EMPNO ENAME          DEPTNO        SAL STDDEV_SAL
---------- ---------- ---------- ---------- ----------
      7782 CLARK              10       2450 1182.50322
      7839 KING               10       5000 1182.50322
      7934 MILLER             10       1300 1182.50322
      7566 JONES              20       2975 1182.50322
      7902 FORD               20       3000 1182.50322
      7876 ADAMS              20       1100 1182.50322
      7369 SMITH              20        800 1182.50322
      7788 SCOTT              20       3000 1182.50322
      7521 WARD               30       1250 1182.50322
      7844 TURNER             30       1500 1182.50322
      7499 ALLEN              30       1600 1182.50322
      7900 JAMES              30        950 1182.50322
      7698 BLAKE              30       2850 1182.50322
      7654 MARTIN             30       1250 1182.50322

SQL>

Adding the partitioning clause allows us to display the standard deviation of the salary per department, along with the employee data for each department.

SELECT empno,
       ename,
       deptno,
       sal,
       STDDEV(sal) OVER (PARTITION BY deptno) AS stddev_sal_by_dept
FROM   emp;

     EMPNO ENAME          DEPTNO        SAL STDDEV_SAL_BY_DEPT
---------- ---------- ---------- ---------- ------------------
      7782 CLARK              10       2450         1893.62967
      7839 KING               10       5000         1893.62967
      7934 MILLER             10       1300         1893.62967
      7566 JONES              20       2975          1123.3321
      7902 FORD               20       3000          1123.3321
      7876 ADAMS              20       1100          1123.3321
      7369 SMITH              20        800          1123.3321
      7788 SCOTT              20       3000          1123.3321
      7521 WARD               30       1250         668.331255
      7844 TURNER             30       1500         668.331255
      7499 ALLEN              30       1600         668.331255
      7900 JAMES              30        950         668.331255
      7698 BLAKE              30       2850         668.331255
      7654 MARTIN             30       1250         668.331255

SQL>

STDDEV_POP Analytic Function

The STDDEV_POP function returns the population standard deviation, the square root of the VAR_POP function.

The basic description for the STDDEV_POP analytic function is shown below. The analytic clause is described in more detail here.

STDDEV_POP(expr) [ OVER (analytic_clause) ]

Using an empty OVER clause turns the STDDEV_POP function into an analytic function. The lack of a partitioning clause means the whole result set is treated as a single partition, so we get the population standard deviation of the salary for all employees, as well as all the original data.

SELECT empno,
       ename,
       deptno,
       sal,
       STDDEV_POP(sal) OVER () AS stddev_pop_sal
FROM   emp;

     EMPNO ENAME          DEPTNO        SAL STDDEV_POP_SAL
---------- ---------- ---------- ---------- --------------
      7369 SMITH              20        800     1139.48862
      7499 ALLEN              30       1600     1139.48862
      7521 WARD               30       1250     1139.48862
      7566 JONES              20       2975     1139.48862
      7654 MARTIN             30       1250     1139.48862
      7698 BLAKE              30       2850     1139.48862
      7782 CLARK              10       2450     1139.48862
      7788 SCOTT              20       3000     1139.48862
      7839 KING               10       5000     1139.48862
      7844 TURNER             30       1500     1139.48862
      7876 ADAMS              20       1100     1139.48862
      7900 JAMES              30        950     1139.48862
      7902 FORD               20       3000     1139.48862
      7934 MILLER             10       1300     1139.48862

SQL>

Adding the partitioning clause allows us to display the population standard deviation of the salary per department, along with the employee data for each department.

SELECT empno,
       ename,
       deptno,
       sal,
       STDDEV_POP(sal) OVER (PARTITION BY deptno) AS stddev_pop_by_dept
FROM   emp;

     EMPNO ENAME          DEPTNO        SAL STDDEV_POP_BY_DEPT
---------- ---------- ---------- ---------- ------------------
      7782 CLARK              10       2450         1546.14215
      7839 KING               10       5000         1546.14215
      7934 MILLER             10       1300         1546.14215
      7566 JONES              20       2975         1004.73877
      7902 FORD               20       3000         1004.73877
      7876 ADAMS              20       1100         1004.73877
      7369 SMITH              20        800         1004.73877
      7788 SCOTT              20       3000         1004.73877
      7521 WARD               30       1250         610.100174
      7844 TURNER             30       1500         610.100174
      7499 ALLEN              30       1600         610.100174
      7900 JAMES              30        950         610.100174
      7698 BLAKE              30       2850         610.100174
      7654 MARTIN             30       1250         610.100174

SQL>

STDDEV_SAMP Analytic Function

The STDDEV_SAMP function returns the cumulative sample standard deviation, the square root of the VAR_SAMP function.

The basic description for the STDDEV_SAMP analytic function is shown below. The analytic clause is described in more detail here.

STDDEV_SAMP(expr) [ OVER (analytic_clause) ]

Using an empty OVER clause turns the STDDEV_SAMP function into an analytic function. The lack of a partitioning clause means the whole result set is treated as a single partition, so we get the cumulative sample standard deviation of the salary for all employees, as well as all the original data.

SELECT empno,
       ename,
       deptno,
       sal,
       STDDEV_SAMP(sal) OVER () AS stddev_samp_sal
FROM   emp;

     EMPNO ENAME          DEPTNO        SAL STDDEV_SAMP_SAL
---------- ---------- ---------- ---------- ---------------
      7369 SMITH              20        800      1182.50322
      7499 ALLEN              30       1600      1182.50322
      7521 WARD               30       1250      1182.50322
      7566 JONES              20       2975      1182.50322
      7654 MARTIN             30       1250      1182.50322
      7698 BLAKE              30       2850      1182.50322
      7782 CLARK              10       2450      1182.50322
      7788 SCOTT              20       3000      1182.50322
      7839 KING               10       5000      1182.50322
      7844 TURNER             30       1500      1182.50322
      7876 ADAMS              20       1100      1182.50322
      7900 JAMES              30        950      1182.50322
      7902 FORD               20       3000      1182.50322
      7934 MILLER             10       1300      1182.50322

SQL>

Adding the partitioning clause allows us to display the cumulative sample standard deviation of the salary per department, along with the employee data for each department.

SELECT empno,
       ename,
       deptno,
       sal,
       STDDEV_SAMP(sal) OVER (PARTITION BY deptno) AS stddev_samp_by_dept
FROM   emp;

     EMPNO ENAME          DEPTNO        SAL STDDEV_SAMP_BY_DEPT
---------- ---------- ---------- ---------- -------------------
      7782 CLARK              10       2450          1893.62967
      7839 KING               10       5000          1893.62967
      7934 MILLER             10       1300          1893.62967
      7566 JONES              20       2975           1123.3321
      7902 FORD               20       3000           1123.3321
      7876 ADAMS              20       1100           1123.3321
      7369 SMITH              20        800           1123.3321
      7788 SCOTT              20       3000           1123.3321
      7521 WARD               30       1250          668.331255
      7844 TURNER             30       1500          668.331255
      7499 ALLEN              30       1600          668.331255
      7900 JAMES              30        950          668.331255
      7698 BLAKE              30       2850          668.331255
      7654 MARTIN             30       1250          668.331255

SQL>

Quick Links

The "*" indicates the function supports the full analytic syntax, including the windowing clause.

AVG * CLUSTER_DETAILS CLUSTER_DISTANCE CLUSTER_ID CLUSTER_PROBABILITY
CLUSTER_SET CORR * COUNT * COVAR_POP * COVAR_SAMP *
CUME_DIST DENSE_RANK FEATURE_DETAILS FEATURE_ID FEATURE_SET
FEATURE_VALUE FIRST FIRST_VALUE * LAG LAST
LAST_VALUE * LEAD LISTAGG MATCH_RECOGNIZE MAX *
MEDIAN MIN * NTH_VALUE * NTILE PERCENT_RANK
PERCENTILE_CONT PERCENTILE_DISC PREDICTION PREDICTION_COST PREDICTION_DETAILS
PREDICTION_PROBABILITY PREDICTION_SET RANK RATIO_TO_REPORT REGR_ (Linear Regression) Functions *
ROW_NUMBER STDDEV * STDDEV_POP * STDDEV_SAMP * SUM *
VAR_POP * VAR_SAMP * VARIANCE * String Aggregation Top-N Queries

For more information see:

Hope this helps. Regards Tim...

Back to the Top.