8i | 9i | 10g | 11g | 12c | 13c | 18c | 19c | 21c | 23ai | Misc | PL/SQL | SQL | RAC | WebLogic | Linux
Analytic Function Syntax Enhancements (WINDOW, GROUPS, EXCLUDE) in Oracle Database 21c
This article describes the syntax enhancements to analytic function introduced in Oracle database 21c.
Related articles.
- Analytic Function Window Clause in Oracle Database 21c
- Analytic Function Groups Clause in Oracle Database 21c
- Analytic Function Exclude Clause in Oracle Database 21c
- Analytic Functions : All Articles
Setup
The examples in this article require the following tables.
--drop table emp purge; create table emp ( empno number(4) constraint pk_emp primary key, ename varchar2(10), job varchar2(9), mgr number(4), hiredate date, sal number(7,2), comm number(7,2), deptno number(2) ); insert into emp values (7369,'SMITH','CLERK',7902,to_date('17-12-1980','dd-mm-yyyy'),800,NULL,20); insert into emp values (7499,'ALLEN','SALESMAN',7698,to_date('20-2-1981','dd-mm-yyyy'),1600,300,30); insert into emp values (7521,'WARD','SALESMAN',7698,to_date('22-2-1981','dd-mm-yyyy'),1250,500,30); insert into emp values (7566,'JONES','MANAGER',7839,to_date('2-4-1981','dd-mm-yyyy'),2975,NULL,20); insert into emp values (7654,'MARTIN','SALESMAN',7698,to_date('28-9-1981','dd-mm-yyyy'),1250,1400,30); insert into emp values (7698,'BLAKE','MANAGER',7839,to_date('1-5-1981','dd-mm-yyyy'),2850,NULL,30); insert into emp values (7782,'CLARK','MANAGER',7839,to_date('9-6-1981','dd-mm-yyyy'),2450,NULL,10); insert into emp values (7788,'SCOTT','ANALYST',7566,to_date('13-JUL-87','dd-mm-rr')-85,3000,NULL,20); insert into emp values (7839,'KING','PRESIDENT',NULL,to_date('17-11-1981','dd-mm-yyyy'),5000,NULL,10); insert into emp values (7844,'TURNER','SALESMAN',7698,to_date('8-9-1981','dd-mm-yyyy'),1500,0,30); insert into emp values (7876,'ADAMS','CLERK',7788,to_date('13-JUL-87', 'dd-mm-rr')-51,1100,NULL,20); insert into emp values (7900,'JAMES','CLERK',7698,to_date('3-12-1981','dd-mm-yyyy'),950,NULL,30); insert into emp values (7902,'FORD','ANALYST',7566,to_date('3-12-1981','dd-mm-yyyy'),3000,NULL,20); insert into emp values (7934,'MILLER','CLERK',7782,to_date('23-1-1982','dd-mm-yyyy'),1300,NULL,10); commit; --drop table t1 purge; create table t1 (id number, value number); insert into t1 (id, value) values (1, 1); insert into t1 (id, value) values (2, 2); insert into t1 (id, value) values (3, 3); insert into t1 (id, value) values (4, 3); insert into t1 (id, value) values (5, 4); insert into t1 (id, value) values (6, 6); insert into t1 (id, value) values (7, 6); insert into t1 (id, value) values (8, 7); insert into t1 (id, value) values (9, 7); insert into t1 (id, value) values (10, 8); commit;
WINDOW Clause
In previous releases the window frame was defined as part the analytic function call. The following query uses the FIRST_VALUE
analytic function to display the lowest salary in each department, along with the raw data about the employees in the department.
select empno, deptno, sal, first_value(sal) over (partition by deptno order by sal) as lowest_in_dept from emp; EMPNO DEPTNO SAL LOWEST_IN_DEPT ---------- ---------- ---------- -------------- 7934 10 1300 1300 7782 10 2450 1300 7839 10 5000 1300 7369 20 800 800 7876 20 1100 800 7566 20 2975 800 7788 20 3000 800 7902 20 3000 800 7900 30 950 950 7654 30 1250 950 7521 30 1250 950 7844 30 1500 950 7499 30 1600 950 7698 30 2850 950 SQL>
From Oracle 21c onward the window frame can be defined using a separate window clause, which is referenced in the analytic function call. This example uses a separate window clause to define the window frame.
select empno, deptno, sal, first_value(sal) over w1 as lowest_in_dept from emp window w1 as (partition by deptno order by sal); EMPNO DEPTNO SAL LOWEST_IN_DEPT ---------- ---------- ---------- -------------- 7934 10 1300 1300 7782 10 2450 1300 7839 10 5000 1300 7369 20 800 800 7876 20 1100 800 7566 20 2975 800 7788 20 3000 800 7902 20 3000 800 7900 30 950 950 7654 30 1250 950 7521 30 1250 950 7844 30 1500 950 7499 30 1600 950 7698 30 2850 950 SQL>
This means multiple analytic functions can reference the same window clause. In this example we add the RANK
analytic function to assign a rank to each person in the department based on their salary from lowest to highest.
select empno, deptno, sal, first_value(sal) over w1 as lowest_in_dept, rank() over w1 as sal_rank_in_dept from emp window w1 as (partition by deptno order by sal); EMPNO DEPTNO SAL LOWEST_IN_DEPT SAL_RANK_IN_DEPT ---------- ---------- ---------- -------------- ---------------- 7934 10 1300 1300 1 7782 10 2450 1300 2 7839 10 5000 1300 3 7369 20 800 800 1 7876 20 1100 800 2 7566 20 2975 800 3 7788 20 3000 800 4 7902 20 3000 800 4 7900 30 950 950 1 7654 30 1250 950 2 7521 30 1250 950 2 7844 30 1500 950 4 7499 30 1600 950 5 7698 30 2850 950 6 SQL>
Multiple window frames can be defined and referenced. The following query displays the lowest and highest salary in each department. It uses the FIRST_VALUE
analytic function twice, each referencing a different window frame.
select empno, deptno, sal, first_value(sal) over w1 as lowest_in_dept, first_value(sal) over w2 as highest_in_dept from emp window w1 as (partition by deptno order by sal), w2 as (partition by deptno order by sal desc); EMPNO DEPTNO SAL LOWEST_IN_DEPT HIGHEST_IN_DEPT ---------- ---------- ---------- -------------- --------------- 7934 10 1300 1300 5000 7782 10 2450 1300 5000 7839 10 5000 1300 5000 7369 20 800 800 3000 7876 20 1100 800 3000 7566 20 2975 800 3000 7788 20 3000 800 3000 7902 20 3000 800 3000 7900 30 950 950 2850 7521 30 1250 950 2850 7654 30 1250 950 2850 7844 30 1500 950 2850 7499 30 1600 950 2850 7698 30 2850 950 2850 SQL>
We can define a common WINDOW
clause, and extend it in each analytic function call.
select row_number () over w1 as row_order, sal, avg(sal) over (w1 rows between unbounded preceding and current row) as avg_rolling, avg(sal) over (w1 rows between unbounded preceding and unbounded following) as avg_all from emp window w1 as (order by sal); ROW_ORDER SAL AVG_ROLLING AVG_ALL ---------- ---------- ----------- ---------- 1 800 800 2073.21429 2 950 875 2073.21429 3 1100 950 2073.21429 4 1250 1025 2073.21429 5 1250 1070 2073.21429 6 1300 1108.33333 2073.21429 7 1500 1164.28571 2073.21429 8 1600 1218.75 2073.21429 9 2450 1355.55556 2073.21429 10 2850 1505 2073.21429 11 2975 1638.63636 2073.21429 12 3000 1752.08333 2073.21429 13 3000 1848.07692 2073.21429 14 5000 2073.21429 2073.21429 SQL>
It's worth keeping in mind this new syntax is a query transformation. We can see this if we trace the session. Check the trace file for the session.
select value from v$diag_info where name = 'Default Trace File'; VALUE ---------------------------------------------------------------- /u01/app/oracle/diag/rdbms/cdb1/cdb1/trace/cdb1_ora_23984.trc 1 row selected. SQL>
Perform a 10053 trace of a statement using the new syntax.
alter session set events '10053 trace name context forever'; select empno, deptno, sal, first_value(sal) over w1 as lowest_in_dept, first_value(sal) over w2 as highest_in_dept from emp window w1 as (partition by deptno order by sal), w2 as (partition by deptno order by sal desc); alter session set events '10053 trace name context off';
The section of the trace file beginning with "Final query after transformations" shows the statement that was actually processed, after the query transformation.
Final query after transformations:******* UNPARSED QUERY IS ******* SELECT "EMP"."EMPNO" "EMPNO", "EMP"."DEPTNO" "DEPTNO", "EMP"."SAL" "SAL", FIRST_VALUE("EMP"."SAL") OVER ( PARTITION BY "EMP"."DEPTNO" ORDER BY "EMP"."SAL" RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) "LOWEST_IN_DEPT", FIRST_VALUE("EMP"."SAL") OVER ( PARTITION BY "EMP"."DEPTNO" ORDER BY "EMP"."SAL" DESC RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) "HIGHEST_IN_DEPT" FROM "TESTUSER1"."EMP" "EMP"
As we can see, the statement has been rewritten to a form we might have used prior to 21c.
GROUPS Clause
The GROUPS
keyword has been added to the ROWS
and RANGE
keywords, so now there are three options for defining a windowing clause.
ROWS
: We are indicating a specific number of rows relative to the current row, either directly, or via an expression. There is no concept of tied rows. The cutoff is explicitly the row indicated. It can have any number of sort keys, of any ordered types.RANGE
: We are referring to a range of values in a specific column relative to the value in the current row. As a result, Oracle doesn't know how many rows are included in the range until the ordered set is created. There must be a single sort key, of a data type that allows addition and subtraction, such as a numeric, date or interval.GROUPS
: We are dividing the data into groups based on the ordered values. All ties are part of the same group. The current row is part of the current group, so references to preceding or following refer to the preceding or following group, not specific rows. Similar toRANGE
, the number of rows in the window are not known until the data is ordered. It can have any number of sort keys, of any ordered types.
The following query compares an average salary "between 1 preceding and current row" using the ROWS
, GROUPS
and RANGE
clauses.
select row_number () over (order by sal) as row_order, sal, avg(sal) over w1 as avg_rows, avg(sal) over w2 as avg_groups, avg(sal) over w3 as avg_range from emp window w1 as (order by sal rows between 1 preceding and current row), w2 as (order by sal groups between 1 preceding and current row), w3 as (order by sal range between 1 preceding and current row); ROW_ORDER SAL AVG_ROWS AVG_GROUPS AVG_RANGE ---------- ---------- ---------- ---------- ---------- 1 800 800 800 800 2 950 875 875 950 3 1100 1025 1025 1100 4 1250 1175 1200 1250 5 1250 1250 1200 1250 6 1300 1275 1266.66667 1300 7 1500 1400 1400 1500 8 1600 1550 1550 1600 9 2450 2025 2025 2450 10 2850 2650 2650 2850 11 2975 2912.5 2912.5 2975 12 3000 2987.5 2991.66667 3000 13 3000 3000 2991.66667 3000 14 5000 4000 3666.66667 5000 SQL>
If we work down the AVG_ROWS
column one row at a time we can see we always have the average for salary value of the current row and the row immediately preceding it.
The AVG_GROUPS
column starts off the same, but once the current row is 4, we have a salary tie with row 5, so rows 4 and 5 are treated as a single group, with row 3 as the preceding group. Rather returning the average of the salaries from rows 3 and 4, we see the average of salaries from rows 3, 4 and 5, the current group and the preceding group.
For this data the AVG_RANGE
column is essentially reporting the salary value. Remember RANGE
is a reporting range value, not a row reference. A range of "1 preceding" is the "value-1", not the "row-1". Ties do matter, but you can't see the effect using this data set.
Let's try a different data set by writing a similar query against the T1
table. Now we have increments of "1" in the VALUE
column, with a few ties and one gap, because value "5" is missing from the ordered set.
select row_number () over (order by value) as row_order, value, avg(value) over w1 as avg_rows, avg(value) over w2 as avg_groups, avg(value) over w3 as avg_range from t1 window w1 as (order by value rows between 1 preceding and current row), w2 as (order by value groups between 1 preceding and current row), w3 as (order by value range between 1 preceding and current row); ROW_ORDER VALUE AVG_ROWS AVG_GROUPS AVG_RANGE ---------- ---------- ---------- ---------- ---------- 1 1 1 1 1 2 2 1.5 1.5 1.5 3 3 2.5 2.66666667 2.66666667 4 3 3 2.66666667 2.66666667 5 4 3.5 3.33333333 3.33333333 6 6 5 5.33333333 6 7 6 6 5.33333333 6 8 7 6.5 6.5 6.5 9 7 7 6.5 6.5 10 8 7.5 7.33333333 7.33333333 SQL>
The AVG_ROWS
column displays the average between the current row and the preceding row.
The AVG_GROUPS
column looks different at row 3. Row 3 and 4 have a tie for the value of "3". This means row 3 and 4 are a single group and row 2 with the value "2" is the previous group. As a result we get the average for rows 2, 3 and 4 returned for rows 3 and 4. We have a similar situation with rows 6 and 7, which have a tie for the value "6" and are handled as a single group. The previous group jumps the missing value "5" and is a group with the value "4".
The AVG_RANGE
column looks similar to the AVG_GROUPS
column until we hit a gap in the value sequence. Rows 6 and 7 have a tie for the value "6". The range is from "6-1=5" to "6", but there is no value of "5" in the ordered set, so we are presented with the average of rows 6 and 7 only when the current row is either row 6 or 7.
EXCLUDE Clause
The EXCLUDE
clause has several options to allow us to exclude various rows from to the window frame. The following options are available.
EXCLUDE NO OTHERS
: Nothing is excluded. This is the equivalent not having anEXCLUDE
clause, so this is the default behaviour.EXCLUDE CURRENT ROW
: Excludes only the current row.EXCLUDE GROUP
: Excludes the current group, the current row and all other rows with the same value.EXCLUDE TIES
: Remove all rows with the same value as the current row, but does not exclude the current row.
All these exclusions can be used with ROWS
, RANGE
and GROUPS
window definitions.
The following query calculates the average of the values from one row preceding and one row following, but excludes the current row.
select row_number () over w1 as row_order, value, avg(value) over (w1 rows between 1 preceding and 1 following exclude current row) as ex_current_row from t1 window w1 as (order by value); ROW_ORDER VALUE EX_CURRENT_ROW ---------- ---------- -------------- 1 1 2 2 2 2 3 3 2.5 4 3 3.5 5 4 4.5 6 6 5 7 6 6.5 8 7 6.5 9 7 7.5 10 8 7 SQL>
The following query calculates the average of the values from one row preceding and one row following, but excludes the current group. Using a row definition for the main window and a group exclusion leads to an interesting result.
select row_number () over w1 as row_order, value, avg(value) over (w1 rows between 1 preceding and 1 following exclude group) as ex_group from t1 window w1 as (order by value); ROW_ORDER VALUE EX_GROUP ---------- ---------- ---------- 1 1 2 2 2 2 3 3 2 4 3 4 5 4 4.5 6 6 4 7 6 7 8 7 6 9 7 8 10 8 7 SQL>
The following query calculates a rolling average using a range from unbounded preceding to current row, but excludes ties, so the reporting range doesn't do the normal processing of including all ties with the current row.
select row_number () over w1 as row_order, value, avg(value) over (w1 range between unbounded preceding and current row exclude ties) as ex_ties from t1 window w1 as (order by value); ROW_ORDER VALUE EX_TIES ---------- ---------- ---------- 1 1 1 2 2 1.5 3 3 2 4 3 2 5 4 2.6 6 6 3.16666667 7 6 3.16666667 8 7 4 9 7 4 10 8 4.7 SQL>
The following query calculates a rolling average using a range from unbounded preceding to current row, but excludes no other rows, which is the equivalent on not having an EXCLUDE
clause.
select row_number () over w1 as row_order, value, avg(value) over (w1 range between unbounded preceding and current row exclude no others) as ex_no_others from t1 window w1 as (order by value); ROW_ORDER VALUE EX_NO_OTHERS ---------- ---------- ------------ 1 1 1 2 2 1.5 3 3 2.25 4 3 2.25 5 4 2.6 6 6 3.57142857 7 6 3.57142857 8 7 4.33333333 9 7 4.33333333 10 8 4.7 SQL>
Quick Links
The "*" indicates the function supports the full analytic syntax, including the windowing clause.
For more information see:
- Windowing Functions
- Practice: Using Enhanced Analytic Functions
- Analytic Function Window Clause in Oracle Database 21c
- Analytic Function Groups Clause in Oracle Database 21c
- Analytic Function Exclude Clause in Oracle Database 21c
- Analytic Functions : All Articles
Hope this helps. Regards Tim...