8i | 9i | 10g | 11g | 12c | 13c | 18c | 19c | 21c | 23ai | Misc | PL/SQL | SQL | RAC | WebLogic | Linux
ANY_VALUE Aggregate Function in Oracle Database 21c
The ANY_VALUE
function allows us to safely drop columns out of a GROUP BY
clause to reduce any performance overhead.
Related articles.
Setup
The examples in this article require the following tables to be present.
-- drop table emp purge; -- drop table dept purge; create table dept ( deptno number(2) constraint pk_dept primary key, dname varchar2(14), loc varchar2(13) ) ; create table emp ( empno number(4) constraint pk_emp primary key, ename varchar2(10), job varchar2(9), mgr number(4), hiredate date, sal number(7,2), comm number(7,2), deptno number(2) constraint fk_deptno references dept ); insert into dept values (10,'ACCOUNTING','NEW YORK'); insert into dept values (20,'RESEARCH','DALLAS'); insert into dept values (30,'SALES','CHICAGO'); insert into dept values (40,'OPERATIONS','BOSTON'); insert into emp values (7369,'SMITH','CLERK',7902,to_date('17-12-1980','dd-mm-yyyy'),800,null,20); insert into emp values (7499,'ALLEN','SALESMAN',7698,to_date('20-2-1981','dd-mm-yyyy'),1600,300,30); insert into emp values (7521,'WARD','SALESMAN',7698,to_date('22-2-1981','dd-mm-yyyy'),1250,500,30); insert into emp values (7566,'JONES','MANAGER',7839,to_date('2-4-1981','dd-mm-yyyy'),2975,null,20); insert into emp values (7654,'MARTIN','SALESMAN',7698,to_date('28-9-1981','dd-mm-yyyy'),1250,1400,30); insert into emp values (7698,'BLAKE','MANAGER',7839,to_date('1-5-1981','dd-mm-yyyy'),2850,null,30); insert into emp values (7782,'CLARK','MANAGER',7839,to_date('9-6-1981','dd-mm-yyyy'),2450,null,10); insert into emp values (7788,'SCOTT','ANALYST',7566,to_date('13-JUL-87','dd-mm-rr')-85,3000,null,20); insert into emp values (7839,'KING','PRESIDENT',null,to_date('17-11-1981','dd-mm-yyyy'),5000,null,10); insert into emp values (7844,'TURNER','SALESMAN',7698,to_date('8-9-1981','dd-mm-yyyy'),1500,0,30); insert into emp values (7876,'ADAMS','CLERK',7788,to_date('13-JUL-87', 'dd-mm-rr')-51,1100,null,20); insert into emp values (7900,'JAMES','CLERK',7698,to_date('3-12-1981','dd-mm-yyyy'),950,null,30); insert into emp values (7902,'FORD','ANALYST',7566,to_date('3-12-1981','dd-mm-yyyy'),3000,null,20); insert into emp values (7934,'MILLER','CLERK',7782,to_date('23-1-1982','dd-mm-yyyy'),1300,null,10); commit;
The Problem
We want to return a list of departments with a count of the number of employees in the department, so we use the COUNT
aggregate function and a GROUP BY
clause.
select d.deptno, d.dname, count(e.empno) as employee_count from dept d left outer join emp e on d.deptno = e.deptno group by d.deptno, d.dname order by 1; DEPTNO DNAME EMPLOYEE_COUNT ---------- -------------- -------------- 10 ACCOUNTING 3 20 RESEARCH 5 30 SALES 6 40 OPERATIONS 0 SQL>
We are forced to include all non-aggregate columns from the select list into the GROUP BY
or we will get an error. In this case we don't really care about including the DNAME
column in the GROUP BY
, but we are forced to do so. Adding extra columns in the GROUP BY
represents an overhead. To get around this, people will sometimes use the MIN
or MAX
functions.
select d.deptno, min(d.dname) as dname, count(e.empno) as employee_count from dept d left outer join emp e on d.deptno = e.deptno group by d.deptno order by 1; DEPTNO DNAME EMPLOYEE_COUNT ---------- -------------- -------------- 10 ACCOUNTING 3 20 RESEARCH 5 30 SALES 6 40 OPERATIONS 0 SQL> select d.deptno, max(d.dname) as dname, count(e.empno) as employee_count from dept d left outer join emp e on d.deptno = e.deptno group by d.deptno order by 1; DEPTNO DNAME EMPLOYEE_COUNT ---------- -------------- -------------- 10 ACCOUNTING 3 20 RESEARCH 5 30 SALES 6 40 OPERATIONS 0 SQL>
This allows us to remove the DNAME
column from the GROUP BY
, but now we have additional work associated with the MIN
and MAX
functions, which is a new overhead.
ANY_VALUE : The Solution
Oracle 21c introduced the ANY_VALUE
aggregate function to solve this problem. We use it in the same way we would use MIN
or MAX
, but it is optimized to reduce the overhead of the aggregate function. Rather than doing any type of comparison, ANY_VALUE
just presents the first non-NULL value it finds.
select d.deptno, any_value(d.dname) as dname, count(e.empno) as employee_count from dept d left outer join emp e on d.deptno = e.deptno group by d.deptno order by 1; DEPTNO DNAME EMPLOYEE_COUNT ---------- -------------- -------------- 10 ACCOUNTING 3 20 RESEARCH 5 30 SALES 6 40 OPERATIONS 0 SQL>
So now we can reduce the overhead of additional columns in the GROUP BY
, without having to add the overhead of the MIN
or MAX
functions.
Considerations
- It's non-deterministic. Don't assume any specific behaviour beyond what is stated.
- You are not going to notice dramatic performance improvements in small data sets like this, but as data sets increase in size, so do the overheads of additional columns in the
GROUP BY
or using theMIN
andMAX
functions. - Using
MIN
andMAX
have meaning. Someone new looking at the code won't know if you chose to useMIN
orMAX
just to remove the column from theGROUP BY
, or if there was a specific reason you chose it. TheANY_VALUE
aggregate function is non-deterministic, so using it is a clear message to any other developer that you are using it to drop the column out of theGROUP BY
. This extra clarity is a good thing from a support perspective. - The
ANY_VALUE
function supportsALL
andDISTINCT
keywords, but they have no function. - NULL values in the expression are ignored, so
ANY_VALUE
will return the first non-NULL value it finds. If all values in the expression are NULL, then the value NULL will be returned. - It supports any data type except XMLTYPE, ANYDATA, LOB, file, or collection data types, which result in an ORA-00932 error.
- Like most functions, the input expression can be a column, constant, bind variable, or an expression made up of them.
For more information see:
Hope this helps. Regards Tim...