8i | 9i | 10g | 11g | 12c | 13c | 18c | 19c | 21c | 23c | Misc | PL/SQL | SQL | RAC | WebLogic | Linux
Efficient Function Calls From SQL
This article presents a number of solutions for making calls to PL/SQL functions from SQL more efficient.
- The Problem
- Scalar Subquery Caching
- DETERMINISTIC Hint
- Cross-Session PL/SQL Function Result Cache (11g)
- Manual Caching Using PL/SQL Collections
- Manual Caching Using Contexts
- Scalar Subquery Caching (Revisited)
- Functions in the WHERE Clause
- The Read Consistency Issue
Related articles.
- Scalar Subquery Caching
- The DETERMINISTIC Hint
- Scalar Subquery Caching vs PL/SQL Function Result Cache
- PL/SQL Function Result Cache
- Pipelined Table Functions
- Function-Based Indexes
- WITH Clause Enhancements in Oracle Database 12c Release 1 (12cR1)
The Problem
Sometimes we require functions in the select list of a query. These may be built in functions, or functions we've coded ourselves. The default action is to call the function for each row returned by the query. If the function is deterministic, for the same input parameter signature it gives the same output, and the number of unique parameter signatures used is low compared to the number of rows processed, this can represent a lot of wasted effort.
The example below creates a test table with 10 rows, of 3 distinct values. The function contains a call to DBMS_LOCK.SLEEP
, to represent a workload that requires a considerable amount of internal processing, taking approximately 1 second per call. The function is then used in the select list of a query from the test table.
CONN / AS SYSDBA GRANT EXECUTE ON dbms_lock TO test; CONN test/test DROP TABLE func_test; CREATE TABLE func_test ( id NUMBER ); INSERT INTO func_test SELECT CASE WHEN level = 10 THEN 3 WHEN MOD(level, 2) = 0 THEN 2 ELSE 1 END FROM dual CONNECT BY level <= 10; COMMIT; CREATE OR REPLACE FUNCTION slow_function (p_in IN NUMBER) RETURN NUMBER AS BEGIN DBMS_LOCK.sleep(1); RETURN p_in; END; / SET TIMING ON SELECT slow_function(id) FROM func_test; SLOW_FUNCTION(ID) ----------------- 1 2 1 2 1 2 1 2 1 3 10 rows selected. Elapsed: 00:00:10.14 SQL>
The elapsed time clearly shows the function was called 10 times, once for each row, even though there were only 3 distinct input parameters.
The first thing you should do it attempt to remove any unnecessary function calls. It sounds obvious, but lazy programming and a lack of SQL knowledge may lead you into using PL/SQL where there is a SQL alternative. SQL keeps evolving and features such as analytic functions allow you to do incredible things without resorting to PL/SQL. Assuming this is not possible, the methods shown below should help reduce the overhead associated with the function calls.
Remember, the sweet spot for any of the caching solutions is where the number of unique parameter signatures are low compared to the total number of rows processed and any dependent data is not too volatile. If functions are dependent on tables and views containing volatile data, the contents of the cache will quickly become out of date.
Scalar Subquery Caching
Rewriting the function call into a scalar subquery allows Oracle to use scalar subquery caching to optimize the number of function calls needed to complete the query. Oracle builds an in-memory hash table to cache the results of scalar subqueries.
SET TIMING ON SELECT (SELECT slow_function(id) FROM dual) FROM func_test; (SELECTSLOW_FUNCTION(ID)FROMDUAL) --------------------------------- 1 2 1 2 1 2 1 2 1 3 10 rows selected. Elapsed: 00:00:03.03 SQL>
The elapsed time shows Oracle was able to cache the result of the subquery and reuse it for subsequent rows where necessary.
Caching is only available for the lifetime of the query. Subsequent queries have no access to the cached values of previous runs.
The size of the hash table is quite important, as it does limit the extent to which Oracle can cache scalar subqueries. In Oracle 10g and 11g, the hash table contains 255 buckets. If there are more than 255 distinct values, the 256th and onward values are not cached in the hash table. Likewise, if there are hash collisions, the subqueries resulting in the collisions are not placed in the hash table. Even in this situation you can still get performance improvements, because Oracle always keeps the latest subquery value, in addition to those placed in the hash table, so provided the data set is ordered appropriately, you can still get great performance when there are more than 255 distinct values, or hash collisions.
We will mention scalar subquery caching again later.
DETERMINISTIC Hint
The DETERMINSTIC
hint has been available since Oracle 8i, where it was first introduced to mark functions as deterministic to allow them to be used in function-based indexes, but it is only in Oracle 10gR2 onward that it has some affect on how the function is cached in SQL.
In the following example, labelling the function as deterministic does improve performance, but the caching is limited to a single fetch, so it is affected by the array size. In this example, the array size is varied using the SQL*Plus "SET ARRAYSIZE n
" command.
CREATE OR REPLACE FUNCTION slow_function (p_in IN NUMBER) RETURN NUMBER DETERMINISTIC AS BEGIN DBMS_LOCK.sleep(1); RETURN p_in; END; / SET TIMING ON SET ARRAYSIZE 15 SELECT slow_function(id) FROM func_test; SLOW_FUNCTION(ID) ----------------- 1 2 1 2 1 2 1 2 1 3 10 rows selected. Elapsed: 00:00:04.04 SQL> SET TIMING ON SET ARRAYSIZE 2 SELECT slow_function(id) FROM func_test; SLOW_FUNCTION(ID) ----------------- 1 2 1 2 1 2 1 2 1 3 10 rows selected. Elapsed: 00:00:10.01 SQL>
The difference in array size produced drastically different performance, showing that caching is only available for the lifetime of the fetch. Subsequent queries (or fetches) have no access to the cached values of previous runs.
Cross-Session PL/SQL Function Result Cache (11g)
Oracle 11g introduced two new caching mechanisms:
- Cross-Session PL/SQL Function Result Cache : Used for caching the results of function calls.
- Query Result Cache : Used for caching the whole result set produced by a query.
We can use the first mechanism to cache the results of our slow function, allowing us to remove the need to rerun it for the same parameter signature.
CREATE OR REPLACE FUNCTION slow_function (p_in IN NUMBER) RETURN NUMBER RESULT_CACHE AS BEGIN DBMS_LOCK.sleep(1); RETURN p_in; END; / SET TIMING ON SELECT slow_function(id) FROM func_test; SLOW_FUNCTION(ID) ----------------- 1 2 1 2 1 2 1 2 1 3 10 rows selected. Elapsed: 00:00:03.09 SQL>
The advantage of this method is the cached information can be reused by any session and dependencies are managed automatically. If we run the query again we get even better performance because we can used the cached values without calling the function at all.
SET TIMING ON SELECT slow_function(id) FROM func_test; SLOW_FUNCTION(ID) ----------------- 1 2 1 2 1 2 1 2 1 3 10 rows selected. Elapsed: 00:00:00.02 SQL>
Manual Caching Using PL/SQL Collections
In versions prior to 11g, we can manually cache the values of function calls in a PL/SQL collection. The following code builds a caching layer in front of the calls to the slow function.
-- Recreate the original function to remove any amendments made in previous examples. CREATE OR REPLACE FUNCTION slow_function (p_in IN NUMBER) RETURN NUMBER AS BEGIN DBMS_LOCK.sleep(1); RETURN p_in; END; / CREATE OR REPLACE PACKAGE cached_lookup_api AS FUNCTION get_cached_value (p_id IN NUMBER) RETURN NUMBER; PROCEDURE clear_cache; END cached_lookup_api; / CREATE OR REPLACE PACKAGE BODY cached_lookup_api AS TYPE t_tab IS TABLE OF NUMBER INDEX BY BINARY_INTEGER; g_tab t_tab; g_last_use DATE := SYSDATE; g_max_cache_age NUMBER := 10/(24*60); -- 10 minutes -- ----------------------------------------------------------------- FUNCTION get_cached_value (p_id IN NUMBER) RETURN NUMBER AS l_value NUMBER; BEGIN IF (SYSDATE - g_last_use) > g_max_cache_age THEN -- Older than 10 minutes. Delete cache. g_last_use := SYSDATE; clear_cache; END IF; BEGIN l_value := g_tab(p_id); EXCEPTION WHEN NO_DATA_FOUND THEN -- Call function and cache data. l_value := slow_function(p_id); g_tab(p_id) := l_value; END; RETURN l_value; END get_cached_value; -- ----------------------------------------------------------------- -- ----------------------------------------------------------------- PROCEDURE clear_cache AS BEGIN g_tab.delete; END; -- ----------------------------------------------------------------- END cached_lookup_api; /
If we query the caching layer, it only calls the slow function if a suitable value isn't already cached.
SET TIMING ON SELECT cached_lookup_api.get_cached_value(id) FROM func_test; CACHED_LOOKUP_API.GET_CACHED_VALUE(ID) -------------------------------------- 1 2 1 2 1 2 1 2 1 3 10 rows selected. Elapsed: 00:00:03.10 SQL>
Executing the query again in the same session allows it to reuse the cached values without calling the function at all.
SET TIMING ON SELECT cached_lookup_api.get_cached_value(id) FROM func_test; CACHED_LOOKUP_API.GET_CACHED_VALUE(ID) -------------------------------------- 1 2 1 2 1 2 1 2 1 3 10 rows selected. Elapsed: 00:00:00.01 SQL>
There are a few gotchas associated with this method.
- There is no dependency management in the cache. This implementation deletes the cache if it is older than 10 minutes. You could improve the granularity, but it would require extra work.
- If the session was part of a connection pool, you would have potential for information bleeding through between multiple calls. This could be solved with a package reset.
- There is no automatic mechanism to manage the size of the cache. You could include this.
- The cache is not shared between database sessions.
Manual Caching Using Contexts
The manual caching approach can also be performed using a context, rather than a PL/SQL collection. The advantage of this is the context can be defined as globally accessible, making the data shareable between sessions. The following package is a rewrite of the previous example, modified to use a context. Notice the first line is the creation of the context itself.
-- Recreate the original function to remove any amendments made in previous examples. CREATE OR REPLACE FUNCTION slow_function (p_in IN NUMBER) RETURN NUMBER AS BEGIN DBMS_LOCK.sleep(1); RETURN p_in; END; / CREATE OR REPLACE CONTEXT cache_context USING cached_lookup_api ACCESSED GLOBALLY; CREATE OR REPLACE PACKAGE cached_lookup_api AS FUNCTION get_cached_value (p_id IN NUMBER) RETURN NUMBER; PROCEDURE clear_cache; END cached_lookup_api; / CREATE OR REPLACE PACKAGE BODY cached_lookup_api AS g_last_use DATE := SYSDATE; g_max_cache_age NUMBER := 10/(24*60); -- 10 minutes g_context_name VARCHAR2(20) := 'cache_context'; -- ----------------------------------------------------------------- FUNCTION get_cached_value (p_id IN NUMBER) RETURN NUMBER AS l_value NUMBER; BEGIN IF (SYSDATE - g_last_use) > g_max_cache_age THEN -- Older than 10 minutes. Delete cache. g_last_use := SYSDATE; clear_cache; END IF; l_value := SYS_CONTEXT(g_context_name, p_id); IF l_value IS NULL THEN l_value := slow_function(p_id); DBMS_SESSION.set_context(g_context_name, p_id, l_value); END IF; RETURN l_value; END get_cached_value; -- ----------------------------------------------------------------- -- ----------------------------------------------------------------- PROCEDURE clear_cache AS BEGIN DBMS_SESSION.clear_all_context(g_context_name); END; -- ----------------------------------------------------------------- END cached_lookup_api; /
Like the example in the previous section, if we query the caching layer, it only calls the slow function if a suitable value isn't already cached.
SET TIMING ON SELECT cached_lookup_api.get_cached_value(id) FROM func_test; CACHED_LOOKUP_API.GET_CACHED_VALUE(ID) -------------------------------------- 1 2 1 2 1 2 1 2 1 3 10 rows selected. Elapsed: 00:00:03.07 SQL>
Executing the query again in the same session allows it to reuse the cached values without calling the function at all.
SET TIMING ON SELECT cached_lookup_api.get_cached_value(id) FROM func_test; CACHED_LOOKUP_API.GET_CACHED_VALUE(ID) -------------------------------------- 1 2 1 2 1 2 1 2 1 3 10 rows selected. Elapsed: 00:00:00.01 SQL>
Creating the context as ACCESSED GLOBALLY
allows the cache to be shared between sessions, as shown in the example below, which starts by connecting to a new session.
-- Create new session. CONN test/test SET TIMING ON SELECT cached_lookup_api.get_cached_value(id) FROM func_test; CACHED_LOOKUP_API.GET_CACHED_VALUE(ID) -------------------------------------- 1 2 1 2 1 2 1 2 1 3 10 rows selected. Elapsed: 00:00:00.02 SQL>
Manually caching data in contexts shares most of the same gotchas as caching in PL/SQL collections.
Scalar Subquery Caching (Revisited)
We have discussed a number of caching mechanisms in addition to scalar subquery caching, but do these alternative caching methods negate the need for scalar subquery caching? The answer to that is no, because scalar subquery caching is the only mechanism that effectively reduces the number of context switches between SQL and PL/SQL. To show this we will build a new test table with 100,000 rows of the same value.
DROP TABLE t2; CREATE TABLE t2 ( id NUMBER ); INSERT /*+ APPEND */ INTO t2 SELECT 1 FROM dual CONNECT BY level <= 100000; COMMIT;
Recreate the slow function using the result cache, but without the sleep.
CREATE OR REPLACE FUNCTION slow_function (p_in IN NUMBER) RETURN NUMBER RESULT_CACHE AS BEGIN --DBMS_LOCK.sleep(1); RETURN p_in; END; /
Compare the CPU usage of a regular query using the result cache, with that of a scalar subquery using the result cache.
SET SERVEROUTPUT ON DECLARE l_start NUMBER; BEGIN l_start := DBMS_UTILITY.get_cpu_time; FOR cur_rec IN (SELECT slow_function(id) FROM t2) LOOP NULL; END LOOP; DBMS_OUTPUT.put_line('Regular Query (SELECT List): ' || (DBMS_UTILITY.get_cpu_time - l_start) || ' hsecs CPU Time'); l_start := DBMS_UTILITY.get_cpu_time; FOR cur_rec IN (SELECT (SELECT slow_function(id) FROM dual) FROM t2) LOOP NULL; END LOOP; DBMS_OUTPUT.put_line('Scalar Subquery (SELECT List): ' || (DBMS_UTILITY.get_cpu_time - l_start) || ' hsecs CPU Time'); END; / Regular Query (SELECT List): 42 hsecs CPU Time Scalar Subquery (SELECT List): 6 hsecs CPU Time PL/SQL procedure successfully completed. SQL>
This difference in CPU usage is also visible when using scalar subqueries in the WHERE
clause.
SET SERVEROUTPUT ON DECLARE l_start NUMBER; BEGIN l_start := DBMS_UTILITY.get_cpu_time; FOR cur_rec IN (SELECT 1 FROM t2 WHERE id = slow_function(id)) LOOP NULL; END LOOP; DBMS_OUTPUT.put_line('Regular Query (WHERE): ' || (DBMS_UTILITY.get_cpu_time - l_start) || ' hsecs CPU Time'); l_start := DBMS_UTILITY.get_cpu_time; FOR cur_rec IN (SELECT 1 FROM t2 WHERE id = (SELECT slow_function(id) FROM dual)) LOOP NULL; END LOOP; DBMS_OUTPUT.put_line('Scalar Subquery (WHERE): ' || (DBMS_UTILITY.get_cpu_time - l_start) || ' hsecs CPU Time'); END; / Regular Query (WHERE): 49 hsecs CPU Time Scalar Subquery (WHERE): 13 hsecs CPU Time PL/SQL procedure successfully completed. SQL>
What causes this difference in CPU usage? With the exception of scalar subquery caching, all the other caching methods discussed here still require calls to the PL/SQL function, which result in context switches between SQL and PL/SQL. These context switches account for the extra CPU load.
So even when you are using alternative caching features to improve performance of function calls between multiple execution, or between sessions, you should still use scalar subquery caching to reduce context switching.
Functions in the WHERE Clause
The caching methods discussed previously are also appropriate for the WHERE
clause, especially scalar subquery caching to reduce context switching.
Applying a function on a column in the WHERE
clause of a query can result in poor performance, because it prevents the optimizer from using regular indexes on that column. Assuming the query can't be rewritten to remove the need for the function call, one option is to use a function based index.
You should also consider virtual columns, introduced in Oracle 11g.
The Read Consistency Issue
Oracle's read consistency model uses undo to maintain a consistent view of the data as it was at the point in time when a SQL statement was issued. This same read consistency model applies to SQL statements issued directly, or from within PL/SQL. This has an interesting effect on SQL statements containing functions, which internally contain SQL statements. Each SQL statement in the function is read consistent with the point in time it is issued, not with the point in time the function is called and not with the point in time the outer SQL statement was issued. In effect, calling functions in SQL statements, that internally issue SQL breaks the Oracle read consistency model.
Bryn Llewellyn came to one of my presentations on efficient function calls from SQL and we had a conversation about this issue. During the conversation I asked the question, "Do you care?", which might seem strange, but we have to keep this issue in context.
- Many of the functions we are likely to call in SQL statements do not contains SQL, so they have zero impact.
- Those functions that do contain SQL, are often querying static reference data. In these cases, the chances of this causing a problem is zero.
- Even if the data the function is querying is non-static, is the data volatile enough to cause a problem during the lifespan of the query?
- Is the accuracy of the response from the query so vital that a potential inconsistency raises a concern?
- Is this application database-agnostic? In which case, it is probably going to run against several read-consistency models anyway, giving an unpredictable result in each.
You have to make a judgement for yourself how concerned you are about this issue on a case-by-case basis. If you really want to be certain you have a read-consistent result, your best option is to use flashback query in the following manner.
EXEC DBMS_FLASHBACK.enable_at_time(SYSTIMESTAMP); SELECT slow_function(id) FROM func_test; EXEC DBMS_FLASHBACK.disable;
Using this method, any SQL issued between the ENABLE_AT_TIME
and the DISABLE
calls is consistent with the same point in time.
In many cases, I think this is an academic exercise, not a real concern, but it is important you understand the issue or you may fall victim to it in the specific cases where it does matter.
For more information see:
- Scalar Subquery Caching
- The DETERMINISTIC Hint
- PL/SQL Function Result Cache
- Scalar Subquery Caching vs PL/SQL Function Result Cache
- Pipelined Table Functions
- Function-Based Indexes
- Cross-Session PL/SQL Function Result Cache in Oracle Database 11g Release 1
- Query Result Cache in Oracle Database 11g Release 1
- Scalar Subquery Expressions
- Function Based Indexes
- Virtual Columns in Oracle Database 11g Release 1
- WITH Clause Enhancements in Oracle Database 12c Release 1 (12cR1)
Hope this helps. Regards Tim...