8i | 9i | 10g | 11g | 12c | 13c | 18c | 19c | 21c | 23c | Misc | PL/SQL | SQL | RAC | WebLogic | Linux

Home » Articles » 21c » Here

Multivalue Function-Based Indexes for JSON_EXISTS in Oracle Database 21c

Multivalue function-based indexes allow us to index multiple scalar values in JSON documents stored in the JSON data type. This can improve the performance of queries containing the JSON_EXISTS condition.

Related articles.

Setup

Create and populate a test table with some JSON data.

-- drop table t1 purge;

create table t1 (
  id         number generated always as identity,
  name       varchar2(20),
  json_data  json,
  constraint t1_pk primary key (id)
);


begin
  insert into t1 (name, json_data)
  values ('book 1', json('{"words":[
                             {"word":"apple","pages":[5, 10, 15, 20, 25, 30]},
                             {"word":"orange","pages":[10, 20, 30, 40, 50, 60]}
                          ]}'));

  insert into t1 (name, json_data)
  values ('book 2', json('{"words":[
                             {"word":"apple","pages":[1, 6, 11, 16, 21, 26]},
                             {"word":"orange","pages":[11, 21, 31, 41, 51, 61]}
                          ]}'));

  commit;
  dbms_stats.gather_table_stats(null, 't1');
end;
/

Each row represents a book, with an array of words. For each word there is an array of pages that word can be found on.

Between each test we will flush the shared pool to make sure we get a hard parse each time. We use SET AUTOTRACE TRACE EXPLAIN to display the execution plan when we run queries, but not clutter the screen with the output rows or the statistics.

conn sys/SysPassword1@//localhost:1521/pdb1 as sysdba
alter system flush shared_pool;
conn testuser1/testuser1@//localhost:1521/pdb1
set autotrace trace explain

Non-Composite Multivalue Indexes for Arrays

We use the following query to check for a row which has a reference to page 40. It's a bit of a silly query, since we are not checking what word is referenced on that page, but it serves our purpose for this test.

conn sys/SysPassword1@//localhost:1521/pdb1 as sysdba
alter system flush shared_pool;
conn testuser1/testuser1@//localhost:1521/pdb1
set autotrace trace explain

select *
from   t1
where  json_exists(json_data, '$.words.pages?(@.number() == 40)');

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |   183 |    36   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |     1 |   183 |    36   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(JSON_EXISTS2("JSON_DATA" /*+ LOB_BY_VALUE */  FORMAT OSON
              , '$.words.pages?(@.number() == 40)' FALSE ON ERROR)=1)

SQL>

There are no indexes present, so we get a full table scan to retrieve the data.

We create an index on the pages array, but this doesn't use the MULTIVALUE keyword. As a result the index is not used and the query still requires a full table scan.

drop index t1_mvi;

create index t1_mvi on t1 t
  (t.json_data.words.pages.number());

We run the query again and it still uses a full table scan.

conn sys/SysPassword1@//localhost:1521/pdb1 as sysdba
alter system flush shared_pool;
conn testuser1/testuser1@//localhost:1521/pdb1
set autotrace trace explain

select *
from   t1
where  json_exists(json_data, '$.words.pages?(@.number() == 40)');

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |   183 |    36   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |     1 |   183 |    36   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(JSON_EXISTS2("JSON_DATA" /*+ LOB_BY_VALUE */  FORMAT OSON
              , '$.words.pages?(@.number() == 40)' FALSE ON ERROR)=1)

SQL>

We recreate the index using the MULTIVALUE keyword, so now it can index the array properly. This is a non-composite multivalue index, so we must use a data type conversion method. In this case we use the "number()" method.

drop index t1_mvi;

create multivalue index t1_mvi on t1 t
  (t.json_data.words.pages.number());

We run the same query again, but this time notice the INDEX RANGE SCAN (MULTI VALUE) operation.

conn sys/SysPassword1@//localhost:1521/pdb1 as sysdba
alter system flush shared_pool;
conn testuser1/testuser1@//localhost:1521/pdb1
set autotrace trace explain

select *
from   t1
where  json_exists(json_data, '$.words.pages?(@.number() == 40)');

Execution Plan
----------------------------------------------------------
Plan hash value: 1854116654

----------------------------------------------------------------------------------------------
| Id  | Operation                           | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |        |     1 |   185 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1     |     1 |   185 |     2   (0)| 00:00:01 |
|   2 |   HASH UNIQUE                       |        |     1 |   185 |            |          |
|*  3 |    INDEX RANGE SCAN (MULTI VALUE)   | T1_MVI |     1 |       |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access(JSON_QUERY("JSON_DATA" /*+ LOB_BY_VALUE */  FORMAT OSON ,
              '$.words.pages.number()' RETURNING NUMBER ASIS  WITHOUT ARRAY WRAPPER ERROR ON ERROR
              NULL ON EMPTY NULL ON MISMATCH MULTIVALUE)=40)

SQL>

The index can still be used if it only represents part of our search term. In the following query we are looking for books with the word "apple" on the 25th page.

conn sys/SysPassword1@//localhost:1521/pdb1 as sysdba
alter system flush shared_pool;
conn testuser1/testuser1@//localhost:1521/pdb1
set autotrace trace explain

select *
from t1 t
where json_exists(json_data,'$.words[*]?(@.word == "apple" && @.pages.number() == 25)');

Execution Plan
----------------------------------------------------------
Plan hash value: 1854116654

----------------------------------------------------------------------------------------------
| Id  | Operation                           | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |        |     1 |   185 |     2   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1     |     1 |   185 |     2   (0)| 00:00:01 |
|   2 |   HASH UNIQUE                       |        |     1 |   185 |            |          |
|*  3 |    INDEX RANGE SCAN (MULTI VALUE)   | T1_MVI |     1 |       |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(JSON_EXISTS2("JSON_DATA" /*+ LOB_BY_VALUE */  FORMAT OSON ,
              '$.words[*]?(@.word == "apple" && @.pages.number() == 25)' FALSE ON ERROR)=1)
   3 - access(JSON_QUERY("JSON_DATA" /*+ LOB_BY_VALUE */  FORMAT OSON ,
              '$.words.pages.number()' RETURNING NUMBER ASIS  WITHOUT ARRAY WRAPPER ERROR ON ERROR
              NULL ON EMPTY NULL ON MISMATCH MULTIVALUE)=25)

SQL>

Data Type Conversion Methods

We mentioned previously that non-composite multivalue indexes must include a data type conversion method. The list of data type conversion methods are show below, but the full list of item methods and their descriptions can be found here.

When deciding on the conversion methods to use, remember the "Only" methods can be quite restrictive and reduce the chances of the index being used.

Our original multivalue index uses the "number()" method.

drop index t1_mvi;

create multivalue index t1_mvi on t1 t
  (t.json_data.words.pages.number());

This can be used equally well by the following queries, using the "number()" and "numberOnly()" methods, and even when no conversion method is used.

-- number()
select *
from   t1
where  json_exists(json_data, '$.words.pages?(@.number() == 40)');

-- numberOnly()
select *
from   t1
where  json_exists(json_data, '$.words.pages?(@.numberOnly() == 40)');

-- No data type conversion method.
select *
from   t1
where  json_exists(json_data, '$.words.pages?(@ == 40)');

We recreate the index using the "numberOnly()" method, which is more restrictive.

drop index t1_mvi;

create multivalue index t1_mvi on t1 t
  (t.json_data.words.pages.numberOnly());

Now only the query using the "numberOnly()" conversion method can use the index. The other two do a full table scan.

-- numberOnly()
select *
from   t1
where  json_exists(json_data, '$.words.pages?(@.numberOnly() == 40)');

Choose the data type conversion methods used in the index definition carefully, to make sure they are actually used by your application.

Composite Multivalue Indexes Using JSON_TABLE

We can create more complex composite multivalue indexes using the JSON_TABLE syntax. I find it easier to plan these out using a JSON_TABLE query before building the index itself. In this example we want to index all the pages that reference a specific word, so we use JSON_TABLE to flatten out the nested data.

set autotrace off

select id, name, word, page
from t1,
  (json_table(json_data, '$.words[*]'
     error on error null on empty null on mismatch
     columns (
       word varchar2(10) path '$.word',
       nested
         path '$.pages[*]'
         columns (page number(20) path '$'))));

        ID NAME                 WORD             PAGE
---------- -------------------- ---------- ----------
         1 book 1               apple               5
         1 book 1               apple              10
         1 book 1               apple              15
         1 book 1               apple              20
         1 book 1               apple              25
         1 book 1               apple              30
         1 book 1               orange             10
         1 book 1               orange             20
         1 book 1               orange             30
         1 book 1               orange             40
         1 book 1               orange             50
         1 book 1               orange             60
         2 book 2               apple               1
         2 book 2               apple               6
         2 book 2               apple              11
         2 book 2               apple              16
         2 book 2               apple              21
         2 book 2               apple              26
         2 book 2               orange             11
         2 book 2               orange             21
         2 book 2               orange             31
         2 book 2               orange             41
         2 book 2               orange             51
         2 book 2               orange             61

24 rows selected.

SQL>

We used the ERROR ON ERROR NULL ON EMPTY NULL ON MISMATCH error handler because it is mandatory for the multivalue indexes using JSON_TABLE, so it makes sense to test the query using it.

We can now create the multivalue index using this same JSON_TABLE call.

drop index t1_mvi;

create multivalue index t1_mvi on t1
  (json_table(json_data, '$.words[*]'
     error on error null on empty null on mismatch
     columns (
       word varchar2(10) path '$.word',
       nested
         path '$.pages[*]'
         columns (page number(20) path '$'))));

In the following example we search for books with the word "apple" on page 25, and we can see the index is used.

conn sys/SysPassword1@//localhost:1521/pdb1 as sysdba
alter system flush shared_pool;
conn testuser1/testuser1@//localhost:1521/pdb1
set autotrace trace explain

select *
from t1 t
where json_exists(json_data,'$.words[*]?(@.word == "apple" && @.pages == 25)');

Execution Plan
----------------------------------------------------------
Plan hash value: 1854116654

----------------------------------------------------------------------------------------------
| Id  | Operation                           | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |        |     1 |   185 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1     |     1 |   185 |     2   (0)| 00:00:01 |
|   2 |   HASH UNIQUE                       |        |     1 |   185 |            |          |
|*  3 |    INDEX RANGE SCAN (MULTI VALUE)   | T1_MVI |     1 |       |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access(JSON_QUERY("JSON_DATA" /*+ LOB_BY_VALUE */  FORMAT OSON ,
              '$.words[*].word' RETURNING VARCHAR2(10) ASIS  WITHOUT ARRAY WRAPPER ERROR ON ERROR
              NULL ON EMPTY NULL ON MISMATCH MULTIVALUE)='apple' AND JSON_QUERY("JSON_DATA" /*+
              LOB_BY_VALUE */  FORMAT OSON , '$.words[*].pages[*]' RETURNING NUMBER(20,0) ASIS
              WITHOUT ARRAY WRAPPER ERROR ON ERROR NULL ON EMPTY NULL ON MISMATCH MULTIVALUE)=25)

SQL>

The index can also be used for a query referencing one of the values, provided it is on the leading edge of the index. For example searching for books containing the word apple, with "word" being the first element in the composite index.

conn sys/SysPassword1@//localhost:1521/pdb1 as sysdba
alter system flush shared_pool;
conn testuser1/testuser1@//localhost:1521/pdb1
set autotrace trace explain

select *
from t1 t
where json_exists(json_data,'$.words[*]?(@.word == "apple")');

Execution Plan
----------------------------------------------------------
Plan hash value: 1854116654

----------------------------------------------------------------------------------------------
| Id  | Operation                           | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |        |     1 |   185 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1     |     1 |   185 |     2   (0)| 00:00:01 |
|   2 |   HASH UNIQUE                       |        |     1 |   185 |            |          |
|*  3 |    INDEX RANGE SCAN (MULTI VALUE)   | T1_MVI |     1 |       |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access(JSON_QUERY("JSON_DATA" /*+ LOB_BY_VALUE */  FORMAT OSON ,
              '$.words[*].word' RETURNING VARCHAR2(10) ASIS  WITHOUT ARRAY WRAPPER ERROR ON ERROR
              NULL ON EMPTY NULL ON MISMATCH MULTIVALUE)='apple')

SQL>

Notice we don't use the index if we don't reference the leading edge of the index. In this case we search for books with a reference to page 25, but do not reference to the "word" element. Notice the index is not used.

conn sys/SysPassword1@//localhost:1521/pdb1 as sysdba
alter system flush shared_pool;
conn testuser1/testuser1@//localhost:1521/pdb1
set autotrace trace explain

select *
from t1 t
where json_exists(json_data,'$.words[*]?(@.pages.numberOnly() == 25)');

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |   183 |    36   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |     1 |   183 |    36   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(JSON_EXISTS2("JSON_DATA" /*+ LOB_BY_VALUE */  FORMAT OSON
              , '$.words[*]?(@.pages.numberOnly() == 25)' FALSE ON ERROR)=1)

SQL>

Similar to normal composite indexes, we need to be careful when deciding the order of the indexed elements.

The JSON_TABLE approach is very tolerant of data type conversion methods. All the following queries are able to use the index.

-- No data type conversion methods.
select *
from t1 t
where json_exists(json_data,'$.words[*]?(@.word == "apple" && @.pages == 25)');

-- number()
select *
from t1 t
where json_exists(json_data,'$.words[*]?(@.word == "apple" && @.pages.number() == 25)');

-- numberOnly()
select *
from t1 t
where json_exists(json_data,'$.words[*]?(@.word == "apple" && @.pages.numberOnly() == 25)');

-- string()
select *
from t1 t
where json_exists(json_data,'$.words[*]?(@.word.string() == "apple" && @.pages == 25)');

-- stringOnly()
select *
from t1 t
where json_exists(json_data,'$.words[*]?(@.word.stringOnly() == "apple" && @.pages == 25)');

-- stringOnly() and numberOnly()
select *
from t1 t
where json_exists(json_data,'$.words[*]?(@.word.stringOnly() == "apple" && @.pages.numberOnly() == 25)');

Considerations

Here are some things to consider when using multivalue function-based indexes.

For more information see:

Hope this helps. Regards Tim...

Back to the Top.