8i | 9i | 10g | 11g | 12c | 13c | 18c | 19c | 21c | 23ai | Misc | PL/SQL | SQL | RAC | WebLogic | Linux
MERGE Statement
The MERGE
statement was introduced in Oracle 9i to conditionally insert or update data depending on its presence, a process also known as an "upsert". The MERGE
statement reduces table scans and can perform the operation in parallel if required.
Related articles.
Syntax
Consider the following example where data from the HR_RECORDS
table is merged into the EMPLOYEES
table.
MERGE INTO employees e USING hr_records h ON (e.id = h.emp_id) WHEN MATCHED THEN UPDATE SET e.address = h.address WHEN NOT MATCHED THEN INSERT (id, address) VALUES (h.emp_id, h.address);
The source can also be a query.
MERGE INTO employees e USING (SELECT * FROM hr_records WHERE start_date > ADD_MONTHS(SYSDATE, -1)) h ON (e.id = h.emp_id) WHEN MATCHED THEN UPDATE SET e.address = h.address WHEN NOT MATCHED THEN INSERT (id, address) VALUES (h.emp_id, h.address);
Performance
The MERGE
statement is optimized for merging sets of data, rather than single rows, as shown in the example below.
Create the following test tables. The source table contains all the rows from the ALL_OBJECTS
view, while the destination table contains approximately half of the rows.
CREATE TABLE source_tab AS SELECT object_id, owner, object_name, object_type FROM all_objects; ALTER TABLE source_tab ADD ( CONSTRAINT source_tab_pk PRIMARY KEY (object_id) ); CREATE TABLE dest_tab AS SELECT object_id, owner, object_name, object_type FROM all_objects WHERE ROWNUM <= 25000; ALTER TABLE dest_tab ADD ( CONSTRAINT dest_tab_pk PRIMARY KEY (object_id) ); EXEC DBMS_STATS.gather_table_stats(USER, 'source_tab', cascade=> TRUE); EXEC DBMS_STATS.gather_table_stats(USER, 'dest_tab', cascade=> TRUE);
The following code compares the performance of four merge operations. The first uses the straight MERGE
statement. The second also uses the MERGE
statement, but in a row-by-row manner. The third performs an update, and conditionally inserts the row if the update touches zero rows. The fourth inserts the row, then performs an update if the insert fails with a duplicate value on index exception.
SET SERVEROUTPUT ON DECLARE TYPE t_tab IS TABLE OF source_tab%ROWTYPE; l_tab t_tab; l_start NUMBER; BEGIN l_start := DBMS_UTILITY.get_time; MERGE INTO dest_tab a USING source_tab b ON (a.object_id = b.object_id) WHEN MATCHED THEN UPDATE SET owner = b.owner, object_name = b.object_name, object_type = b.object_type WHEN NOT MATCHED THEN INSERT (object_id, owner, object_name, object_type) VALUES (b.object_id, b.owner, b.object_name, b.object_type); DBMS_OUTPUT.put_line('MERGE : ' || (DBMS_UTILITY.get_time - l_start) || ' hsecs'); ROLLBACK; l_start := DBMS_UTILITY.get_time; SELECT * BULK COLLECT INTO l_tab FROM source_tab; FOR i IN l_tab.first .. l_tab.last LOOP MERGE INTO dest_tab a USING (SELECT l_tab(i).object_id AS object_id, l_tab(i).owner AS owner, l_tab(i).object_name AS object_name, l_tab(i).object_type AS object_type FROM dual) b ON (a.object_id = b.object_id) WHEN MATCHED THEN UPDATE SET owner = b.owner, object_name = b.object_name, object_type = b.object_type WHEN NOT MATCHED THEN INSERT (object_id, owner, object_name, object_type) VALUES (b.object_id, b.owner, b.object_name, b.object_type); END LOOP; DBMS_OUTPUT.put_line('ROW MERGE : ' || (DBMS_UTILITY.get_time - l_start) || ' hsecs'); ROLLBACK; l_start := DBMS_UTILITY.get_time; SELECT * BULK COLLECT INTO l_tab FROM source_tab; FOR i IN l_tab.first .. l_tab.last LOOP UPDATE dest_tab SET owner = l_tab(i).owner, object_name = l_tab(i).object_name, object_type = l_tab(i).object_type WHERE object_id = l_tab(i).object_id; IF SQL%ROWCOUNT = 0 THEN INSERT INTO dest_tab (object_id, owner, object_name, object_type) VALUES (l_tab(i).object_id, l_tab(i).owner, l_tab(i).object_name, l_tab(i).object_type); END IF; END LOOP; DBMS_OUTPUT.put_line('UPDATE/INSERT: ' || (DBMS_UTILITY.get_time - l_start) || ' hsecs'); ROLLBACK; l_start := DBMS_UTILITY.get_time; SELECT * BULK COLLECT INTO l_tab FROM source_tab; FOR i IN l_tab.first .. l_tab.last LOOP BEGIN INSERT INTO dest_tab (object_id, owner, object_name, object_type) VALUES (l_tab(i).object_id, l_tab(i).owner, l_tab(i).object_name, l_tab(i).object_type); EXCEPTION WHEN DUP_VAL_ON_INDEX THEN UPDATE dest_tab SET owner = l_tab(i).owner, object_name = l_tab(i).object_name, object_type = l_tab(i).object_type WHERE object_id = l_tab(i).object_id; END; END LOOP; DBMS_OUTPUT.put_line('INSERT/UPDATE: ' || (DBMS_UTILITY.get_time - l_start) || ' hsecs'); ROLLBACK; END; / MERGE : 119 hsecs ROW MERGE : 1453 hsecs UPDATE/INSERT: 1280 hsecs INSERT/UPDATE: 2443 hsecs PL/SQL procedure successfully completed. SQL>
The output shows the straight MERGE
statement is an order of magnitude faster than its nearest rival. The update/insert performs almost twice the speed of the insert/update and even out performs the row-by-row MERGE
.
Just comparing the update/insert and the insert/update methods in isolation, we have to remember the comparisons will vary depending on the data in the table. If most of the data will not be present already, the insert/update approach may be better. If most of the data is already present, the update/insert approach will probably be better. If you are not sure, just use merge as it is clearer.
In addition to the straight MERGE
statement being faster, because it is a DML statement it can easily be run in parallel to improve performance further, provided your server can handle the extra load.
For more information see:
Hope this helps. Regards Tim...