Strange behavior in MERGE with sequence NEXTVAL

Today we learned the following: “When using a sequence in a MERGE statement, the nextvalue is generated for all rows both updated and inserted (hence all rows in the source).”
Our initial reaction was that this must be a bug, but is it really?

Background

Today we found a sequence with very large gaps. The gaps corresponded to some timeouts in the application. Our first thought was that a rollback was happening somewhere. But when analysing our code and data from ASH we could not find any traces of rollbacks. When looking in v$sql we found that there where two SQLs using the particular sequence. These two statements where MERGE statements.

Test Case

Note! Big thanks to Lars Johan Ulveseth for the test case.

We created the following test case:

-- drop sequence test_seq;
-- drop table test_t;

CREATE SEQUENCE test_seq 
  START WITH 1 INCREMENT BY 1 CACHE 20 MAXVALUE 99999999999;
 

CREATE TABLE test_t (
   id               NUMBER(18) ,
   text             VARCHAR2(255),
   sys_created_time timestamp,
   sys_updated_time timestamp
);

The sequence before we run the MERGE statement the first time looks like this:

SQL> select sequence_name, cache_size, last_number 
     from user_sequences 
     where sequence_name='TEST_SEQ';

SEQUENCE_NAME        CACHE_SIZE LAST_NUMBER
-------------------- ---------- -----------
TEST_SEQ                     20           1

Then we use the following MERGE statement to the merge in our application:

SQL> MERGE INTO test_t target
USING (
   SELECT
      ROWNUM id,
      'Employee '||to_char(ROWNUM)||dbms_random.value(2, 9) * 1000 text
   FROM dual
   CONNECT BY level <= 100
) source ON ( target.id = source.id )
WHEN MATCHED THEN 
  UPDATE 
     SET   target.text = source.text, sys_updated_time = systimestamp
     WHERE target.text <> source.text
WHEN NOT MATCHED THEN
  INSERT ( id, text, sys_created_time)
    VALUES( test_seq.NEXTVAL, source.text, systimestamp);

100 rows merged.

The first time the MERGE is ran 100 rows are INSERTED (since the TEST_T table is empty).
We then find the row count of the TEST_T table, and take a look at the sequence:

SQL> select count(*) from test_t;

  COUNT(*)
----------
       100

SQL> select sequence_name, cache_size, last_number 
     from user_sequences 
     where sequence_name='TEST_SEQ';

SEQUENCE_NAME        CACHE_SIZE LAST_NUMBER
-------------------- ---------- -----------
TEST_SEQ                     20         101

We see that 100 rows are inserted into the table, and the sequence counter has increased by 100.
Ok. Then lets run the MERGE once more.

SQL> MERGE ... (Same statement as above);

100 rows merged.

If we count the rows in the TEST_T table we’ll see that now more rows are inserted, but the SYS_UPDATED_TIME column is updated. But what about the sequence?

SQL> select count(*) from test_t;

  COUNT(*)
----------
       100

SQL> select sequence_name, cache_size, last_number 
     from user_sequences 
     where sequence_name='TEST_SEQ';

SEQUENCE_NAME        CACHE_SIZE LAST_NUMBER
-------------------- ---------- -----------
TEST_SEQ                     20         201

WHAT??? No inserts are happening in the MERGE, but Oracle is still running NEXTVAL for every row in the source table. This is actually by design, and is documented:

For each row merged by a MERGE statement. The reference to NEXTVAL can appear in the merge_insert_clause or the merge_update_clause or both. The NEXTVALUE value is incremented for each row updated and for each row inserted, even if the sequence number is not actually used in the update or insert operation. If NEXTVAL is specified more than once in any of these locations, then the sequence is incremented once for each row and returns the same value for all occurrences of NEXTVAL for that row.

From Oracle 19c SQL Reference documentation

If you edit the table and use the sequence as the default value of the ID column, and refactor the MERGE statement to not implicitly call the NEXTVAL in the sequence, the same thing happens.

Possible Solution

Already in february 2009 (almost 15 years ago) Bob Jankovsky wrote a blog about this, and suggested a solution returning the nextvalue from a function. The PRAGMA UDF could possible be used in this function, but I’m a little unsure if I would recommend this solution. We’ll probably do some testing with the following two approaches:
1) Returning nextvalue from function (defined with PRAGMA UDF)
2) Rewriting MERGE to running both UPDATE and INSERT

We learn something new every day! This is why we love what we do :-).

Post a Comment

Your email is never published nor shared. Required fields are marked *