Merge Strategy

Overview

When using incremental or backfill modes with a primary_key, Sling needs to merge incoming data with existing records in the target table. The merge strategy controls exactly how this merge operation is performed. This is available in v1.5.7.

By default, Sling uses the optimal merge strategy for each database. However, you can override this behavior using the merge_strategy target option to gain more control over how data is merged.

circle-info

When Does Merge Strategy Apply?

Merge strategy only applies when:

  • Mode is incremental or backfill

  • A primary_key is defined for the stream

For full-refresh, truncate, or snapshot modes, data is inserted directly without merging.

Available Strategies

Sling supports four merge strategies:

Strategy
Description
Behavior

update_insert

Update existing rows, insert new rows

Standard upsert behavior. Matches on primary key, updates existing records, inserts new ones.

delete_insert

Delete matching rows, then insert all

Removes rows with matching primary keys from target, then inserts all source rows. Safe and reliable.

insert

Insert only, skip existing

Only inserts new rows. Does not update existing records (use for append-only scenarios).

update

Update only, skip new

Only updates existing rows. Does not insert new records (use when target should only contain pre-existing keys).

Strategy Comparison

Strategy
Updates Existing
Inserts New
Performance
Use Case

update_insert

Yes

Yes

Best

Default upsert behavior

delete_insert

Yes (via delete+insert)

Yes

Good

When native MERGE unavailable or for simpler logic

insert

No

Yes

Fast

Append-only, idempotent loads

update

Yes

No

Fast

Updating existing records only

Database Support

Not all databases support all merge strategies due to differences in SQL capabilities.

Database
Default
Supported Strategies

PostgreSQL

update_insert

all

Snowflake

update_insert

all

BigQuery

delete_insert

all

SQL Server

update_insert

all

Oracle

update_insert

all

Databricks

update_insert

all

SQLite

update_insert

all

Exasol

update_insert

all

Cloudflare D1

update_insert

all

MotherDuck

update_insert

all

MySQL

delete_insert

insert, delete_insert

MariaDB

delete_insert

insert, delete_insert

Redshift

delete_insert

insert, delete_insert

ClickHouse

delete_insert

insert, delete_insert

DuckDB

delete_insert

insert, update, delete_insert

StarRocks

delete_insert

insert, update_insert, delete_insert

circle-exclamation

Usage

In Replication YAML

Set the merge strategy in target_options:

Per-Stream Override

Override the strategy for specific streams:

CLI Usage

Use JSON format for --tgt-options:

Strategy Details

update_insert (Upsert)

The most common merge pattern. Uses the database's native MERGE statement (or equivalent) to:

  1. Match rows on primary key

  2. Update matched rows with new values

  3. Insert unmatched rows

SQL Pattern (varies by database):

Use when:

  • You need standard upsert behavior

  • Target database supports MERGE or equivalent

  • Performance is important

delete_insert

A two-step approach that's universally supported:

  1. Delete all rows in target that have matching primary keys in source

  2. Insert all rows from source

SQL Pattern:

Use when:

  • Database doesn't support native MERGE

  • You want simpler, more predictable behavior

  • Debugging merge issues

  • Working with databases like MySQL, Redshift, or ClickHouse

insert

Only inserts new rows, ignoring any that already exist:

  1. Insert rows from source that don't exist in target (based on primary key)

SQL Pattern (varies by database):

Use when:

  • Append-only tables (audit logs, event streams)

  • Idempotent data loads where updates shouldn't occur

  • First-wins conflict resolution

update

Only updates existing rows, never inserts new ones:

  1. Update rows in target that have matching primary keys in source

  2. Skip source rows that don't exist in target

SQL Pattern:

Use when:

  • Target table should be the source of truth for which records exist

  • Updating dimension tables with enrichment data

  • Preventing accidental record creation

Customizing Merge Templates

For advanced use cases, you can customize the SQL templates used for merge operations. See Template Overrides for details.

Merge strategy templates are defined in the core section:

Available Template Variables

Variable
Description

{tgt_table}

Target table name

{src_table}

Source (temp) table name

{src_fields}

Source fields with type casting

{insert_fields}

Fields for INSERT clause

{src_insert_fields}

Source fields for VALUES clause

{set_fields}

SET clause for UPDATE

{src_tgt_pk_equal}

Primary key equality condition

{src_pk_fields}

Source primary key fields

{tgt_pk_fields}

Target primary key fields

Disabling a Strategy

To disable a strategy (force users to use a different one), set it to null:

circle-info

Built-in Templates Reference

View the default merge templates in the sling-cli repositoryarrow-up-right. Each database file (e.g., postgres.yaml, snowflake.yaml) shows how merge strategies are implemented.

Best Practices

1. Use the Default When Possible

Each database has a default merge strategy optimized for its capabilities. Only override when you have a specific need.

2. Consider Performance

  • update_insert: Usually fastest when database has native MERGE

  • delete_insert: May be slower for large tables with few updates

  • insert / update: Fastest when you only need one operation

3. Handle Edge Cases

Scenario
Recommended Strategy

First-time load

Any (all behave same for empty table)

Append-only data

insert

Full record replacement

delete_insert

Partial field updates

update_insert

Cross-database compatibility

delete_insert

4. Testing

When changing merge strategies:

  1. Test with a small dataset first

  2. Verify row counts match expectations

  3. Check that updates/inserts behave correctly

  4. Monitor performance differences

Troubleshooting

"merge strategy not supported"

Error: merge strategy 'update_insert' not supported for mysql

Cause: The requested strategy isn't available for the target database.

Solution: Use a supported strategy for that database. Check the Database Support table above.

Data not updating

Cause: Using insert strategy when you need update_insert.

Solution: Change to update_insert or delete_insert.

Unexpected row deletions

Cause: Using delete_insert with partial data loads.

Solution: If source only contains some records, consider update_insert to avoid deleting non-present records.

Performance issues

Cause: delete_insert on large tables with few changes.

Solution:

  • Use update_insert if supported

  • Consider partitioning/filtering to reduce scope

  • Use update if new records shouldn't be inserted

Last updated

Was this helpful?