MySQL / MariaDB

CDC source setup for MySQL and MariaDB

Sling supports Change Data Capture from MySQL and MariaDB by reading the binary log (binlog). Each run reads row-level inserts, updates, and deletes from the binlog and merges them into the target table.

For general CDC concepts, the two-phase process, and all available options, see the Change Capture overview.

Prerequisites

Ensure your MySQL source has binary logging enabled with row-based format:

-- Verify binlog is enabled and using ROW format
SHOW VARIABLES LIKE 'log_bin';          -- Must be ON
SHOW VARIABLES LIKE 'binlog_format';    -- Must be ROW
SHOW VARIABLES LIKE 'binlog_row_image'; -- Should be FULL

The MySQL user must have the REPLICATION SLAVE and REPLICATION CLIENT privileges:

GRANT REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'sling_user'@'%';
GRANT SELECT ON my_database.* TO 'sling_user'@'%';
circle-info

MariaDB uses the same binlog mechanism as MySQL. The same prerequisites apply.

Quick Start

# Source MySQL
sling conns set MY_MYSQL type=mysql host=mysql.example.com user=sling_user password=secret database=my_database

# Target PostgreSQL
sling conns set MY_POSTGRES type=postgres host=pg.example.com user=postgres password=secret database=analytics

# State store (required for CDC)
export SLING_STATE='MY_POSTGRES/sling_state'

Examples

Large Tables with Custom Chunk Size

For very large tables, adjust the chunk size to control memory usage and checkpointing frequency during the initial snapshot.

Time-Bounded Snapshots for Very Large Tables

For tables with hundreds of millions of rows, the initial snapshot can take hours. Use snapshot_run_duration to cap how long each run spends on the snapshot. The next run automatically resumes from the last completed chunk.

High-Throughput Workloads

For tables with heavy write activity, increase run_max_events and run_max_duration so each run captures more changes.

Soft Deletes

Keep deleted rows in the target instead of physically removing them. Deleted rows are marked with _sling_synced_op = 'D'. Useful for audit trails or when downstream queries need to detect deletions.

When a row is deleted in MySQL, the target row is preserved with _sling_synced_op set to 'D' and _sling_synced_at updated to the current timestamp. A subsequent re-insert of the same primary key restores the row with the appropriate operation type.

Mixed Streams with Per-Stream Overrides

Different tables can have different CDC options.

Replay / Backfill from a Point in Time

If target data becomes inconsistent, you can replay changes from an earlier position. The replay_from value is applied exactly once per unique value.

After the replay run completes, remove or change the replay_from value. Leaving it unchanged has no effect (it is only applied once).

Replay Formats

The replay_from option accepts these MySQL-specific position formats:

  • RFC 3339 timestamp: 2025-06-01T00:00:00Z

  • Binlog file:offset: mysql-bin.000003:12345

  • GTID set: cc51b500-0cda-11f1-9e7c-0242ac110002:1-100

Binlog Retention

Ensure binlog retention is long enough to cover the maximum gap between CDC runs. If MySQL purges binlogs that Sling hasn't read yet, the run will fail.

Troubleshooting

"CDC not supported for <type>"

Ensure your source connection is configured as type=mysql or type=mariadb.

Initial snapshot keeps restarting

Ensure SLING_STATE is configured. Without state persistence, Sling cannot track that the snapshot completed and will restart it on every run.

Missing changes after snapshot

This should not happen — Sling captures the binlog position before the snapshot begins. If you suspect missing data, check that the binlog retention period is long enough to cover the snapshot duration:

"could not resolve replay_from position"

The replay_from value must be in one of the formats listed in the Replay Formats section above. Ensure the value is valid and the binlog position still exists on the server.

Binlog purged before CDC could read

If MySQL purges binlogs that Sling hasn't read yet, the run will fail. Increase binlog retention to be longer than the maximum gap between CDC runs:

Last updated

Was this helpful?