# Oracle

Sling supports Change Data Capture from Oracle by connecting to Oracle GoldenGate Data Streams over WebSocket. Each run reads row-level inserts, updates, and deletes from the GoldenGate Data Stream and merges them into the target table. Sling uses GoldenGate Data Streams — not LogMiner — because Data Streams provide a reliable, low-overhead, real-time change feed without requiring direct access to redo logs from the client.

For general CDC concepts, the two-phase process, and all available options, see the [Change Capture overview](/concepts/change-capture.md).

## Prerequisites

### 1. Oracle Database Requirements

Oracle Database 19c or later is required. The following settings must be configured on the source database:

```sql
-- Verify ARCHIVELOG mode is enabled
SELECT LOG_MODE FROM V$DATABASE;  -- Must be ARCHIVELOG

-- Enable ARCHIVELOG mode (if not already enabled)
SHUTDOWN IMMEDIATE;
STARTUP MOUNT;
ALTER DATABASE ARCHIVELOG;
ALTER DATABASE OPEN;

-- Enable force logging (recommended)
ALTER DATABASE FORCE LOGGING;

-- Enable supplemental logging
ALTER DATABASE ADD SUPPLEMENTAL LOG DATA;
ALTER DATABASE ADD SUPPLEMENTAL LOG DATA (ALL) COLUMNS;

-- Enable GoldenGate replication parameter
ALTER SYSTEM SET enable_goldengate_replication = TRUE SCOPE=BOTH;
```

Verify the configuration:

```sql
SELECT SUPPLEMENTAL_LOG_DATA_MIN, FORCE_LOGGING FROM V$DATABASE;
-- Both should return YES
```

### 2. GoldenGate Requirements

Oracle GoldenGate **23ai or later** is required. The Data Streams feature — which Sling uses to consume changes over WebSocket — was introduced in GoldenGate 23ai.

Before running Sling, you must have:

1. A GoldenGate deployment connected to the source Oracle database.
2. An **Extract** process configured to read the source database's redo logs.
3. A **Data Stream** provisioned and attached to the Extract's trail file.

{% hint style="info" %}
**GoldenGate Free** is available at no cost for Oracle databases under 20 GB. It includes the Data Streams feature and is sufficient for development and small workloads. For production use with larger databases, a commercial GoldenGate license is required.
{% endhint %}

{% hint style="info" %}
Configuring GoldenGate Extract and Data Streams is done through the GoldenGate Admin Console or REST API. Refer to the [Oracle GoldenGate documentation](https://docs.oracle.com/en/middleware/goldengate/) for detailed setup instructions.
{% endhint %}

### 3. GoldenGate Connection Properties

The following properties are added to the Oracle connection definition and are only used when `mode: change-capture` is active:

<table data-full-width="false"><thead><tr><th>Property</th><th>Description</th></tr></thead><tbody><tr><td><code>gg_host</code> <strong>(required)</strong></td><td>GoldenGate REST/WebSocket hostname or IP address.</td></tr><tr><td><code>gg_port</code></td><td>GoldenGate REST/WebSocket port. Default is <code>443</code>.</td></tr><tr><td><code>gg_user</code></td><td>GoldenGate REST API user. Default is <code>oggadmin</code>.</td></tr><tr><td><code>gg_password</code> <strong>(required)</strong></td><td>GoldenGate REST API password.</td></tr><tr><td><code>gg_stream</code> <strong>(required)</strong></td><td>Name of the pre-provisioned Data Stream in GoldenGate.</td></tr><tr><td><code>gg_tls_skip_verify</code></td><td>Skip TLS certificate verification. Default is <code>false</code>. Set to <code>true</code> for self-signed certificates.</td></tr><tr><td><code>gg_tls_ca_cert</code></td><td>Path to a custom CA certificate file for TLS verification.</td></tr></tbody></table>

### 4. User Permissions

The Oracle database user used by Sling needs `SELECT` access on the source tables (used during the initial snapshot):

```sql
GRANT SELECT ON HR.EMPLOYEES TO sling_user;
GRANT SELECT ON HR.DEPARTMENTS TO sling_user;

-- Or grant schema-wide access
GRANT SELECT ANY TABLE TO sling_user;
```

{% hint style="info" %}
The GoldenGate Extract process typically runs under a dedicated GoldenGate admin user with broader privileges. The Sling connection user only needs `SELECT` on the source tables for the initial snapshot — change events are consumed from the GoldenGate Data Stream, not directly from the database.
{% endhint %}

## Quick Start

```bash
# Source Oracle (with GoldenGate CDC properties)
sling conns set MY_ORACLE type=oracle host=oracle.example.com user=sling_user password=secret sid=ORCL \
  gg_host=gg.example.com gg_password=gg_secret gg_stream=my_data_stream

# Target PostgreSQL
sling conns set MY_POSTGRES type=postgres host=pg.example.com user=postgres password=secret database=analytics

# State store (required for CDC)
export SLING_STATE='MY_POSTGRES/sling_state'
```

```yaml
# replication.yaml
source: MY_ORACLE
target: MY_POSTGRES

defaults:
  mode: change-capture
  primary_key: [id]
  object: public.{stream_table}

streams:
  HR.EMPLOYEES:
  HR.DEPARTMENTS:
```

```bash
# First run: performs the initial snapshot
sling run -r replication.yaml

# Subsequent runs: captures and applies changes
sling run -r replication.yaml
```

## How It Works

Unlike MySQL (binlog streaming) and PostgreSQL (WAL logical replication), Oracle CDC uses an external middleware layer:

1. The GoldenGate **Extract** process reads the Oracle redo logs and writes change records to a trail file.
2. The GoldenGate **Data Stream** exposes the trail file as a WebSocket endpoint, streaming row-level change events in real time.
3. On each run, Sling connects to the Data Stream, reads change events from the last saved position, and merges them into the target table.
4. The GoldenGate position is saved in the state store.

Because Sling connects to GoldenGate (not directly to Oracle redo logs), the Oracle database sees **no additional load** from CDC reads. All redo log parsing is handled by the GoldenGate Extract process.

## Examples

### Large Tables with Custom Chunk Size

For very large tables, adjust the chunk size to control memory usage and checkpointing frequency during the initial snapshot.

```yaml
source: MY_ORACLE
target: MY_POSTGRES

defaults:
  mode: change-capture
  primary_key: [id]
  object: analytics.{stream_table}
  change_capture_options:
    snapshot_chunk_size: 50000  # 50k rows per chunk

streams:
  HR.TRANSACTIONS:
    # This 10M-row table will be loaded in ~200 chunks
    # If interrupted, it resumes from the last completed chunk
```

### Time-Bounded Snapshots for Very Large Tables

For tables with hundreds of millions of rows, the initial snapshot can take hours. Use `snapshot_run_duration` to cap how long each run spends on the snapshot. The next run automatically resumes from the last completed chunk.

```yaml
source: MY_ORACLE
target: MY_POSTGRES

defaults:
  mode: change-capture
  primary_key: [id]
  object: analytics.{stream_table}
  change_capture_options:
    snapshot_chunk_size: 50000
    snapshot_run_duration: 30m  # spend at most 30 minutes per run on the snapshot

streams:
  HR.HUGE_EVENTS:
    # 500M rows — will take multiple runs to complete the initial load
    # Each run processes ~30 minutes worth of chunks, then exits cleanly
```

### High-Throughput Workloads

For tables with heavy write activity, increase `run_max_events` and `run_max_duration` so each run captures more changes.

```yaml
source: MY_ORACLE
target: MY_POSTGRES

defaults:
  mode: change-capture
  primary_key: [id]
  object: warehouse.{stream_table}
  change_capture_options:
    run_max_events: 50000  # Process up to 50k events per run
    run_max_duration: 60s  # Wait up to 60 seconds for events

streams:
  HR.CLICK_EVENTS:
  HR.PAGE_VIEWS:
```

### Soft Deletes

Keep deleted rows in the target instead of physically removing them. Deleted rows are marked with `_sling_synced_op = 'D'`. Useful for audit trails or when downstream queries need to detect deletions.

```yaml
source: MY_ORACLE
target: MY_POSTGRES

defaults:
  mode: change-capture
  primary_key: [id]
  object: public.{stream_table}
  change_capture_options:
    soft_delete: true

streams:
  HR.CUSTOMERS:
  HR.SUBSCRIPTIONS:
```

When a row is deleted in Oracle, the target row is preserved with `_sling_synced_op` set to `'D'` and `_sling_synced_at` updated to the current timestamp. A subsequent re-insert of the same primary key restores the row with the appropriate operation type.

### Mixed Streams with Per-Stream Overrides

Different tables can have different CDC options.

```yaml
source: MY_ORACLE
target: MY_POSTGRES

defaults:
  mode: change-capture
  primary_key: [id]
  object: public.{stream_table}

streams:
  # High-volume table: larger batches
  HR.EVENTS:
    change_capture_options:
      run_max_events: 100000
      run_max_duration: 2m

  # Audit table: keep soft deletes
  HR.USER_ACCOUNTS:
    change_capture_options:
      soft_delete: true

  # Standard table: uses defaults
  HR.PRODUCTS:
```

### Replay / Backfill from a Point in Time

If target data becomes inconsistent, you can replay changes from an earlier position. The `replay_from` value is applied exactly once per unique value.

```yaml
source: MY_ORACLE
target: MY_POSTGRES

defaults:
  mode: change-capture
  primary_key: [id]
  object: public.{stream_table}
  change_capture_options:
    replay_from: "2025-06-01T00:00:00Z"  # Re-process all changes since June 1

streams:
  HR.ORDERS:
```

After the replay run completes, remove or change the `replay_from` value. Leaving it unchanged has no effect (it is only applied once).

## Replay Formats

The `replay_from` option accepts these Oracle-specific position formats:

* **RFC 3339 timestamp**: `2025-06-01T00:00:00Z`
* **GoldenGate position**: A GoldenGate position string as reported by the Data Stream (e.g., the position value from the state store)

## Why GoldenGate (Not LogMiner)

Sling uses GoldenGate Data Streams instead of LogMiner for Oracle CDC. Here's why:

* **LogMiner is resource-intensive**: LogMiner parses redo logs via SQL sessions on the source database, consuming CPU and memory on the production system. GoldenGate Extract offloads this work to a separate process.
* **LogMiner has version-specific limitations**: The `CONTINUOUS_MINE` option — which enabled continuous redo log reading — was deprecated in Oracle 19c. Without it, LogMiner requires manual log file management.
* **LogMiner is single-threaded**: It does not scale well for high-volume, low-latency CDC workloads.
* **GoldenGate Data Streams are purpose-built**: They provide a real-time, WebSocket-based change feed with built-in position tracking, designed specifically for CDC consumers.
* **GoldenGate Free is available**: For databases under 20 GB, [GoldenGate Free](https://www.oracle.com/integration/goldengate/free/) can be used at no cost, making it accessible for development and small deployments.

## GoldenGate Setup Overview

Setting up GoldenGate is a one-time process. Here is a condensed checklist:

1. **Install GoldenGate 23ai** (or later). The quickest method is Docker:

   ```bash
   docker login container-registry.oracle.com
   docker pull container-registry.oracle.com/goldengate/goldengate-free:latest

   docker run --name ogg-free -p 443:443 \
     -e OGG_ADMIN_PWD='YourPassword1!' \
     container-registry.oracle.com/goldengate/goldengate-free:latest
   ```
2. **Connect GoldenGate to the source Oracle database** via the GoldenGate Admin Console at `https://localhost:443`.
3. **Create an Extract** process that reads the source database's redo logs.
4. **Create a Data Stream** in the GoldenGate Console under Distribution Service > Data Streams, and point it to the Extract's trail file.
5. **Note the Data Stream name** — this becomes the `gg_stream` connection property in Sling.
6. **Ensure the GoldenGate REST API is accessible** from the machine running Sling (port 443 by default).

For detailed instructions, see:

* [Oracle GoldenGate Free Getting Started Guide](https://docs.oracle.com/en/middleware/goldengate/free/21/uggfe/get-started-goldengate-free.html)
* [Data Streams Components](https://docs.oracle.com/en/database/goldengate/core/26/coredoc/distribute-datastream-componentsofoggds.html)

## Troubleshooting

### "CDC not supported for \<type>"

Ensure your source connection is configured as `type=oracle` and includes the required `gg_host`, `gg_password`, and `gg_stream` properties.

### "could not connect to GoldenGate Data Stream"

Verify `gg_host` and `gg_port` are correct and that the GoldenGate REST API is reachable from the machine running Sling. Check firewall rules and TLS settings. For self-signed certificates, set `gg_tls_skip_verify: true` for testing.

### "GoldenGate Data Stream not found"

The `gg_stream` value must match an existing Data Stream name in GoldenGate. Verify the stream exists in the GoldenGate Admin Console under Distribution Service > Data Streams.

### "authentication failed for GoldenGate"

Check `gg_user` and `gg_password`. The default GoldenGate admin user is `oggadmin`. Ensure the password meets Oracle's complexity requirements.

### Initial snapshot keeps restarting

Ensure `SLING_STATE` is configured. Without state persistence, Sling cannot track that the snapshot completed and will restart it on every run.

### "could not resolve replay\_from position"

The `replay_from` value must be a valid RFC 3339 timestamp or GoldenGate position string. Ensure the position is still available in the GoldenGate trail files.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.slingdata.io/concepts/change-capture/oracle.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
