# Backfill

Backfill mode allows you to load historical data within a specific range based on an update key. This is useful when you need to reload data for a particular time period or range of values.

## Basic Backfill

Basic backfill requires:

* A primary key to uniquely identify records
* An update key to determine the range
* A range specification defining the start and end values

**Using CLI Flags**

{% code overflow="wrap" %}

```bash
$ sling run --src-conn MY_SOURCE_DB \
  --src-stream 'my_schema.orders' \
  --tgt-conn MY_TARGET_DB \
  --tgt-object 'target_schema.orders' \
  --mode backfill \
  --primary-key order_id \
  --update-key order_date \
  --range '2023-01-01,2023-12-31'
```

{% endcode %}

**Using Replication Config**

{% code title="replication.yaml" overflow="wrap" %}

```yaml
source: MY_SOURCE_DB
target: MY_TARGET_DB

defaults:
  mode: backfill
  primary_key: [id]
  object: target_schema.{stream_table}

streams:
  my_schema.orders:
    update_key: order_date
    source_options:
      range: '2023-01-01,2023-12-31'

  my_schema.transactions:
    update_key: id # same as primary key
    source_options:
      range: '100000,200000'
```

{% endcode %}

## Backfill Chunking

Backfill mode with chunking allows loading historical data in smaller ranges, optimizing for large datasets. See the [chunking documentation](https://docs.slingdata.io/examples/chunking#chunking-in-backfill-mode) for details.

```yaml
source: postgres
target: oracle

defaults:
  mode: backfill
  object: oracle.{stream_table_lower}
  primary_key: [id]

streams:
  public.orders_mariadb_pg:
    update_key: update_dt
    source_options:
      range: '2018-11-01,2018-12-01'
      chunk_size: 10d  # Process in 10-day chunks

  public.orders_sqlserver_pg:
    update_key: date
    source_options:
      range: '2019-01-01,2019-06-01'
      chunk_size: 2m   # Process in 2-month chunks

  public.orders_snowflake_pg:
    update_key: id  # same as primary key
    source_options:
      range: '1,800'
      chunk_size: 200  # Process in chunks of 200 IDs

  public.large_table_pg:
    update_key: created_at
    source_options:
      range: '2023-01-01,2023-12-31'
      chunk_count: 6  # Split into 6 equal time-based chunks

  public.user_data_pg:
    update_key: user_id
    source_options:
      range: '1000,50000'
      chunk_count: 10  # Split into 10 equal numeric chunks

env:
  SLING_THREADS: 3  # Process 3 streams concurrently
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.slingdata.io/examples/database-to-database/backfill.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
