# Structure

Below is the structure of the replication configuration file.

## Root Level

At the root level, we have the following keys:

```yaml
# 'source', 'target' and 'streams' keys are required
source: <connection name>
target: <connection name>

defaults: <replication stream map>

hooks: <replication level hooks map>

streams:
  <stream name>: <replication stream map>

env:
  <variable name>: <variable value>

```

## Stream Level

The `<stream name>` identifies the stream to replicate. This can be either a source table name, a file path, or a wildcard pattern using `*`. Wildcards allow matching multiple tables within a schema or multiple files within a directory. For example, `my_schema.*` matches all tables in `my_schema`, while `data/*.csv` matches all CSV files in the `data` directory. See [Tags & Wildcards](https://docs.slingdata.io/concepts/replication/tags-wildcards) for more details.

The `<replication stream map>` is a map object which accepts the following keys:

```yaml
object: <target table or file name>
mode: full-refresh | incremental | truncate | snapshot | backfill
description: <stream description>
disabled: true | false

primary_key: [<array of column names to use as primary key>]
update_key: <column name to use as incremental key>

columns: {<map of column name to data type>}
select: [<array of column names to include or exclude>]
files: [<array of file paths to include or exclude>]
where: <SQL where clause. Also accepts placeholders update_key, incremental_value, and incremental_where_cond>
single: true | false
sql: <source custom SQL query>
transforms: [<array of transforms or map of column name to array of transforms>]
hooks: <stream level hooks map>

source_options: <source options map>
target_options: <target options map>
```

## Hooks

The `<replication level hooks map>` and `<stream level hooks map>` accepts the keys below. See [Hooks](https://docs.slingdata.io/concepts/hooks) for more details.

```yaml
# replication level, at start and end of replication
start: [<array of hooks>]
end: [<array of hooks>]

# stream level, before and after a stream run
pre: [<array of hooks>]
post: [<array of hooks>]
pre_merge: [<array of hooks>]   # since v1.4.24
post_merge: [<array of hooks>]  # since v1.4.24
```

## Source Options

The `<source options map>` accepts the keys below. See [Source Options](https://docs.slingdata.io/concepts/replication/source-options) for more details.

```yaml
compression: auto | none | zip | gzip | snappy | zstd
chunk_size: <backfill chunk size>
datetime_format: auto | <ISO 8601 date format>
delimiter: <character to use as flat file delimiter>
encoding: latin1 | latin5 | latin9 | utf8 | utf8_bom | utf16 | windows1250 | windows1252
empty_as_null: true | false
escape: <character to use as flat file quote escape>
flatten: true | false
format: csv | xml | xlsx | json | parquet | avro | sas7bdat | jsonlines | arrow | delta | raw | geojson
header: true | false
jmespath: <JMESPath expression>
jq: <JQ expression>
limit: <integer>
null_if: <null_if expression>
range: <backfill range expression>
sheet: <excel sheet/range expression>
skip_blank_lines: true | false
```

## Target Options

The `<target options map>` accepts the keys below. See [Target Options](https://docs.slingdata.io/concepts/replication/target-options) for more details.

```yaml
add_new_columns: true | false
adjust_column_type: true | false
batch_limit: <integer>
column_casing: source | target | snake | upper | lower
column_typing: {map of column type generation configuration}
compression: auto | none | gzip | snappy | zstd
datetime_format: auto | <ISO 8601 date format>
delimiter: <character to use as flat file delimiter>
delete_missing: hard | soft
direct_insert: true | false
encoding: latin1 | latin5 | latin9 | utf8 | utf8_bom | utf16 | windows1250 | windows1252
isolation_level: default | read_uncommitted | read_committed | write_committed | repeatable_read | snapshot | serializable | linearizable
file_max_bytes: <integer>
file_max_rows: <integer>
format: csv | xlsx | json | parquet | raw
header: true | false
ignore_existing: true | false
merge_strategy: update_insert | delete_insert | insert | update
table_ddl: <ddl sql query>
table_keys: {map of table key type to array of column names}
table_tmp: <name of table>
use_bulk: true | false
```

## Replication Specification

Here we have the definitions for the accepted keys.

<table data-full-width="false"><thead><tr><th width="328.950030469226">Replication Config Key</th><th>Description</th></tr></thead><tbody><tr><td><code>source</code></td><td>The source database connection (name, conn string or URL).</td></tr><tr><td><code>target</code></td><td>The target database connection (name, conn string or URL).</td></tr><tr><td><code>hooks</code></td><td>The replication level hooks to apply (at start &#x26; end of replication). See <a href="../hooks">here</a> for details.</td></tr><tr><td><code>streams.&#x3C;key></code></td><td>The source table (schema.table), local / cloud file path. Use <code>file://</code> for local paths.</td></tr><tr><td><p><code>streams.&#x3C;key>.object</code></p><p>or <code>defaults.object</code></p></td><td>The target table (schema.table) or local / cloud file path. Use <code>file://</code> for local paths.</td></tr><tr><td><p><code>streams.&#x3C;key>.columns</code></p><p>or <code>defaults.columns</code></p></td><td>The columns types map. See <a href="columns">here</a> for details.</td></tr><tr><td><p><code>streams.&#x3C;key>.transforms</code></p><p>or <code>defaults.transforms</code></p></td><td>The transforms to apply. See <a href="transforms">here</a> for details.</td></tr><tr><td><p><code>streams.&#x3C;key>.hooks</code></p><p>or <code>defaults.hooks</code></p></td><td>The stream level hooks to apply (pre- &#x26; post-stream run). See <a href="../hooks">here</a> for details.</td></tr><tr><td><p><code>streams.&#x3C;key>.mode</code></p><p>or <code>defaults.mode</code></p></td><td>The target load <a href="modes">mode</a> to use: <code>incremental</code>, <code>truncate</code>, <code>full-refresh</code>, <code>backfill</code> or <code>snapshot</code>. Default is <code>full-refresh</code>.</td></tr><tr><td><code>streams.&#x3C;key>.select</code> or <code>defaults.select</code></td><td>Select or exclude specific columns from the source stream. Use <code>-</code> prefix to exclude.</td></tr><tr><td><code>streams.&#x3C;key>.single</code> or <code>defaults.single</code></td><td>When using a wildcard (<code>*</code>) in the stream name, consider as a single stream (don't expand into many streams).</td></tr><tr><td><code>streams.&#x3C;key>.sql</code> or <code>defaults.sql</code></td><td>The custom SQL query to use. Accepts <code>file://path/to.query.sql</code> as well.</td></tr><tr><td><p><code>streams.&#x3C;key>.primary_key</code></p><p>or <code>defaults.primary_key</code></p></td><td>The column(s) to use as primary key. If composite key, use array.</td></tr><tr><td><p><code>streams.&#x3C;key>.update_key</code></p><p>or <code>defaults.update_key</code></p></td><td>The column to use as update key (for <code>incremental</code> mode).</td></tr><tr><td><p><code>streams.&#x3C;key>.source_options</code></p><p>or <code>defaults.source_options</code></p></td><td>Options to further configure source. See <a href="source-options">here</a> for details.</td></tr><tr><td><p><code>streams.&#x3C;key>.target_options</code></p><p>or <code>defaults.target_options</code></p></td><td>Options to further configure target. See <a href="target-options">here</a> for details.</td></tr><tr><td><code>env</code></td><td>Environment variables to use for replication. See <a href="../../sling-cli/variables">here</a> for details.</td></tr></tbody></table>
