Target Options

Specification

Here we have various keys accepted for target options:

Key

Description

add_new_columns

(Only for database target)

Whether to add new columns from stream not found in target table (when mode is not full-refresh). Default is true.

adjust_column_type

(Only for database target)

Whether to adjust the column type when needed. Default is false. [BETA]

batch_limit

(Only for database target — since v1.2.11)

The maximum number of records per transaction batch.

column_casing

Whether to convert the column name casing. This facilitates querying tables in target database without using quotes. Accepts source (keep original casing), target (converts casing according to target database), snake (converts snake casing according to target database), upper or lower. Default is source.

compression

(Only for file target)

The type of compression to use when writing files. Valid inputs are none, auto and gzip, zstd, snappy. Default is auto.

datetime_format

(Only for file target)

The ISO 8601 date format to use when writing date values. Default is auto

delete_missing

(Only for database target)

Whether to soft/hard delete missing primary keys records in the target tables. Accepts soft or hard values. See here for more details

delimiter

(Only for file target)

The delimiter to use when writing tabular files. Default is ,.

file_max_bytes

(For file target or temp files for DB target)

The maximum number of bytes to write to a file. 0 means infinite number of bytes. When a value greater than 0 is specified, the output location will be a folder with many parts in it. Default is 50000000. Does not work with parquet file format (use file_max_rows instead).

file_max_rows

(For file target or temp files for DB target)

The maximum number of rows (usually lines) to write to a file. 0 means infinite number of rows. When a value greater than 0 is specified, the output location will be a folder with many parts in it. Default is 500000.

format

(Only for file target)

The format of the file(s). Options are: csv, parquet, xlsx, json and jsonlines.

ignore_existing

Ignore existing target file/table if it exists (do not overwrite/modify). Default is false.

header

(Only for file target)

Whether to write the first line as header. Default is true.

post_sql

(Only for database target)

The SQL query to run after loading.

pre_sql

(Only for database target)

The SQL query to run before loading.

table_ddl

(Only for database target)

The table DDL to use when writing to a database. Default is auto-generated by Sling. Accepts the {col_types} runtime variable. See here for more details

table_keys

(Only for database target)

The table keys to define when creating a table, such as partition, cluster, sort, etc. Each entry defines an array of column names, or expression. See here for more details.

table_tmp

(Only for database target)

The temporary table name that should be used when loading into a database. Default is auto-generated by Sling.

use_bulk

(Only for database target)

Whether to use external bulk loading tools, if installed and available in the PATH environment variable. Sling looks to use bcp for SQL Server, sqlldr for Oracle and mysql for MySQL. If false, traditional batch INSERT loading will be used. Default is true.

Table DDL

The table_ddl option allows you to define the table DDL to use when creating a table. This is useful when you want to define a table with specific properties. Sling also allows the injection of runtime variables, such as {stream_table}, which will be replaced with the actual stream table name. See Runtime Variables for more information. The special {col_types} variable will be replaced with the actual column types of the stream table.

Here is an example of using the table_ddl option:

source: my_source_db
target: clickhouse

# apply to all streams
defaults:
  target_options:
    table_ddl: create table {stream_table} ({col_types}) engine=MergeTree

streams:
  # use default target_options
  my_schema.my_table:

  # override default for this stream
  my_schema.another_table:
    target_options:
      table_ddl: create table {stream_table} ({col_types}) engine=AggregatingMergeTree

Table Keys

Table keys are used to define the keys of the target table. They are useful to preset keys such as indexes, primary keys, etc.

Here are the accepted keys:

cluster (used in BigQuery and Snowflake)
index (used in PostgreSQL, MySQL, Oracle, SQL Server, SQLite and DuckDB)
partition (used in PostgreSQL, BigQuery, ClickHouse)
primary (used in PostgreSQL, MySQL, Oracle, SQL Server, SQLite)
sort (used in Redshift)
unique (used in PostgreSQL, MySQL, Oracle, SQL Server, SQLite and DuckDB)
aggregate, duplicate, distribution and hash (used in StarRocks)

Here is an example of using the table_keys option:

source: my_source_db
target: my_target_db

# apply to all streams
defaults:
  target_options:
    table_keys:
      primary: [id]

streams:
  # use default target_options
  my_schema.my_table:

  # override default for this stream
  my_schema.another_table:
    target_options:
      table_keys:
        index: [col1, col2]

PreviousSource Options NextColumns

Last updated 2 months ago