Source Options
Specification
Here we have various keys accepted for source options:
compression
(Only for file source)
The type of compression to use when reading files. Valid inputs are none
, auto
and gzip
, zstd
, snappy
. Default is auto
.
chunk_size
(Only for database source)
The chunk size for backfill processing. This tells Sling to split a stream into many. Accepts values such as 12h
, 7d
or 1m. See here for more details.
datetime_format
The ISO 8601 date format to use when reading date values. Default is auto
delimiter
(Only for file source)
The delimiter to use when parsing tabular files. Default is auto
.
escape
(Only for file source - since v1.2.4)
The escape character to use when parsing tabular files. Default is "
empty_as_null
Whether empty fields should be treated as NULL
. Default is true
.
flatten
(Only for file source)
Whether to flatten a semi-structure file source format (JSON, XML)
format
(Only for file source)
The format of the file(s). Options are: csv
, parquet
, xlsx
, avro
, json
, jsonlines
, sas7bdat
and xml
.
header
(Only for file source) Whether to consider the first line as header. Default is true
.
jmespath
(Only for file and NoSQL database source)
Specify a JMESPath expression to use to filter / extract nested JSON data. See https://jmespath.org/ for more
limit
The maximum number of rows to pull from the source
null_if
Whether this case-sensitive value should be treated as NULL
when encountered. Default is NULL
.
sheet
(Only for Excel source files) The name of the sheet to use as a data source, for example Sheet1
. Default is the first sheet. You can also specify the range (Sheet2!B:H
, Sheet3!B1:H70
).
range
The range to use for backfill
mode, separated by a single comma. Example: 2021-01-01,2021-02-01
or 1,10000
skip_blank_lines
Whether blank lines should be skipped when encountered. Default is false
.
Last updated