Constraints
Learn how to use constraints
Constraints are a powerful feature that allow you to evaluate each value of a certain column and handle any failures. They can be specified using SQL-like syntax, separated by a | symbol. The advantage of using constraints is that data quality can be ensured while ingesting the data (at runtime), not way later in the pipeline.
source: source_name
target: target_name
streams:
my_stream:
columns:
# id values cannot be null
id: bigint | value is not null
# status values can only be active or inactive
status: string | value in ('active', 'inactive')
# col_1 value length can only be 6, 7 or 8
col_1: string(8) | value_len > 5 and value_len <= 8Handling failures
Sling looks for the environment variable SLING_ON_CONSTRAINT_FAILURE to know what to do.
Here are the values allowed in SLING_ON_CONSTRAINT_FAILURE:
warn: This is the default. When using the Sling Platform, this will emit a warning status, which can notify you (Email, Slack, etc.)skip: Skip the record (do not ingest into target)abort: Will immediately abort the run, and fail/error. When using the Sling Platform, this can notify you (Email, Slack, etc.)
Supported Operators
is null- Check if value is nullis not null- Check if value is not null==- Equal to!=or<>- Not equal to>- Greater than>=- Greater than or equal to<- Less than<=- Less than or equal to~- Matches regex pattern!~- Does not match regex patternin- Value matches any in listnot in- Value does not match any in listand- Combine multiple conditions (all must be true)or- Combine multiple conditions (at least one must be true)
Special Variables
value- Record value for respective columnvalue_len- Length of record value for respective column
Using CLI Flags
Using YAML
Using the defaults and streams keys, you can specify different columns/constraints for each stream.
Last updated
Was this helpful?