Sling
Slingdata.ioBlogGithubHelp!
  • Introduction
  • Sling CLI
    • Installation
    • Environment
    • Running Sling
    • Global Variables
    • CLI Pro
  • Sling Platform
    • Sling Platform
      • Architecture
      • Agents
      • Connections
      • Editor
      • API
      • Deploy from CLI
  • Concepts
    • Replications
      • Structure
      • Modes
      • Source Options
      • Target Options
      • Columns
      • Transforms
      • Runtime Variables
      • Tags & Wildcards
    • Hooks / Steps
      • Check
      • Command
      • Copy
      • Delete
      • Group
      • Http
      • Inspect
      • List
      • Log
      • Query
      • Replication
      • Store
    • Pipelines
    • Data Quality
      • Constraints
  • Examples
    • File to Database
      • Custom SQL
      • Incremental
    • Database to Database
      • Custom SQL
      • Incremental
      • Backfill
    • Database to File
      • Incremental
  • Connections
    • Database Connections
      • BigTable
      • BigQuery
      • Cloudflare D1
      • Clickhouse
      • DuckDB
      • MotherDuck
      • MariaDB
      • MongoDB
      • Elasticsearch
      • MySQL
      • Oracle
      • Postgres
      • Prometheus
      • Proton
      • Redshift
      • StarRocks
      • SQLite
      • SQL Server
      • Snowflake
      • Trino
    • Storage Connections
      • AWS S3
      • Azure Storage
      • Backblaze B2
      • Cloudflare R2
      • DigitalOcean Spaces
      • FTP
      • Google Storage
      • Local Storage
      • Min.IO
      • SFTP
      • Wasabi
Powered by GitBook
On this page
  • Runtime Variables
  • Environment Variables
  • Definition
  • Replication
  • Global Environment Variables
  1. Concepts
  2. Replications

Runtime Variables

Learn how to use Runtime & Environment Variables with Sling

Runtime Variables

A powerful feature that allows dynamic configuration. The used parts will be replaced at runtime with the corresponding values. So you could name your target object {target_schema}.{stream_schema}_{stream_table}, and at runtime it will be formatted correctly as depicted below.

  • run_timestamp: The run timestamp of the task (2006_01_02_150405)

  • source_account: the name of the account of the source connection (when source conn is AZURE)

  • source_bucket: the name of the bucket of the source connection (when source conn is GCS or S3)

  • source_container: the name of the container of the source connection (when source conn is AZURE)

  • source_name: the name of the source connection

  • stream_file_folder: the file parent folder name of the stream (when source is a file system)

  • stream_file_name: the file name of the stream (when source is a file system)

  • stream_file_ext: the file extension of the stream (when source is a file system)

  • stream_file_path: the file path of the stream (when source is a file system)

  • stream_name: the name of the stream

  • stream_schema / stream_schema_lower / stream_schema_upper: the schema name of the source stream (when source is a database)

  • stream_table / stream_table_lower / stream_table_upper: the table name of the source stream (when source is a database)

  • stream_full_name: the full qualified table name of the source stream (when source is a database)

  • target_account: the name of the account of the target connection (when target is AZURE)

  • target_bucket: the name of the bucket of the target connection (when target is GCS or S3)

  • target_container: the name of the container of the target connection (when target is AZURE)

  • target_name: the name of the target connection

  • target_schema: the default target schema defined in connection (when target is a database)

  • object_schema: the target object table schema (when target is a database)

  • object_table: the target object table name (when target is a database)

  • object_full_name: the target object full qualified table name (when target is a database)

  • object_name: the target object name

Timestamp Patterns

  • YYYY: The 4 digit year of the run timestamp of the task

  • YY: The 2 digit year of the run timestamp of the task

  • MMM: The abbreviation of the month of the run timestamp of the task

  • MM: The 2 digit month of the run timestamp of the task

  • DD: The 2 digit day of the run timestamp of the task

  • HH: The 2 digit 24-hour of the run timestamp of the task

  • hh: The 2 digit 12-hour of the run timestamp of the task

  • mm: The 2 digit minute of the run timestamp of the task

  • ss: The 2 digit second of the run timestamp of the task

  • ISO8601: The ISO-8601 format of the run timestamp of the task (2006-01-02T15:04:05Z)

Partition Patterns

This only applies when writing parquet files. You must specified the update_key along with a part_ variable in the object_name, for example: object: my/folder/{part_year_month}/{part_day}.

  • part_year: The 4 digit year partition value of the update_key.

  • part_month: The 2 digit month partition value of the update_key.

  • part_year_month: Combination of the 4 digit year and the 2 digit month partition values of the update_key (e.g. 2024-11 as one value).

  • part_day: The 2 digit day partition value of the update_key.

  • part_week: The ISO-8601 2 digit week partition value of the update_key.

  • part_hour: The 2 digit hour partition value of the update_key.

  • part_minute: The 2 digit minute partition value of the update_key.

Environment Variables

Sling also allows you to pass-in environment variables in order to further customize configurations in a scalable manner. We are then able to reuse them in various places in our config files.

Definition

A convenient way to embed global variables is in the env.yaml file. You could also simply define it in the environment, the traditional way.

env.yaml
connections:
  MYSQL:
    type: mysql
  S3_ZONE_A:
    type: s3

# this sets environment variables in sling process
variables:
  path_prefix: /my/path/prefix
  schema_name: main

Replication

replication.yaml
source: MYSQL
target: S3_ZONE_A

defaults:
  # {path_prefix} here is filled in from env var
  object: {path_prefix}/{stream_schema}/{stream_table}/{YYYY}_{MM}_{DD}.parquet
  target_options:
    format: parquet

streams:

  # all tables in schema
  my_schema.*:
    # overwrites default object
    object: {stream_schema}/{stream_table}/{YYYY}_{MM}_{DD}/
    target_options:
      file_max_rows: 400000 # will split files into folder
  
  mysql.my_table:
    sql: |
      select * from mysql.my_table
      where date between '{start_date}' and '{end_date}'

env:
  # ${path_prefix} pulls from environment variables in sling process or env
  path_prefix: '${path_prefix}' # From env.yaml (not in Environment)
  start_date: '${START_DATE}'   # From Environment
  end_date: '${END_DATE}'       # From Environment

Global Environment Variables

PreviousTransformsNextTags & Wildcards

Last updated 2 months ago

Below we are displaying the full use of Environment Variables as well as (such as stream_schema, stream_table, YYYY, MM and DD).

Sling utilizes global environment variables to further configure the load behavior. You can simply define them in your environment, the env.yaml file or the env section in a task or replication. See for more details.

Global Environment Variables
Runtime Vars