Sling
Slingdata.ioBlogGithubHelp!
  • Introduction
  • Sling CLI
    • Installation
    • Environment
    • Running Sling
    • Global Variables
    • CLI Pro
  • Sling Platform
    • Sling Platform
      • Architecture
      • Agents
      • Connections
      • Editor
      • API
      • Deploy from CLI
  • Concepts
    • Replications
      • Structure
      • Modes
      • Source Options
      • Target Options
      • Columns
      • Transforms
      • Runtime Variables
      • Tags & Wildcards
    • Hooks / Steps
      • Check
      • Command
      • Copy
      • Delete
      • Group
      • Http
      • Inspect
      • List
      • Log
      • Query
      • Replication
      • Store
    • Pipelines
    • Data Quality
      • Constraints
  • Examples
    • File to Database
      • Custom SQL
      • Incremental
    • Database to Database
      • Custom SQL
      • Incremental
      • Backfill
    • Database to File
      • Incremental
  • Connections
    • Database Connections
      • BigTable
      • BigQuery
      • Cloudflare D1
      • Clickhouse
      • DuckDB
      • MotherDuck
      • MariaDB
      • MongoDB
      • Elasticsearch
      • MySQL
      • Oracle
      • Postgres
      • Prometheus
      • Proton
      • Redshift
      • StarRocks
      • SQLite
      • SQL Server
      • Snowflake
      • Trino
    • Storage Connections
      • AWS S3
      • Azure Storage
      • Backblaze B2
      • Cloudflare R2
      • DigitalOcean Spaces
      • FTP
      • Google Storage
      • Local Storage
      • Min.IO
      • SFTP
      • Wasabi
Powered by GitBook
On this page
  • CLI Flags Overview
  • Interface Specifications
  • Features
  1. Sling CLI

Running Sling

PreviousEnvironmentNextGlobal Variables

Last updated 1 month ago

The sling run command is the primary mechanism for executing data movement operations in Sling CLI. It provides a flexible interface for transferring data between various sources and targets, with support for different replication modes and configuration options.

There are 2 primary ways to configure and run sling, using:

  • : quick ad-hoc runs from your terminal shell or script.

  • : streams defined in a YAML or JSON file.


Furthermore, you'll find plenty of examples on how to use Sling:

CLI Flags Overview

For quickly running ad-hoc operations from the terminal, using CLI flags is often best. Here are some examples using:

# Load all tables in a schema in with 3 threads
$ export SLING_THREADS=3
$ sling run \
    --src-conn MY_SOURCE_DB \
    --src-stream 'source_schema.*' \
    --tgt-conn MY_TARGET_DB \
    --tgt-object 'target_schema.{stream_table}'
    --mode full-refresh
    
# Pipe in your json file and flatten the nested keys into their own columns
$ cat /tmp/my_file.json | sling run --src-options '{"flatten": "true"}' --tgt-conn MY_TARGET_DB --tgt-object 'target_schema.target_table' --mode full-refresh

# Read folder containing many CSV files
$ sling run \
    --src-stream 'file:///tmp/my_csv_folder/' \
    --tgt-conn MY_TARGET_DB --tgt-object 'target_schema.target_table' \
    --mode full-refresh

# Load only latest data from one source DB to another.
$ sling run \
    --src-conn MY_SOURCE_DB \
    --src-stream 'source_schema.source_table' \
    --tgt-conn MY_TARGET_DB \
    --tgt-object 'target_schema.target_table' \
    --mode incremental \
    --primary-key 'id' --update-key 'last_modified_dt' 

# Export / Backup database tables to JSON files
$ sling run \
    --src-conn MY_SOURCE_DB \
    --src-stream 'source_schema.source_table' \
    --tgt-conn MY_S3_BUCKET \
    --tgt-object 's3://my-bucket/my_json_folder/' \
    --tgt-options '{"file_max_rows": 100000, "format": "jsonlines"}'

Interface Specifications

CLI Flag
Description

--src-conn

The source database connection (name, conn string or URL).

--tgt-conn

The target database connection (name, conn string or URL).

--src-stream

The source table (schema.table), local / cloud file path. Can also be the path of sql file or in-line text to use as query. Use file:// for local paths.

--tgt-object

--mode

--primary-key

The column(s) to use as primary key (for incremental mode). If composite key, use a comma-delimited string.

--update-key

The column to use as update key (for incremental mode).

--src-options

--tgt-options

--stdout

Output the stream to standard output (STDOUT).

--select

Select or exclude specific columns from the source stream. (comma separated). Use - prefix to exclude.

--transforms

An object/map, or array/list of built-in transforms to apply to records (JSON or YAML).

--columns

An object/map to specify the type that a column should be cast as (JSON or YAML).

--streams

Features

  • Flexible Data Sources: Supports databases, files, cloud storage, and standard input

  • Multiple Load Modes: Includes full refresh, incremental, snapshot, and truncate modes

  • Data Transformations: Allows column selection, type casting, and custom transformations

  • Progress Tracking: Monitors row counts, bytes transferred, and constraint violations

  • Error Handling: Provides detailed error reporting and validation

The sling run command is designed to be both powerful and flexible, accommodating various data movement scenarios while maintaining ease of use through consistent parameter patterns and comprehensive documentation.

The target table (schema.table) or local / cloud file path. Use file:// for local paths. See for details on runtime variables.

The target load to use: incremental, truncate, full-refresh, backfill or snapshot. Default is full-refresh.

In-line options to further configure source (JSON or YAML). See for details.

In-line options to further configure target (JSON or YAML). See for details.

Only run specific streams from a replication (comma separated). See for details.

Replication
Database to Database
Database to File
File to Database
CLI Flags
here
mode
here
here
here