Custom SQL
Sling allows you to use custom DuckDB SQL statements to read from files, giving you more control over the data ingestion process. This is particularly useful when you need to perform transformations or filtering during the read operation.
CLI Flags Examples
Full Refresh Mode
In the example below, when we specify the source connection aws_s3
, sling will auto-inject the necessary secrets for proper auth.
Incremental Mode
Replication Configuration
You can also use DuckDB SQL in your replication configuration:
Features
SQL Functions: Access to DuckDB's rich SQL function library
File Format Support: Works with CSV, Parquet, JSON, and other formats supported by DuckDB
Aggregations: Perform aggregations and transformations during read
Joins: Join data from multiple files
Filtering: Apply filters to reduce data transfer
Type Casting: Use SQL CAST functions for type conversions
Notes
Sling with auto-download the duckdb binary into the Sling home directory. You can specify the desired duckDB version with env var
DUCKDB_VERSION
.Use DuckDB's
read_*
functions to specify input filesFor incremental loads, use the placeholder variables such as
{incremental_where_cond}
and{incremental_value}
. See here for more details.File paths support wildcards (
*
) for matching multiple files. See Reading Multiple Files.Cloud storage paths (
s3://
,gs://
, etc.) are supported with proper credentials. Make sure to specify the respective source connection, sling will auto-inject the needed secrets before running the query. If you are facing issues with auth not working, please reach out to us at support@slingdata.io, on discord or open a Github Issue here.
Last updated