Structure
This document covers the fundamental structure of a Sling API specification file.
Root Level
At the root level, we have the following keys:
# 'name', 'description' and 'endpoints' keys are required
name: <API display name>
description: <API description>
queues: [<array of queue names>]
defaults: <endpoint configuration map>
authentication: <authentication configuration map>
endpoints:
<endpoint name>: <endpoint configuration map>Endpoint Level
The <endpoint name> identifies the API endpoint to interact with. This can be any descriptive name for the endpoint.
The <endpoint configuration map> is a map object which accepts the following keys:
name: <endpoint name>
description: <endpoint description>
docs: <documentation URL>
disabled: true | false
state: {<map of state variables>}
sync: [<array of state variable names to persist>]
request: <request configuration map>
pagination: <pagination configuration map>
response: <response configuration map>
iterate: <iteration configuration map>
setup: [<array of setup calls>]
teardown: [<array of teardown calls>]
depends_on: [<array of upstream endpoint names>]
overrides: <stream processor configuration overrides>Request Configuration
The <request configuration map> accepts the keys below:
url: <endpoint URL>
method: GET | POST | PUT | PATCH | DELETE | HEAD | OPTIONS | TRACE | CONNECT
timeout: <timeout in seconds>
headers: {<map of header name to value>}
parameters: {<map of parameter name to value>}
payload: <request body data>
rate: <maximum requests per second>
concurrency: <maximum concurrent requests>Pagination Configuration
The <pagination configuration map> accepts the keys below:
next_state: {<map of state variables to update for next page>}
stop_condition: <expression to determine when to stop paginating>Response Configuration
The <response configuration map> accepts the keys below:
format: json | csv | xml
records: <records extraction configuration map>
processors: [<array of processor configurations>]
rules: [<array of response rule configurations>]Records Configuration
The <records extraction configuration map> accepts the keys below:
jmespath: <JMESPath expression to extract records>
primary_key: [<array of column names for primary key>]
update_key: <column name for incremental updates>
limit: <maximum number of records to process>
duplicate_tolerance: <bloom filter settings: "capacity,error_rate">💡 Primary Key Priority: When using API specs in replications, the primary key defined in the replication stream configuration takes priority over the primary key defined in the API spec. If no primary key is specified in the stream, the primary key from the spec will be used.
Processor Configuration
Each processor in the processors array accepts:
aggregation: none | maximum | minimum | collect | first | last
expression: <transformation expression>
output: <output destination (record field, state variable, queue, environment variable, or store)>
# Examples:
# - record.field_name (add/update field in record)
# - record (replace entire record)
# - state.variable_name (store in state, requires aggregation)
# - queue.queue_name (send to queue)
# - env.VAR_NAME (set environment variable, requires aggregation)
# - context.store.key_name (store in replication store, requires aggregation)Response Rules
Each rule in the rules array accepts:
action: retry | continue | stop | fail
condition: <boolean expression>
max_attempts: <maximum retry attempts>
backoff: none | constant | linear | exponential | jitter
backoff_base: <base duration in seconds for backoff>
message: <custom message for rule execution>Authentication Configuration
The <authentication configuration map> accepts the keys below:
type: none | static | basic | oauth2 | aws-sigv4 | hmac | sequence
expires: <re-authentication interval in seconds>
# Static header authentication
headers: {<map of header name to value>}
# Basic authentication
username: <username>
password: <password>
# OAuth2 authentication
flow: client_credentials | authorization_code | device_code
authentication_url: <OAuth token URL>
authorization_url: <OAuth authorization URL>
device_auth_url: <OAuth device auth URL>
client_id: <OAuth client ID>
client_secret: <OAuth client secret>
scopes: [<array of OAuth scopes>]
redirect_uri: <OAuth redirect URI>
# AWS Signature V4 authentication
aws_service: <AWS service name>
aws_access_key_id: <AWS access key>
aws_secret_access_key: <AWS secret key>
aws_session_token: <AWS session token>
aws_region: <AWS region>
aws_profile: <AWS profile>
# HMAC authentication
algorithm: sha256 | sha512
secret: <HMAC secret key>
signing_string: <template for string to sign>
request_headers: {<map of header name to value template>}
nonce_length: <random nonce length in bytes>
# Sequence authentication (custom calls)
sequence: [<array of authentication calls>]Iteration Configuration
The <iteration configuration map> accepts the keys below:
over: <expression that evaluates to an array or queue>
into: <state variable name to store current iteration value>
if: <condition expression to evaluate before iteration>
concurrency: <maximum parallel iterations>Endpoint Dependencies
The depends_on field explicitly declares that an endpoint depends on other endpoints completing first. This is useful for controlling execution order.
endpoints:
# First endpoint: Collects customer IDs
customers:
request:
url: "{state.base_url}/customers"
response:
processors:
- expression: "record.id"
output: "queue.customer_ids"
# Second endpoint: Depends on customers endpoint
customer_orders:
depends_on: ["customers"] # Wait for customers to complete first
iterate:
over: "queue.customer_ids"
into: "state.customer_id"
request:
url: "{state.base_url}/customers/{state.customer_id}/orders"📝 Note: When using queues with
iterate.over, Sling automatically infers dependencies. Thedepends_onfield is optional but can make dependencies explicit.
Stream Overrides
The overrides field allows you to configure how the endpoint's data is processed when writing to a destination. This is used during replication to control stream-specific behavior.
Basic Overrides
Control the replication mode for specific endpoints:
endpoints:
# Full refresh for dimension tables
customers:
request:
url: "{state.base_url}/customers"
response:
records:
jmespath: "data[]"
primary_key: ["id"]
overrides:
mode: full-refresh # Always replace all data
# Incremental for fact tables
transactions:
request:
url: "{state.base_url}/transactions"
parameters:
updated_since: "{state.last_sync_timestamp}"
response:
records:
jmespath: "data[]"
primary_key: ["id"]
update_key: "updated_at"
overrides:
mode: incremental # Only new/updated records. User would have to manually drop/truncate the table.Available modes:
full-refresh: Replace all data (truncate and load)incremental: Append new records onlysnapshot: Create versioned snapshotsbackfill: Historical data loading
Hooks Override
Add post-processing hooks for specific endpoints. This is powerful for merge operations, data cleanup, or custom transformations:
endpoints:
customer_balance_transaction:
request:
url: "{state.base_url}/customers/{state.customer_id}/balance_transactions"
iterate:
over: "queue.customer_ids"
into: "state.customer_id"
response:
records:
jmespath: "data[]"
primary_key: ["id"]
overrides:
mode: full-refresh
hooks:
post:
# Check that parent customer data exists
- type: check
check: '!is_null(runs["customer"]) && run.total_rows > 0'
failure_message: no customer records to merge with
on_failure: break
# Merge balance transactions into customer table
- type: query
id: customer-update-merge
connection: '{target.name}'
operation: merge
on_failure: abort
params:
strategy: update
source_table: '{run.object.full_name}'
target_table: '{runs["customer"].object.full_name}'
primary_key: [id]
# Clean up temporary staging table
- type: query
connection: '{target.name}'
operation: drop_table
params:
table: '{run.object.full_name}'Hook Types Available:
check: Validate conditions before proceedingquery: Execute SQL operations (merge, drop, etc.)log: Log messages for debugginghttp: Call external APIscommand: Run shell commands
See Hooks documentation for complete details.
💡 Tip: Overrides are most useful when extracting large datasets that need special handling during the write phase, or when implementing complex merge/upsert logic.
State vs. Sync
Understanding the difference between state and sync:
State Variables
The state field defines variables available during endpoint execution. State is:
Temporary: Exists only during current run
Per-endpoint: Each endpoint has its own state
Per-iteration: Each iteration (if using
iterate) gets its own state copy
endpoints:
daily_data:
state:
start_date: "{date_format(date_add(now(), -1, 'day'), '%Y-%m-%d')}"
end_date: "{date_format(now(), '%Y-%m-%d')}"
page_size: 100
request:
url: "{state.base_url}/data"
parameters:
from: "{state.start_date}"
to: "{state.end_date}"
limit: "{state.page_size}"Sync Variables
The sync field lists which state variables should persist between runs. This enables incremental data loading:
endpoints:
incremental_data:
state:
# Initialize from previous run, or default to 7 days ago
last_sync_timestamp: >
{
coalesce(
sync.last_sync_timestamp,
date_format(date_add(now(), -7, 'day'), '%Y-%m-%dT%H:%M:%SZ')
)
}
# Persist this variable for next run
sync: [last_sync_timestamp]
request:
url: "{state.base_url}/data"
parameters:
updated_since: "{state.last_sync_timestamp}"
response:
processors:
# Track the maximum timestamp seen
- expression: "record.updated_at"
output: "state.last_sync_timestamp"
aggregation: maximumKey Differences:
Scope
Current run only
Persisted between runs
Purpose
Runtime variables
Incremental tracking
Declaration
state: {key: value}
sync: [key]
Access
state.key
sync.key (on load) → state.key (during run)
Use Case
Configuration, calculations
Timestamps, cursors, offsets
Context Variables
Context variables are read-only runtime values passed from the replication configuration to the API spec. They enable endpoints to support both backfill and incremental modes with a single configuration.
Available Context Variables:
context.mode
string
Replication mode
Replication config mode field
context.limit
integer
Maximum records to fetch
Replication config source_options.limit
context.range_start
string
Backfill range start
Replication config source_options.range (first value)
context.range_end
string
Backfill range end
Replication config source_options.range (second value)
Context vs. State vs. Sync:
Source
Replication config
API spec
Persisted storage
Scope
Current run
Current run
Between runs
Modifiable
No (read-only)
Yes
Yes (via state)
Common Pattern: Backfill with Incremental Fallback
This pattern supports backfill (with range), incremental (with sync state), and first run (with default):
endpoints:
daily_events:
sync: [last_date] # Persist for incremental runs
iterate:
# Priority: context.range_start → sync.last_date → default
over: >
range(
coalesce(context.range_start, sync.last_date, date_format(date_add(now(), -7, "day"), "%Y-%m-%d")),
coalesce(context.range_end, date_format(now(), "%Y-%m-%d")),
"1d"
)
into: "state.current_date"
request:
url: "{state.base_url}/events/daily/{state.current_date}"
response:
records:
jmespath: "events[]"
primary_key: ["event_id"]
processors:
- expression: "state.current_date"
output: "state.last_date"
aggregation: "maximum"Replication Configs:
# Backfill mode: Process specific date range
source_options:
range: '2024-01-01,2024-01-31' # Sets context.range_start and context.range_end
# Incremental mode: Use sync state (no range specified)
# Falls back to sync.last_date from previous run
# Testing mode: Limit records
source_options:
limit: 100 # Sets context.limitOther Common Uses:
# Mode-specific behavior
state:
batch_size: '{if(context.mode == "backfill", 1000, 100)}'
# Limit for testing/development
response:
records:
limit: '{coalesce(context.limit, null)}'
# Numeric ID ranges
iterate:
over: >
range(
coalesce(context.range_start, sync.last_id, "1"),
coalesce(context.range_end, "999999"),
"1000"
)💡 Best Practice: Always use
coalesce()with context variables to provide fallback values for when they're not set.
Using Inputs
Inputs are custom configuration values passed from the connection definition to the API spec. Unlike secrets (which are for credentials), inputs are for non-sensitive options like field mappings, account IDs, or feature flags. Inputs are accessed via {inputs.var_name}, similar to secrets and env.
Defining inputs in env.yaml:
# ~/.sling/env.yaml
connections:
AIRTABLE:
type: api
spec: airtable
secrets:
api_key: "patXXXXXXXXXXXXXX"
inputs:
last_modified_field_map:
'My Base Name':
'My Table Name': 'Updated At'
'Another Base':
'Customers': 'Last Modified'Accessing inputs in your API spec:
# In your API spec
state:
modified_field: >
{
jmespath(
coalesce(inputs.last_modified_field_map, object()),
"\"" + state.base_name + "\".\"" + state.table_name + "\""
)
}When to use inputs vs. secrets:
Use Case
Use secrets
Use inputs
API keys, tokens, passwords
✅
Client IDs/secrets
✅
Account IDs (non-sensitive)
✅
Field name mappings
✅
Feature flags
✅
Custom configuration options
✅
📝 Note: Inputs are defined by the API spec author. Check the specific API connector documentation to see what inputs are available.
Queues
Queues allow you to pass data from one endpoint to another in a multi-step workflow:
queues:
- order_ids
- customer_ids
endpoints:
list_orders:
response:
processors:
- expression: "record.id"
output: "queue.order_ids"
get_order_details:
iterate:
over: "queue.order_ids"
into: "state.current_order_id"
request:
url: "{state.base_url}/orders/{state.current_order_id}"For detailed information on queues, see Queues.
Sequence of Calls
A sequence is an ordered array of API calls that can be executed in workflows, authentication processes, and lifecycle hooks. Sequences are perfect for multi-step operations like async job workflows, custom authentication flows, or complex setup/teardown processes.
For detailed information on sequences, see Sequences: Setup and Teardown.
if: <condition expression to evaluate before executing the call>
request: <request configuration map>
pagination: <pagination configuration map>
response: <response configuration map>Component Relationships
The following diagram shows how the major components relate to each other:
Basic Example
Here's a minimal example showing the essential components:
name: "GitHub API"
description: "API for accessing GitHub repositories and issues"
defaults:
state:
base_url: "https://api.github.com"
request:
headers:
Accept: "application/vnd.github.v3+json"
endpoints:
repos:
description: "List repositories for a user"
request:
url: "{state.base_url}/users/{env.GITHUB_USERNAME}/repos"
response:
records:
jmespath: "[*]"API Specification
Here we have the definitions for the accepted keys.
name
The display name of the API specification.
description
Brief description of what the API does.
queues
Array of queue names for passing data between endpoints.
defaults
Default endpoint configuration applied to all endpoints.
authentication
Authentication configuration for the API. See Authentication for details.
endpoints.<key>
Named endpoints that define API interactions.
dynamic_endpoints
Array of endpoint configurations for dynamic endpoint generation. See Dynamic Endpoints for details.
endpoints.<key>.name
or defaults.name
The endpoint name (defaults to the key).
endpoints.<key>.description
or defaults.description
Description of what the endpoint does.
endpoints.<key>.docs
or defaults.docs
URL to endpoint documentation.
endpoints.<key>.disabled
or defaults.disabled
Whether the endpoint is disabled (default: false).
endpoints.<key>.state
or defaults.state
Map of state variables available to the endpoint. See State vs. Sync section above.
endpoints.<key>.sync
or defaults.sync
Array of state variable names to persist between runs. See State vs. Sync section above.
endpoints.<key>.request
or defaults.request
HTTP request configuration. See Requests for details.
endpoints.<key>.pagination
or defaults.pagination
Pagination configuration. See Pagination for details.
endpoints.<key>.response
or defaults.response
Response processing configuration. See Response Processing for details.
endpoints.<key>.iterate
or defaults.iterate
Iteration configuration for looping over data. See Iteration for details.
endpoints.<key>.setup
or defaults.setup
Array of calls to execute before the main request. See Sequences for details.
endpoints.<key>.teardown
or defaults.teardown
Array of calls to execute after the main request. See Sequences for details.
endpoints.<key>.depends_on
or defaults.depends_on
Array of endpoint names this endpoint depends on. See Endpoint Dependencies section above.
endpoints.<key>.overrides
or defaults.overrides
Stream processing overrides for destination writing. See Stream Overrides section above.
💡 Tip: Start with the basic example and gradually add complexity as needed. Use the defaults section to avoid repetition across endpoints.
Last updated
Was this helpful?