Advanced Features

This document covers advanced capabilities within Sling API specifications: pagination strategies, expression functions, incremental sync, and rules for error handling with retry logic.

For response processing basics (format handling, record extraction, deduplication, processors), see Response Processing.

Content Overview

Pagination

Pagination controls how Sling navigates through multiple pages of results for each iteration (if iterate is used) or for the single endpoint execution (if iterate is not used).

Pagination Flow

Common Pagination Patterns

1. Cursor-based Pagination

Uses the ID of the last record to fetch the next page.

pagination:
  next_state:
    # Use ID of last record for next request
    starting_after: "{response.records[-1].id}"
  stop_condition: 'jmespath(response.json, "has_more") == false || length(response.records) == 0'

2. Page Number Pagination

Increments a page number for each request.

pagination:
  next_state:
    # Increment page number
    page: "{state.page + 1}"
  stop_condition: 'state.page >= jmespath(response.json, "total_pages") || length(response.records) == 0'

3. Offset Pagination

Increments an offset value based on records received.

pagination:
  next_state:
    # Increase offset by limit
    offset: "{state.offset + state.limit}"
  stop_condition: "length(response.records) < state.limit"

4. Link Header Pagination

Extracts the next page URL from the response headers.

pagination:
  next_state:
    # Extract URL from Link header
    url: >
      {
        if(
          contains(response.headers.link, "rel=\"next\""),
          trim(split_part(split(response.headers.link, ",")[0], ";", 0), "<>"),
          null
        )
      }
  stop_condition: '!contains(response.headers.link, "rel=\"next\"")'

💡 Tip: For better performance, avoid using response variables in next_state expressions when possible. This allows Sling to prepare the next request before the current one finishes, increasing parallelism.

Functions

Functions are the building blocks of dynamic expressions in Sling API specifications. They enable sophisticated data transformations, validations, and manipulations within your API configurations.

Using Functions

Functions can be used throughout your API specification wherever expressions are supported, including:

Request Configuration: Dynamic URLs, headers, parameters, and payloads
Response Processing: Data transformation and extraction
Pagination Logic: Computing next page parameters
Conditional Logic: Rules, iteration conditions, and stop conditions
State Management: Transforming and aggregating state variables

Common Function Patterns

Dynamic Request Construction

request:
  url: '{state.base_url}/users/{state.user_id}'
  headers:
    Authorization: "Bearer {auth.token}"
    X-Request-ID: "{uuid()}"
  parameters:
    updated_since: '{date_format(date_add(now(), -1, "day"), "%Y-%m-%dT%H:%M:%SZ")}'
    limit: '{coalesce(state.page_size, 100)}'

Data Transformation in Processors

processors:
  # Parse and format timestamps
  - expression: 'date_parse(record.created_at, "auto")'
    output: "record.created_timestamp"
  
  # Clean and validate data
  - expression: "trim(upper(record.status))"
    output: "record.status_clean"
  
  # Extract nested values
  - expression: 'get_path(record, "user.profile.email")'
    output: "record.user_email"
  
  # Conditional processing
  - expression: 'if(record.active, "ACTIVE", "INACTIVE")'
    output: "record.status_label"

Pagination with Functions

pagination:
  next_state:
    # Extract cursor from response
    cursor: "get_path(response.json, 'pagination.next_cursor')"
    
    # Increment page number
    page: "{coalesce(state.page, 0) + 1}"
    
    # Dynamic limit based on response size
    limit: "{if(length(response.records) < 100, 50, state.limit)}"

  stop_condition: 'is_null(jmespath(response.json, "pagination.next_cursor")) || length(response.records) == 0'

Advanced Data Processing

processors:
  # Filter and transform arrays
  - expression: 'filter(record.tags, "length(value) > 0")'
    output: "record.valid_tags"
  
  # Extract specific fields using JMESPath
  - expression: 'jmespath(record, "items[?price > 100].{name: name, price: price}")'
    output: "record.expensive_items"
  
  # Hash sensitive data
  - expression: hash(record.email, "sha256")
    output: "record.email_hash"
  
  # Generate derived fields
  - expression: join([record.first_name, record.last_name], " ")
    output: "record.full_name"

Function Error Handling

Functions can help with graceful error handling and fallback values:

processors:
  # Safe parsing with fallback
  - expression: coalesce(try_cast(record.age, "int"), 0)
    output: "record.age_int"
  
  # Required field validation
  - expression: require(record.user_id, "User ID is required")
    output: "record.validated_user_id"
  
  # Conditional field access
  - expression: if(is_null(record.metadata), "", get_path(record.metadata, "source"))
    output: "record.source"

Best Practices

Use coalesce() for Defaults: Always provide fallback values for optional fields
Validate Required Fields: Use require() to ensure critical data is present
Handle Date Formats: Use date_parse() with "auto" format when possible
Escape Special Characters: Use encoding functions for URLs and other special contexts
Test Complex Expressions: Break down complex expressions into smaller, testable parts

For a complete reference of all available functions, see the Functions documentation.

💡 Tip: Use the log() function during development to debug complex expressions: log("Processing record: " + record.id)

⚠️ Warning: Functions are evaluated for each record or iteration. Avoid expensive operations in frequently-called expressions.

Sync State for Incremental Loads

The sync key allows persisting state variables between runs, enabling incremental data loading (fetching only new or updated data).

Incremental Sync Workflow

Example: Timestamp-Based Incremental Sync

endpoints:
  incremental_data:
    state:
      # Get previous timestamp or default to 7 days ago
      start_timestamp: >
        {
          coalesce(
            sync.last_sync_ts,
            date_format(date_add(now(), -7, 'day'), '%Y-%m-%dT%H:%M:%SZ')
          )
        }
      
      # Initialize tracking variable with start timestamp
      last_sync_ts: '{state.start_timestamp}'

    # List of state variables to persist for next run
    sync: [last_sync_ts]

    request:
      parameters:
        # Filter by timestamp from last run
        updated_since: '{state.start_timestamp}'

    response:
      processors:
        # Track maximum timestamp seen
        - expression: "record.updated_at"
          output: "state.last_sync_ts"
          aggregation: "maximum"

💡 Tip: Always use coalesce() with sync variables to handle the first run when no previous state exists.

Combining Incremental Sync with Context Variables

For advanced scenarios, you can combine sync state with context variables to support both incremental loading and backfilling. Context variables are runtime values passed from the replication configuration.

Key context variables for incremental sync:

context.range_start - Start of backfill range (from source_options.range)
context.range_end - End of backfill range (from source_options.range)
context.mode - Replication mode (incremental, full-refresh, backfill)
context.limit - Maximum records to fetch (from source_options.limit)

Example: Incremental with Backfill Support

endpoints:
  events:
    sync: [last_date]

    iterate:
      # Backfill mode: Use context.range_start/range_end
      # Incremental mode: Use sync.last_date
      over: >
        range(
          coalesce(context.range_start, sync.last_date, date_format(date_add(now(), -7, "day"), "%Y-%m-%d")),
          coalesce(context.range_end, date_format(now(), "%Y-%m-%d")),
          "1d"
        )
      into: "state.current_date"

    request:
      url: "{state.base_url}/events"
      parameters:
        date: "{state.current_date}"

    response:
      records:
        jmespath: "events[]"
        primary_key: ["event_id"]

      processors:
        # Track last processed date for next incremental run
        - expression: "state.current_date"
          output: "state.last_date"
          aggregation: "maximum"

Backfill Usage:

# replication.yaml
source: MY_API
target: MY_TARGET_DB

streams:
  events:
    object: analytics.events
    source_options:
      # Backfill January 2024
      range: '2024-01-01,2024-01-31'

Incremental Usage:

# replication.yaml (without range)
source: MY_API
target: MY_TARGET_DB

streams:
  events:
    object: analytics.events
    # No range - uses sync.last_date

This pattern allows the same endpoint to handle both historical backfills and ongoing incremental updates. See Context Variables for full details.

Rules & Retries

Rules define actions based on response conditions (status codes, headers, body content), providing fine-grained control over error handling and retries.

Rules Evaluation Flow

Rule Properties

Property

Required

Description

Example

action

Yes

Action to take when condition is true

"retry", "continue", "stop", "break", "skip", "fail"

condition

Yes

Expression that triggers the action

"response.status == 429"

max_attempts

No (for retry)

Max number of retry attempts

5 (default: 3)

backoff

No (for retry)

Strategy for delay between retries

"exponential", "linear", "constant", "jitter", "none"

backoff_base

No (for retry)

Initial delay in seconds

2 (default: 1)

message

Message for logging (supports expressions)

"Rate limit hit, retrying..."

Rule Actions

Action

Description

Use Case

retry

Retry the request after delay

Rate limits (429), server errors (>=500)

continue

Process response, ignore error

Non-critical errors (e.g., 404 for optional resources)

skip

Break out of the rule evaluation loop and skip this request

When a request should not be processed

break

Stop the current iteration gracefully without error

Stop iteration within loops when processing complete

stop

Stop current endpoint/iteration

When further requests would be useless

fail

Stop Sling run with error

Critical errors (auth failure, invalid parameters)

Example Rules

rules:
  # Rule 1: Retry rate limits and server errors
  - action: "retry"
    condition: "response.status == 429 || response.status >= 500"
    max_attempts: 5
    backoff: "exponential"
    backoff_base: 2
    message: "Server error or rate limit hit, retrying..."

  # Rule 2: Fail on authentication errors
  - action: "fail"
    condition: "response.status == 401 || response.status == 403"
    message: "Authentication failed"

  # Rule 3: Ignore 404 errors
  - action: "continue"
    condition: "response.status == 404"
    message: "Resource not found, continuing"

  # Rule 4: Skip invalid records in iteration
  - action: "skip"
    condition: "is_null(record.id)"
    message: "Skipping record without ID"

  # Rule 5: Stop iteration when reaching limit
  - action: "break"
    condition: "state.records_processed >= state.limit"
    message: "Processed limit reached, breaking iteration"

📝 Note: Rules are evaluated in order. The first matching rule's action is executed.

Backoff Strategies

When a rule uses the retry action, the backoff strategy determines how long to wait between retry attempts.

Backoff Types

Type

Calculation

Use Case

Example Delays (backoff_base=1)

none

No delay

Immediate retries (use cautiously)

0s, 0s, 0s, ...

constant

Fixed delay

Predictable retry timing

1s, 1s, 1s, ...

linear

base × attempt

Gradual backoff

1s, 2s, 3s, 4s, 5s, ...

exponential

base × 2^(attempt-1)

Aggressive backoff (recommended)

1s, 2s, 4s, 8s, 16s, ...

jitter

exponential + random(0-50%)

Avoid thundering herd

1s, 3s, 5s, 10s, 20s, ...

Backoff Examples with Timing

None (No Backoff)

rules:
  - action: retry
    condition: "response.status >= 500"
    max_attempts: 3
    backoff: none

Retry Timeline:

Request 1 (fails) → 0s wait
Request 2 (fails) → 0s wait
Request 3 (fails) → Give up

⚠️ Warning: No backoff can overwhelm failing services. Use only when retries must be immediate.

Constant Backoff

rules:
  - action: retry
    condition: "response.status >= 500"
    max_attempts: 5
    backoff: constant
    backoff_base: 2  # 2 seconds between each retry

Retry Timeline:

Request 1 (fails) → Wait 2s
Request 2 (fails) → Wait 2s
Request 3 (fails) → Wait 2s
Request 4 (fails) → Wait 2s
Request 5 (fails) → Give up

Total time: ~8 seconds

Linear Backoff

rules:
  - action: retry
    condition: "response.status == 429"
    max_attempts: 5
    backoff: linear
    backoff_base: 3  # Base delay of 3 seconds

Retry Timeline:

Request 1 (fails) → Wait 3s (3 × 1)
Request 2 (fails) → Wait 6s (3 × 2)
Request 3 (fails) → Wait 9s (3 × 3)
Request 4 (fails) → Wait 12s (3 × 4)
Request 5 (fails) → Give up

Total time: ~30 seconds

Exponential Backoff (Recommended)

rules:
  - action: retry
    condition: "response.status == 429 || response.status >= 500"
    max_attempts: 5
    backoff: exponential
    backoff_base: 2  # Base delay of 2 seconds

Retry Timeline:

Request 1 (fails) → Wait 2s (2 × 2⁰ = 2)
Request 2 (fails) → Wait 4s (2 × 2¹ = 4)
Request 3 (fails) → Wait 8s (2 × 2² = 8)
Request 4 (fails) → Wait 16s (2 × 2³ = 16)
Request 5 (fails) → Give up

Total time: ~30 seconds

💡 Tip: Exponential backoff is the industry standard for API retries. It quickly backs off from transient failures while giving services time to recover.

Jitter Backoff (Best for High Concurrency)

rules:
  - action: retry
    condition: "response.status == 429"
    max_attempts: 5
    backoff: jitter
    backoff_base: 2

Retry Timeline (example with random jitter):

Request 1 (fails) → Wait 2.3s (2s + 15% jitter)
Request 2 (fails) → Wait 5.1s (4s + 28% jitter)
Request 3 (fails) → Wait 10.4s (8s + 30% jitter)
Request 4 (fails) → Wait 20.8s (16s + 30% jitter)
Request 5 (fails) → Give up

Total time: ~38 seconds (varies due to randomness)

📝 Note: Jitter adds 0-50% random delay to exponential backoff. This prevents multiple clients from retrying simultaneously (thundering herd problem).

Choosing the Right Backoff Strategy

Scenario

Recommended Strategy

Reasoning

Rate limits (429)

exponential or jitter

Gives API time to recover quota

Server errors (5xx)

exponential

Allows server recovery time

Temporary network issues

linear

Moderate, predictable backoff

Must retry immediately

constant with low base

Fast retries, simple timing

High-concurrency scenarios

jitter

Prevents retry storms

Rate Limit Handling

Sling automatically detects and respects rate limit headers from API responses. This works in conjunction with backoff strategies to optimize retry timing.

Automatic Rate Limit Detection

When a retry rule triggers on a 429 status, Sling automatically checks for rate limit headers:

rules:
  - action: retry
    condition: "response.status == 429"
    max_attempts: 5
    backoff: exponential  # Fallback if no rate limit headers
    backoff_base: 2

Supported Rate Limit Headers

Sling checks for these headers in order of priority:

1. IETF Standard Headers (Preferred)

Header

Description

Example

RateLimit-Reset

Seconds until quota resets

60 (wait 60 seconds)

RateLimit-Remaining

Requests remaining in window

0

RateLimit-Policy

Rate limit window and quota

"60;q=100;w=60"

HTTP/1.1 429 Too Many Requests
RateLimit-Reset: 30
RateLimit-Remaining: 0
RateLimit-Policy: "minute";q=60;w=60

Behavior: Sling will wait 30 seconds before retrying.

2. Legacy/Alternative Headers

Header

Description

Example

Retry-After

Seconds or HTTP date to retry after

120 or Wed, 21 Oct 2025 07:28:00 GMT

X-RateLimit-Reset

Unix timestamp when quota resets

1743158739

HTTP/1.1 429 Too Many Requests
Retry-After: 60

Behavior: Sling will wait 60 seconds before retrying.

Rate Limit Header Processing

When rate limit headers are detected, they override the backoff calculation:

Rate Limit Policy Parsing

For APIs using the IETF RateLimit-Policy header:

RateLimit-Policy: "hour";q=1000;w=3600, "day";q=5000;w=86400
RateLimit-Remaining: 0

Format: "name";q=quota;w=window

q: Quota (number of requests)
w: Window duration (seconds)

Sling's behavior:

If RateLimit-Remaining is 0, waits for the full window
Otherwise, calculates proportional wait: window × (1 - remaining/quota)

Complete Rate Limit Example

endpoints:
  api_data:
    request:
      url: "{state.base_url}/data"
      rate: 10  # Max 10 requests per second normally

    response:
      rules:
        # Rule 1: Handle rate limits with header-aware retry
        - action: retry
          condition: "response.status == 429"
          max_attempts: 5
          backoff: exponential  # Fallback strategy
          backoff_base: 2
          message: "Rate limited - waiting {response.headers['ratelimit-reset']}s"

        # Rule 2: Fail on repeated rate limits
        - action: fail
          condition: "response.status == 429 && request.attempts >= 5"
          message: "Rate limit exceeded after 5 retries"

        # Rule 3: Handle server errors differently
        - action: retry
          condition: "response.status >= 500"
          max_attempts: 3
          backoff: jitter
          backoff_base: 5

What happens:

On first 429, checks for RateLimit-Reset header
If found, waits that duration (ignoring backoff calculation)
If not found, uses exponential backoff (2s, 4s, 8s, ...)
Retries up to 5 times
Fails if still getting 429 after all retries

Testing Rate Limits

Use the trace flag to see rate limit handling in action:

sling conns test MY_API --endpoints data_endpoint --trace

Look for output like:

DBG r.0001.abc   response code=429 duration=234ms
DBG r.0001.abc   using rate limit headers for backoff: 30s
DBG r.0001.abc   rule met to retry (attempt=2) with backoff=30s: response.status == 429

💡 Tip: Most well-designed APIs include rate limit headers. Always use exponential or jitter backoff as a fallback for APIs that don't.

PreviousResponse Processing NextQueues

Last updated 24 days ago

Was this helpful?