Response Processing
This document explains how Sling processes API responses, including format handling, record extraction, and data transformations.
Response Flow Overview
Response Formats
Sling can automatically handle multiple response formats based on the API's Content-Type header or explicit configuration.
Automatic Format Detection
By default, Sling detects the format from the Content-Type response header:
application/json
JSON
Direct JSON parsing
application/xml or text/xml
XML
Converted to JSON structure
text/csv
CSV
Converted to JSON records
Others
JSON (default)
Attempts JSON parsing
Explicit Format Configuration
You can override automatic detection by specifying the format explicitly:
response:
format: json # Force format interpretation
records:
jmespath: "data[]"Supported format values:
json- Standard JSON responsecsv- Comma-separated valuesxml- XML responsejsonlorjsonlines- JSON Lines (one JSON object per line)
Format-Specific Processing
JSON Responses
The most common API response format. Sling parses JSON and extracts records using JMESPath:
response:
format: json # Optional, auto-detected
records:
# Extract the array of user objects
jmespath: "data.users[]"Example JSON response:
{
"data": {
"users": [
{"id": 1, "name": "Alice"},
{"id": 2, "name": "Bob"}
]
},
"meta": {
"total": 2
}
}CSV Responses
CSV responses are automatically converted to JSON records:
response:
format: csv
records:
# For CSV, jmespath typically extracts all records
jmespath: "[*]"
primary_key: ["id"]CSV Processing Rules:
First row is treated as the header row (column names)
Subsequent rows become records
Minimum 2 rows required (header + at least one data row)
Each row is converted to a JSON object with header names as keys
Example CSV response:
id,name,email
1,Alice,[email protected]
2,Bob,[email protected]Becomes:
[
{"id": "1", "name": "Alice", "email": "[email protected]"},
{"id": "2", "name": "Bob", "email": "[email protected]"}
]📝 Note: CSV values are always strings. Use processors to convert them to other types if needed.
XML Responses
XML responses are automatically converted to JSON before record extraction:
response:
format: xml
records:
jmespath: "root.users.user[]"Example XML response:
<root>
<users>
<user>
<id>1</id>
<name>Alice</name>
</user>
<user>
<id>2</id>
<name>Bob</name>
</user>
</users>
</root>Becomes JSON:
{
"root": {
"users": {
"user": [
{"id": "1", "name": "Alice"},
{"id": "2", "name": "Bob"}
]
}
}
}⚠️ Warning: XML to JSON conversion follows standard rules: attributes become fields with
@prefix, text content becomes#textfield.
JSON Lines (JSONL)
For streaming JSON responses where each line is a complete JSON object:
response:
format: jsonl
records:
# Each line is already a record
jmespath: "[*]"Example JSONL response:
{"id": 1, "name": "Alice", "email": "[email protected]"}
{"id": 2, "name": "Bob", "email": "[email protected]"}
{"id": 3, "name": "Charlie", "email": "[email protected]"}Record Extraction
After format conversion, records are extracted using JMESPath expressions.
Basic Extraction
response:
records:
# Extract top-level array
jmespath: "[*]"Nested Extraction
response:
records:
# Extract nested array
jmespath: "response.data.items[]"Conditional Extraction
response:
records:
# Extract only active users
jmespath: "users[?status=='active']"Projection and Transformation
response:
records:
# Extract and reshape data
jmespath: "data[].{user_id: id, full_name: name, contact: email}"Deduplication
When primary_key is defined, Sling automatically deduplicates records:
response:
records:
jmespath: "data[]"
primary_key: ["id"] # Single field
# OR
# primary_key: ["id", "location_id"] # Composite keyDeduplication Strategies
1. In-Memory Deduplication (Default)
For datasets with reasonable record counts:
response:
records:
primary_key: ["id"]
# Uses hash map in memoryCharacteristics:
Fast and accurate
Memory usage grows with unique record count
Suitable for datasets up to ~1 million records
2. Bloom Filter Deduplication
For very large datasets where memory is constrained:
response:
records:
primary_key: ["id"]
duplicate_tolerance: "10000000,0.001" # capacity,error_rateCharacteristics:
Probabilistic deduplication (small false positive rate)
Fixed memory footprint
Suitable for datasets with millions of records
Format: "capacity,error_rate"
capacity: Expected number of unique recordserror_rate: Acceptable false positive rate (e.g., 0.001 = 0.1%)
💡 Tip: Use Bloom filter for datasets over 1 million records or when memory is limited. The error rate determines memory usage - lower rates use more memory.
Response State
All response data is accessible in the response state variable for use in expressions:
response.status
HTTP status code
response.status == 200
response.headers
Response headers
response.headers.link
response.text
Raw response body
length(response.text) > 0
response.json
Parsed JSON response
response.json.has_more
response.records
Extracted records array
length(response.records)
Using Response State in Pagination
pagination:
next_state:
cursor: '{jmespath(response.json, "pagination.next_cursor")}'
stop_condition: 'jmespath(response.json, "has_more") == false'Using Response State in Rules
rules:
- action: retry
condition: "response.status == 429"
max_attempts: 5
- action: stop
condition: "length(response.records) == 0"
message: "No more records available"Using Response State in Processors
processors:
# Add metadata from response to each record
- expression: "response.json.request_id"
output: "record.api_request_id"
# Conditional processing based on response
- if: "response.status == 206" # Partial content
expression: "record.id"
output: "queue.incomplete_records"Conditional Processing with IF Conditions
Processors support an optional if field to conditionally execute based on runtime conditions.
Basic Syntax
processors:
# Only process non-null values
- expression: "lower(record.email)"
if: "!is_null(record.email) && record.email != ''"
output: "record.email_normalized"
# Only queue US customers
- expression: "record.id"
if: "record.country == 'US'"
output: "queue.us_customer_ids"
# Track max timestamp only for completed records
- expression: "record.updated_at"
if: "record.status == 'completed'"
output: "state.last_completed_timestamp"
aggregation: "maximum"How It Works
Evaluation: The
ifcondition is evaluated before the expressionSkip on False: If false, the entire processor is skipped for that record
Access: Has access to
record,state,response,env,secrets
Common Patterns
processors:
# Null/empty checks
- expression: 'cast(record.age, "int")'
if: "!is_null(record.age)"
output: "record.age_int"
# Type validation with try_cast
- expression: 'cast(record.value, "int")'
if: "is_null(try_cast(record.value, 'int')) == false"
output: "record.value_int"
# Date filtering
- expression: "record.id"
if: "date_parse(record.created_at, 'auto') > date_add(now(), -7, 'day')"
output: "queue.recent_ids"
# Response-based conditions
- expression: "response.json.request_id"
if: "response.status == 200"
output: "record.api_request_id"💡 Tip: Always check for null before accessing field properties to avoid errors.
⚠️ Warning: IF conditions are evaluated for every record. Avoid expensive operations.
Overwriting Records with output: "record"
output: "record"Setting output: "record" completely replaces the entire record with the result of the expression. All existing fields are discarded unless explicitly included.
Common Use Cases
1. Select Specific Fields
Keep only essential fields from large API responses:
processors:
- expression: >
object(
"user_id", record.id,
"username", record.username,
"email", record.email
)
output: "record"2. Rename Fields
Transform field names to match your schema:
processors:
- expression: >
object(
"customer_id", record.id,
"full_name", record.name,
"contact_email", record.email
)
output: "record"3. Flatten Nested Data
Convert nested structures into flat records using JMESPath:
processors:
- expression: >
jmespath(record, "{
id: id,
name: user.profile.name,
email: user.contact.email,
country: user.address.country,
plan_type: subscription.plan.type
}")
output: "record"4. Add Computed Fields
Create records with derived values:
processors:
- expression: >
object(
"order_id", record.id,
"subtotal", record.subtotal,
"tax", record.subtotal * 0.08,
"total", record.subtotal * 1.08
)
output: "record"Important Warnings
⚠️ All previous fields are discarded - Must explicitly include every field you want to keep
⚠️ Order matters - If you overwrite the record, then add fields afterward:
processors:
# First: Overwrite to simplify
- expression: 'object("id", record.id, "name", record.name)'
output: "record"
# Then: Add new fields to simplified record
- expression: "upper(record.name)"
output: "record.name_upper"⚠️ Include primary keys - For deduplication to work, primary key fields must be in the new record
💡 Tip: Use JMESPath projection syntax for cleaner nested data transformations.
Error Handling
Invalid Response Format
When Sling cannot parse the response in the expected format:
rules:
- action: fail
condition: "response.status >= 400"
message: "API returned error: {response.status}"Empty or Missing Records
Handle cases where no records are found:
pagination:
# Stop if no records returned
stop_condition: "length(response.records) == 0"Partial Responses
Some APIs return partial data on errors:
rules:
# Continue processing partial results
- action: continue
condition: "response.status == 206"
message: "Partial content received, processing available data"Complete Example
Here's a comprehensive example showing all response processing features:
endpoints:
user_activity:
request:
url: "{state.base_url}/users/activity"
parameters:
limit: 100
response:
# Explicitly set format (usually auto-detected)
format: json
records:
# Extract nested records
jmespath: "data.activities[]"
# Deduplicate by composite key
primary_key: ["user_id", "activity_id"]
# Limit total records for testing
limit: 5000
# Use Bloom filter for large datasets
duplicate_tolerance: "1000000,0.001"
processors:
# Transform timestamp field
- expression: 'date_parse(record.timestamp, "auto")'
output: "record.activity_date"
# Add response metadata
- expression: "response.json.request_id"
output: "record.api_request_id"
# Track max timestamp for incremental sync
- expression: "record.timestamp"
output: "state.last_activity_timestamp"
aggregation: maximum
# Send user IDs to queue for detail lookup
- expression: "record.user_id"
output: "queue.user_ids"
rules:
# Retry on rate limit
- action: retry
condition: "response.status == 429"
max_attempts: 5
backoff: exponential
# Continue on not found (user may have been deleted)
- action: continue
condition: "response.status == 404"
message: "Resource not found, continuing"
# Fail on auth errors
- action: fail
condition: "response.status == 401 || response.status == 403"
message: "Authentication failed"
pagination:
next_state:
cursor: '{jmespath(response.json, "pagination.next_cursor")}'
stop_condition: 'is_null(jmespath(response.json, "pagination.next_cursor")) || length(response.records) == 0'Best Practices
1. Always Define Primary Keys
Even if the API doesn't explicitly require deduplication, defining primary keys helps ensure data quality:
response:
records:
primary_key: ["id"] # Prevents accidental duplicates2. Use Appropriate Deduplication
Choose the right strategy based on your dataset size:
# For < 1M records (default)
primary_key: ["id"]
# For > 1M records
primary_key: ["id"]
duplicate_tolerance: "10000000,0.001"3. Handle Multiple Content Types
If your API might return different formats:
rules:
# Handle JSON errors
- action: fail
condition: 'response.status >= 400 && response.headers["content-type"] == "application/json"'
message: "API error: {response.json.error}"
# Handle HTML errors (often 500 errors)
- action: fail
condition: 'response.status >= 400 && jmespath(response.headers, "\"content-type\"") == "text/html"'
message: "Server error (HTML response)"4. Validate Records Structure
Use processors to validate critical fields:
processors:
# Ensure required field exists
- expression: 'require(record.id, "Record missing required id field")'
output: "record.id_validated"5. Log Response Details for Debugging
During development, use processors to log response information:
processors:
# Log response summary
- expression: >
log("Response status: " + string(response.status) +
", Records: " + string(length(response.records)))
output: "" # Empty output means don't store anywhereTroubleshooting
No Records Extracted
If you're not getting any records:
Check your JMESPath expression:
sling conns test API_NAME --endpoints ENDPOINT_NAME --traceLook at the raw response in trace output
Verify the path to your records array
Test JMESPath expressions using online tools
CSV Parsing Errors
Common CSV issues:
# Error: "need at least 2 lines to build records from csv"
# Solution: Ensure API returns header + at least one data rowDeduplication Not Working
Verify your primary key fields exist:
processors:
# Log primary key values
- if: "!is_null(record.id)"
expression: 'log("Found ID: " + string(record.id))'
output: ""💡 Tip: Use
--traceflag to see detailed response processing including format detection, record extraction, and deduplication results.
Last updated
Was this helpful?