> For the complete documentation index, see [llms.txt](https://docs.slingdata.io/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.slingdata.io/concepts/pipeline.md).

# Pipelines

A Pipeline in Sling allows you to execute multiple steps in sequence. Each step can be a different type of operation, enabling you to create complex workflows by chaining together various actions like running replications, executing queries, making HTTP requests, and more.

{% hint style="success" %}
Sling Pipelines integrate seamlessly with the [Sling VSCode Extension](/sling-cli/vscode.md). The extension provides schema validation, auto-completion, hover documentation, and diagnostics for your pipeline configurations, making it easier to author and debug complex workflows.
{% endhint %}

## Pipeline Configuration

A pipeline is defined in YAML format with a `steps` key at the root level containing an array of steps. Each step supports the same types and configurations as [Hooks](/concepts/hooks.md).

```yaml
steps:
  - type: log
    message: "Starting pipeline execution"

  - type: replication
    path: path/to/replication.yaml
    id: my_replication

  - type: query
    if: state.my_replication.status == "success"
    connection: my_database
    query: "UPDATE status SET completed = true"

env:
  MY_KEY: VALUE
```

## Available Step Types

Pipelines support all the same types as Hooks:

| Step Type   | Description                                                | Documentation                                      |
| ----------- | ---------------------------------------------------------- | -------------------------------------------------- |
| Check       | Validate conditions and control flow                       | [Check Step](/concepts/hooks/check.md)             |
| Command     | Run any command/process                                    | [Command Step](/concepts/hooks/command.md)         |
| Copy        | Transfer files between local or remote storage connections | [Copy Step](/concepts/hooks/copy.md)               |
| Delete      | Remove files from local or remote storage connections      | [Delete Step](/concepts/hooks/delete.md)           |
| Group       | Run sequences of steps or loop over values                 | [Group Step](/concepts/hooks/group.md)             |
| HTTP        | Make HTTP requests to external services                    | [HTTP Step](/concepts/hooks/http.md)               |
| Inspect     | Inspect a file or folder                                   | [Inspect Step](/concepts/hooks/inspect.md)         |
| List        | List files in folder                                       | [List Step](/concepts/hooks/list.md)               |
| Log         | Output custom messages and create audit trails             | [Log Step](/concepts/hooks/log.md)                 |
| Query       | Execute SQL queries against any defined connection         | [Query Step](/concepts/hooks/query.md)             |
| Read        | Read contents of files from storage connections            | [Read Step](/concepts/hooks/read.md)               |
| Replication | Run a Replication                                          | [Replication Step](/concepts/hooks/replication.md) |
| Routine     | Execute reusable step sequences from external files        | [Routine Step](/concepts/hooks/routine.md)         |
| Store       | Store values for later in-process access                   | [Store Step](/concepts/hooks/store.md)             |
| Write       | Write content to files in storage connections              | [Write Step](/concepts/hooks/write.md)             |

## Common Step Properties

Each step shares the same common properties as hooks:

| Property     | Description                                                                       | Required                 |
| ------------ | --------------------------------------------------------------------------------- | ------------------------ |
| `type`       | The type of step (`query`/ `http`/ `check`/ `copy` / `delete`/ `log` / `inspect`) | Yes                      |
| `if`         | Optional condition to determine if the step should execute                        | No                       |
| `id`         | A specific identifier to refer to the step output data                            | No                       |
| `on_failure` | What to do if the step fails (see [On Failure Behaviors](#on-failure-behaviors))  | No (defaults to `abort`) |

### On Failure Behaviors

The `on_failure` property controls what happens **when a step fails**. It only takes effect if the step errors — on success it has no impact. The following values are supported:

| Value             | Behavior                                                                                                                                                                         |
| ----------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `abort` (default) | Stops the pipeline immediately and fails the run with the error. This is the behavior when `on_failure` is not set.                                                              |
| `warn`            | Logs a warning with the error message and **continues to the next step**. The run is not failed.                                                                                 |
| `quiet`           | Silently swallows the error (no log output) and **continues to the next step**. The run is not failed.                                                                           |
| `break`           | Stops the current sequence of steps gracefully (without failing the run) and moves on. Useful to stop early without marking the pipeline as failed.                              |
| `retry`           | Retries the failed step once. If it fails again, the error is raised.                                                                                                            |
| `defer`           | Only meaningful inside a [group](/concepts/hooks/group.md): records the error but lets the remaining steps in the group finish, then surfaces the error at the end of the group. |

## Variables Available

Pipeline steps have access to the runtime state which includes various variables that can be referenced using curly braces `{variable}`. The available variables include:

* `runtime_state` - Contains all state variables available
* `env.*` - All variables defined in the `env`
* `timestamp.*` - Various timestamp parts information
* `steps.*` - Output data from previous steps (referenced by their `id`)
* `execution.*` - Execution-level context, including:
  * `execution.cli_args.*` - The flags passed on the command line when running the pipeline (see [Reading CLI Flags](#reading-cli-flags) below).

You can view all available variables by using a log step:

```yaml
steps:
  - type: log
    message: '{runtime_state}'
```

### Reading CLI Flags

Starting in v1.5.22, the `execution.cli_args` map exposes the flags passed to `sling run`, so a pipeline can adapt its behavior based on command-line input. The key is the flag's long name with hyphens replaced by underscores (e.g. `--src-conn` becomes `execution.cli_args.src_conn`).

```yaml
# sling run -p pipeline.yaml --streams tag:transactions --mode full-refresh --limit 5
steps:
  - type: log
    message: 'requested streams => {execution.cli_args.streams}'

  - type: check
    check: execution.cli_args.streams[0] == "tag:transactions"
    message: 'expected the transactions stream selector'
```

#### Available Keys

Only flags that are actually passed appear in `cli_args` (plus the boolean flags, which always appear defaulted to `0`). The valid keys are:

`replication`, `pipeline`, `directory`, `streams`, `select`, `primary_key`, `update_key`, `mode`, `limit`, `offset`, `range`, `where`, `src_conn`, `src_stream`, `src_options`, `tgt_conn`, `tgt_object`, `tgt_options`, `columns`, `transforms`, `env`, `cdc_options`, `debug`, `trace`, `stdout`, `examples`

The comma-separated flags `streams`, `primary_key` and `select` are always **arrays**, even when a single value is passed — index them to read individual entries (e.g. `execution.cli_args.streams[0]`).

#### Missing Keys

If a flag was **not** passed, its key is absent from `cli_args`. Referencing a missing key directly in a `check`/`if` expression raises an error (`object has no member "..."`), so use `jmespath` to read a possibly-missing key safely — it returns empty instead of erroring:

```yaml
  # safe: is_empty() is true when --where was not passed
  - check: is_empty(jmespath(execution.cli_args, "where"))

  # or test presence with keys()
  - check: contains(keys(execution.cli_args), "streams")
```

In a `log` message (or any rendered template), a missing key simply renders as the un-substituted `{execution.cli_args.where}` literal rather than erroring.

## Example Pipeline

Here's a complete example that demonstrates various pipeline capabilities:

```yaml
env:
  DATABASE: production
  NOTIFY_URL: https://api.example.com/webhook

steps:
  # Log the start of execution
  - type: log
    message: "Starting pipeline execution"

  # Run a replication
  - type: replication
    path: replications/daily_sync.yaml
    id: daily_sync
    on_failure: warn

  # Validate the results
  - type: check
    check: state.daily_sync.status == "success"
    message: "Daily sync failed"
    on_failure: abort

  # Update status in database
  - type: query
    connection: "{env.DATABASE}"
    query: |
      UPDATE pipeline_status 
      SET last_run = current_timestamp
      WHERE name = 'daily_sync'
    on_failure: warn

  # Send notification
  - type: http
    url: "{env.NOTIFY_URL}"
    method: POST
    payload: |
      {
        "pipeline": "daily_sync",
        "status": "success",
      }

  # Log completion
  - type: log
    message: "Pipeline completed successfully"
```

## Best Practices

1. **Error Handling**: Use appropriate `on_failure` behaviors for each step
2. **Validation**: Include check steps to validate critical conditions
3. **Logging**: Add log steps for better observability
4. **Modularity**: Break down complex operations into multiple steps
5. **Conditions**: Use `if` conditions to control step execution
6. **Variables**: Leverage environment variables and runtime state for dynamic configuration
7. **Identifiers**: Use meaningful `id`s for steps when you need to reference their output later

## Running a Pipeline

You can run a pipeline using the Sling CLI:

```bash
sling run --pipeline path/to/pipeline.yaml
```


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.slingdata.io/concepts/pipeline.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.