# Pipelines

A Pipeline in Sling allows you to execute multiple steps in sequence. Each step can be a different type of operation, enabling you to create complex workflows by chaining together various actions like running replications, executing queries, making HTTP requests, and more.

{% hint style="success" %}
Sling Pipelines integrate seamlessly with the [Sling VSCode Extension](/sling-cli/vscode.md). The extension provides schema validation, auto-completion, hover documentation, and diagnostics for your pipeline configurations, making it easier to author and debug complex workflows.
{% endhint %}

## Pipeline Configuration

A pipeline is defined in YAML format with a `steps` key at the root level containing an array of steps. Each step supports the same types and configurations as [Hooks](/concepts/hooks.md).

```yaml
steps:
  - type: log
    message: "Starting pipeline execution"

  - type: replication
    path: path/to/replication.yaml
    id: my_replication

  - type: query
    if: state.my_replication.status == "success"
    connection: my_database
    query: "UPDATE status SET completed = true"

env:
  MY_KEY: VALUE
```

## Available Step Types

Pipelines support all the same types as Hooks:

| Step Type   | Description                                                | Documentation                                      |
| ----------- | ---------------------------------------------------------- | -------------------------------------------------- |
| Check       | Validate conditions and control flow                       | [Check Step](/concepts/hooks/check.md)             |
| Command     | Run any command/process                                    | [Command Step](/concepts/hooks/command.md)         |
| Copy        | Transfer files between local or remote storage connections | [Copy Step](/concepts/hooks/copy.md)               |
| Delete      | Remove files from local or remote storage connections      | [Delete Step](/concepts/hooks/delete.md)           |
| Group       | Run sequences of steps or loop over values                 | [Group Step](/concepts/hooks/group.md)             |
| HTTP        | Make HTTP requests to external services                    | [HTTP Step](/concepts/hooks/http.md)               |
| Inspect     | Inspect a file or folder                                   | [Inspect Step](/concepts/hooks/inspect.md)         |
| List        | List files in folder                                       | [List Step](/concepts/hooks/list.md)               |
| Log         | Output custom messages and create audit trails             | [Log Step](/concepts/hooks/log.md)                 |
| Query       | Execute SQL queries against any defined connection         | [Query Step](/concepts/hooks/query.md)             |
| Read        | Read contents of files from storage connections            | [Read Step](/concepts/hooks/read.md)               |
| Replication | Run a Replication                                          | [Replication Step](/concepts/hooks/replication.md) |
| Routine     | Execute reusable step sequences from external files        | [Routine Step](/concepts/hooks/routine.md)         |
| Store       | Store values for later in-process access                   | [Store Step](/concepts/hooks/store.md)             |
| Write       | Write content to files in storage connections              | [Write Step](/concepts/hooks/write.md)             |

## Common Step Properties

Each step shares the same common properties as hooks:

| Property     | Description                                                                       | Required                 |
| ------------ | --------------------------------------------------------------------------------- | ------------------------ |
| `type`       | The type of step (`query`/ `http`/ `check`/ `copy` / `delete`/ `log` / `inspect`) | Yes                      |
| `if`         | Optional condition to determine if the step should execute                        | No                       |
| `id`         | A specific identifier to refer to the step output data                            | No                       |
| `on_failure` | What to do if the step fails (`abort`/ `warn`/ `quiet`/`skip`)                    | No (defaults to `abort`) |

## Variables Available

Pipeline steps have access to the runtime state which includes various variables that can be referenced using curly braces `{variable}`. The available variables include:

* `runtime_state` - Contains all state variables available
* `env.*` - All variables defined in the `env`
* `timestamp.*` - Various timestamp parts information
* `steps.*` - Output data from previous steps (referenced by their `id`)

You can view all available variables by using a log step:

```yaml
steps:
  - type: log
    message: '{runtime_state}'
```

## Example Pipeline

Here's a complete example that demonstrates various pipeline capabilities:

```yaml
env:
  DATABASE: production
  NOTIFY_URL: https://api.example.com/webhook

steps:
  # Log the start of execution
  - type: log
    message: "Starting pipeline execution"

  # Run a replication
  - type: replication
    path: replications/daily_sync.yaml
    id: daily_sync
    on_failure: warn

  # Validate the results
  - type: check
    check: state.daily_sync.status == "success"
    message: "Daily sync failed"
    on_failure: abort

  # Update status in database
  - type: query
    connection: "{env.DATABASE}"
    query: |
      UPDATE pipeline_status 
      SET last_run = current_timestamp
      WHERE name = 'daily_sync'
    on_failure: warn

  # Send notification
  - type: http
    url: "{env.NOTIFY_URL}"
    method: POST
    payload: |
      {
        "pipeline": "daily_sync",
        "status": "success",
      }

  # Log completion
  - type: log
    message: "Pipeline completed successfully"
```

## Best Practices

1. **Error Handling**: Use appropriate `on_failure` behaviors for each step
2. **Validation**: Include check steps to validate critical conditions
3. **Logging**: Add log steps for better observability
4. **Modularity**: Break down complex operations into multiple steps
5. **Conditions**: Use `if` conditions to control step execution
6. **Variables**: Leverage environment variables and runtime state for dynamic configuration
7. **Identifiers**: Use meaningful `id`s for steps when you need to reference their output later

## Running a Pipeline

You can run a pipeline using the Sling CLI:

```bash
sling run --pipeline path/to/pipeline.yaml
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.slingdata.io/concepts/pipeline.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
