# Pipelines

A Pipeline in Sling allows you to execute multiple steps in sequence. Each step can be a different type of operation, enabling you to create complex workflows by chaining together various actions like running replications, executing queries, making HTTP requests, and more.

{% hint style="success" %}
Sling Pipelines integrate seamlessly with the [Sling VSCode Extension](https://docs.slingdata.io/sling-cli/vscode). The extension provides schema validation, auto-completion, hover documentation, and diagnostics for your pipeline configurations, making it easier to author and debug complex workflows.
{% endhint %}

## Pipeline Configuration

A pipeline is defined in YAML format with a `steps` key at the root level containing an array of steps. Each step supports the same types and configurations as [Hooks](https://docs.slingdata.io/concepts/hooks).

```yaml
steps:
  - type: log
    message: "Starting pipeline execution"

  - type: replication
    path: path/to/replication.yaml
    id: my_replication

  - type: query
    if: state.my_replication.status == "success"
    connection: my_database
    query: "UPDATE status SET completed = true"

env:
  MY_KEY: VALUE
```

## Available Step Types

Pipelines support all the same types as Hooks:

| Step Type   | Description                                                | Documentation                                                            |
| ----------- | ---------------------------------------------------------- | ------------------------------------------------------------------------ |
| Check       | Validate conditions and control flow                       | [Check Step](https://docs.slingdata.io/concepts/hooks/check)             |
| Command     | Run any command/process                                    | [Command Step](https://docs.slingdata.io/concepts/hooks/command)         |
| Copy        | Transfer files between local or remote storage connections | [Copy Step](https://docs.slingdata.io/concepts/hooks/copy)               |
| Delete      | Remove files from local or remote storage connections      | [Delete Step](https://docs.slingdata.io/concepts/hooks/delete)           |
| Group       | Run sequences of steps or loop over values                 | [Group Step](https://docs.slingdata.io/concepts/hooks/group)             |
| HTTP        | Make HTTP requests to external services                    | [HTTP Step](https://docs.slingdata.io/concepts/hooks/http)               |
| Inspect     | Inspect a file or folder                                   | [Inspect Step](https://docs.slingdata.io/concepts/hooks/inspect)         |
| List        | List files in folder                                       | [List Step](https://docs.slingdata.io/concepts/hooks/list)               |
| Log         | Output custom messages and create audit trails             | [Log Step](https://docs.slingdata.io/concepts/hooks/log)                 |
| Query       | Execute SQL queries against any defined connection         | [Query Step](https://docs.slingdata.io/concepts/hooks/query)             |
| Read        | Read contents of files from storage connections            | [Read Step](https://docs.slingdata.io/concepts/hooks/read)               |
| Replication | Run a Replication                                          | [Replication Step](https://docs.slingdata.io/concepts/hooks/replication) |
| Routine     | Execute reusable step sequences from external files        | [Routine Step](https://docs.slingdata.io/concepts/hooks/routine)         |
| Store       | Store values for later in-process access                   | [Store Step](https://docs.slingdata.io/concepts/hooks/store)             |
| Write       | Write content to files in storage connections              | [Write Step](https://docs.slingdata.io/concepts/hooks/write)             |

## Common Step Properties

Each step shares the same common properties as hooks:

| Property     | Description                                                                       | Required                 |
| ------------ | --------------------------------------------------------------------------------- | ------------------------ |
| `type`       | The type of step (`query`/ `http`/ `check`/ `copy` / `delete`/ `log` / `inspect`) | Yes                      |
| `if`         | Optional condition to determine if the step should execute                        | No                       |
| `id`         | A specific identifier to refer to the step output data                            | No                       |
| `on_failure` | What to do if the step fails (`abort`/ `warn`/ `quiet`/`skip`)                    | No (defaults to `abort`) |

## Variables Available

Pipeline steps have access to the runtime state which includes various variables that can be referenced using curly braces `{variable}`. The available variables include:

* `runtime_state` - Contains all state variables available
* `env.*` - All variables defined in the `env`
* `timestamp.*` - Various timestamp parts information
* `steps.*` - Output data from previous steps (referenced by their `id`)

You can view all available variables by using a log step:

```yaml
steps:
  - type: log
    message: '{runtime_state}'
```

## Example Pipeline

Here's a complete example that demonstrates various pipeline capabilities:

```yaml
env:
  DATABASE: production
  NOTIFY_URL: https://api.example.com/webhook

steps:
  # Log the start of execution
  - type: log
    message: "Starting pipeline execution"

  # Run a replication
  - type: replication
    path: replications/daily_sync.yaml
    id: daily_sync
    on_failure: warn

  # Validate the results
  - type: check
    check: state.daily_sync.status == "success"
    message: "Daily sync failed"
    on_failure: abort

  # Update status in database
  - type: query
    connection: "{env.DATABASE}"
    query: |
      UPDATE pipeline_status 
      SET last_run = current_timestamp
      WHERE name = 'daily_sync'
    on_failure: warn

  # Send notification
  - type: http
    url: "{env.NOTIFY_URL}"
    method: POST
    payload: |
      {
        "pipeline": "daily_sync",
        "status": "success",
      }

  # Log completion
  - type: log
    message: "Pipeline completed successfully"
```

## Best Practices

1. **Error Handling**: Use appropriate `on_failure` behaviors for each step
2. **Validation**: Include check steps to validate critical conditions
3. **Logging**: Add log steps for better observability
4. **Modularity**: Break down complex operations into multiple steps
5. **Conditions**: Use `if` conditions to control step execution
6. **Variables**: Leverage environment variables and runtime state for dynamic configuration
7. **Identifiers**: Use meaningful `id`s for steps when you need to reference their output later

## Running a Pipeline

You can run a pipeline using the Sling CLI:

```bash
sling run --pipeline path/to/pipeline.yaml
```
