# Databricks

## Setup

The following credentials keys are accepted:

* `host` **(required)** -> The hostname of the Databricks workspace (e.g., `dbc-a1b2c3d4-e5f6.cloud.databricks.com`)
* `token` **(required)** -> The personal access token or password to access the instance
* `warehouse_id` **(required)** -> The SQL warehouse ID to connect to
* `http_path` (optional) -> The HTTP path for the connection (if not using warehouse\_id)
* `catalog` (optional) -> The initial catalog name to use in the session (default: `hive_metastore`)
* `schema` (optional) -> The initial schema name to use in the session (default: `default`)
* `port` (optional) -> The port number (default: `443`)
* `max_rows` (optional) -> Maximum number of rows fetched per request (default: `10000`)
* `internal_volume` (optional) -> Specifies a custom internal volume to use for bulk operations. If not provided, Sling will attempt to create a volume in the default schema named `SLING_SCHEMA.SLING_STAGING`.
* `timeout` (optional) -> Timeout in seconds for server query execution (no timeout by default)
* `user_agent_entry` (optional) -> Used to identify partners
* `ansi_mode` (optional) -> Boolean for ANSI SQL specification adherence (default: `false`)
* `timezone` (optional) -> Timezone setting (default: `UTC`)

### Using `sling conns`

Here are examples of setting a connection named `DATABRICKS`. We must provide the `type=databricks` property:

{% code overflow="wrap" %}

```bash
# Basic connection with warehouse
$ sling conns set DATABRICKS type=databricks host=<workspace-hostname> token=<access-token> warehouse_id=<warehouse-id>

# Connection with custom HTTP path
$ sling conns set DATABRICKS type=databricks host=<workspace-hostname> token=<access-token> http_path=<http-path>

# With catalog and schema
$ sling conns set DATABRICKS type=databricks host=<workspace-hostname> token=<access-token> warehouse_id=<warehouse-id> catalog=<catalog> schema=<schema>

# Or use url
$ sling conns set DATABRICKS url="databricks://token:<access-token>@<workspace-hostname>:443/sql/1.0/warehouses/<warehouse-id>?schema=<schema>"
```

{% endcode %}

### Environment Variable

See [here](https://docs.slingdata.io/sling-cli/environment#dot-env-file-.env.sling) to learn more about the `.env.sling` file.

{% code overflow="wrap" %}

```bash
export DATABRICKS='databricks://token:<access-token>@<workspace-hostname>:443/sql/1.0/warehouses/<warehouse-id>?schema=<schema>'

# use JSON format
export DATABRICKS_CONN='{ "type": "databricks", "host": "<workspace-hostname>", "token": "<access-token>", "warehouse_id": "<warehouse-id>", "schema": "<schema>" }'

# use YAML format (with new lines)
export DATABRICKS='
type: databricks
host: <workspace-hostname>
token: <access-token>
warehouse_id: <warehouse-id>
schema: <schema>
'
```

{% endcode %}

### Sling Env File YAML

See [here](https://docs.slingdata.io/sling-cli/environment#sling-env-file-env.yaml) to learn more about the sling `env.yaml` file.

```yaml
connections:
  DATABRICKS:
    type: databricks
    host: <workspace-hostname>
    token: <access-token>
    warehouse_id: <warehouse-id>
    schema: <schema>

  DATABRICKS_URL:
    url: "databricks://token:<access-token>@<workspace-hostname>:443/sql/1.0/warehouses/<warehouse-id>?catalog=<catalog>&schema=<schema>"\
```

If you are facing issues connecting, please reach out to us at <support@slingdata.io>, on [discord](https://discord.gg/q5xtaSNDvp) or open a Github Issue [here](https://github.com/slingdata-io/sling-cli/issues).
