Databricks
Connect & Ingest data from / to a Databricks database
Setup
The following credentials keys are accepted:
host
(required) -> The hostname of the Databricks workspace (e.g.,dbc-a1b2c3d4-e5f6.cloud.databricks.com
)token
(required) -> The personal access token or password to access the instancewarehouse_id
(required) -> The SQL warehouse ID to connect tohttp_path
(optional) -> The HTTP path for the connection (if not using warehouse_id)catalog
(optional) -> The initial catalog name to use in the session (default:hive_metastore
)schema
(optional) -> The initial schema name to use in the session (default:default
)port
(optional) -> The port number (default:443
)max_rows
(optional) -> Maximum number of rows fetched per request (default:10000
)timeout
(optional) -> Timeout in seconds for server query execution (no timeout by default)user_agent_entry
(optional) -> Used to identify partnersansi_mode
(optional) -> Boolean for ANSI SQL specification adherence (default:false
)timezone
(optional) -> Timezone setting (default:UTC
)
Using sling conns
sling conns
Here are examples of setting a connection named DATABRICKS
. We must provide the type=databricks
property:
# Basic connection with warehouse
$ sling conns set DATABRICKS type=databricks host=<workspace-hostname> token=<access-token> warehouse_id=<warehouse-id>
# Connection with custom HTTP path
$ sling conns set DATABRICKS type=databricks host=<workspace-hostname> token=<access-token> http_path=<http-path>
# With catalog and schema
$ sling conns set DATABRICKS type=databricks host=<workspace-hostname> token=<access-token> warehouse_id=<warehouse-id> catalog=<catalog> schema=<schema>
# Or use url
$ sling conns set DATABRICKS url="databricks://token:<access-token>@<workspace-hostname>:443/sql/1.0/warehouses/<warehouse-id>?schema=<schema>"
Environment Variable
export DATABRICKS='databricks://token:<access-token>@<workspace-hostname>:443/sql/1.0/warehouses/<warehouse-id>?schema=<schema>'
# use JSON format
export DATABRICKS_CONN='{ "type": "databricks", "host": "<workspace-hostname>", "token": "<access-token>", "warehouse_id": "<warehouse-id>", "schema": "<schema>" }'
# use YAML format (with new lines)
export DATABRICKS='
type: databricks
host: <workspace-hostname>
token: <access-token>
warehouse_id: <warehouse-id>
schema: <schema>
'
Sling Env File YAML
See here to learn more about the sling env.yaml
file.
connections:
DATABRICKS:
type: databricks
host: <workspace-hostname>
token: <access-token>
warehouse_id: <warehouse-id>
schema: <schema>
DATABRICKS_URL:
url: "databricks://token:<access-token>@<workspace-hostname>:443/sql/1.0/warehouses/<warehouse-id>?catalog=<catalog>&schema=<schema>"\
If you are facing issues connecting, please reach out to us at [email protected], on discord or open a Github Issue here.
Last updated