# Fabric

Microsoft Fabric is a modern data warehouse solution from Microsoft that uses OneLake storage for data persistence. Sling provides optimized bulk loading for Fabric using OneLake staging.

## Setup

The following credentials keys are accepted:

* `host` **(required)** -> The hostname of the Fabric warehouse (e.g., `xxx.datawarehouse.fabric.microsoft.com`)
* `database` **(required)** -> The warehouse or database name
* `user` (optional) -> The username to access the warehouse (not required when using Azure AD auth)
* `password` (optional) -> The password to access the warehouse (not required when using Azure AD auth)
* `port` (optional) -> The port of the instance. Default is `1433`.
* `schema` (optional) -> The default schema to use
* `fedauth` (optional) -> The Azure Active Directory authentication string. See [here](https://github.com/microsoft/go-mssqldb?tab=readme-ov-file#azure-active-directory-authentication) for more details. Accepted values: `ActiveDirectoryDefault`, `ActiveDirectoryIntegrated`, `ActiveDirectoryPassword`, `ActiveDirectoryInteractive`, `ActiveDirectoryMSI`, `ActiveDirectoryManagedIdentity`, `ActiveDirectoryApplication`, `ActiveDirectoryServicePrincipal`, `ActiveDirectoryServicePrincipalAccessToken`, `ActiveDirectoryDeviceCode`, `ActiveDirectoryAzCli`.
* `client_id` (optional) -> Application ID of an Azure service principal. Can also set via env var `AZURE_CLIENT_ID`
* `tenant_id` (optional) -> ID of the application's Microsoft Entra tenant. Can also set via env var `AZURE_TENANT_ID`
* `client_secret` (optional) -> A client secret generated for the App Registration. Can also set via env var `AZURE_CLIENT_SECRET`
* `client_certificate_path` (optional) -> Path to a PEM or PKCS12 certificate file including private key. Can also set via env var `AZURE_CLIENT_CERTIFICATE_PATH`
* `client_certificate_password` (optional) -> Password protecting the certificate file (PFX/PKCS12 only). Can also set via env var `AZURE_CLIENT_CERTIFICATE_PASSWORD`
* `ssh_tunnel` (optional) -> The URL of the SSH server you would like to use as a tunnel (example `ssh://user:password@db.host:22`)
* `ssh_private_key` (optional) -> The private key to use to access a SSH server (raw string or path to file).
* `ssh_passphrase` (optional) -> The passphrase to use to access a SSH server.

### ABFS / OneLake Configuration

For **bulk import operations**, Fabric uses OneLake staging via Azure Blob File System (ABFS). The following properties are required for bulk operations:

* `abfs_endpoint` **(required for bulk)** -> The OneLake endpoint (e.g., `onelake.dfs.fabric.microsoft.com`) or Azure Data Lake Storage Gen2 endpoint (e.g., `myaccount.dfs.core.windows.net`)
* `abfs_filesystem` **(required for bulk)** -> The workspace UUID identifier for staging files (OneLake) or container name (ADLS Gen2)
* `abfs_parent` **(required for bulk)** -> The lakehouse UUID with the folder path for staging files. This is typically a UUID with the `/Files` path, e.g.: `a6682eca-b677-40d0-85a0-71665example/Files`
* `format` (optional) -> File format for staging. Accepts `parquet` (default) or `csv`
* `copy_into_endpoint` (optional) -> Override the endpoint used in the COPY INTO SQL command. Useful when your staging endpoint differs from what Fabric Warehouse expects for reading. See [Troubleshooting](#copy-into-endpoint-issues) for details.

#### How to obtain ABFS values

Create a Lakehouse, and then get the URL for the `Files` folder.

<div align="center"><img src="https://3453272330-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-M93cpHl7B7NPZlDrubS%2Fuploads%2Fgit-blob-412f49b4b94b03355085ab36162fa4692b3efec2%2Ffabric-menu.png?alt=media" alt="Click Properties under the &#x60;Files&#x60; menu" width="400"></div>

<div align="center"><img src="https://3453272330-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-M93cpHl7B7NPZlDrubS%2Fuploads%2Fgit-blob-c10e1fa6558dde5fbed7b31a034b733fbfa16a72%2Ffabric-properties.png?alt=media" alt="Copy the URL" width="400"></div>

You will get a URL such as: `https://onelake.dfs.fabric.microsoft.com/8e5f41f1-2677-4e95-78d8-7cd1eexample/a6682eca-b677-40d0-85a0-71665example/Files`

The `abfs_filesystem` would be `8e5f41f1-2677-4e95-78d8-7cd1eexample`. The `abfs_parent` would be `a6682eca-b677-40d0-85a0-71665example/Files`.

**Authentication for ABFS** (choose one):

* `account_key` -> Storage account key for OneLake access
* `sas_svc_url` -> Shared Access Signature URL for OneLake
* `client_id`, `tenant_id`, `client_secret` -> Azure service principal with client secret
* `client_id`, `tenant_id`, `client_certificate_path` -> Azure service principal with certificate

{% hint style="info" %}
If ABFS properties are not provided, Sling will fall back to standard INSERT statements, which are much slower for large datasets. For optimal performance with bulk data loading, configure OneLake staging.
{% endhint %}

### Azure Environment Variables

When using service principal authentication (for both the SQL connection via `fedauth` and for ABFS/OneLake staging), Sling maps connection properties to the standard Azure SDK environment variables at runtime. This is how the Azure `DefaultAzureCredential` chain picks up your credentials.

| Connection Property           | Azure Env Var                       | Purpose                                              |
| ----------------------------- | ----------------------------------- | ---------------------------------------------------- |
| `client_id`                   | `AZURE_CLIENT_ID`                   | Application (client) ID of the App Registration      |
| `tenant_id`                   | `AZURE_TENANT_ID`                   | Microsoft Entra tenant (directory) ID                |
| `client_secret`               | `AZURE_CLIENT_SECRET`               | Client secret for the App Registration               |
| `client_certificate_path`     | `AZURE_CLIENT_CERTIFICATE_PATH`     | Path to PEM or PFX certificate file with private key |
| `client_certificate_password` | `AZURE_CLIENT_CERTIFICATE_PASSWORD` | Password for PFX certificate file (optional)         |

You can either:

1. Set these as connection properties in `env.yaml` (recommended) — Sling maps them to `AZURE_*` env vars automatically
2. Set the `AZURE_*` env vars directly in your shell environment — they will be picked up by `ActiveDirectoryDefault`

{% hint style="warning" %}
When using `fedauth: ActiveDirectoryDefault`, the Azure SDK tries credentials in order: environment variables, workload identity, managed identity, Azure CLI, Azure Developer CLI. Make sure the appropriate credentials are available for your environment.
{% endhint %}

### Additional Parameters

Sling uses the `go-mssqldb` library and thus will accept any parameters listed [here](https://github.com/denisenkom/go-mssqldb#connection-parameters-and-dsn). Some parameters that may be of interest:

* `encrypt` -> `strict`, `disable`, `false` or `true` - whether data between client and server is encrypted. For Fabric, typically set to `true`.
* `log` -> logging level (accepts `1`, `2`, `4`, `8`, `16`, `32`).
* `trusted_connection` -> `true` or `false` - whether to connect with a trusted connection using integrated security
* `trust_server_certificate` -> `true` or `false` - whether the server certificate is checked. For Fabric, typically set to `false`.
* `certificate` -> The file that contains the public key certificate of the CA that signed the server certificate.
* `hostname_in_certificate` -> Specifies the Common Name (CN) in the server certificate. Default value is the server host.
* `server_spn` -> The kerberos SPN (Service Principal Name) for the server. Default is MSSQLSvc/host:port.
* `driver` -> A way to override the sql driver to connect with. Default is `sqlserver`. Other option is `azuresql`.

### Kerberos Parameters

* `authenticator` - set this to `krb5` to enable kerberos authentication. If this is not present, the default provider would be `ntlm` for unix and `winsspi` for windows.
* `krb5_config_file` (optional) - path to kerberos configuration file. Defaults to `/etc/krb5.conf`. Can also be set using `KRB5_CONFIG` environment variable.
* `krb5_realm` (required with keytab and raw credentials) - Domain name for kerberos authentication. Omit this parameter if the realm is part of the user name like `username@REALM`.
* `krb5_keytab_file` - path to Keytab file. Can also be set using environment variable `KRB5_KTNAME`. If no parameter or environment variable is set, the `DefaultClientKeytabName` value from the krb5 config file is used.
* `krb5_cred_cache_file` - path to Credential cache. Can also be set using environment variable `KRB5CCNAME`.
* `krb5_dns_lookup_kdc` - Optional parameter in all contexts. Set to lookup KDCs in DNS. Boolean. Default is true.
* `krb5_udp_preference_limit` - Optional parameter in all contexts. 1 means to always use tcp. MIT krb5 has a default value of 1465, and it prevents user setting more than 32700. Integer. Default is 1.

Sling supports authentication via 3 methods. See [here](https://github.com/microsoft/go-mssqldb?tab=readme-ov-file#kerberos-parameters) for more details.

* Keytabs - Specify the username, keytab file, the krb5.conf file, and realm.

  ```
  authenticator: krb5
  krb5_realm: domain.com
  krb5_config_file: /etc/krb5.conf
  krb5_keytab_file: ~/user.keytab
  ```
* Credential Cache - Specify the krb5.conf file path and credential cache file path.

  ```
  authenticator=krb5
  krb5_config_file=/etc/krb5.conf
  krb5_cred_cache_file=~/user_cached_creds
  ```
* Raw credentials - Specify krb5.conf, Username, Password and Realm.

  ```
  authenticator=krb5
  krb5_realm=company.com
  krb5_config_file=/etc/krb5.conf
  ```

***

## Connection Examples

### Using `sling conns`

Here are examples of setting a connection named `FABRIC`. We must provide the `type=fabric` property:

{% code overflow="wrap" %}

```bash
# Basic connection with Azure AD authentication
$ sling conns set FABRIC type=fabric host=myworkspace-uuid.datawarehouse.fabric.microsoft.com database=mywarehouse fedauth=ActiveDirectoryAzCli encrypt=true trust_server_certificate=false

# With OneLake bulk staging
$ sling conns set FABRIC type=fabric host=myworkspace-uuid.datawarehouse.fabric.microsoft.com database=mywarehouse fedauth=ActiveDirectoryAzCli encrypt=true trust_server_certificate=false abfs_endpoint=onelake.dfs.fabric.microsoft.com abfs_filesystem=filesystem-uuid abfs_parent=lakehouse-uuid/Files

# Or use url
$ sling conns set FABRIC url="fabric://myworkspace-uuid.datawarehouse.fabric.microsoft.com?database=mywarehouse&encrypt=true&trust_server_certificate=false&fedauth=ActiveDirectoryAzCli"
```

{% endcode %}

### Environment Variable

See [here](https://docs.slingdata.io/sling-cli/environment#dot-env-file-.env.sling) to learn more about the `.env.sling` file.

{% code overflow="wrap" %}

```bash
# Basic connection
export FABRIC='fabric://myworkspace-uuid.datawarehouse.fabric.microsoft.com?database=mywarehouse&encrypt=true&trust_server_certificate=false&fedauth=ActiveDirectoryAzCli'

# With structured format
export FABRIC='{
  type: fabric,
  host: "myworkspace-uuid.datawarehouse.fabric.microsoft.com",
  database: "mywarehouse",
  trust_server_certificate: false,
  encrypt: true,
  fedauth: ActiveDirectoryAzCli
}'

# With OneLake bulk staging
export FABRIC='{
  type: fabric,
  host: "myworkspace-uuid.datawarehouse.fabric.microsoft.com",
  database: "mywarehouse",
  trust_server_certificate: false,
  encrypt: true,
  fedauth: ActiveDirectoryAzCli,
  account_key: "your_account_key_here",
  abfs_endpoint: "onelake.dfs.fabric.microsoft.com",
  abfs_filesystem: "filesystem-uuid",
  abfs_parent: "lakehouse-uuid/Files"
}'
```

{% endcode %}

### Sling Env File YAML

See [here](https://docs.slingdata.io/sling-cli/environment#sling-env-file-env.yaml) to learn more about the sling `env.yaml` file.

```yaml
connections:
  FABRIC:
    type: fabric
    host: myworkspace-uuid.datawarehouse.fabric.microsoft.com
    database: mywarehouse
    encrypt: true
    trust_server_certificate: false
    fedauth: ActiveDirectoryAzCli  # after running `az login` using Azure CLI Tool

  FABRIC_WITH_BULK_STAGING:
    type: fabric
    host: myworkspace-uuid.datawarehouse.fabric.microsoft.com
    database: mywarehouse
    encrypt: true
    trust_server_certificate: false
    fedauth: ActiveDirectoryAzCli

    # OneLake staging for bulk operations
    abfs_endpoint: onelake.dfs.fabric.microsoft.com
    abfs_filesystem: filesystem-uuid
    abfs_parent: lakehouse-uuid/Files
    format: parquet  # Optional: parquet (default) or csv

  FABRIC_WITH_SERVICE_PRINCIPAL:
    type: fabric
    host: myworkspace-uuid.datawarehouse.fabric.microsoft.com
    database: mywarehouse
    encrypt: true
    trust_server_certificate: false
    fedauth: ActiveDirectoryServicePrincipal
    # Service principal credentials
    client_id: your-client-id
    tenant_id: your-tenant-id
    client_secret: your-client-secret

    # OneLake staging
    abfs_endpoint: onelake.dfs.fabric.microsoft.com
    abfs_filesystem: filesystem-uuid
    abfs_parent: lakehouse-uuid/Files

  FABRIC_WITH_CERTIFICATE:
    type: fabric
    host: myworkspace-uuid.datawarehouse.fabric.microsoft.com
    database: mywarehouse
    encrypt: true
    trust_server_certificate: false
    fedauth: ActiveDirectoryServicePrincipal
    # Service principal with certificate
    client_id: your-client-id
    tenant_id: your-tenant-id
    client_certificate_path: /path/to/cert.pem

    # OneLake staging
    abfs_endpoint: onelake.dfs.fabric.microsoft.com
    abfs_filesystem: filesystem-uuid
    abfs_parent: lakehouse-uuid/Files

  FABRIC_WITH_ACCOUNT_KEY:
    type: fabric
    host: myworkspace-uuid.datawarehouse.fabric.microsoft.com
    database: mywarehouse
    encrypt: true
    trust_server_certificate: false
    fedauth: ActiveDirectoryDefault
    account_key: your_account_key_here
    
    abfs_endpoint: onelake.dfs.fabric.microsoft.com
    abfs_filesystem: filesystem-uuid
    abfs_parent: lakehouse-uuid/Files

  FABRIC_WITH_SAS:
    type: fabric
    host: myworkspace-uuid.datawarehouse.fabric.microsoft.com
    database: mywarehouse
    encrypt: true
    trust_server_certificate: false
    fedauth: ActiveDirectoryDefault
    sas_svc_url: "https://account.dfs.fabric.microsoft.com/?sv=2021-06-08&ss=..."

    # OneLake staging
    abfs_endpoint: onelake.dfs.fabric.microsoft.com
    abfs_filesystem: filesystem-uuid
    abfs_parent: lakehouse-uuid/Files

  # Using Azure Data Lake Storage Gen2 instead of OneLake
  FABRIC_WITH_ADLS_GEN2:
    type: fabric
    host: myworkspace-uuid.datawarehouse.fabric.microsoft.com
    database: mywarehouse
    encrypt: true
    trust_server_certificate: false
    fedauth: ActiveDirectoryServicePrincipal
    client_id: your-client-id
    tenant_id: your-tenant-id
    client_secret: your-client-secret

    # ADLS Gen2 staging (Sling auto-converts DFS to Blob for COPY INTO)
    abfs_endpoint: myaccount.dfs.core.windows.net
    abfs_filesystem: my-container
    abfs_parent: sling-staging

  FABRIC_KERBEROS:
    type: fabric
    host: myworkspace-uuid.datawarehouse.fabric.microsoft.com
    database: mywarehouse
    user: myuser
    password: mypassword
    authenticator: krb5
    krb5_config_file: /etc/krb5.conf
    krb5_realm: domain.com
    krb5_keytab_file: ~/MyUserName.keytab
```

## Bulk Import Operations

Fabric supports high-performance bulk loading using OneLake staging. When properly configured with ABFS properties, Sling will:

1. **Stage data to OneLake** - Write data files (Parquet or CSV) to your specified OneLake location
2. **Execute COPY INTO** - Load data from OneLake into Fabric warehouse using the optimized COPY INTO command
3. **Clean up** - Automatically remove staging files after successful load

### Performance Tips

* **Use Parquet format** - Parquet provides better compression and faster loads (default)
* **Configure OneLake staging** - Essential for large datasets (100K+ rows)
* **Use Azure AD authentication** - Recommended for production environments
* **Set appropriate file chunk sizes** - Use `file_max_rows` property to control staging file size (default: 500,000 rows)

### Example Replication with Bulk Loading

```yaml
source: POSTGRES
target: FABRIC

defaults:
  mode: full-refresh

streams:
  public.large_table:
    object: dbo.large_table
```

## Important Notes

* **Connection Type**: Use `type: fabric` (not `sqlserver`) for Fabric-specific optimizations
* **BCP Not Supported**: Unlike SQL Server, Fabric does not use the BCP utility for bulk loading
* **OneLake Staging**: Required for optimal bulk load performance
* **Azure AD Recommended**: Most Fabric deployments use Azure AD authentication
* **Endpoint Pattern**: Fabric warehouses use `.datawarehouse.fabric.microsoft.com` endpoints

## Troubleshooting

### Slow Bulk Loads

If bulk loads are slow, ensure ABFS properties are configured:

```yaml
abfs_endpoint: onelake.dfs.fabric.microsoft.com
abfs_filesystem: filesystem-uuid
```

Without these, Sling falls back to row-by-row inserts.

### Authentication Issues

For Azure AD authentication, ensure you've run:

```bash
az login
```

Or provide service principal credentials if using `ActiveDirectoryServicePrincipal`.

### Connection Errors

Verify your connection with:

```bash
sling conns test FABRIC --debug
```

### COPY INTO Endpoint Issues

When using Azure Data Lake Storage Gen2 (ADLS Gen2) as staging instead of OneLake, you may encounter this error:

```
mssql: Content of directory on path 'https://xxx.dfs.core.windows.net/...' cannot be listed.
```

This occurs because:

1. **File uploads** via ABFS require the `.dfs.core.windows.net` endpoint
2. **COPY INTO command** often works better with the `.blob.core.windows.net` endpoint (especially with Service Principal authentication)

**Manual Override**: If you need a different endpoint for COPY INTO, use the `copy_into_endpoint` property:

```yaml
connections:
  FABRIC:
    type: fabric
    host: myworkspace-uuid.datawarehouse.fabric.microsoft.com
    database: mywarehouse
    fedauth: ActiveDirectoryServicePrincipal
    client_id: your-client-id
    tenant_id: your-tenant-id
    client_secret: your-client-secret

    # ADLS Gen2 staging
    abfs_endpoint: myaccount.dfs.core.windows.net
    abfs_filesystem: my-container
    abfs_parent: staging/folder

    # Override endpoint for COPY INTO command (optional)
    copy_into_endpoint: myaccount.blob.core.windows.net
```

If you are facing issues connecting, please reach out to us at <support@slingdata.io>, on [discord](https://discord.gg/q5xtaSNDvp) or open a Github Issue [here](https://github.com/slingdata-io/sling-cli/issues).
