Databricks Integration Setup
Honeydew requires access to Databricks in order to operate. You have two options to set up Databricks access - either using a central org-level connection parameters or map your individual Databricks user credentials to Honeydew. If you would like to use a central org-level connection, it is advised to create a new dedicated service principal for Honeydew integration. The following Databricks connection parameters are required for Honeydew setup:- Server Hostname
- HTTP Path
- Catalog (for Unity Catalog environments)
Authentication Methods
Honeydew supports the following authentication methods for Databricks:OAuth (M2M) authentication
This is the recommended method for org-level service accounts. OAuth provides stronger security through automated token generation and refresh, with each access token valid for one hour. For this method, you will need to provide a Client ID and Client Secret.1
Create a service principal
If you do not already have a service principal, create one in your Databricks workspace:
- Log in to your Databricks workspace as an account admin
- Navigate to Settings > Identity and access
- Click the Service principals tab
- Click Add service principal
- Enter a name for the service principal (e.g., “Honeydew Integration”)
- Click Add
2
Generate OAuth client credentials
To generate OAuth credentials for the service principal:
- In the service principal details page, click the OAuth tab
- Click Generate secret
- Copy the Client ID and Client Secret immediately (the secret will not be shown again)
- Store these credentials securely
3
Grant workspace access
Grant the service principal access to your Databricks workspace:
- Navigate to Settings > Identity and access > Service principals
- Select your service principal
- Click Workspace access
- Add the service principal to the desired workspace(s)
4
Configure the Databricks connection in Honeydew
In Honeydew App settings page,
configure the Databricks connection using the Client ID and Client Secret
from the previous step, along with the Server Hostname and HTTP Path.
Personal Access Token (PAT) authentication
For this method, you will need to provide a generated access token.1
Generate a PAT in Databricks
To create a personal access token:
- Log in to your Databricks workspace
- Click your username in the top right corner
- Select Settings > Developer
- Click Access tokens
- Click Generate new token
- Enter a comment to describe the token’s purpose (e.g., “Honeydew”)
- Optionally set a lifetime for the token, leave blank for no expiration (availability of non-expiring tokens depends on Databricks workspace policy)
- Click Generate
- Copy the token immediately (it will not be shown again)
2
Configure the PAT in Honeydew
In Honeydew App settings page,
configure the Databricks connection using the Personal Access Token
generated in Databricks, along with the Server Hostname and HTTP Path.
OAuth User Authentication
This method allows individual users to authenticate with their Databricks credentials. Each user will need to authorize Honeydew to access their Databricks account using OAuth.OAuth User Authentication requires account-level applications with redirect URIs.
Honeydew will provide the redirect URIs upon enabling.
Connection Parameters
Server Hostname and HTTP Path
To get the connection details for your Databricks SQL warehouse or cluster:- Log in to your Databricks workspace
- Navigate to SQL > SQL Warehouses (or Compute for clusters)
- Click the target warehouse or cluster name
- Go to the Connection Details tab
- Copy the Server hostname and HTTP path
sql/protocolv1/o/{workspace_id}/{warehouse_id}
Unity Catalog
If you are using Unity Catalog, you will need to specify:- Catalog - the catalog where Honeydew will access and deploy dynamic datasets
- Schema - the schema where Honeydew will deploy dynamic datasets as views or tables
- Dev Catalog - the catalog for dev branch deployments (optional, defaults to main catalog)
- Dev Schema - the schema for dev branch deployments
Unity Catalog uses a three-level namespace:
catalog.schema.table.
Ensure your service principal or user has appropriate permissions to access these resources.Allowing Honeydew Client IP Addresses
If you have IP-based access restrictions in Databricks, add the IP addresses displayed in the Databricks connection screen in Honeydew App settings page to the “Allowed IP Addresses” list.For the Honeydew Cloud deployment, the following IP addresses are used:
34.86.209.9034.145.147.92
Permissions
Honeydew does not extract or store your data. It only reads schema metadata and executes SQL queries inside your Databricks environment. You can find more security-related information here.Required Permissions
If using a service principal or user account, the following permissions are required:For Unity Catalog Environments
- USE CATALOG on catalogs used in the semantic layer
- USE SCHEMA on schemas used in the semantic layer
- SELECT on tables/views used in the semantic layer
- CREATE TABLE and CREATE VIEW on the schema where dynamic datasets will be deployed
Tracking Honeydew Queries in Databricks
You can track and monitor queries executed by Honeydew in Databricks using query tags and the system tables.Query Tag Format
All Honeydew queries include a query tag with the following JSON format:- application: Always set to “Honeydew”
- workspace: The Honeydew workspace name
- branch: The Honeydew workspace branch being used (e.g., “dev”, “prod”)
- user: The Honeydew user identifier (usually email address)
- client: The client name, usually “Honeydew Server” for server-side operations
Tracking Methods
You can track Honeydew queries using any of the following approaches:1. By Service Principal or User
If you’re using dedicated service principals or users for Honeydew integration, you can filter queries by these identifiers in thesystem.query.history table
(Unity Catalog) or the query history UI.