Skip to main content
Honeydew support for Databricks is currently in Beta.Please contact [email protected] to get access to Databricks integration.

Databricks Integration Setup

Honeydew requires access to Databricks in order to operate. You have two options to set up Databricks access - either using a central org-level connection parameters or map your individual Databricks user credentials to Honeydew. If you would like to use a central org-level connection, it is advised to create a new dedicated service principal for Honeydew integration. The following Databricks connection parameters are required for Honeydew setup:
  1. Server Hostname
  2. HTTP Path
  3. Catalog (for Unity Catalog environments)

Authentication Methods

Honeydew supports the following authentication methods for Databricks:

OAuth (M2M) authentication

This is the recommended method for org-level service accounts. OAuth provides stronger security through automated token generation and refresh, with each access token valid for one hour. For this method, you will need to provide a Client ID and Client Secret.
1

Create a service principal

If you do not already have a service principal, create one in your Databricks workspace:
  1. Log in to your Databricks workspace as an account admin
  2. Navigate to Settings > Identity and access
  3. Click the Service principals tab
  4. Click Add service principal
  5. Enter a name for the service principal (e.g., “Honeydew Integration”)
  6. Click Add
2

Generate OAuth client credentials

To generate OAuth credentials for the service principal:
  1. In the service principal details page, click the OAuth tab
  2. Click Generate secret
  3. Copy the Client ID and Client Secret immediately (the secret will not be shown again)
  4. Store these credentials securely
The client secret is only displayed once. Store it in a secure location immediately after generation.
3

Grant workspace access

Grant the service principal access to your Databricks workspace:
  1. Navigate to Settings > Identity and access > Service principals
  2. Select your service principal
  3. Click Workspace access
  4. Add the service principal to the desired workspace(s)
4

Configure the Databricks connection in Honeydew

In Honeydew App settings page, configure the Databricks connection using the Client ID and Client Secret from the previous step, along with the Server Hostname and HTTP Path.

Personal Access Token (PAT) authentication

For this method, you will need to provide a generated access token.
Databricks strongly recommends using OAuth instead of PATs for enhanced security. PATs are considered legacy authentication and should only be used when OAuth is not available.
1

Generate a PAT in Databricks

To create a personal access token:
  1. Log in to your Databricks workspace
  2. Click your username in the top right corner
  3. Select Settings > Developer
  4. Click Access tokens
  5. Click Generate new token
  6. Enter a comment to describe the token’s purpose (e.g., “Honeydew”)
  7. Optionally set a lifetime for the token, leave blank for no expiration (availability of non-expiring tokens depends on Databricks workspace policy)
  8. Click Generate
  9. Copy the token immediately (it will not be shown again)
For production use, it is recommended to set an appropriate token lifetime and implement a token rotation process
2

Configure the PAT in Honeydew

In Honeydew App settings page, configure the Databricks connection using the Personal Access Token generated in Databricks, along with the Server Hostname and HTTP Path.

OAuth User Authentication

This method allows individual users to authenticate with their Databricks credentials. Each user will need to authorize Honeydew to access their Databricks account using OAuth.
OAuth User Authentication requires account-level applications with redirect URIs. Honeydew will provide the redirect URIs upon enabling.

Connection Parameters

Server Hostname and HTTP Path

To get the connection details for your Databricks SQL warehouse or cluster:
  1. Log in to your Databricks workspace
  2. Navigate to SQL > SQL Warehouses (or Compute for clusters)
  3. Click the target warehouse or cluster name
  4. Go to the Connection Details tab
  5. Copy the Server hostname and HTTP path
The HTTP path follows the format: sql/protocolv1/o/{workspace_id}/{warehouse_id}
You can find these connection details in the Connection Details tab of any SQL Warehouse or Cluster in your Databricks workspace.

Unity Catalog

If you are using Unity Catalog, you will need to specify:
  1. Catalog - the catalog where Honeydew will access and deploy dynamic datasets
  2. Schema - the schema where Honeydew will deploy dynamic datasets as views or tables
  3. Dev Catalog - the catalog for dev branch deployments (optional, defaults to main catalog)
  4. Dev Schema - the schema for dev branch deployments
Unity Catalog uses a three-level namespace: catalog.schema.table. Ensure your service principal or user has appropriate permissions to access these resources.

Allowing Honeydew Client IP Addresses

If you have IP-based access restrictions in Databricks, add the IP addresses displayed in the Databricks connection screen in Honeydew App settings page to the “Allowed IP Addresses” list.
For the Honeydew Cloud deployment, the following IP addresses are used:
  • 34.86.209.90
  • 34.145.147.92
If you are using a private Honeydew deployment, the IP addresses will be different. You can find them in the Databricks connection screen in Honeydew App settings page.

Permissions

Honeydew does not extract or store your data. It only reads schema metadata and executes SQL queries inside your Databricks environment. You can find more security-related information here.

Required Permissions

If using a service principal or user account, the following permissions are required:

For Unity Catalog Environments

  1. USE CATALOG on catalogs used in the semantic layer
  2. USE SCHEMA on schemas used in the semantic layer
  3. SELECT on tables/views used in the semantic layer
  4. CREATE TABLE and CREATE VIEW on the schema where dynamic datasets will be deployed
Example SQL commands to grant permissions:
-- Grant catalog usage
GRANT USE CATALOG ON CATALOG <catalog_name> TO `<service_principal_name>`;

-- Grant schema usage
GRANT USE SCHEMA ON SCHEMA <catalog_name>.<schema_name> TO `<service_principal_name>`;

-- Grant select on all tables in a schema
GRANT SELECT ON ALL TABLES IN SCHEMA <catalog_name>.<schema_name> TO `<service_principal_name>`;
-- Or

-- Grant select on specific tables in a schema
GRANT SELECT ON TABLE <catalog_name>.<schema_name>.<table_name> TO `<service_principal_name>`;

-- Grant create permissions for dynamic datasets
GRANT CREATE TABLE ON SCHEMA <catalog_name>.<schema_name> TO `<service_principal_name>`;
GRANT CREATE VIEW ON SCHEMA <catalog_name>.<schema_name> TO `<service_principal_name>`;

Tracking Honeydew Queries in Databricks

You can track and monitor queries executed by Honeydew in Databricks using query tags and the system tables.

Query Tag Format

All Honeydew queries include a query tag with the following JSON format:
{
  "application": "Honeydew",
  "workspace": "some_workspace",
  "branch": "branch_name",
  "user": "[email protected]",
  "client": "Honeydew Server"
}
The query tag contains:
  • application: Always set to “Honeydew”
  • workspace: The Honeydew workspace name
  • branch: The Honeydew workspace branch being used (e.g., “dev”, “prod”)
  • user: The Honeydew user identifier (usually email address)
  • client: The client name, usually “Honeydew Server” for server-side operations

Tracking Methods

You can track Honeydew queries using any of the following approaches:

1. By Service Principal or User

If you’re using dedicated service principals or users for Honeydew integration, you can filter queries by these identifiers in the system.query.history table (Unity Catalog) or the query history UI.

2. By SQL Warehouse or Cluster

If you’re using dedicated SQL warehouses or clusters for Honeydew operations, you can filter by warehouse or cluster name in the query history. The most comprehensive method is to filter by the Honeydew query tag. This approach works regardless of your Databricks setup and allows you to track queries by workspace, branch, or user using the standardized query tag format. Example query to track Honeydew queries in Unity Catalog:
SELECT
  query_id,
  query_text,
  user_name,
  start_time,
  end_time,
  total_duration_ms
FROM system.query.history
WHERE query_tag LIKE '%"application": "Honeydew"%'
ORDER BY start_time DESC;