Honeydew support for Databricks is currently in Beta.Please contact [email protected] to get access to Databricks integration.

Databricks Integration Setup

Honeydew requires access to Databricks in order to operate. You have two options to set up Databricks access - either using a central org-level connection parameters or map your individual Databricks user credentials to Honeydew. If you would like to use a central org-level connection, it is advised to create a new dedicated service principal for Honeydew integration. The following Databricks connection parameters are required for Honeydew setup:

Server Hostname
HTTP Path
Catalog (for Unity Catalog environments)

Authentication Methods

Honeydew supports the following authentication methods for Databricks:

OAuth (M2M) authentication

This is the recommended method for org-level service accounts. OAuth provides stronger security through automated token generation and refresh, with each access token valid for one hour. For this method, you will need to provide a Client ID and Client Secret.

Create a service principal

If you do not already have a service principal, create one in your Databricks workspace:

Log in to your Databricks workspace as an account admin
Navigate to Settings > Identity and access
Click the Service principals tab
Click Add service principal
Enter a name for the service principal (e.g., “Honeydew Integration”)
Click Add

Generate OAuth client credentials

To generate OAuth credentials for the service principal:

In the service principal details page, click the OAuth tab
Click Generate secret
Copy the Client ID and Client Secret immediately (the secret will not be shown again)
Store these credentials securely

The client secret is only displayed once. Store it in a secure location immediately after generation.

Grant workspace access

Grant the service principal access to your Databricks workspace:

Navigate to Settings > Identity and access > Service principals
Select your service principal
Click Workspace access
Add the service principal to the desired workspace(s)

Configure the Databricks connection in Honeydew

In Honeydew App settings page, configure the Databricks connection using the Client ID and Client Secret from the previous step, along with the Server Hostname and HTTP Path.

Personal Access Token (PAT) authentication

For this method, you will need to provide a generated access token.

Databricks strongly recommends using OAuth instead of PATs for enhanced security. PATs are considered legacy authentication and should only be used when OAuth is not available.

Generate a PAT in Databricks

To create a personal access token:

Log in to your Databricks workspace
Click your username in the top right corner
Select Settings > Developer
Click Access tokens
Click Generate new token
Enter a comment to describe the token’s purpose (e.g., “Honeydew”)
Optionally set a lifetime for the token, leave blank for no expiration (availability of non-expiring tokens depends on Databricks workspace policy)
Click Generate
Copy the token immediately (it will not be shown again)

For production use, it is recommended to set an appropriate token lifetime and implement a token rotation process

Configure the PAT in Honeydew

In Honeydew App settings page, configure the Databricks connection using the Personal Access Token generated in Databricks, along with the Server Hostname and HTTP Path.

OAuth User Authentication

This method allows individual users to authenticate with their Databricks credentials. Each user will need to authorize Honeydew to access their Databricks account using OAuth.

OAuth User Authentication requires account-level applications with redirect URIs. Honeydew will provide the redirect URIs upon enabling.

Connection Parameters

Server Hostname and HTTP Path

To get the connection details for your Databricks SQL warehouse or cluster:

Log in to your Databricks workspace
Navigate to SQL > SQL Warehouses (or Compute for clusters)
Click the target warehouse or cluster name
Go to the Connection Details tab
Copy the Server hostname and HTTP path

The HTTP path follows the format: sql/protocolv1/o/{workspace_id}/{warehouse_id}

You can find these connection details in the Connection Details tab of any SQL Warehouse or Cluster in your Databricks workspace.

Unity Catalog

If you are using Unity Catalog, you will need to specify:

Catalog - the catalog where Honeydew will access and deploy dynamic datasets
Schema - the schema where Honeydew will deploy dynamic datasets as views or tables
Dev Catalog - the catalog for dev branch deployments (optional, defaults to main catalog)
Dev Schema - the schema for dev branch deployments

Unity Catalog uses a three-level namespace: catalog.schema.table. Ensure your service principal or user has appropriate permissions to access these resources.

Allowing Honeydew Client IP Addresses

If you have IP-based access restrictions in Databricks, add the IP addresses displayed in the Databricks connection screen in Honeydew App settings page to the “Allowed IP Addresses” list.

For the Honeydew Cloud deployment, the following IP addresses are used:

34.86.209.90
34.145.147.92

If you are using a private Honeydew deployment, the IP addresses will be different. You can find them in the Databricks connection screen in Honeydew App settings page.

Permissions

Honeydew does not extract or store your data. It only reads schema metadata and executes SQL queries inside your Databricks environment. You can find more security-related information here.

Required Permissions

If using a service principal or user account, the following permissions are required:

For Unity Catalog Environments

USE CATALOG on catalogs used in the semantic layer
USE SCHEMA on schemas used in the semantic layer
SELECT on tables/views used in the semantic layer
CREATE TABLE and CREATE VIEW on the schema where dynamic datasets will be deployed

Example SQL commands to grant permissions:

-- Grant catalog usage
GRANT USE CATALOG ON CATALOG <catalog_name> TO `<service_principal_name>`;

-- Grant schema usage
GRANT USE SCHEMA ON SCHEMA <catalog_name>.<schema_name> TO `<service_principal_name>`;

-- Grant select on all tables in a schema
GRANT SELECT ON ALL TABLES IN SCHEMA <catalog_name>.<schema_name> TO `<service_principal_name>`;
-- Or

-- Grant select on specific tables in a schema
GRANT SELECT ON TABLE <catalog_name>.<schema_name>.<table_name> TO `<service_principal_name>`;

-- Grant create permissions for dynamic datasets
GRANT CREATE TABLE ON SCHEMA <catalog_name>.<schema_name> TO `<service_principal_name>`;
GRANT CREATE VIEW ON SCHEMA <catalog_name>.<schema_name> TO `<service_principal_name>`;

Tracking Honeydew Queries in Databricks

You can track and monitor queries executed by Honeydew in Databricks using query tags and the system tables.

Query Tag Format

All Honeydew queries include a query tag with the following JSON format:

{
  "application": "Honeydew",
  "workspace": "some_workspace",
  "branch": "branch_name",
  "user": "[email protected]",
  "client": "Honeydew Server"
}

The query tag contains:

application: Always set to “Honeydew”
workspace: The Honeydew workspace name
branch: The Honeydew workspace branch being used (e.g., “dev”, “prod”)
user: The Honeydew user identifier (usually email address)
client: The client name, usually “Honeydew Server” for server-side operations

Tracking Methods

You can track Honeydew queries using any of the following approaches:

1. By Service Principal or User

If you’re using dedicated service principals or users for Honeydew integration, you can filter queries by these identifiers in the system.query.history table (Unity Catalog) or the query history UI.

2. By SQL Warehouse or Cluster

If you’re using dedicated SQL warehouses or clusters for Honeydew operations, you can filter by warehouse or cluster name in the query history.

3. By Query Tag (Recommended)

The most comprehensive method is to filter by the Honeydew query tag. This approach works regardless of your Databricks setup and allows you to track queries by workspace, branch, or user using the standardized query tag format. Example query to track Honeydew queries in Unity Catalog:

SELECT
  query_id,
  query_text,
  user_name,
  start_time,
  end_time,
  total_duration_ms
FROM system.query.history
WHERE query_tag LIKE '%"application": "Honeydew"%'
ORDER BY start_time DESC;

Quick Start

Migration Guides

Reference

Integrations

Security

Release Notes

Databricks

Databricks Integration Setup

Authentication Methods

OAuth (M2M) authentication

Personal Access Token (PAT) authentication

OAuth User Authentication

Connection Parameters

Server Hostname and HTTP Path

Unity Catalog

Allowing Honeydew Client IP Addresses

Permissions

Required Permissions

For Unity Catalog Environments

Tracking Honeydew Queries in Databricks

Query Tag Format

Tracking Methods

1. By Service Principal or User

2. By SQL Warehouse or Cluster

3. By Query Tag (Recommended)

Quick Start

Migration Guides

Reference

Integrations

Security

Release Notes

​Databricks Integration Setup

​Authentication Methods

​OAuth (M2M) authentication

​Personal Access Token (PAT) authentication

​OAuth User Authentication

​Connection Parameters

​Server Hostname and HTTP Path

​Unity Catalog

​Allowing Honeydew Client IP Addresses

​Permissions

​Required Permissions

​For Unity Catalog Environments

​Tracking Honeydew Queries in Databricks

​Query Tag Format

​Tracking Methods

​1. By Service Principal or User

​2. By SQL Warehouse or Cluster

​3. By Query Tag (Recommended)

Databricks Integration Setup

Authentication Methods

OAuth (M2M) authentication

Personal Access Token (PAT) authentication

OAuth User Authentication

Connection Parameters

Server Hostname and HTTP Path

Unity Catalog

Allowing Honeydew Client IP Addresses

Permissions

Required Permissions

For Unity Catalog Environments

Tracking Honeydew Queries in Databricks

Query Tag Format

Tracking Methods

1. By Service Principal or User

2. By SQL Warehouse or Cluster

3. By Query Tag (Recommended)