Introduction

Honeydew enables deep integration in Snowflake with a snowflake native application.

It can be installed from the Snowflake Marketplace Listing.

When the native app is installed, Honeydew API can be accessed directly from a Snowflake connection or the web interface, to do things such as:

  • Query data based on a semantic query
  • Generate Snowflake SQL query from a semantic query
  • Consume any metadata such as field names and descriptions
  • Update the semantic layer definitions

Installation

1

Honeydew account

Honeydew native application requires a Honeydew account. If you don’t have one yet, schedule a 20-min onboarding here. Review the initial setup documentation for a full list of setup steps.

2

Native App installation name

Note the name you have given to the application (i.e. SEMANTIC_LAYER_ENTERPRISE_EDITION) and replace SEMANTIC_LAYER_ENTERPRISE_EDITION in all examples below with the actual application name. You can find the names of the installed applications by running the following command:

SHOW APPLICATIONS;
3

Create API Key

In Honeydew application, navigate to the Settings page, and generate an API Key to be used for this integration. Copy the generated API Key and API Secret values.

4

Set API Credentials

Set the Honeydew API Credentials for the Honeydew application, replacing api_key and api_secret with the generated API Key values:

CALL SEMANTIC_LAYER_ENTERPRISE_EDITION.API.SET_API_CREDENTIALS('api_key', 'api_secret');
5

Create Honeydew API Integration

Create a Snowflake integration to the Honeydew API:

CREATE OR REPLACE EXTERNAL ACCESS INTEGRATION HONEYDEW_API_ACCESS_INTEGRATION
  ALLOWED_NETWORK_RULES = (
    SEMANTIC_LAYER_ENTERPRISE_EDITION.HONEYDEW_EXTERNAL_ACCESS.API_NETWORK_RULE,
    SEMANTIC_LAYER_ENTERPRISE_EDITION.HONEYDEW_EXTERNAL_ACCESS.AUTH_NETWORK_RULE)
  ALLOWED_AUTHENTICATION_SECRETS = (
    SEMANTIC_LAYER_ENTERPRISE_EDITION.HONEYDEW_EXTERNAL_ACCESS.HONEYDEW_API_HOSTNAME,
    SEMANTIC_LAYER_ENTERPRISE_EDITION.HONEYDEW_EXTERNAL_ACCESS.HONEYDEW_API_CLIENT_ID,
    SEMANTIC_LAYER_ENTERPRISE_EDITION.HONEYDEW_EXTERNAL_ACCESS.HONEYDEW_API_CLIENT_SECRET,
    SEMANTIC_LAYER_ENTERPRISE_EDITION.HONEYDEW_EXTERNAL_ACCESS.HONEYDEW_API_USERNAME_PASSWORD)
  ENABLED = TRUE;

This integration is used to allow the Honeydew native application to access the Honeydew API. The API_NETWORK_RULE and AUTH_NETWORK_RULE are automatically created by the Honeydew native application setup process. They are pointing to the following endpoints:

  • API_NETWORK_RULE: api.honeydew.cloud
  • AUTH_NETWORK_RULE: auth.honeydew.cloud

Make sure to name the integration HONEYDEW_API_ACCESS_INTEGRATION - do not change that name.

6

Grant Integration Access

Grant the Honeydew API integration to the Honeydew application:

GRANT USAGE ON INTEGRATION HONEYDEW_API_ACCESS_INTEGRATION TO APPLICATION SEMANTIC_LAYER_ENTERPRISE_EDITION;
7

Enable External Access

Enable the Honeydew API integration for all Honeydew code functions and procedures:

CALL SEMANTIC_LAYER_ENTERPRISE_EDITION.API.ENABLE_EXTERNAL_ACCESS();

You will receive the following result upon success:

[Row(anonymous block=None)]
8

Grant Native App Access

Optionally, grant native app access to any additional Snowflake roles:

GRANT APPLICATION ROLE SEMANTIC_LAYER_ENTERPRISE_EDITION.HONEYDEW_APP_PUBLIC TO ROLE MY_ROLE;

Only the ACCOUNTADMIN role has the CREATE INTEGRATION privilege by default. The privilege can be granted to additional roles as needed.

Upgrade

The Honeydew Snowflake Native Application is upgraded automatically when new versions are released.

Native App Access

Grant native app access to any additional Snowflake roles:

GRANT APPLICATION ROLE SEMANTIC_LAYER_ENTERPRISE_EDITION.HONEYDEW_APP_PUBLIC TO ROLE MY_ROLE;

Usage Examples

Note that most calls require to choose the workspace and working branch.

The convention here is to set those as variables, i.e.

SET WORKSPACE='tpch';
SET BRANCH='prod';

Workspace and Branches

List all workspaces and branches:

select * from table(SEMANTIC_LAYER_ENTERPRISE_EDITION.API.SHOW_WORKSPACES());

Create a new branch named branch for a given workspace:

select SEMANTIC_LAYER_ENTERPRISE_EDITION.API.CREATE_WORKSPACE_BRANCH($WORKSPACE, 'branch');

Reload a given workspace and branch:

select SEMANTIC_LAYER_ENTERPRISE_EDITION.API.RELOAD_WORKSPACE($WORKSPACE, $BRANCH);

Reload all existing workspaces:

select SEMANTIC_LAYER_ENTERPRISE_EDITION.API.RELOAD_ALL_WORKSPACES();

Reload a given workspace and branch for all users:

select SEMANTIC_LAYER_ENTERPRISE_EDITION.API.RELOAD_WORKSPACE_FOR_ALL_USERS($WORKSPACE, $BRANCH);

Reload all existing workspaces for all users:

select SEMANTIC_LAYER_ENTERPRISE_EDITION.API.RELOAD_ALL_WORKSPACES_FOR_ALL_USERS();

Schema

Parameters

List all global parameters in the given workspace and branch:

select * from table(SEMANTIC_LAYER_ENTERPRISE_EDITION.API.SHOW_GLOBAL_PARAMETERS($WORKSPACE, $BRANCH));

Entities

List all entities in the given workspace and branch:

select * from table(SEMANTIC_LAYER_ENTERPRISE_EDITION.API.SHOW_ENTITIES($WORKSPACE, $BRANCH));

List all entity relations in the given workspace and branch:

select * from table(SEMANTIC_LAYER_ENTERPRISE_EDITION.API.SHOW_RELATIONS($WORKSPACE, $BRANCH));

Fields Metadata

List all fields in the given workspace and branch:

select * from table(SEMANTIC_LAYER_ENTERPRISE_EDITION.API.SHOW_FIELDS($WORKSPACE, $BRANCH));

List all broken fields (fields with error) in the given workspace and branch:

select * from table(SEMANTIC_LAYER_ENTERPRISE_EDITION.API.SHOW_FIELDS($WORKSPACE, $BRANCH)) where error is not NULL;

List all fields in the given workspace and branch, for a specific domain:

select * from table(SEMANTIC_LAYER_ENTERPRISE_EDITION.API.SHOW_FIELDS($WORKSPACE, $BRANCH, 'domain_name'));

Fields include metrics and attributes

Fields Add/Update/Delete

Change attribute field of entity entity

select SEMANTIC_LAYER_ENTERPRISE_EDITION.API.CREATE_ATTRIBUTE($WORKSPACE, $BRANCH, 'entity', 'field', 'expression');
select SEMANTIC_LAYER_ENTERPRISE_EDITION.API.ALTER_ATTRIBUTE($WORKSPACE, $BRANCH, 'entity', 'field', 'expression');
select SEMANTIC_LAYER_ENTERPRISE_EDITION.API.DROP_ATTRIBUTE($WORKSPACE, $BRANCH, 'entity', 'field');

Change metric field of entity entity

select SEMANTIC_LAYER_ENTERPRISE_EDITION.API.CREATE_METRIC($WORKSPACE, $BRANCH, 'entity', 'field', 'expression')
select SEMANTIC_LAYER_ENTERPRISE_EDITION.API.ALTER_METRIC($WORKSPACE, $BRANCH, 'entity', 'field', 'expression')
select SEMANTIC_LAYER_ENTERPRISE_EDITION.API.DROP_METRIC($WORKSPACE, $BRANCH, 'entity', 'field')

Domains

List all domains in the given workspace and branch:

select * from table(SEMANTIC_LAYER_ENTERPRISE_EDITION.API.SHOW_DOMAINS($WORKSPACE, $BRANCH));

Queries

Get data from a semantic query

The following stored procedure allows to run a semantic query and get the resulting data.

It also possible to only generate the SQL (see next section).

You might need to grant access for the native application to the relevant data, for example:

GRANT USAGE ON DATABASE MY_DATABASE TO APPLICATION SEMANTIC_LAYER_ENTERPRISE_EDITION;
GRANT USAGE ON SCHEMA MY_DATABASE.MY_SCHEMA TO APPLICATION SEMANTIC_LAYER_ENTERPRISE_EDITION;
GRANT SELECT ON ALL TABLES IN SCHEMA MY_DATABASE.MY_SCHEMA TO APPLICATION SEMANTIC_LAYER_ENTERPRISE_EDITION;
GRANT SELECT ON ALL VIEWS IN SCHEMA MY_DATABASE.MY_SCHEMA TO APPLICATION SEMANTIC_LAYER_ENTERPRISE_EDITION;
call SEMANTIC_LAYER_ENTERPRISE_EDITION.API.SELECT_FROM_FIELDS(
        -- workspace / branch
        $WORKSPACE, $BRANCH,
        -- domain (can be NULL)
        'domain',
        -- dimensions - can be ad-hoc expressions
        ['entity.attr', ...],
        -- metrics - can be named or ad-hoc expressions
        ['entity.metric', 'SUM(entity.measure)', ...],
        -- filters - can be named attributes or ad-hoc expressions
        ['entity.attr', 'entity.attr > 0', 'entity.name like ''%cheese%''', ...],
        -- optional: transform SQL (e.g. add ORDER BY clause or LIMIT clause)
        'ORDER BY "entity.attr" LIMIT 10'
    );

If Honeydew parameters are used, their default values will be used. To control parameter values, generate the SQL and set parameter values within it - see the next section.

Attributes and metrics may either refer to named fields in the semantic layer, or to new ad-hoc calculations based on them.

Metric ad-hoc expressions can do anything a metric can do.

In particular may use qualifiers such as FILTER (WHERE ...) and GROUP BY (...) to create ad-hoc filtered and/or partial metrics.

All the filters apply.

To allow better performance optimizations, it is recommended to pass multiple filters that will all apply rather than a single one with multiple conditions and an AND between them.

When passing filters to the native app, may use both attributes and metrics for filtering.

Attributes filter data similar to how WHERE behaves in SQL (only rows that match the expression are returned)

When a metric is used in a filter (entity.count > 0) it will be grouped by attributes before filtering, similar to how HAVING behaves in SQL. Only rows that the metric aggregation matches are returned.

Generate SQL for a query

The following function generates SQL for an ad-hoc semantic query

select SEMANTIC_LAYER_ENTERPRISE_EDITION.API.GET_SQL_FOR_FIELDS(
        -- workspace / branch
        $WORKSPACE, $BRANCH,
        -- domain (can be NULL)
        'domain',
        -- dimensions - can be ad-hoc expressions
        ['entity.attr', ...],
        -- metrics - can be named or ad-hoc expressions
        ['entity.metric', 'SUM(entity.measure)', ...],
        -- filters - can be named attributes or ad-hoc expressions
        ['entity.attr', 'entity.attr > 0', 'entity.name like ''%cheese%''', ...],
        -- optional: transform SQL (e.g. add ORDER BY clause or LIMIT clause)
        'ORDER BY "entity.attr" LIMIT 10'
    );

Using parameters with generated SQL

If a query is using Honeydew parameters then they will be generated as session variables in the query, and can be set with SET.

This is typically used for automation, with code calling the API setting parameter values.

A sample Snowflake stored procedure that sets parameters can look like:


-- Build a wrapper that gets SQL from Honeydew and executes it
-- immediately with paramters coming from session variables
CREATE OR REPLACE PROCEDURE
  SELECT_FROM_FIELDS_WITH_SESSION_VARIABLES_AS_PARAMETERS(
        workspace string, branch string, domain string,
        attributes variant, metrics variant, filters variant)
RETURNS TABLE()
-- Pass session variables into the procedure
EXECUTE AS CALLER
AS
$$
DECLARE
  query VARCHAR;
  rs RESULTSET;
BEGIN
  -- Get the query from Honeydew
  SELECT HONEYDEW_APP.API.GET_SQL_FOR_FIELDS(
    :workspace, :branch, :domain, :attributes, :metrics, :filters) INTO :query;

  -- Run the query
  rs := (EXECUTE IMMEDIATE :query);

  RETURN TABLE(rs);
END;
$$
;

-- Set a Honeydew parameter called $PARAM
SET param=10;

-- Get data based on Honeydew generated SQL
CALL SELECT_FROM_FIELDS_WITH_SESSION_VARIABLES_AS_PARAMETERS(
        -- workspace / branch
        $WORKSPACE, $BRANCH,
        -- domain (can be NULL)
        'domain',
        -- dimensions - can be ad-hoc expressions
        ['entity.attr', ...],
        -- metrics - can be named or ad-hoc expressions
        ['entity.metric', 'SUM(entity.measure)', ...],
        -- filters - can be named attributes or ad-hoc expressions
        ['entity.attr', 'entity.attr > 0', 'entity.name like ''%cheese%''', ...]
);

BI SQL Interface wrapper

It is possible to use the native application as a wrapper to the live SQL interface:

Get data from ad-hoc SQL:

CALL SEMANTIC_LAYER_ENTERPRISE_EDITION.API.SELECT_FROM_QUERY($WORKSPACE, $BRANCH, 'select ... from world');

Get compiled SQL query from ad-hoc SQL:

SELECT SEMANTIC_LAYER_ENTERPRISE_EDITION.API.GET_SQL_FOR_QUERY($WORKSPACE, $BRANCH, 'select ... from world');

Dynamic Datasets

Metadata

List all dynamic datasets in the given workspace and branch:

select * from table(SEMANTIC_LAYER_ENTERPRISE_EDITION.API.SHOW_DYNAMIC_DATASETS($WORKSPACE, $BRANCH));

Add/Update/Delete

Add or update a dynamic dataset dataset in given workspace and branch, and for a specific domain.

call SEMANTIC_LAYER_ENTERPRISE_EDITION.API.ALTER_DYNAMIC_DATASET(
        -- workspace / branch
        $WORKSPACE, $BRANCH,
        -- dataset name
        'dataset',
        -- domain (can be NULL for a global-level dataset)
        'domain',
        -- dimensions - can be ad-hoc expressions
        ['entity.attr', ...],
        -- metrics - can be named or ad-hoc expressions
        ['entity.metric', 'SUM(entity.measure)', ...],
        -- filters - can be named attributes or ad-hoc expressions
        ['entity.attr', 'entity.attr > 0', ...]
    );

Delete a dynamic dataset dataset from a given workspace and branch

select SEMANTIC_LAYER_ENTERPRISE_EDITION.API.DROP_DYNAMIC_DATASET($WORKSPACE, $BRANCH, 'dataset');

Get Data or SQL

Get the data for a dynamic dataset dataset:

call SEMANTIC_LAYER_ENTERPRISE_EDITION.API.SELECT_FROM_DYNAMIC_DATASET($WORKSPACE, $BRANCH, 'dataset');

Get the SQL query for a dynamic dataset dataset:

select SEMANTIC_LAYER_ENTERPRISE_EDITION.API.GET_SQL_FOR_DYNAMIC_DATASET($WORKSPACE, $BRANCH, 'dataset');

Deployment

Deploy the dynamic dataset dataset according to its deployment settings:

select SEMANTIC_LAYER_ENTERPRISE_EDITION.API.DEPLOY_DYNAMIC_DATASET($WORKSPACE, $BRANCH, 'dataset');

Plaintext questions to data (using Cortex LLMs)

Ask the semantic layer plaintext questions that are translated to the correct Snowflake query:

select SEMANTIC_LAYER_ENTERPRISE_EDITION.API.ASK_QUESTION(
    workspace => $WORKSPACE,
    branch => $BRANCH,
    
    /* The plaintext data question to ask */
    question => 'show me revenue broken down by month and city, for the last 2 years',

    /* The following parameters are optional: */

    /* History of previous questions and answers, to support a chat experience.
     * The argument is an array of objects representing a conversation in chronological order.
     * Each object contains a role key and a content key.
     * The content value is a prompt or a response, depending on the role.
     * The role can be either 'user' or 'assistant'.
     */
    history => [
        {'role': 'user', 'content': 'My previous question'},
        {'role': 'assistant', 'content': 'The answer to the previous question'}
    ],

    domain => $MY_LLM_DOMAIN,   /* A dedicated domain, narrowing down the semantics for the llm */

    /* Any model that is supported by CORTEX.COMPLETE */
    cortex_llm_name => 'mistral-large2', /* default is 'mistral-large2' */

    /* The default results limit for the query */
    default_results_limit => 10000,  /* default is 10,000 */
    
    /* A value from 0 to 1 (inclusive) that controls the randomness of the output of the language model.
     * A higher temperature (for example, 0.7) results in more diverse and random output,
     * while a lower temperature (such as 0.2) makes the output more deterministic and focused.
     */
    temperature => 0.2,  /* Default is 0.2 */

    /* Sets the maximum number of output tokens in the response. Small values can result in truncated responses. */
    max_tokens => 8192  /* Default is 8192 */
);

Returns a JSON object containing the following keys:

  • llm_response: A descriptive response from the language model
  • llm_response_json: The JSON representation of the semantic query
  • perspective: A JSON representing the attributes, metrics and filters of the query
  • sql: The generated SQL query
  • error: An error message if an error was encountered

Only the mistral-large2 and llama3.1-405b models demonstrated adequate performance. Other models may not perform as expected.

Was this page helpful?