Entity Caching

Entities may have calculated attributes that are expensive to compute - with aggregations, JOINs or large table scans.

By default, every access to a calculated attribute recomputes it.

Entity cache enables materializing entities, in order to avoid re-computation. The cache is updated upon triggers or according to a set schedule.

The cache is stored as tables in your Snowflake, and is recomputed based on configuration.

When entity cache is enabled, Honeydew will automatically leverage data in the cache when applicable.

Entity Cache refresh orchestration requires dbt integration. Contact support@honeydew.ai for any other orchestration requirements.

Set up with dbt

To set up dbt as a cache orchestrator:

  1. In dbt, create an entity cache model by using the Honeydew dbt cache macro
  2. In dbt, use the config macro to set up materialization settings such as clustering or dynamic tables
  3. In Honeydew, set the entity’s dbt delivery settings to the chosen dbt model name

Set up a dbt source for the entity table in Honeydew, and Honeydew will maintain the reference.

For example, this can be the customers model in dbt:

-- Set up materialization parameters for cache
{{ config(materialized='table') }}

-- Set up any additional dependencies in dbt with
-- depends_on: {{ ref('upstream_parent_model') }}

-- Cache for customers entity
{{ get_honeydew_entity_sql('customers') }}

is_incremental() dbt function may be used in combination with the Honeydew SQL macro for incremental caches. However, make sure to check whether the computation itself had changed between runs, to avoid mixing different version of logic in the same table.