Documentation Index
Fetch the complete documentation index at: https://honeydew.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
Entity Caching
Entities may have calculated attributes that are expensive to compute - with aggregations, JOINs or large table scans.
By default, every access to a calculated attribute recomputes it. However, for most calculated attributes recomputation is not necessary.
Entity cache enables materializing entities, in order to avoid re-computation. The cache is updated upon triggers or according to a set schedule.
By employing calculated attributes for multi-step calculations can build complex ELT data transformations directly in Honeydew.
The cache is stored as tables in your Snowflake, and is recomputed based on configuration.
In addition to Snowflake tables, entities can also be cached in supported BI tools that have memory capacity for cached data.See Power BI data caching for entity caching in Power BI.
When entity cache is enabled, Honeydew will automatically leverage data in the cache when applicable.
Entity caches are singlular and are shared across domains and branches.
Configuration
Set up in Entity YAML schema cache delivery settings for Snowflake:
type: entity
# ... entity configuration
delivery:
# enable Snowflake as cache
use_for_cache: snowflake
snowflake:
enabled: true
# snowflake delivery settings (where the entity cache resides)
name: <name_of_table>
schema: <name_of_schema>
target: table/view
Configuration when using dbt as orchestrator
Set up in Entity YAML schema cache delivery settings for dbt:
type: entity
# ... entity configuration
delivery:
# enable dbt materialization as cache
use_for_cache: dbt
dbt:
enabled: true
# dbt settings (name of dbt model that creates the table in Snowflake)
dbt_model: name_of_model
Orchestration
Entity Cache refresh relies on external orchestration
(with dbt, Apache Airflow, or otherwise) or manual deployments.
Set up with Honeydew
Use the Honeydew deploy functionality to write the entity cache to a table from the UI.
For Snowflake, you can also use the
Native Application Deploy API:
select SEMANTIC_LAYER.API.DEPLOY_ENTITY(
-- workspace & branch
'workspace_name', 'branch_name',
-- entity name
'entity_name'
);
For Snowflake, use the
Native Application API to get SQL
for the entity cache:
select SEMANTIC_LAYER.API.GET_SQL_FOR_ENTITY(
-- workspace & branch
'workspace_name', 'branch_name',
-- entity name
'entity_name'
);
Create the table using that SQL in Snowflake. Honeydew uses the table update time
to detect cache validity.
Set up with dbt
To set up dbt as a cache orchestrator:
- In dbt, create an entity cache model by using the Honeydew dbt cache macro
- In dbt, use the
config macro to set up materialization settings such as clustering or dynamic tables
- In Honeydew, set the entity’s dbt delivery settings to the chosen dbt model name
For example, this can be the customers model in dbt:
-- Set up materialization parameters for cache
{{ config(materialized='table') }}
-- Set up any additional dependencies in dbt with
-- depends_on: {{ ref('upstream_parent_model') }}
-- Cache for customers entity
{{ honeydew.get_entity_sql('customers') }}
is_incremental() dbt function may be used in combination with the Honeydew SQL macro for incremental caches.
However, make sure to check whether the computation itself had changed between runs, to avoid mixing different version
of logic in the same table. See Incremental Aggregate Updates
for the recommended incremental update pattern.
See more in Honeydew dbt documentation