Documentation Index
Fetch the complete documentation index at: https://honeydew.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
Pre-aggregated caches are typically rebuilt in full on each refresh. For large aggregates,
a full rebuild can be slow or expensive. Incremental updates let you compute only the changed
rows and merge them into the existing table.
How it works
The incremental pattern uses two dynamic datasets:
| Dataset | Purpose |
|---|
| Base aggregate | The permanent pre-aggregated table, deployed once in full |
| Delta aggregate | A copy of the base with an extra date filter - deployed on each incremental run |
After each delta deployment, you merge or append the delta rows into the base table
using a MERGE or INSERT INTO statement.
Setup
1. Create the base aggregate
Create a dynamic dataset with the dimension keys and metrics you want to cache.
Enable it as a pre-aggregate cache:
type: perspective
name: preagg_orders
owner: owner@example.com
attributes:
- date.date
- location.location_id
- menu.menu_item_id
metrics:
- order_detail.count
- order_detail.order_cost
- order_detail.revenue
filters: []
delivery:
use_for_cache: snowflake
snowflake:
enabled: true
name: <name of table>
database: <name of database>
schema: <name of schema>
target: table
type: perspective
name: preagg_orders
owner: owner@example.com
attributes:
- date.date
- location.location_id
- menu.menu_item_id
metrics:
- order_detail.count
- order_detail.order_cost
- order_detail.revenue
filters: []
delivery:
use_for_cache: databricks
databricks:
enabled: true
name: <name of table>
catalog: <name of catalog>
schema: <name of schema>
target: table
2. Create the delta aggregate
Duplicate the base aggregate and add a date filter for the incremental window
(e.g. last day or last 7 days). The delta is a staging table - do not set
use_for_cache on it.
type: perspective
name: preagg_orders_delta
owner: owner@example.com
attributes:
- date.date
- location.location_id
- menu.menu_item_id
metrics:
- order_detail.count
- order_detail.order_cost
- order_detail.revenue
filters:
- date.date >= CURRENT_DATE - 1
delivery:
snowflake:
enabled: true
name: <name of table>
database: <name of database>
schema: <name of schema>
target: table
type: perspective
name: preagg_orders_delta
owner: owner@example.com
attributes:
- date.date
- location.location_id
- menu.menu_item_id
metrics:
- order_detail.count
- order_detail.order_cost
- order_detail.revenue
filters:
- date.date >= CURRENT_DATE - 1
delivery:
databricks:
enabled: true
name: <name of table>
catalog: <name of catalog>
schema: <name of schema>
target: table
Both tables share the same schema. The delta is a filtered subset of the base:
Base aggregate (preagg_orders) - full history:
| date.date | location.location_id | menu.menu_item_id | order_detail.count | order_detail.order_cost | order_detail.revenue |
|---|
| 2024-01-01 | 101 | 5 | 40 | 300.00 | 500.00 |
| 2024-01-01 | 102 | 7 | 20 | 150.00 | 200.00 |
| 2024-01-02 | 101 | 5 | 35 | 280.00 | 450.00 |
| 2024-01-02 | 102 | 7 | 15 | 120.00 | 180.00 |
Delta (preagg_orders_delta) - yesterday only (date.date >= CURRENT_DATE - 1):
| date.date | location.location_id | menu.menu_item_id | order_detail.count | order_detail.order_cost | order_detail.revenue |
|---|
| 2024-01-03 | 101 | 5 | 50 | 400.00 | 600.00 |
| 2024-01-03 | 102 | 7 | 25 | 200.00 | 250.00 |
Base aggregate (preagg_orders) - after incremental run:
| date.date | location.location_id | menu.menu_item_id | order_detail.count | order_detail.order_cost | order_detail.revenue |
|---|
| 2024-01-01 | 101 | 5 | 40 | 300.00 | 500.00 |
| 2024-01-01 | 102 | 7 | 20 | 150.00 | 200.00 |
| 2024-01-02 | 101 | 5 | 35 | 280.00 | 450.00 |
| 2024-01-02 | 102 | 7 | 15 | 120.00 | 180.00 |
| 2024-01-03 | 101 | 5 | 50 | 400.00 | 600.00 |
| 2024-01-03 | 102 | 7 | 25 | 200.00 | 250.00 |
Do not set use_for_cache on the delta aggregate. If the delta is registered as a cache,
queries may be served from partial data.
3. Initial deployment
Deploy the base aggregate once to populate the full historical table, using the
Honeydew UI or deployment API.
Incremental update strategies
Both phases run entirely inside your orchestrator (Airflow, dbt, or similar):
- Deploy the delta - your orchestrator calls the Honeydew API to compute and
write the delta dataset to a staging table. See
Set up with ETL tools.
- Merge or append - your orchestrator then runs the SQL below to move the
delta rows into the base aggregate table.
| Strategy | When to use |
|---|
| Append | Source records are immutable after writing - no late-arriving updates |
| Merge | Source records can change after the fact - use a window wide enough to cover all possible late changes |
Append - new records only
Use this when source records are immutable after they are written (e.g. event logs where
past dates never change).
1. Call the Honeydew API to compute and write the delta dataset for yesterday.
2. Your orchestration tool runs:
INSERT INTO preagg_orders
SELECT * FROM preagg_orders_delta;
Merge - backfill and append
Use this when source records can change after the fact (e.g. orders that are updated
for up to 7 days). Widen the delta filter to cover the backfill window, then use
MERGE to update existing rows and insert new ones.
1. Update the filter in preagg_orders_delta to date.date >= CURRENT_DATE - 7.
2. Call the Honeydew API to compute and write the delta dataset.
3. Your orchestration tool executes:
-- List all columns explicitly - wildcards are not supported in MERGE statements
MERGE INTO preagg_orders target
USING preagg_orders_delta delta
ON (
-- Match condition: identify existing rows by their dimension keys
-- Use IS NOT DISTINCT FROM so NULL keys match correctly
target."date.date" IS NOT DISTINCT FROM delta."date.date"
AND target."location.location_id" IS NOT DISTINCT FROM delta."location.location_id"
AND target."menu.menu_item_id" IS NOT DISTINCT FROM delta."menu.menu_item_id"
)
-- Update metrics only - dimensions are assumed stable and are used as the match key
WHEN MATCHED THEN UPDATE SET
target."order_detail.count" = delta."order_detail.count",
target."order_detail.order_cost" = delta."order_detail.order_cost",
target."order_detail.revenue" = delta."order_detail.revenue"
-- Insert the full row (dimensions + metrics) for new dimension combinations
WHEN NOT MATCHED THEN INSERT (
"date.date",
"location.location_id",
"menu.menu_item_id",
"order_detail.count",
"order_detail.order_cost",
"order_detail.revenue"
) VALUES (
delta."date.date",
delta."location.location_id",
delta."menu.menu_item_id",
delta."order_detail.count",
delta."order_detail.order_cost",
delta."order_detail.revenue"
);
-- List all columns explicitly - wildcards are not supported in MERGE statements
MERGE INTO preagg_orders target
USING preagg_orders_delta delta
ON (
-- Use IS NOT DISTINCT FROM so NULL keys match correctly
target.`date.date` IS NOT DISTINCT FROM delta.`date.date`
AND target.`location.location_id` IS NOT DISTINCT FROM delta.`location.location_id`
AND target.`menu.menu_item_id` IS NOT DISTINCT FROM delta.`menu.menu_item_id`
)
-- Update metrics only - dimensions are assumed stable and are used as the match key
WHEN MATCHED THEN UPDATE SET
target.`order_detail.count` = delta.`order_detail.count`,
target.`order_detail.order_cost` = delta.`order_detail.order_cost`,
target.`order_detail.revenue` = delta.`order_detail.revenue`
-- Insert the full row (dimensions + metrics) for new dimension combinations
WHEN NOT MATCHED THEN INSERT (
`date.date`,
`location.location_id`,
`menu.menu_item_id`,
`order_detail.count`,
`order_detail.order_cost`,
`order_detail.revenue`
) VALUES (
delta.`date.date`,
delta.`location.location_id`,
delta.`menu.menu_item_id`,
delta.`order_detail.count`,
delta.`order_detail.order_cost`,
delta.`order_detail.revenue`
);
Honeydew column names follow the format entity.attribute. Use double quotes in
Snowflake ("date.date") and backticks in Databricks (`date.date`). List all
columns explicitly in MERGE statements - wildcards are not supported.
Use IS NOT DISTINCT FROM instead of = in the ON clause so that NULL dimension
values match correctly across rows.