Multi-Grain Tables

Introduction

A multi-grain table or a mixed-grain table mixes two or more such levels in the same structure. For instance:

A sales fact table that stores daily totals by store but also includes monthly totals in the same table.
A web analytics table where some events are logged at the session level and others at the page-view level.
A financial fact table that records both transaction-level entries and quarterly adjustments.

The grain of a table is the level of detail at which each row is stored. A fine grain example: one row per order line item (most detailed). A coarser grain example: one row per order or one row per day. In general, multi-grain data falls in two categories that lead to different modeling approaches:

Mixed-Grain: Each data point has a single grain, but they are mixed: some web events are at session-level and some are at page-view level.
Multi-Grain: Each data point has multiple grains: store sales are recorded both daily and monthly.

Why it matters

Complicates joins and aggregations: Queries can double-count if the analyst doesn’t know which grain they are working with.
Harder governance: Business logic like “one row = one transaction” no longer holds.

Honeydew can enforce single-grain consistency to avoid ambiguity.

Modeling Category 1: Different data, different grains (“Mixed-Grain”)

Each data point exists at a single grain, but grains differ across event types (e.g., session-level vs page-view-level events). To model different data recorded at different grains, separate grains to different fact tables. For example, for web analytics table that mixes session-level and page-view level events, separate to:

fact_sessions (grain = session)
fact_pageviews (grain = page view), typically connected to its session by session_id with a many:1 relationship.

Don’t denormalize into one wide table — it introduces double-counting traps and complex modeling.

Modeling Category 2: Same data, multiple versions (“Multi-Grain”)

Each data point exists at more than one grain (e.g., store sales recorded daily and monthly). The is a more complex case as that introduces duplicated data into the semantic data. Honeydew supports modeling multi-grain tables:

Ensure join consistency without row duplication.
Adjust metric logic by the chosen grain.
Hide the “multi-grain” data modeling complexity from the business user or AI agent.

Multiple grain tables for same data

While a multi-grain table is a single table with data at different grains, some data models have separate tables for the same data, each at a different grain, but not mixed together. For example:

sales_daily_fact: 1 row per store per day.
sales_monthly_fact: 1 row per store per month. The computation would include end-of-month adjustments, so it is not a simple transformation from sales_daily_fact.

To handle this case in Honeydew, transform multiple grain tables into a single multi-grain entity. Create an entity with Custom SQL that UNIONs the facts:

SELECT 'daily' as grain, * FROM sales_daily_fact
UNION ALL
SELECT 'monthly' as grain, * FROM sales_monthly_fact

Always use UNION ALL to union multiple grains rather than UNION for significantly faster performance

How to Handle Duplicate Grains

Consider a multi-grain fact fact_sales that records per store both the daily and the monthly sales volume, as indicated in the grain column. If naively counting per store sales volume as SUM(sales), the result would be wrong: it would double count daily and monthly totals. When operating with multi grain data, must always deduplicate it. There are two deduplication approaches:

At the domain level, applying to all calculations at once
At per-metric level, applying to only a particular metric

Domain-Level Deduplication

With domain-level deduplication, double counting is avoided by removing the duplicated data via a Domain Source Filter that enforces a choice of grain.

Domain-level deduplication ensures any single user query sees only one grain of the data - the one it chose to show.Different queries can operate at different grains, but no query can see multiple grains at once.

Create a table called dim_grain with the list of possible grains. That would be used by the user to choose current grain by filtering on it.
Create a source filter in a domain that enforces a specific grain is chosen:

type: domain
name: all

entities:
  - fact_sales
  - ...
  # Grain choice entity
  - dim_grain

# Ensure a single grain is always chosen in the fact - 'daily' by default
source_filters:
  - fact_sales.grain = GET_FIELD_SELECTION(dim_grain.grain, 'daily')

A metric that counts sales is simple:

fact_sales.total_sales = SUM(fact_sales.sales)

Since it runs after the source filter is applied, it only counts over a single version of the data. If different counting is required, can create more complex logic based on the currently chosen grain:

fact_sales.total_sales =
    CASE GET_FIELD_SELECTION(dim_grain.grain)
        -- simple summing at a daily grain
        WHEN 'daily' THEN SUM(fact_sales.sales)
        -- different logic for monthly grain - include adjustments
        WHEN 'monthly' THEN SUM(fact_sales.sales - fact_sales.monthly_adjustment)
    END

For more information about GET_FIELD_SELECTION, see conditional filtering.

Metric-Level Deduplication

With metric-level deduplication, double counting is avoided in metric definitions, but data itself stays duplicated.

Metric level deduplication does not reduce data in the fact entity.It requires attention in every metric. Summing up sales without a filter (by mistake) would otherwise result in duplicated data.

Use this approach when duplicated data is required for cross-grain computation and a domain level deduplication is not applicable. To create metric level deduplication use filtered metrics, such as:

fact_sales.total_daily_sales =
    SUM(fact_sales.sales) FILTER (WHERE fact_sales.grain = 'daily')

fact_sales.total_monthly_sales =
    SUM(fact_sales.sales) FILTER (WHERE fact_sales.grain = 'monthly')

To avoid multiple metrics with multiple names, can also create a single metric that enforces deduplication using conditional filtering:

-- This metric is computed at the 'daily' level, unless the user chooses 'monthly' by filtering on fact_sales.grain = 'monthly'
-- Regardless to whether the user chose a grain or not, it will always compute correctly.
fact_sales.total_sales =
   SUM(fact_sales.sales) FILTER (WHERE fact_sales.grain = GET_FIELD_SELECTION(fact_sales.grain, 'daily'))

This can be further extended to handle different logic for different grains, such as:

fact_sales.total_sales =
    CASE GET_FIELD_SELECTION(dim_grain.grain)
        -- simple summing at a daily grain
        WHEN 'daily' THEN SUM(fact_sales.sales) FILTER (WHERE fact_sales.grain = 'daily')
        -- different logic for monthly grain - include adjustments
        WHEN 'monthly' THEN SUM(fact_sales.sales - fact_sales.monthly_adjustment) FILTER (WHERE fact_sales.grain = 'monthly')
    END

For more information about GET_FIELD_SELECTION, see conditional filtering.

Quick Start

Migration Guides

Reference

Integrations

Security

Release Notes

Introduction

Modeling Category 1: Different data, different grains (“Mixed-Grain”)

Modeling Category 2: Same data, multiple versions (“Multi-Grain”)

Multiple grain tables for same data

How to Handle Duplicate Grains

Domain-Level Deduplication

Metric-Level Deduplication

Quick Start

Migration Guides

Reference

Integrations

Security

Release Notes

​Introduction

​Modeling Category 1: Different data, different grains (“Mixed-Grain”)

​Modeling Category 2: Same data, multiple versions (“Multi-Grain”)

​Multiple grain tables for same data

​How to Handle Duplicate Grains

​Domain-Level Deduplication

​Metric-Level Deduplication

Introduction

Modeling Category 1: Different data, different grains (“Mixed-Grain”)

Modeling Category 2: Same data, multiple versions (“Multi-Grain”)

Multiple grain tables for same data

How to Handle Duplicate Grains

Domain-Level Deduplication

Metric-Level Deduplication