Order of Filtering
Order of Filtering and Computation
Every query on Honeydew computes needed metrics and applies user filters based on context.
The context of a query includes the domain it operates on the source it came from (SQL interface or a Dynamic Dataset).
The logical order of applying filters:
- Source data: as defined in source tables, including any custom SQL filters
- Domain source filters that filter out data from specific source tables, before any computation
- Entities and their calculated attributes: including calculated attributes that use filtered metrics that add their own filter
- Domain semantic filters that control accessible data across the semantic layer
- User query filters:
- When using the SQL interface, filters in the SQL (
WHERE
andHAVING
) - When using Dynamic Datasets, filters in the Dynamic Dataset definition
- When using the SQL interface, filters in the SQL (
- Metrics, including filters of filtered metrics in use
To get the best performance, filters are pushed down and executed as early as possible without affecting results.
The engine automatically places the filter execution at the earliest point in the query flow.
Example
Consider:
- A semantic schema with
lineitem
,orders
andcustomers
entities - A domain set up with:
- Source filters
orders.order_date >= '1990-01-01'
andlineitem.status = 'F'
- Semantic filter
customers.c_nationkey = 10
- A user that asks for
orders.count
with a query filter oforders.order_date >= '1996-01-01'
The following filters will apply logically:
- Select from
orders
with source filterorders.order_date >= '1990-01-01'
- Select from
customers
- Apply semantic filter
customers.c_nationkey = 10
:- Filter the
customers
entity and join it toorders
- Note that
orders
gets the filter through its INNER join to filteredcustomers
- Filter the
- Apply query filter
orders.order_date >= '1996-01-01'
- Calculate
orders.count
on the resulting filtered data
Note that while logically this is the order, physically the filter in step (5) orders.order_date >= '1996-01-01'
will be actually executed at step (1), when selecting from orders
.
The reason is that it can be safely pushed down earlier in the query execution without affecting results.
The source filter on lineitem
did not apply as lineitem
did not participate in the query and a source filter
only acts on its source entity when it is needed for producing the result.
However, the semantic filter for customers
was applied to the result by adding a join to customers
.
Was this page helpful?