Order of Filtering and Computation
Every query on Honeydew computes needed metrics and applies user filters based on context.
The context of a query includes the domain it operates on the source it came from (SQL interface or a Dynamic Dataset).
The logical order of applying filters:
- Source data: as defined in source tables, including any custom SQL filters
- Domain source filters that filter out data from specific source tables, before any computation
- Entities and their calculated attributes: including calculated attributes that use filtered metrics that add their own filter
- Domain semantic filters that control accessible data across the semantic layer
- User query filters:
- Metrics, including filters of filtered metrics in use
To get the best performance, filters are pushed down and executed as early as possible without affecting results.The engine automatically places the filter execution at the earliest point in the query flow.
Example
Consider:
- A semantic schema with
lineitem, orders and customers entities
- A domain set up with:
- Source filters
orders.order_date >= '1990-01-01' and lineitem.status = 'F'
- Semantic filter
customers.c_nationkey = 10
- A user that asks for
orders.count with a query filter of orders.order_date >= '1996-01-01'
The following filters will apply logically:
- Select from
orders with source filter orders.order_date >= '1990-01-01'
- Select from
customers
- Apply semantic filter
customers.c_nationkey = 10:
- Filter the
customers entity and join it to orders
- Note that
orders gets the filter through its INNER join to filtered customers
- Apply query filter
orders.order_date >= '1996-01-01'
- Calculate
orders.count on the resulting filtered data
Note that while logically this is the order, physically the filter in step (5) orders.order_date >= '1996-01-01' will be actually executed at step (1), when selecting from orders.
The reason is that it can be safely pushed down earlier in the query execution without affecting results.
The source filter on lineitem did not apply as lineitem did not participate in the query and a source filter
only acts on its source entity when it is needed for producing the result.However, the semantic filter for customers was applied to the result by adding a join to customers.