Domains
Creating Business Context
A user in Finance or Marketing on a semantic layer does not want or may not be allowed access to everything. They want to see semantics relevant to them, with their context applied.
That concept is called in Honeydew a “Domain”.
A Domain is a lightweight governance object that allows to build context for users, as well as maintain access control on data and on metadata.
That context includes:
- Selection of entities and sub-selection of fields, that are accessible when using the domain
- Selection of relationships when multiple ways to connect entities are available
- Selection of filters that must be applied on every query on the domain
- Selection of parameter overrides that apply in the domain context
Selecting entities
Domain allows to select entities that participate. For example, a domain that select a subset of entities from TPCH:
type: domain
name: orders_domain
description: Orders Tracing Domain
# Entities and fields that participate
entities:
- name: lineitem
- name: orders
- name: partsupp
- name: part
Filters
Domain may control how data is filtered.
Semantic Filters
Filters within a domain apply to every query on the domain. Think of them as filters every user must add to every query.
For example,
type: domain
name: orders_domain
description: Domestic Orders Tracing Domain
# Entities and fields that participate
entities:
- name: lineitem
- name: orders
- name: partsupp
- name: part
# Semantic Filters that always apply to every query on the domain
filters:
- lineitem.l_shipmode in ('MAIL', 'RAIL', 'TRUCK')
Semantic filters can be ad-hoc expressions (as in the example above) or reference named predefined filters.
A semantic filter will be always added to a query in a domain context. That means that semantic filters may have a perfomrance impact when they introduce more JOINs.
For example, this SQL query on the domain above that asks for part count:
SELECT AGG("part.count") FROM domains.orders_domain
will include a JOIN to lineitems
even though it was not directly referenced in the query to make sure only parts that where shipped over ground are included.
Filtering based on User Context
A common use case is filtering data based on a parameter that sets user context.
For example, if:
- Every user operates within a tenant context
- Data in Snowflake is partitioned by a tenant column (in the example below in a tenant dimension
dim_tenant
, though can be a column in every table) - Users are only allowed to see data within their own tenant
Set $TENANT
user parameter, and use it as a domain filter:
type: domain
# Entities that participate ..
filters:
- dim_tenant.tenant_id = $TENANT
Apply the parameter in the manner appropriate for the user queries (whether it is through BI connection settings or Snowflake SET
statements).
If filtering data through a filtering dimension, make sure it is cross filtering the data with a one-to-many
filtering direction.
Source Filters
Source filters are filters that are applied at the source level (unlike semantic filters that apply to the semantic layer as a whole).
The main use case for source filter is improving performance with logically partitioned data by always pushing filters below calculated attributes.
Thus, source filters are only supported on attributes that come from an entity source table.
A source filter is only applied if the source is part of the query.
type: domain
name: orders_domain
description: Domestic Orders Tracing Domain
# Entities and fields that participate
entities:
- name: lineitem
- name: orders
- name: partsupp
- name: part
source_filters:
- lineitem.l_shipdate >= '1994-01-01' and lineitem.l_shipdate < '1995-01-01'
Caution: Source filters apply before any other computation is done, which can change the values of calculated attributed. See filtering order for more details.
When in doubt, use a semantic filter, not a source filter.
Roles
Domain may control active relationships.
Active Roles
When there are multiple possible ways to connect entities (for example in TPCH, a nation might be connected to supplier or customer) domain allows to select the active role to use.
Active roles enable to choose a relation for entities that have multiple relations. But what if want to enable easily using more than one relation path? See aliases below.
Example (based on TPCH derived example) - include all entities relevant to supply chain, and connect nation and region always through supplier.
type: domain
name: tpch_supply_chain
entities:
- name: lineitem
- name: orders
- name: partsupp
- name: part
- name: nation
- name: region
active_roles:
- supplier_nation
- Active roles must result in an acyclic graph
- If an entity that participates in the domain can assume more than one role, at most one of them may be set
Roles Aliases
Role Aliases in domains is in Beta - reach out to support@honeydew.ai to enable in your account
When a domain includes the same entity that connected via multiple relationships, may give an alias for each chosen role.
type: domain
name: tpch_example_when_nation_is_supplier
entities:
- name: nation
role_path: [supplier_nation]
alias: supplier_nation
- name: region
role_path: [supplier_nation]
alias: supplier_region
- name: nation
role_path: [customer_nation]
alias: customer_nation
- name: region
role_path: [customer_nation]
alias: customer_region
Notes:
- The entity under a chosen role(s) must have a unique alias (not a name of another entity)
- The entity should be able to assume the chosen role(s)
- Indirect entities are not allowed: when setting an explicit alias and role for an entity indirect to the role (i.e. for region when the role is on nation) , must also have an alias for every entity on the role path.
Interfaces
Domains on SQL interface
Domains are present as a flat table in the domains
schema.
- Attributes that are part of the domain are accessible as SQL columns.
- Metrics that are part of the domain are accessible as SQL columns that can be aggregated on.
- All filters of the domain apply (in addition to any filters in the SQL query).
See SQL interface for more details.
Domains as a context for dynamic datasets
A dynamic dataset can be associated with a domain. In that case, all domain configuration applies to the dynamic dataset query.
YAML Schema
Every domain is backed by a text file in git that defines it, and keeps history of every change.
The schema for a domain is:
type: domain
name: [name]
display_name: [display name]
owner: [owner]
description: |-
[desciption]
labels: [...]
folder: [folder]
hidden: [True/False/Yes/No]
entities:
- name: [entity name]
fields:
- name: [field name]
- ...
includes: [field pattern to include]
excludes: [field pattern to exclude]
role_path: [role]
alias: [alias for role]
- ...
filters:
- [filter expression]
- ...
source_filters:
- [source filter expression]
- ...
active_roles:
- [active role]
- ..
parameters:
- name: [parameter name]
value: [parameter value]
- ...
short_term_aggregate_cache_ttl_seconds: [the duration for which a query result can be reused as part of the short-term aggregate cache]
Fields:
name
: Name of domaindisplay_name
,owner
,description
,labels
,folder
,hidden
: Metadataentities
: List of entities that participate in the domainname
: Name of entityfields
: List of fields from the entity that participate in the domainname
: Name of field (attribute or metric)
includes
: Optional field pattern to includeexcludes
: Optional field pattern to excluderole_path
: Optional role pathalias
: Optional alias for entity
filters
: List of semantic filterssource_filters
: List of source filtersactive_roles
: List of active rolesparameters
(optional): List of parameters and values (override workspace values if they exist)short_term_aggregate_cache_ttl_seconds
(optional): Control short term aggregate caching. See Automatic Aggregate Caching.
Was this page helpful?