Facts and Shared Dimensions
Introduction
A shared (also called conformed) dimension is a dimension that has the same meaning to every fact with which it relates.
For example, a user
entity might appear in different facts:
logins
fact table describing login eventsvisits
fact table describing visits events
A possible query would be to get the count of visit and login events per user, counting separately on two facts and then joining on the user id. Or get a list of all user events (login and visit) together, requiring to filter the facts and then unioning the result of two facts.
Honeydew automatically detects shared dimensions and supports building the correct queries automatically.
You can see them in the entity graph as entities that have more than one arrow of a higher granularity pointing to them.
The dynamic dataset for counting visits and logins per user will look like:
- Attributes:
user.id
,user.name
- Metrics:
logins.count
,visits.count
Honeydew engine will detect that user is a shared dimension and build the query accordingly.
Not every combination of fields is possible with shared dimensions.
For example, a user can have both associated visits and associated logins from two fact tables that conform to user. However, without including a user, a visit by itself does not have related login events (a many to many relationship). Thus a query that asks for count of logins per visit is impossible to compute. Including it in a dynamic dataset will lead to failure.
Metrics that combine multiple facts
A common use case with shared dimensions is querying one fact based on information from another fact with a connected shared dimension.
For example, would want to build a metric that counts the visits of all users that had a login last month.
Since there is no direct connection between visits and logins to join on, need to build an intermediate field per user in the users
entity
that takes data from the logins
fact, so it can be used later on when joining users with visits
.
To implement the question above:
-
Build a calculated attribute in the
users
shared entity namedlogins_last_month
defined asThis counts per user the logins (assuming logins has a
count
metric) in the last month (assuming atime
attribute), at the granularity of a user (since the calculated attribute is part of the user entity). See also how metrics behave in calculated attributes. -
Can build a dynamic dataset that uses it as a filter to get list of visits to test:
- Attributes of visits such as
visits.visit_id
. - The filter
users.logins_last_month > 0
.
- Attributes of visits such as
Since that dataset only combines users with visits, it is possible to compute.
-
Build a reusable metric in
visits
calledcount_visits_of_recently_logged_in_users
defined asThis is a reusable metric based on data from two facts, counting first at the login fact granularity (logins last month) and then at the visit fact granularity (visits of users that logged in last month).
Was this page helpful?