Overview

Performance Optimization

Performance is a concern in few scenarios:

Complex multi-stage computations
Large data amounts for dashboard aggregations
Ad-hoc exploration over large datasets

Honeydew automatically optimizes any query it processes (see below). In addition, depending on the use case, users can leverage:

Filtering and sampling to reduce data scanning in ad-hoc workloads
Aggregate aware caching to reduce computation with dynamic dashboards
Dataset materialization when the dataset can be fully pre-computed
Entity Caching to reduce computation with complex calculated attributes

Different approaches benefit different use cases.

Workload	Queries	Performance approach
Ad-hoc data exploration	Unpredictable	Filtering and Sampling
Dynamic dashboards and BI	Predictable	Aggregate Aware Caching
Datasets for data science or 3rd party integration	Fully known	Materialization
Massive data transformation (ELT in Honeydew)	Batch	Entity Caching

Snowflake Source Tables

Honeydew does not control the Snowflake configuration of source data. It is advised to follow Snowflake’s best practices for clustering and table structure, as they will benefit any query generated by Honeydew.

Automatic Query Optimizations

All Honeydew generated queries are optimized for high performance and low cost.

Push down filters

If there are filters in the query, they will be pushed down to the data, limiting the amount of data scanned. See order of filtering for more details.

Filter pushdown works best when it matches the table clustering in Snowflake.

Join pruning

After pushing down filters and using caches, Honeydew engine will only join the tables that are actually needed for a specific user query.

Using a normalized schema coupled with join pruning is often faster than using pre-joined denormalized wide tables.

Aggregate aware caching

If a there is a valid pre-aggregate cache that matches the specific groups, filters and aggregations used in a query - Honeydew engine will leverage it instead of computing the aggregation. Honeydew can detect partial aggregations that leverage roll-up automatically. For example, if there is a pre-aggregated cache for revenue by day, Honeydew can use it to calculate revenue by year.

Pre-aggregate caches help best with workloads than operate on a repeating set of groups and metrics, such as BI dashboards or pivot tables.

Entity caching

If an entity calculated attribute is used in a query, and a valid cache exists for the entity, then the cache will be used instead of recomputing the calculated attribute. Filters, if applicable, are pushed down to the entity cache as well.

Entity calculated attributes that are using window functions or aggregations benefit the most from an entity cache.

Quick Start

Migration Guides

Reference

Integrations

Security

Release Notes

Performance Optimization

Snowflake Source Tables

Automatic Query Optimizations

Push down filters

Join pruning

Aggregate aware caching

Entity caching

Quick Start

Migration Guides

Reference

Integrations

Security

Release Notes

​Performance Optimization

​Snowflake Source Tables

​Automatic Query Optimizations

​Push down filters

​Join pruning

​Aggregate aware caching

​Entity caching

Performance Optimization

Snowflake Source Tables

Automatic Query Optimizations

Push down filters

Join pruning

Aggregate aware caching

Entity caching