Foundational delivers actionable cost-saving insights, empowering users to monitor and minimize their data warehouse expenses, whether in BigQuery or Snowflake. Thanks to Foundational’s unique, end-to-end data lineage, we can recommend cost optimizations that won’t disrupt your critical data products or dashboards.
Requirements
To identify cost-saving opportunities, you’ll need to grant Foundational access to your warehouse’s query history (e.g., BigQuery, Snowflake). Please refer to our documentation for instructions on configuring BigQuery or Snowflake.
Reminder: This access to the query history does not mean that Foundational has access to your underlying data. Foundational only reads the queries executed (along with some other metadata), but does not read or access your underlying data.
How Does It Work?
Foundational examines both your query history and pipeline code (to account for operations not visible in the query logs) to create:
A detailed lineage graph.
Table usage metrics, tracking how often they are read from or written to.
A cost breakdown for each table, pipeline, and computation.
Foundational then applies a set of rules to pinpoint cost-saving opportunities, such as unused tables or pipelines generating unused data. Only high-impact recommendations are surfaced, so you won’t receive alerts for small-cost items like a $2 table, but significant costs (e.g., a $1K table) will be flagged.
Types of Cost-Saving Recommendations
Foundational offers multiple cost-saving recommendations. The table below summarizes each suggestion, its ease of implementation, and its typical impact on your data warehouse bill:
| Description | Ease of Implementation | Typical Impact |
Unused Tables | Delete unused tables to save on storage costs. | Very Easy | 1-2% |
Unused Data Pipelines | Remove redundant pipelines to reduce compute costs. | Easy | 5-10% |
Pipeline Scheduling Mismatch | Align pipeline run frequencies to usage needs. | Very Easy | 5-7% |
Incremental Processing | Switch to processing only new data to cut costs. | Moderate | 5-10% |
Below, we dive into each recommendation in more detail.
1. Unused Tables
Tables that are no longer accessed can often be safely removed, helping you save on storage costs. This is especially relevant for large tables that are outdated or no longer needed, such as old backups.
Implementation of the recommendation: Very easy—simply delete the tables identified in the recommendation.
2. Unused Data Pipelines
Pipelines that generate output data with no downstream consumers can typically be eliminated, reducing the associated compute costs. This often happens when a pipeline’s purpose is outdated, but the code was never removed. Removing such pipelines often leads to meaningful savings across various use cases.
Implementation of the recommendation: Easy—remove the pipeline code that is no longer required.
3. Pipeline Scheduling Mismatch (Pipelines Running Too Frequently)
A mismatch between the frequency of pipeline runs and the actual data usage needs can lead to excessive, unnecessary processing. For example, a daily pipeline calculating total revenue might run more often than needed if the output is only consumed by a weekly report. In such cases, assuming the pipeline isn’t incremental, running the pipeline daily results in an excess of 7 times the necessary runs.
Foundational identifies these scenarios and provides recommendations on adjusting pipeline schedules to reduce costs while maintaining data freshness and quality. We won’t alert on incremental pipelines, as they only process new data, regardless of their frequency.
Implementation of the recommendation: Very easy—adjust the schedule to match actual usage needs.
Switch to Incremental Processing
Rather than processing the entire dataset with every pipeline run, you can shift to an incremental approach that processes only new or updated data. For example, instead of recalculating total company revenue from scratch every day, an incremental pipeline would only compute revenue for new data. Many frameworks, such as dbt’s incremental models, support this approach.
Foundational identifies pipelines with significant costs that rewrite entire datasets repeatedly, and recommends switching them to incremental models.
Implementation of the recommendation: Moderate—converting a pipeline to an incremental model requires thoughtful work and logic adjustments.