Skip to main content

OpenLineage Support in Foundational

Updated over 2 weeks ago

Introduction

Foundational supports OpenLineage. You can ingest OpenLineage data into Foundational and export Foundational lineage to catalogs that support OpenLineage.

OpenLineage support is in Preview and available to selected customers. To request access, contact the Foundational Support team at support@foundational.io.


Why this framework matters in your data stack

OpenLineage defines a standard way to collect, analyze, and share data lineage across tools and systems. It tracks data flow using runtime information emitted from pipelines. It creates a common language for lineage, which improves interoperability between tools such as Apache Airflow, Spark, and dbt. This gives teams a unified view of data dependencies.

OpenLineage provides several benefits:

  • Standard capture and sharing of lineage.

  • Interoperability between tools in the ecosystem.

  • Use of emitted OpenLineage information across many tools, including Foundational.

  • Reduced vendor lock-in.

  • Support for a system that stays forward-compatible with the evolving data ecosystem.


How Foundational integrates with OpenLineage

Foundational supports OpenLineage in two ways:

  • Runtime event ingestion from pipelines that already emit OpenLineage data.

  • Code-time lineage export in OpenLineage format, produced through static code analysis.

This connection with OpenLineage helps data teams maintain data quality, run accurate impact analysis, and manage changes across the data stack.


Differences between Foundational & OpenLineage

Runtime lineage shows data flow during pipeline execution. However, it does not cover all situations. Some pipelines run rarely; some run only under specific conditions, and ad hoc processes can run outside normal schedules. These gaps make runtime lineage incomplete.

Code-time lineage closes these gaps. Foundational extracts lineage directly from code. This shows all possible data flows, not only those that run in production. Code-time lineage helps teams understand:

  • Pipelines that rarely run

  • Code paths that run only after specific triggers​

OpenLineage added static lineage support in 2023. The Job object represents code locations, and facets such as SourceCodeLocationJobFacet store related metadata. This structure supports both runtime and code-based lineage.

Foundational uses this structure to combine runtime and code-time lineage in OpenLineage format. Runtime lineage shows what ran. Code-time lineage shows what could run. Together, they provide full coverage of active and dormant dependencies and reduce blind spots across the data stack.

For more Foundational’s collaboration with OpenLineage see the article in OpenLineage Expanding the Horizon of OpenLineage: Extracting Lineage from Code with Foundational.


Advantages of Foundational's approach

Foundational adds early visibility to the OpenLineage model. It extracts lineage directly from source code and shows the effect of each change during development or inside pending Pull Requests. This prevents downstream breakages and improves data quality before pipelines run.

Foundational also uses code as the source of truth. This removes the delays and inconsistencies that occur when tools rely only on deployed systems. Teams can run accurate impact analysis as soon as changes appear in the codebase.

The OpenLineage model strengthens this approach. It combines design-time detail with execution-time information, which gives Foundational a reliable way to present a complete view of lineage across code, runtime, and all dependencies.


Useful links

Did this answer your question?