Introduction
Foundational supports OpenLineage. You can ingest OpenLineage data into Foundational and export Foundational lineage to catalogs that support OpenLineage.
OpenLineage support is in Preview and available to selected customers. To request access, contact the Foundational Support team at support@foundational.io.
Why this framework matters in your data stack
OpenLineage defines a standard way to collect, analyze, and share data lineage across tools and systems. It tracks data flow using runtime information emitted from pipelines. It creates a common language for lineage, which improves interoperability between tools such as Apache Airflow, Spark, and dbt. This gives teams a unified view of data dependencies.
OpenLineage provides several benefits:
Standard capture and sharing of lineage.
Interoperability between tools in the ecosystem.
Use of emitted OpenLineage information across many tools, including Foundational.
Fewer silos between platforms.
Reduced vendor lock-in.
Support for a system that stays forward-compatible with the evolving data ecosystem.
How Foundational analyzes this framework
Foundational supports open standards to help customers reduce vendor lock-in. This aligns with our support for open data contracts. Foundational fully supports OpenLineage.
Foundational provides two capabilities:
Ingest runtime OpenLineage information from pipelines that already emit it.
Export Foundational code-time lineage, produced through static analysis of pipeline code, in OpenLineage format.
This connection with OpenLineage helps data teams maintain data quality, run accurate impact analysis, and manage changes across the data stack.
Foundational’s process to extract schema and lineage
Runtime lineage shows data flow during pipeline execution. However, it does not cover all situations. Some pipelines run rarely; some run only under specific conditions, and ad hoc processes can run outside normal schedules. These gaps make runtime lineage incomplete.
Code-time lineage closes these gaps. Foundational extracts lineage directly from code. This shows all possible data flows, not only those that run in production. Code-time lineage helps teams understand:
Pipelines that rarely run
Ad hoc processes
Code paths that run only after specific triggers
Data flows that can occur if the code executes
OpenLineage added static lineage support in 2023. The Job object represents code locations, and facets such as SourceCodeLocationJobFacet store related metadata. This model supports both runtime and code-based lineage in one structure.
Foundational combines code-time and runtime lineage in OpenLineage format. Runtime lineage shows what ran. Code-time lineage shows what could run. Together, these views provide full coverage of active and dormant dependencies and reduce blind spots.
This combined approach helps data teams manage changes, troubleshoot issues, and understand the full design and execution context of their pipelines.
Advantages of Foundational's approach
Foundational combines runtime and code-time lineage in one model. This gives a complete view of data flow across pipelines. It covers active and dormant dependencies and reduces blind spots. This approach improves data quality and supports accurate impact analysis.
The OpenLineage model strengthens this view. It blends execution context with design-time insights and helps data teams manage changes in the data stack with confidence.

