Skip to main content

How to upload manual lineage (xlsx)

Introduction

Manual lineage lets you add lineage that Foundational can't discover automatically. Foundational automatically builds lineage from your connected sources (dbt, Airflow, Snowflake, BigQuery, and 30+ others).

When some lineage lives outside those sources (a hand-built pipeline, a legacy system, or a vendor export), you can add it by uploading a spreadsheet (XLSX workbook) that describes your entities and the edges between them.

Foundational ingests the file through a dedicated Manual Upload connector and merges your entities and edges into the lineage graph. It supports tables, columns, dashboards, files, and warehouse tasks.

A blank template workbook (all sheets with headers) and a filled sample workbook (orders/customers/daily_revenue, with table-level and column-level lineage) are available. Ask your Foundational contact, or start from the worked example below.


How it works

Describe your entities: Each entity type (tables, columns, dashboards, files, warehouse tasks) has its own sheet in the workbook. Each row is one entity.

Assign ref_ids: Give each entity you need to reference elsewhere a short, unique identifier. Columns link to their parent table via parent_ref_id; edges link any two entities via source_ref_id and dest_ref_id.

Define edges: The edges sheet connects entities using those ref_id values, one row per lineage relationship.

Upload via API: POST the workbook to the Foundational API. Foundational creates (or updates) a Manual Upload connector, runs a scan, and adds your lineage to the graph.


Prepare the XLSX file

The workbook contains one sheet per entity type plus an edges sheet. The first row of every sheet is the header row. Each following row is one entity or edge.

Foundational ignores blank rows. Column order does not matter; only header names are used.

To get started, download foundational_manual_lineage_template.xlsx at the end of this article.

The ref_id system

ref_id is how rows reference each other:

  • Set a ref_id on any entity you need to reference elsewhere: every table that has columns, and every entity that appears in an edge. Put a short, human-friendly value in it (e.g. orders, orders.order_id, dash_revenue).

  • ref_ids must be unique across the whole workbook.

  • Columns link to their table via parent_ref_id. Edges link entities via source_ref_id and dest_ref_id. These values must exactly match a ref_id you defined.

  • If you omit ref_id, the entity is still imported, but you won't be able to reference it from a column or an edge.


Tables sheet

One row per table.

Column

Required

Notes

db_platform

Yes

The table's platform, e.g. SNOWFLAKE, BIGQUERY, REDSHIFT. Matched case-insensitively.
See the section Allowed values.

db_name

Yes

Database / catalog name

db_schema

Yes

Schema name

name

Yes

Table name

ref_id

No

Identifier to reference this table from columns / edges

account

No

Account / host identifier

description

No

Free text

tags

No

Comma-separated, e.g. pii,gold

owner

No

Owner name or email

custom_attributes

No

Comma-separated

Example

db_platform

db_name

db_schema

name

ref_id

SNOWFLAKE

analytics

public

orders

orders

SNOWFLAKE

analytics

public

customers

customers


Columns sheet

One row per column. Each column attaches to its parent table via parent_ref_id.

Column

Required

Notes

parent_ref_id

Yes

Must equal the ref_id of a table (or other parent entity) defined in the workbook

name

Yes

Column name

ref_id

No

Identifier to reference this column from edges (needed for column-level lineage and any edge referencing this column)

data_type

No

e.g. integer, varchar

description, tags, owner, custom_attributes

No

Same meaning as on tables

Example

parent_ref_id

name

data_type

ref_id

orders

order_id

integer

orders.order_id

customers

customer_id

integer

customers.customer_id


Dashboards sheet

One row per dashboard.

Column

Required

Notes

platform

Yes

BI platform, e.g. TABLEAU, LOOKER, POWER_BI. See the section Allowed values.

project

Yes

Project / workspace name

name

Yes

Dashboard name

ref_id

No

Identifier for use in edges

description, tags, owner, custom_attributes

No

Optional metadata


Files sheet

One row per file. Foundational builds the file URI as platform://bucket/dir/name.

Column

Required

Notes

platform

Yes

Storage platform, e.g. S3, GCS, AZURE_BLOB_STORAGE.

See the section Allowed values.

bucket

Yes

Bucket name

name

Yes

File / object name

dir

No

Folder path within the bucket

ref_id

No

Identifier for use in edges

description, tags, owner, custom_attributes

No

Optional metadata

Example (produces s3://my-bucket/raw/orders.parquet)

platform

bucket

dir

name

ref_id

s3

my-bucket

raw

orders.parquet

raw_orders_file


Warehouse tasks sheet

One row per warehouse job or task (e.g., a scheduled Snowflake task).

Column

Required

Notes

platform

Yes

The task's platform, e.g. SNOWFLAKE, BIGQUERY.

See the section Allowed values.

host

Yes

Warehouse host / account identifier

name

Yes

Task name

ref_id

No

Identifier for use in edges

description, tags, owner, custom_attributes

No

Optional metadata


Edges sheet

One row per lineage relationship. Connect two entities by their ref_id values.

Column

Required

Notes

source_ref_id

Yes

ref_id of the upstream entity (data flows from here)

dest_ref_id

Yes

ref_id of the downstream entity (data flows to here)

edge_type

No

LINEAGE (default) or DEPENDENCY

transformation

No

Free-text expression describing the transformation. Use a single * between two tables to draw an all-columns ("star") edge.

except_ref_ids

No

Comma-separated ref_ids to exclude from a star edge

include_ref_ids

No

Comma-separated ref_ids to explicitly include

description

No

Free text

Example (table-level star edge and a column-level edge)

source_ref_id

dest_ref_id

edge_type

transformation

orders

customers

LINEAGE

*

orders.order_id

customers.customer_id

LINEAGE


Allowed values

Foundational matches platform values case-insensitively. Foundational rejects any value not on this list with an "Unsupported platform value" error.

Table and warehouse task platforms
db_platform on tables
platform on warehouse tasks

SNOWFLAKE, BIGQUERY, POSTGRES, REDSHIFT, MYSQL, SQL_SERVER, MSSQL), ORACLE, SYNAPSE, TERADATA, NETEZZA, DB2, ATHENA, PRESTO, HIVE, DATABRICKS, SAP_HANA, SAP_ECC, MONGO_DB, COUCHBASE, ICEBERG

Dashboard platforms
(platform on dashboards)

TABLEAU, LOOKER, POWER_BI, SIGMA, MODE_ANALYTICS, LOOKER_STUDIO, SISENSE, SPOTFIRE, THOUGHTSPOT, HEX, DATABRICKS, TDV

File storage platforms
(platform on files

S3, GCS, AZURE_BLOB_STORAGE, AZURE_DATA_LAKE, HDFS, LOCAL, MAINFRAME

Edge types

LINEAGE (default), DEPENDENCY

List fields
(tags, custom_attributes, except_ref_ids, include_ref_ids)

comma-separated values


Worked example

The example workbook below models orders to customers lineage at both table level and column levels.

To see the worked example, download: foundational_manual_lineage_example.xlsx at the end of this article.

Workbook contents

tables sheet

db_platform

db_name

db_schema

name

ref_id

SNOWFLAKE

analytics

public

orders

orders

SNOWFLAKE

analytics

public

customers

customers

columns sheet

parent_ref_id

name

data_type

ref_id

orders

order_id

integer

orders.order_id

customers

customer_id

integer

customers.customer_id

edges sheet

source_ref_id

dest_ref_id

edge_type

transformation

orders

customers

LINEAGE

*

orders.order_id

customers.customer_id

LINEAGE


Upload the file

Manual lineage uploads go through the Foundational public API at https://api.foundational.io, where you can find the Foundational API specifications.

Authentication: All endpoints require a personal API token with the Admin role. Pass the token as Authorization: Bearer <token>. To generate a token, see the article Create API tokens.

If you prefer not to run the upload yourself, your Foundational representative can do it for you.

Create a new manual-upload connector

POST /api/v1/manual-lineage/snapshot (multipart/form-data)

Field

Required

Description

name

Yes

Display name for the new connector

file

Yes

The .xlsx file. Only .xlsx is accepted.

The response contains the connector's connectorId and a snapshotUploadId. Foundational starts a scan to process the file. Save the connectorId. You will need it to upload future versions.

Upload a new version to an existing connector

POST /api/v1/manual-lineage/connectors/{connectorId}/snapshot (multipart/form-data)

Set {connectorId} to the connectorId from a previous response.

Field

Required

Description

file

Yes

The .xlsx file. Only .xlsx is accepted.

The response contains the connectorId and a snapshotUploadId. Foundational starts a scan to process the file.


After uploading

Foundational parses the workbook and merges your entities and edges into the lineage graph. Once the scan completes, the new lineage appears in the Lineage and Catalog views, attributed to your Manual Upload connector.

It generally takes about 15 minutes for the lineage to load, but on occasions it can take up to two hours.


Validation and troubleshooting

If Foundational rejects the file, the error names the sheet, row, and column. Common issues:

Error

Fix

Only .xlsx files are supported

Save the workbook as .xlsx (not .xls or .csv)

Missing required columns: …

Add the listed header(s) to that sheet's header row

Unsupported columns: …

Remove or rename headers not listed in the reference above

Unknown parent_ref_id: X / Unknown source_ref_id: X

The referenced ref_id is not defined or does not match exactly. Define it on the target entity row.

Duplicate ref_id: X

ref_ids must be unique across the workbook

Unsupported db_platform / dashboard platform / edge_type value

Use one of the allowed values listed above

Uploaded file is empty

The file has no content. Re-export it.

TIP: It’s best to start from the smallest possible workbook. Use a couple of tables and one edge, confirm they import correctly, and then expand from there.


Useful links

Did this answer your question?