Skip to main content

Connecting to dbt Core (artifacts from path)

Supported versions

Dataedo’s dbt connector works with both dbt Core and dbt Cloud starting from version v1.0. The connector is platform-agnostic, so you can use it with any warehouse supported by dbt - Snowflake, BigQuery, Redshift, and more. For the lineage support, make sure that the warehouse is connected to Dataedo as well.

Prepare for import

Dataedo needs two dbt artifacts to import metadata:

  • manifest.json - produced by any dbt command except dbt deps, dbt clean, dbt debug, and dbt init,
  • catalog.json - produced by dbt docs generate command.

See below for details on how to generate or retrieve these files.

Generate the dbt docs artifacts

After your regular dbt run, execute dbt docs generate in the terminal inside your project directory. This command creates the manifest.json and catalog.json files in the target folder (unless you changed the default path). These two files are required for Dataedo to import metadata from dbt.

Running dbt commands
Navigating to the target folder
Locating the required files
tip

You don’t have to run dbt docs generate immediately after dbt run. You can execute it at any time as long as all models are already materialized. Otherwise, the generated documentation may be incomplete.

Filling in the connection form

Choose Artifacts from path (dbt Core) as Import type and fill in the following fields:

  • manifest.json path - the path to the manifest.json file. You can use the Browse button to select it or type it manually/paste it - see section above for instructions on how to generate it.
  • catalog.json path - the path to the catalog.json file. You can use the Browse button to select it or type it manually/paste it - see section above for instructions on how to generate it.

After the first import - activate automatic lineage

lineage is empty after the first import

The initial import brings in objects and descriptions, but lineage is empty until you set up the Linked sources and run Import changes. Follow the steps below to create automatic lineage.

What is a linked source?

A linked source identifies which documentation set inside Dataedo represents a particular external system. By mapping it, you remove the ambiguity that appears when identical object names exist in several documentations—Dataedo now knows exactly where to look to resolve cross-database lineage.

more background

Need more context? See the Linked sources overview.

In dbt, Dataedo creates three kinds of linked sources:

Linked source typeWhen it is createdName in Dataedo
ProjectAlways. Represents the whole dbt project.Always Project warehouse
SeedOnly if the project contains seeds.Always Seeds
SourceOne per source defined in your dbt project.Same as the source name in dbt.

Map each linked source to the right database

Linked-source typeMap it to documentation that represents…
ProjectThe database where dbt models (and seeds) are materialized.
SeedThe database that holds the seed .csv files, not the database where the seeds are materialized.
SourceThe upstream database referenced by that specific source in dbt.

For step-by-step instructions on assigning a linked source to a documentation set, see the overview.

how many documentations to assign

A linked source can point to several documentations - for instance, if you imported the same database multiple times, but one schema at a time. In most scenarios, though, you will connect each linked source to a single documentation.

Run Import changes

After you assign the linked sources, run Import changes to populate the lineage. Dataedo will now automatically create the lineage.

keep linked sources up to date

Remember to update the linked sources whenever you add/update the sources in your dbt project. The automatic data lineage feature only works if the linked sources are up to date.

Dataedo is an end-to-end data governance solution for mid-sized organizations.
Data Lineage • Data Quality • Data Catalog