Connecting to Databricks Unity Catalog
Starting from Dataedo 25.3, to connect to Databricks it is required to provide SQL Warehouse name that will allow to execute SQL queries via Databricks API. The compute resources of this warehouse will be used in Data Profiling and Data Quality modules and to retrieve data lineage faster using system tables. To import column lineage, the following privileges are now required: USE SCHEMA
on system.access
schema and SELECT
on system.access.column_lineage
table.
How to connect
Pre-requisites:
To scan Databricks Unity Catalog, Dataedo connects to a Databricks workspace API and uses the Personal Access Token for authentication.
You need to have a Databricks workspace that is Unity Catalog enabled and attached to the metastore you want to scan.
Dataedo will need the following information to connect to the Databricks instance:
NOTE: You can find them in your Databricks workspace
-
Workspace URL - Once you've opened your Databricks workspace, copy the URL from the address bar in your web browser and paste it into Dataedo
-
Token - A personal token that has to be generated in profile settings in your Databricks workspace. Check how to generate PAT. User for whom token is generated will need to have the following privileges:
- For each table/view to be imported:
SELECT
on the object,USE CATALOG
on the object’s catalog, andUSE SCHEMA
on the object’s schema. - To import column level lineage:
USE SCHEMA
onsystem.access
schema andSELECT
onsystem.access.column_lineage
table
- For each table/view to be imported:
-
Catalog - Name of the documented catalog. It can either typed manually or be later chosen from the list.
-
Warehouse - Name of the SQL warehouse that will be used to execute SQL queries. It can either typed manually or be later chosen from the list.
Create Connection
-
From the list of connections, select Databricks
-
Enter connection details. If you don't remember the Catalog or Warehouse name, you can select them from the list using the [...] buttons. Just remember to provide Workspace Url and Token beforehand.

Importing metadata
-
When the connection is successful, Dataedo will read objects and then show a list of objects found. You can choose which objects to import.
-
You can also use advanced filter to narrow down the list of objects.
Outcome
- Metadata for your Unity Catalog has been imported to new documentation in the repository.

- Lineage for your objects within imported Unity Catalog objects was documented more details check here
Watchouts
-
For all the objects that you want to bring into Dataedo, the user needs to have at least
SELECT
privilege on tables/views,USE CATALOG
on the object’s catalog, andUSE SCHEMA
on the object’s schema. -
To import column level lineage, Dataedo uses table from system schema and user for which access token was generated need to have the following privileges:
USE SCHEMA
onsystem.access
schema andSELECT
onsystem.access.column_lineage
table. -
In order to scan all the objects in a Unity Catalog metastore, use a user with metastore admin role. Learn more from Manage privileges in Unity Catalog and Unity Catalog privileges and securable objects.
-
For classification, the user also needs to have
SELECT
privilege on the tables/views to retrieve sample data. -
If your Azure Databricks workspace doesn’t allow access from the public network, or if your Microsoft Purview account doesn’t enable access from all networks, you can use the Managed Virtual Network Integration Runtime for the scan. You can set up a managed private endpoint for Azure Databricks as needed to establish private connectivity.
How to generate the Personal Access Token?
- Use the latest manual from Databricks Databricks personal access token authentication
Depending on your implementation scenario, you would need to use:
-
We recommend using Service Principal to avoid losing connection if a personal account is removed or the token expires. To do it, follow this guide Databricks personal access tokens for service principals
-
For smaller non-production implementations like POC, you can use the Personal account. To set up, use the following steps Databricks personal access tokens for workspace users