Azure Data Factory
Azure Data Factory (ADF) is a cloud ETL tool developed by Microsoft. It allows you to build complex ETL pipelines using drag & drop web UI.
This article describes the abilities and limitations of the Azure Data Factory Dataedo connector and walks you through the import process.
Supported elements and metadata
Objects imported
-
Pipelines
-
Name
-
Folder (as schema)
-
Activities
-
Name
-
Data lineage (object and column level, details below)
-
-
-
Datasets
-
Name
-
Columns (if linked service is imported to Dataedo or dataset schema is imported into ADF)
-
-
Sources
-
Name
-
Columns (if linked service is imported to Dataedo or schema is imported into ADF)
-
-
Destinations
-
Name
-
Columns (if linked service is imported to Dataedo or schema is imported into ADF)
-
Dataedo imports all the activities as data processes within pipelines. Nested activities like ForEach, Until, IfCondition, and Switch are also imported with all the activities inside them.
Sources and Destinations are created based on Dataset information and runtime logs information (for parameterized datasets and linked sources) for presenting lineage purposes.
Activities we build automatic data-lineage for
-
Copy activity - Object-level and column-level lineage
-
Dataflow - Object-level lineage
Automatic data lineage
Dataedo ADF connector always creates automatic data lineage source of activity -> task -> sink of activity. We also support external data source -> source of activity -> task -> destination of activity -> external data source for the following linked service types:
-
Microsoft SQL Server
-
Azure SQL Server
-
Azure Synapse Analytics
-
Azure Blob Storage
-
Azure Data Lake Storage
-
Amazon S3
-
Postgres
-
MariaDB
-
MySQL
-
MongoDB
-
Snowflake
-
REST