SQL Server Integration Services (SSIS) - Automatic Data Lineage
What to Expect
Dataedo automatically imports:
- Packages,
- Tasks (as processes in Dataedo),
- Object-level lineage for supported task types,
- Column-level lineage for supported Data Flow transformations.
Prerequisites
To create SSIS lineage automatically in Dataedo:
- Import documentation for all source and destination systems used by the package.
- Define source and destination connections at the package level.
- Ensure Linked Sources are matched correctly.
Data Transfer Tasks
If prerequisites are met, Dataedo creates lineage for:
- Data Flow Task (object-level lineage for sources and destinations, and column-level lineage for supported transformations, including supported transformation-related sources such as Lookup)
- Bulk Insert Task (destination)
- FTP Task (file transfer flow)
For supported Data Flow transformations, at the Process level in Dataedo, full column-level lineage is created for:
- Balanced Data Distributor
- Conditional Split
- Lookup
- Multicast
- Cache Transform
- Export Column
- Percentage Sampling
- Aggregate
- Sort
- Union
- Pivot
- Row Sampling
- Unpivot
- Merge
- Merge Join
- Copy Column
- Derived Column (including newly created columns, e.g., combining multiple input columns into one output column)
For the following transformations, column lineage is created for mapped columns, but not for newly created output columns:
- Data Conversion
- Script Component
- Character Map
- Fuzzy Grouping
The source or destination may be relational databases, files, or ODBC sources.

In this view:
SourceandDestinationobjects are displayed on both sides of the package-level process node.- The central node (
Package) represents the Data Flow process between source and destination elements. - Column links can be expanded to see how individual fields are mapped.
- The legend distinguishes automatic lineage from manual lineage.
How is SSIS lineage shown in Dataedo?
Dataedo's documentation of a sample SSIS data flow is shown below.

At package level, Dataedo groups SSIS tasks into a single process view and shows incoming and outgoing objects around that process.
Source
For database sources, Dataedo recognizes the following Data Access Mode options:
-
Table or view
-
Table name or view name from variable
-
SQL Command
-
SQL Command from variable
For each of these options, Dataedo constructs a lineage diagram that shows how data is retrieved.
SSIS Lookup elements are also added as Source objects with the Lookup subtype, and their lineage is shown in Dataedo.

The screenshot shows Package.Data Flow Task.Lookup in the Sources tree and the corresponding lineage node with mapped output columns.
Lineage is also constructed for file-based sources, as shown in the screenshot below.

In the file-based example, Dataedo links the documented file object (for example NewCustomers.txt) to the SSIS Source node and preserves column-level mappings.
For sources that use SQL queries to create them, there will be a lineage based on the parser

In the SQL-based example, the Column flow details panel contains the parsed query text (including aliases and joins), and upstream table nodes are linked to source columns based on parser results.

Process in lineage

The process node in Dataedo represents what is between Source and Destination in SSIS Data Flow (for example additional transformation steps). In package-level lineage, these steps are collapsed into one process object for readability.
For Data Flow transformations that use Derived Column, Dataedo now shows lineage for newly created columns (for example, when two input columns are combined into one output column).

In the example, two input columns (Addressline1, addressline2) feed a newly created output column (Derived Column 1), and this dependency is visible directly on the column flow.
Destination

Dataedo links the SSIS Destination object to the actual target object (database table or file), based on the matched connection from the package.

FTP Tasks

For FTP tasks, Dataedo reads task properties (for example remote path and local path) and creates lineage between documented file objects when both endpoints are available in the repository.
Known Limitations
- Column-level lineage is available for the Data Flow Task transformation types listed above.
- Lineage based on third-party vendor elements is not fully validated.
- Lineage is not created when variables are used as addresses.
- Execute SQL Task lineage is not available
Troubleshooting
Lineage is created based on a matched Linked Source. Check the Dataedo desktop to make sure it is matched properly.

When validating Linked Source mapping, verify:
Source databasepoints to the documented repository object (for exampleAdventureWorks2019@localhost),SQL Dialectmatches the source technology,- connection metadata (host/server, database/catalog, provider) is populated correctly.