Amazon Redshift
Connector Features
Supported Versions
Dataedo supports importing data from both standard Amazon Redshift cluster instances and Amazon Redshift Serverless instances.
Data Catalog
Dataedo documents metadata of the following objects:
- tables
- external tables
- views
- materialized views
- functions
- copy commands
Data Lineage
For Redshift, Dataedo supports both manual and automatic lineage. Automatic lineage is available for views and external tables.
Data Quality
Users will be able to check if data in Redshift tables is accurate, consistent, complete, and reliable using Data Quality functionality. Data Quality requires SELECT permission over the tested object.
Data Profiling
Users will be able to run data profiling for a table, view, or materialized view, then save selected data in the repository. This data will be available from Desktop and Web.
Relationships (PK/FK) Tester
With Dataedo, you can check if selected columns are viable candidates to create a relationship between tables.
Data Classification
Users will be able to run classification on the Redshift database in the repository in search of columns containing potentially sensitive data. All built-in functions are supported.
Connector Specification
Imported Objects
| Object | Imported as |
|---|---|
| Table | Table |
| External Table | Table |
| View | View |
| Materialized View | View |
| Function | Function |
| Copy Command | SQL Script |
Imported Metadata
| Imported | Editable | |
|---|---|---|
| Tables | ||
| Columns | ||
| Data types | ||
| Nullability | ||
| Default value | ||
| Column comments | ||
| Table comments | ||
| Foreign keys | ||
| Primary keys | ||
| Unique indexes | ||
| Views, Materialized Views | ||
| Script | ||
| Columns | ||
| Data types | ||
| Nullability | ||
| Default value | ||
| Column comments | ||
| View comments | ||
| User-defined Functions | ||
| Script | ||
| Parameters | ||
| Returned Value | ||
| Parameter comments | ||
| Function comments | ||
| Copy Commands | ||
| Script |
Supported Features
| Feature | Imported |
|---|---|
| Import comments | |
| Write comments back | |
| Data profiling | |
| Reference data (import lookups) | |
| Importing from DDL | |
| Generating DDL | |
| FK relationship tester |
Data Profiling
Dataedo supports the following data profiling in Redshift:
| Profile | Support |
|---|---|
| Table row count | |
| Table sample data | |
| Column distribution (unique, non-unique, null, empty values) | |
| Min, max values | |
| Average | |
| Variance | |
| Standard deviation | |
| Min-max span | |
| Number of distinct values | |
| Top 10/100/1000 values | |
| 10 random values |
Read more about profiling in the Data Profiling documentation.
Data Lineage
| Source | Method | Version |
|---|---|---|
| Views - object level | From dependencies | 10.4 |
| Views - object level | From SQL parsing | 10.4 |
| Views - column level | From SQL parsing | 10.4 |
| External Tables - object level | From dependencies | 23.2 |
| External Tables - object level | From Linked Sources | 24.1 |
Known Issues and Limitations
- Copy Commands - Due to retention time in Redshift Copy Logs, after one week (default, can be changed in cluster options) Copy Commands would be impossible to import.
Required Access Level
Importing database schema requires a certain access level in the documented database. Granting USAGE permission for information_schema, pg_catalog, and sys schemas in the documented database to a user will allow them to import all objects from the database. Alternatively, you can grant SELECT permission only for specific objects you want to document.
The following objects are accessed during the schema import process:
- SVV_TABLES
- SVV_EXTERNAL_TABLES
- SVV_EXTERNAL_SCHEMAS
- SVV_COLUMNS
- INFORMATION_SCHEMA.ROUTINES
- INFORMATION_SCHEMA.VIEWS
- INFORMATION_SCHEMA.PARAMETERS
- INFORMATION_SCHEMA.VIEW_TABLE_USAGE
- INFORMATION_SCHEMA.TABLES
- PG_CATALOG.PG_PROC
- PG_CATALOG.PG_NAMESPACE
- PG_CATALOG.PG_LANGUAGE
- PG_CATALOG.PG_DESCRIPTION
- PG_CATALOG.PG_CONSTRAINT
- PG_CATALOG.PG_CLASS
- PG_CATALOG.PG_ATTRIBUTE
- SYS.SYS_QUERY_HISTORY (only for Copy History Import)