Skip to main content

Amazon Redshift

Connector Features

Supported Versions

Dataedo supports importing data from both standard Amazon Redshift cluster instances and Amazon Redshift Serverless instances.

Data Catalog

Dataedo documents metadata of the following objects:

  • tables
  • external tables
  • views
  • materialized views
  • functions
  • copy commands

Data Lineage

For Redshift, Dataedo supports both manual and automatic lineage. Automatic lineage is available for views and external tables.

Data Profiling

Users will be able to run data profiling for a table, view, or materialized view, then save selected data in the repository. This data will be available from Desktop and Web.

Relationships (PK/FK) Tester

With Dataedo, you can check if selected columns are viable candidates to create a relationship between tables.

Data Classification

Users will be able to run classification on the Redshift database in the repository in search of columns containing potentially sensitive data. All built-in functions are supported.

Connection Requirements

Cluster VPC Option

In order to import Redshift metadata, the cluster has to have Public accessibility turned on.

Redshift Action Menu
Public accessibility menu

How to Connect

  1. Host - provide an address of the Redshift endpoint.
  2. Port - change the default port of the Amazon Redshift instance if required.
  3. User - provide the username of the user (either root or IAM) that has access to the Redshift database.
  4. Password - provide the password for the given username.
  5. SSL mode:
    • Disable - don't use SSL.
    • Require - connect with SSL. If the server doesn't support SSL, the connection won't be established.
  6. Database - type in the database name.
Redshift Connector Control

Connector Specification

Imported Objects

ObjectImported as
TableTable
External TableTable
ViewView
Materialized ViewView
FunctionFunction
Copy CommandSQL Script

Imported Metadata

ImportedEditable
Tables
  Columns
   Data types
   Nullability
   Default value
   Column comments
  Table comments
  Foreign keys
  Primary keys
  Unique indexes
Views, Materialized Views
  Script
  Columns
   Data types
   Nullability
   Default value
   Column comments
  View comments
User-defined Functions
  Script
  Parameters
  Returned Value
  Parameter comments
  Function comments
Copy Commands
  Script

Supported Features

FeatureImported
Import comments
Write comments back
Data profiling
Reference data (import lookups)
Importing from DDL
Generating DDL
FK relationship tester

Data Profiling

Dataedo supports the following data profiling in Redshift:

ProfileSupport
Table row count
Table sample data
Column distribution (unique, non-unique, null, empty values)
Min, max values
Average
Variance
Standard deviation
Min-max span
Number of distinct values
Top 10/100/1000 values
10 random values

Read more about profiling in the Data Profiling documentation.

Data Lineage

SourceMethodVersion
Views - object levelFrom dependencies10.4
Views - object levelFrom SQL parsing10.4
Views - column levelFrom SQL parsing10.4
External Tables - object levelFrom dependencies23.2
External Tables - object levelFrom Linked Sources24.1

Known Issues and Limitations

  • Copy Commands - Due to retention time in Redshift Copy Logs, after one week (default, can be changed in cluster options) Copy Commands would be impossible to import.

Required Access Level

Importing database schema requires a certain access level in the documented database. Granting USAGE permission for information_schema, pg_catalog, and sys schemas in the documented database to a user will allow them to import all objects from the database. Alternatively, you can grant SELECT permission only for specific objects you want to document.

The following objects are accessed during the schema import process:

  • SVV_TABLES
  • SVV_EXTERNAL_TABLES
  • SVV_EXTERNAL_SCHEMAS
  • SVV_COLUMNS
  • INFORMATION_SCHEMA.ROUTINES
  • INFORMATION_SCHEMA.VIEWS
  • INFORMATION_SCHEMA.PARAMETERS
  • INFORMATION_SCHEMA.VIEW_TABLE_USAGE
  • INFORMATION_SCHEMA.TABLES
  • PG_CATALOG.PG_PROC
  • PG_CATALOG.PG_NAMESPACE
  • PG_CATALOG.PG_LANGUAGE
  • PG_CATALOG.PG_DESCRIPTION
  • PG_CATALOG.PG_CONSTRAINT
  • PG_CATALOG.PG_CLASS
  • PG_CATALOG.PG_ATTRIBUTE
  • SYS.SYS_QUERY_HISTORY (only for Copy History Import)