Skip to main content

Apache ORC

Dataedo 9.3 added support for ORC files. Dataedo scans ORC file and builds a structure that includes:

Supported Metadata

Metadata

  • Primitive types (boolean, byte, int, long, float, string ...)
  • Compound types:
    • Struct
    • List
    • Map
    • Union

Each field contains:

  • Name
  • Data type
  • Nullability

Data profiling

Dataedo does not support data profiling in ORC files.

How to import ORC File

To import ORC file to Dataedo:

  • Right-click on any database or Structures folder, choose Add Object, then Add/Import Structure, or
  • On the main ribbon select Add Object then Structure/File, or
  • Select Structures folder and on the main ribbon select Add Structure/File.

Then select Import from file and ORC format. To read the file, point to an ORC file on the disk and click Next. This will scan the content and open Structure designer with a parsed structure. You can use this window to edit names, data types and field types and save with Save button.

orc_structure_designer

Guide: Adding files to the catalog