Raw Schema

The Raw Schema tab for Sources allows users to view the raw database attributes as well as raw metadata.


 

Raw Schema tab in Sources

 


Viewing Raw Schema

The Raw Schema tab can be located within each source between the Settings and Dependencies tabs.  Users can view the raw attributes for their source and the metadata associated with these attributes.

 

Data columns:

Column Name Description
ID Serialized number assigned to attribute

Name

Column name from source data

Lineage (blank)

Use icons to view lineage graph
Description Optional description for raw attribute
Column Normalized Name converted to normalized format
Raw Metadata See Raw Metadata
Last Input ID ID of latest input where attribute was ingested
Data Type Data type of each attribute
Version Number Instance number of attribute with the same name and data type ingested for the source.
Column Alias Lower-cased alias of column name used for hub tables.
Unique Flag Indicates whether the attribute is unique for every record.  Designated in source settings via Data Refresh Key, Sequence, or Timestamp columns.
Targets Flag Indicates whether the attribute is used down-stream in enrichments or output column mappings.
Inputs Flag Indicates whether any inputs in the source contained the attribute when ingestion was run
Updated Date Date and time of last update
 
Raw attribute data will appear after inputs have completed their data pull.
 

Raw Metadata

Raw Metadata popup
 
Database columns may contain additional metadata. This can be viewed by clicking on the table icon in the Raw Metadata column on the Raw Schema table. Raw Metadata is only provided after ingesting data through a connection that uses an agent for ingestion.
 
After clicking the icon, a popup will appear with all of the raw metadata associated with that raw attribute.
 
If there is not any metadata associated with a column, the icon will be disabled. A tooltip can be seen by hovering over the icon, "No metadata defined".
 

Updating Raw Attribute Descriptions

 
Raw Attribute Description popup
 
Raw attribute descriptions can be created and updated via the Raw Schema tab. Simply click the row cell underneath the description column.
 
A popup will appear where users can input a description for the specific raw attribute.

Raw Schema Management

Raw attributes are automatically added to the Raw Schema tab as inputs with new columns are ingested.  There may be instances where users want to remove a raw attribute that was added to the Raw Schema.  To do so, delete the Inputs where the attribute was ingested.  Afterward, the Raw Schema will be automatically updated, and remove the column if it is no longer referenced in another input.  The "Last Input ID" column is an indicator of which Input last ingested the attribute.

This can happen when an input contained bad data and a new attribute was created in Raw Schema.  It can also happen when data types are inferred and the same column on one input is inferred as a different data type as the same column name on another input.  When two columns exist in the source data with the same name and data type, the newer raw attribute is created with "_2" appended to the column alias.  


Data Profiles

Clicking the Data Profile icon brings up the data profile of that raw attribute. Different datatypes provide different data profile data.
 
Data Profile options
 
A modal appears showing the data profile when the datatype label is clicked. Older data profiles for the source can be accessed by using Select profiling timestamp.
 
The Data Profile
 
Data profiles provide the following statistics:

Common

  • Attribute Type
  • Data Type
  • Number of Rows
  • Min
  • Max
  • Unique %
  • Null %
  • Top 5 Values
  • Bottom 5 Values
  • Distribution Percentiles (10%, 25%, 50%, 75%, 90%)

Text

  • Min Length
  • Max Length
  • Avg Length
  • Numeric %
  • Blank %
  • Special Char %

Numeric

  • Average
  • Median
  • Standard Deviation
  • Zero %

Timestamp

  • Average
  • Median
  • Standard Deviation

 


Sub-Source Raw Schema

All raw schema tab features are available in a sub-source. Raw schema is auto-updated from parent sub-source enrichment schema. 

For full documentation, visit Sub-Sources

Updated

Was this article helpful?

0 out of 0 found this helpful