The Raw Schema tab for Sources allows users to view the raw database attributes as well as raw metadata.
Viewing Raw Schema
The Raw Schema tab can be located within each source between the Settings and Dependencies tabs. Users can view the raw attributes for their source and the metadata associated with these attributes.
Data columns:
Column Name | Description |
ID | Serialized number assigned to attribute |
Name |
Column name from source data |
Lineage (blank) |
Use icons to view lineage graph |
Description | Optional description for raw attribute |
Column Normalized | Name converted to normalized format |
Raw Metadata | See Raw Metadata |
Last Input ID | ID of latest input where attribute was ingested |
Data Type | Data type of each attribute |
Version Number | Instance number of attribute with the same name and data type ingested for the source. |
Column Alias | Lower-cased alias of column name used for hub tables. |
Unique Flag | Indicates whether the attribute is unique for every record. Designated in source settings via Data Refresh Key, Sequence, or Timestamp columns. |
Targets Flag | Indicates whether the attribute is used down-stream in enrichments or output column mappings. |
Inputs Flag | Indicates whether any inputs in the source contained the attribute when ingestion was run |
Updated Date | Date and time of last update |
Raw Metadata
Updating Raw Attribute Descriptions
Raw Schema Management
Raw attributes are automatically added to the Raw Schema tab as inputs with new columns are ingested. There may be instances where users want to remove a raw attribute that was added to the Raw Schema. To do so, delete the Inputs where the attribute was ingested. Afterward, the Raw Schema will be automatically updated, and remove the column if it is no longer referenced in another input. The "Last Input ID" column is an indicator of which Input last ingested the attribute.
This can happen when an input contained bad data and a new attribute was created in Raw Schema. It can also happen when data types are inferred and the same column on one input is inferred as a different data type as the same column name on another input. When two columns exist in the source data with the same name and data type, the newer raw attribute is created with "_2" appended to the column alias.
Data Profiles
Common
- Attribute Type
- Data Type
- Number of Rows
- Min
- Max
- Unique %
- Null %
- Top 5 Values
- Bottom 5 Values
- Distribution Percentiles (10%, 25%, 50%, 75%, 90%)
Text
-
Min Length
-
Max Length
-
Avg Length
-
Numeric %
-
Blank %
-
Special Char %
Numeric
-
Average
-
Median
-
Standard Deviation
-
Zero %
Timestamp
-
Average
-
Median
-
Standard Deviation
Sub-Source Raw Schema
All raw schema tab features are available in a sub-source. Raw schema is auto-updated from parent sub-source enrichment schema.
For full documentation, visit Sub-Sources.
Updated