Overview
This article provides a brief overview of the DataForge UI. For a more in depth explanation of each page and the features available, refer to the User Manual section of this documentation.
Sources and Processing
Sources Tabs
Tab | Summary |
Settings
|
The main interface for set up and configuration of a Source. Set the Connection and define how to get data and how to process it. |
Raw Schema | Metadata associated with the raw attributes ingested from the source connection. Additional information about data profiles and how the raw attributes are used in the system exists. |
Dependencies | Sources may be dependent upon other Sources. This tab displays additional data flows the Source may be dependent upon. |
Relations | Additional data may be joined to a source through use of rules, relations, and output mappings. This tab displays the details of the relations between sources (similar to SQL Joins). |
Rules | Documentation of row level manipulations of the data, inclusive of both enrichments and validation rules. Adding new enrichments allows the user to apply business logic to the data. |
Inputs | View of the associated raw data that is ingested through the Connection settings. Primary interface for ingesting new data and re-processing existing data. |
Process | Overview of the logical data flow and the calculations enacted. This view checks the status of the logical data flow and displays individual process results along with logs and Databricks job runs. |
Data View
|
Opens the source hub table in Databricks for users to query. |
Processing
Outputs
Connections
Connections
Other Objects
Templates
Templates provide users the ability to mass apply and manage similar transformation logic to many objects at the same time, whether it be optional tokens, rule or relation templates, or even source and output templates.
Schedules
Users manage source data ingestions based on schedule configurations defined and attached to sources. Schedules are set up using CRON syntax and include a tab for users to see all objects attached to the schedule in one view.
Agents
The Agents page allows users to manage remote Agent installations and parameters related to how remote Agents ingest appropriate data. Additional data on DataForge Agents can be found in the Agent Installation Guide.
Lineage
Lineage icons exist throughout the DataForge user interface, allowing users to create lineage graphs for specific objects or more globally. As the user starts a new lineage graph, they're redirected to this page for interaction.
System Configuration
Users can utilize system configurations for customized Cleanup configurations, Cluster and Process configurations, and to manage global service configurations like disabling all ingestions in the workspace or auto-upgrading to the latest version of DataForge. Cleanup and Cluster/Process Configurations are assigned to specific sources, giving the user ultimate flexibility in compute resource management and post-processing cleanup.
Projects
Projects are a top-level container within a DataForge workspace and represent separate sets of Source and Output configurations. Users utilize Project exports and imports to promote changes between projects. Projects are used to follow DevOps best practices with tools such as Git clients.
Users
User management is handled in this page for all Standard DataForge customers. Each customer has at least one admin user who can add and manage access for their organization. Private Enterprise customers manage user access to DataForge through Auth0.
Documentation (and Support)
Databricks
Sign Up and Log In (with a subscription)
Updated