User Manual

Sources

  • Sources Overview

    A Source represents a single schema of data. Sources are the logical grouping for Inputs, Relations, and Ru...

  • Source Settings

    The Settings tab for a Source allows a user to specify key information about the Source including input types ...

  • Raw Schema

    The Raw Schema tab for Sources allows users to view the raw database attributes as well as raw metadata.   ...

  • Dependencies

    Dependencies allow configurators to modify the workflow engine to introduce waits to the processing queues ...

  • Relations

    Relations define intra-source connections and enable users to configure lookups and cross-source aggregates. ...

  • Rules

    Rules allow DataForge to modify and transform data. Rules Tab The Rules tab allows users to select, ...

See all 13 articles

Processing

  • Processing Queue

    The Processing Queue tab provides an interactive overview of all processes completed, active, error...

  • Ingestion Queue

    Sample Ingestion Queue The Ingestion Queue tab provides a view of all ingestions that are waiting to run, cur...

  • Workflow Queue

    Sample Workflow Queue   The Workflow Queue tab provides a view of processes tha...

  • Job Runs

    Table showing all job run details   Job Runs tab The Job Runs table shows detailed information on all job...

  • Resetting Processes

    Types of Resets During the development lifecycle, users will need to reset their sources often to change sourc...

  • Recommended Cluster Configurations during Development

    For environments that are undergoing daily development, it may be beneficial to set up your cluster configurat...

Outputs

  • Outputs Overview

    Outputs specify where or how DataForge exposes or exports data to external systems   Outputs Scre...

  • Output Settings

    In the Output Settings screen, users can see the various components that make up an Output, including tabs ...

  • Output Mapping

    Output Mapping controls the way data is sent to its final destination. It allows a user to rename columns, app...

  • Process (Output History)

    The Process page provides an operational dashboard of the processes completed or currently active for this Out...

Connections

  • Connections

    A connection holds the credentials, network locations, and any other parameters required to access data in the...

  • Generic JDBC Connection

    DataForge offers a generic JDBC connection type for ingesting data from external databases, outside of the pre...

  • Salesforce Connection

    DataForge offers a pre-built connector for Salesforce.  Before data can be ingested from Salesforce, users nee...

  • Kafka Events Connection

    DataForge integrates directly with Kafka Event Topics for batch or stream data Source ingestion and Output pub...

  • Unity Catalog

    DataForge supports reading from and writing to tables stored in Databricks Unity Catalog.  While DataForge wil...

Schedules

  • Schedules

    A schedule uses a CRON expression to determine how often source inputs are updated. Multiple sources can be as...

Lineage

  • Lineage Overview

    Lineage refers to a directed acyclic graph (DAG) generated by DataForge describing how data is processed, trac...

  • Lineage Edges: Dataflows and Relations

    What are Edges? Edges are the dataflow arrows connecting nodes on lineage. All edges show the dataflow from le...

  • Lineage Legend

    Where's the key? In Lineage, the different types of nodes are represented by combinations of colors and symbol...

  • Lineage Navigating Nodes

    Accessing the Navigation Menu Once in a lineage session, users can navigate the dataflow related to any node v...

System Configuration

  • Global Service Configurations

    WARNING: Changes to these settings can have a severe impact to the platform and break functionality if set inc...

  • Cleanup Configuration

    Cleanup Configuration defines retention settings for data lake objects and metadata. It is accessible via m...

Projects

  • Projects Overview

    Projects represent a group of configurations in DataForge that users control and are the primary vehicle for m...

  • Managing Projects with Github

    Managing Project configurations with Github allows users to efficiently merge changes from one project to anot...

Import/Export

  • Export/Import Overview

    Intro In DataForge, import/export functionality allows users to copy groups of configurations, in the...

Users - Access Administration

  • Manage Users

    This article explains how to manage user access to DataForge workspaces.   Overview of User Access: There are ...

Templates and Tokens

  • Templates and Tokens Overview

    Templates combined with tokens enable centralized deployment and management of re-usable Rules and Relations a...

  • Tokens

    Token management screens overview Tokens can be managed and created in the Tokens page which is found by o...

  • Relation Templates

    Relation Templates Use Case When configuring multiple Sources within DataForge, it is common for repeated ...

  • Rule Templates

    Rule Templates Use Case When configuring multiple Sources within DataForge, it is common for repea...

  • Best Practices

    Best practices, patterns and recommendations for template configuration and management Do When fi...

Agents

  • Agents

    Overview Agents are a lightweight application that can be installed on a machine to allow data ingestion from ...

  • Logs

    UI Logs Agent logs can be found in the UI by clicking on the Logs icon on the Agents screen. Logs can...

  • Installing a New Agent

    Details requirements and configurations to installing an Agent on a server that can access an on-premise data ...

SDK

See all 7 articles

Cloning

  • Cloning Overview

    Summary and Use Case  Cloning allows users to create copies of multiple sources/outputs and the relat...