DataForge Cloud 9.0 Version Features Blog

DataForge Cloud 9.0 provides a revolutionized AI assistant experience in Talos, dark mode, updated connection management, and performance improvements.

Table of Contents

  1. Next generation Talos AI: Your AI-powered data assistant
  2. Embrace the Dark Side: Introducing Dark Mode
  3. Upgraded Databricks Runtime 15.4 LTS
  4. Improved connection metadata searching
  5. Integrate connection data dictionaries
  6. Optimize full refresh sources with key column parameters
  7. SQL Server output upgrades

Next generation of Talos: Your AI-powered data assistant

Talos, DataForge’s intelligent AI assistant, has taken a giant leap forward in empowering users to work smarter and faster with their data. Talos offers a dramatically enhanced natural language interface, allowing users to interact with their data assets, build complex outputs, and automate data processes simply by describing what they want to accomplish. Whether you’re a data engineer, analyst, or business user, Talos bridges the gap between technical complexity and business intent, making advanced data tasks accessible to everyone.

Among the standout features in this release, Talos can automatically discover and suggest relevant tables, recommend optimal relations, and even generate Spark SQL expressions for custom rules and calculations, all from a plain English prompt. Users can trace data lineage, create and manage sources, and orchestrate outputs with just a few clicks or commands, drastically reducing the time spent on manual configuration. Talos also provides intuitive guidance throughout the process, surfacing best practices and actionable insights, so teams can focus on deriving value from their data rather than wrestling with technical hurdles.

This release marks a new era for DataForge users, where the power of AI-driven automation meets the flexibility of self-service analytics. With Talos, organizations can accelerate their data projects, boost productivity, and unlock new possibilities no matter their level of technical expertise.

talos creation of sources and output.gif

For more information on getting started, visit the Talos AI documentation.

Embrace the Dark Side: Introducing Dark Mode

You can use a sleek, modern dark mode to make your experience easier on the eyes. Whether optimizing your data model or reviewing complex logic, dark mode reduces eye strain while giving the UI a polished look. You can easily switch between light and dark mode through the profile menu in the top-right of the screen.

For more information about switching between light and dark mode, visit the Navigation and Interface documentation.

Upgraded Databricks Runtime 15.4 LTS

The default Databricks version for cluster configurations is updated to DBR 15.4 LTS. For more information, visit Databricks 15.4 LTS documentation.

Improved connection metadata search and multi-select

Searching through available schemas and tables in your connection metadata and multi-selecting is even easier. Search for schema or table names freely, and DataForge will retain any previously selected tables even if the existing search is cleared. Easily multi-select a group of tables to bulk create by selecting the top table and holding shift+clicking the bottom table to select all tables in between.

connection metadata search and multiselect.gif

For more information, visit the connections documentation.

Integrate connection data dictionaries

You can define a data dictionary for the tables and columns from your connection. Once the data dictionaries are uploaded, Talos becomes even more intelligent in helping you build your data model through natural language processing and interpreting what data is available in the connection and how it interacts with each other. For now, this is limited to table connections where metadata is collected. 

Download the table and column template CSVs to start, fill them out, and upload them back into DataForge.  

Provide descriptions for both tables and columns, and identify what category each column is from the list of key, dimension, metric, name, or modified timestamp. Indicate which columns are key columns through the use of "pk", "fk", or "pk+fk" definitions, and identify which table is associated with the foreign keys to allow Talos to create relations for you easily!

For more information, visit the connections documentation.

Optimize full refresh sources with key column parameters

DataForge now supports key column identification in full refresh source CDC parameters. Skip creating unique rules in full sources to create 1:1 or 1:M relations, as the key columns are marked as unique in the raw schema and can be directly referenced. 

For more information, visit the source settings documentation.

SQL Server output upgrades

The Spark SQL connector that was previously used is no longer supported with the updated Databricks Runtime 15.4 LTS version. However, don't fret! DataForge has built an even more powerful SQL connector using many of the basics of the previous Spark SQL connector with enhanced output capabilities to improve performance!

Single channel outputs can truncate tables when a full reload is detected rather than running a "DELETE FROM ... WHERE s_output_source_id = ...." pattern, improving performance. SQL Server outputs include two new parameters for batch size and table lock. Adjusting the batch size helps optimize bulk insert performance. Use higher values for narrower tables and lower values for wider tables. Enabling the table lock improves bulk insert performance, but requires an exclusive full table lock.

For more information, visit the output settings documentation.

Updated

Was this article helpful?

0 out of 0 found this helpful