This archived page lists maintenance updates for DataForge versions that are no longer supported.
Important!
This documentation has been retired. The DataForge versions listed are no longer supported. See DataForge release notes and version support.
DataForge Version releases
Maintenance Updates by version:
DataForge Version 7.0
- January 5, 2024
- Salesforce documentation hyperlinks were not opening correctly to the api docs
- Salesforce API ingestions with Max Records parameter not downloading all batches of data
- December 12, 2023
- Workflow doesn't prevent manual reset cdc on source with previously failed refresh
- Add connect and login timeout to Core system config connection on startup
- Remove single quotes around timestamp token for SFDC ingestion
- November 22, 2023
- Output mappings not fully importing from project
- Select enrichment parameters deleted during import
- Added check for import/export version number
- Unable to delete group if connection is tagged in use
- Removed deprecated scripts from database deployment during upgrades
- Validation of rule template erroring with message "error : Values not defined for token(s): "
- October 5, 2023
- Group renaming fails on save
- Output channel IPUs not updated after cloning
- Custom refresh source fails during refresh when keep current rule using window function exists
- Delete input and Bulk reset CDC may duplicate data
- Clone (output channel clone without Group) does not map cloned Output column mappings
- Unable to process large inputs (10GB+)
- September 21, 2023
- Update Source File Type description to clarify multipart should not be used with Watcher sources
- Reset All CDC fails on switching from Key to Custom Refresh when schema changed
- Source parser attribute is null in Postgres but displays as Spark in UI
- Connection cloud_credentials are blank after cloning
- s3 file connection allows "Public bucket" credential when using agent (removed)
- Can't update Raw Attribute Description anymore
- Convert query token substitution to string for spark ingestion to match agent ingestion
- Optimize reads for custom refresh bulk reset CDC
- September 15, 2023
- Changing from Key refresh to Custom Refresh fails on Reset All CDC
- Key Delete Breaks when raw_attribute only exists in input(s) being deleted
- Delta Lake Output with multiple channels creates loop of output processes
- Key CDC bulk reset breaks hub_history table
- Custom Data refresh delete input fails
- August 22, 2023
-
Direct Mappings do not detect Relations
-
Refresh calls effective record ranges function excessively during timeseries,none,sequence refresh
- Enrichments with Date type failing to save and restarting API
-
- July 19, 2023
- Sparky file ingest is missing file archiving
- Add the option to delete or remove existing sources from the dependencies page through the UI
- Keyed source is allowed to keep processing if refresh failed but a queued input is deleted
- Output records for deleted inputs are not deleted after reset all output
- Bad character in fixed width schema breaks file ingestion
- Source and output settings are not reverted after error during save.
- Deleting Token from Source page throwing error message
- Manual attribute recalculation process waits on any pre-refresh process in the queue, creating circular wait in case of error
- Data Profile showing blank results for Rules
- Cluster Applied Objects page is not showing the source list when the Cluster used as a process override cluster
- Mark Input as Failed if latest output retry is failed when multiple outputs are mapped
- Source filters error out with ' in search
- Output: changing Output Type from Virtual to File is throwing error message
- Delete group connection if group is deleted from project and doesn't exist in other projects
- Databricks Null Pointer error on Failure
- Hub View Name incorrect when creating source from connections page
- Output mapping non-primary relations not validating
- Do not allow mapping unmanaged external source to Output -> Source Mapping
- Do not allow Keep Current rules to be used with unmanaged source in relation/rule expression
- Schedule page is showing two Name column
- Cloning a group sets hub_view_name to NULL for all new sources
- Searching for Agent name doesn't work but agent code does
- Process Configuration UI Search doesn't work
- Relation Expression can't save when casting part of expression from String to INT
- Enrichment parser does not allow writing enrichment using a relation out and into the same source
- CDC Failed from Driver Node reclaimed but source kept processing newer inputs and didn't queue them
- s_key using decimal column is calculated differently in regular CDC vs Reset_all_cdc, leading to data duplication
- Job Runs table is empty
- Adding source id to URL jumps to the right source but keeps the wrong project where it doesn't exist
- Input status is not updated when enrich/recalc or output query fails to generate
- Deleting source can get you to a null Project for the session
- Deadlock happening during processing
- Core failed to reconnect after Postgres shut down/restarted
- Get Date From File Name broken when using only date with no time component
- Deployment needs to be rerunnable by restarting deployment task only after failure
- Deployment container fails to rollback and restart Core, API, Agent, Usage Agent containers on failure
- Catch error if a valid Cron expression is saved that results in invalid date
- Reset all CDC hangs
- Can not clear Generic JDBC password once it's been saved
- Extra slash in file path archive folder setting on watcher source creates infinite loop of errors
- A big source name causes logout button and source status to go off the screen
- CDC and Input Delete break after Raw Attribute with same name as Key Column and different data type are ingested
- Source settings parameters are not properly bolding or auto-expanding when non-default
- IPU is hidden in SaaS versions
- Remove reset options from triple-dot menu for inputs that failed on ingestion
- Admin user can't uncheck the Active flag on a non-admin user in the UI
DataForge Version 6.2
- May 30, 2023
- Add log message to process logs when a watcher source failure triggers disable initiation to flip
- Input not updated to Success when output runs (process complete success)
- Keyed source allows later inputs to process when Refresh failed from driver node being shut down (spot availability)
- Reset Enrichment not an option when needed after failing from other source Hub view does not exist error
- Agent Restart Fails Ingestions but Retries don't run after the restart
- Fix deprecated image for Appstream
- Remove the use of ACLs from S3
- Increase Source Page load to 100 rows initially and increment in 100 row chunks
- April 6, 2023
- Core gets stuck on microservice restart
- Attribute Recalculation with Self-Relation in Rules on Multiple Sources triggers loop
- Can't save or update rule template with "null values detected" message
- March 24, 2023
- Import breaks due to not converting Output Column Mapping Expressions to lowercase column aliases during lowercasing process
- Import breaks due to lower-casing of column aliases but not converting Relation Expression column aliases to lowercase
- Self-related enrichment expression is not parsed correctly and returns wrong result
- Add user-agent tag to all databricks calls
- March 9, 2023
- Fixed attribute lowercasing conversion with additional hub table column name lowercasing. The bug resulted in new attributes being created during Refresh and null values. With this fix, a new data pull or reset corrects the hub table.
- Fixed Custom Post Output process failing by correcting missing parameters in request process.
- February 21, 2023
-
Cleanup is running slow and not fully utilizing cluster capacity
-
Dependency edit dialog is broken
-
Deadlock during prc_proces_end during update of job_run table
-
Queued Ingestion processes not able to be deleted, automatically failing processes associated with dead agents also not working
-
Attribute recalc overrides hub table view definition of full refresh source with non-current input_id
-
Clone import errors when tokens are used in relation expressions
-
Output template view name does not convert ${GROUP} token dashes to underscores
-
Relation and Rule templates continue spinning when it is invalid rather than giving an error
-
Key Source: Changing the Key Columns values multiple times does not show "Save Changes and Reset CDC" popup option
-
Usage-Agent errors out with Java heap space message
-
Unable to save boolean values in service(system) configuration screen
-
Input last_completed_process_type is not updated when process fails
-
Increase wait time before Core deletes process if agent is unhealthy
-
Private UI container results in Environment and Client parameters as "undefined" in UI
-
Multipart sources throwing error in cleanup
-
Import does not import post output commands option (post output notebook not imported)
-
UI displays 2-3 redundant authentication prompt on startup
-
Inactivating relation in source environment throws errors when importing to second environment (can't inactivate relation on import)
-
Reset all CDC does not calculate record count
-
Azure Spot with Fallback does not save
-
Custom Ingestion Process stuck In Progress for hours and core restart didn't clear
-
Full refresh sources have redundant key refresh attributes
-
Prevent certain characters (/"&) in terraform db passwords OR escape them in deployment app/everywhere
-
All Auth0 rules are "global"
-
Grey out recalculate button if no passed CDC inputs on source
-
DataForge Version 6.1
- January 12, 2023
- Relation validation throws intermittent error
- Unable to clone templated relation that uses enrichment template
- December 2, 2022
- Cross AWS Account S3 ingestion always gives 403 forbidden error
- Custom ingestion record count may be incorrect
- Globally-scoped processes keep retrying indefinitely when spark job fails
- Change hub table check wait time from geometric progression to constant
- November 11, 2022
- Multipart file ingestion creates input/process for every file in the directory
- Remove nullability schema check for sql server output connector
- Output is skipped with 0 records when when job run is switched between refresh and output
- Manual reset enrichment on key source input may result in data loss
- SQL Server and Snowflake outputs do not delete temp table after error
- Timestamp sources not updating same time ranges / keeping old records
- Agent runs out of disk space in Azure when pulling large tables
- Icon links are missing from search-select component dropdowns
- Inactivated source continues scheduled ingestion and processing
- Agent log UI is slow, overloads DB and may crash API
- Changing tracking columns on source does not show reset CDC popup
- Delete Data generates error when hub table does not exist
- Effective Record Count higher than Record Count
- Relation Expression Validation Error covering whole page and blocking other attributes
- Import/Clone logs have disappeared from UI
- Watcher sources trying to ingest when no file is present (xml error after)
- Unlinking a source from Rules Templates page is throwing an error message
- SVC_Create and Update error trying to create/save dependency
- Source status doesn't match input statuses
- Hub View is not updated during schema check
- Core doesn't come back when Databricks api call fails
- Deadlocks between meta.prc_process_get_next (sparky) and meta.r_update_job_run (core)
- Disable ingestion didn't stop new input from running on schedule
- Failed cleanup keeps retrying indefinitely
- Reset all CDC on key source overwrites hub table with incorrect decimal types
- Source Data Deletes processing very slowly - potentially colliding with mini-sparky
- Graph Legend on Lineage doesn't show up on Azure - shows as a broken picture link
- Dead agent does not fail Inputs and keeps them permanently queued
- Heartbeat route throws error when Agent calls with invalid agent code
- Our config values do not override play API defaults
- Spark output connection None.get
- Self-Relation with the same name as a normal Relation breaks the ability to use Self-Relation
- Change name in output column adds new column when promoting and does not remove old column
- Source delete is allowed when active (Q or I) processes exist
- Ingestion breaks on raw attribute normalization when columns come in with _2 already on them
- Removing output columns from source environment that are still used in target environment breaks import
- Clearing Spark Conf value and saving the cluster config showing an error message
- Add count of enrichments/rules to rules page
- Inactive Rule are no longer highlighted in red on Enrichments list page
- Manual Refresh required to see new updated relations and rules in table
- Output column data type doesn't clear out
DataForge Version 6.0
- September 28, 2022
- r_update_job_run blows up during ingestion
- File Mask needs to have Inbox/ to find file correctly
- September 13, 2022
- Output query writes all enrichments as null
- Upgrade Spark JDBC driver used in API to address log4j vulnerability
- September 9, 2022
-
Schema unlock error is not logged in pg logs and missing key info
-
Link relation template to source blows up
-
Rule Expression Validation Error covering whole page and blocking other attributes
-
Export includes sources that have been loopbacks in the past
-
Fix columns widths on Sources table from cutting off data
-
Needs to refresh the Rule Templates page manually after adding Source to see new Linked Source and Updated By name is incorrect
-
Show better error message in Databrick log when Hive query in incorrect
-
Bulk reset CDC will fail when source schema varies
-
Input stuck in workflow queue when hard dependency source (key) pulls in zero record input
-
Output Failure Alerts not working
-
Support link on Unauthorized page points to wrong DataForge support URL
-
Enriched fields not being aliased in GROUP BY of aggregate output query
-
Users are able to reset CDC and enrichment on single input that failed custom ingestion
-
Remove redundant updates to job_run_status table
-
Import process gets stuck in 'I' indefinitely when core restarts
-
Import does not work right now for sources with custom notebook and no connection attached
-
Reset all CDC fails after converting source Full->Key
-
Can't use "Save and Reset All CDC" when changing source from Full to Keyed and only has zero record inputs
-
Can't unlink rule template from sources when validation is auto generated
-
Workflow is not checked when Cluster is terminated
-
Cannot remove config for driver type on cluster config
-
Source that has a rule created with a relation is always able to enqueue recalculate usingchanged only button - even if nothing has changed
-
If relation template points to source template, grey out sources in new linked source list that have no group assigned
-
Rules tab no longer displays column headers with no rules
-
UI is not showing error message for Relation Templates
-
Needs to refresh the Input page after adding rule to see "Recalculate Change" option
-
Space out Save, Save and Create Validation buttons
-
Deadlock on Input Delete from Broadstreet
-
Cancel button on Schedule Page does not work
-
Output column name restrictions aren't followed when switching from Table to File type
-
Updated