6.2.0 Usability Enhancements and (BETA) Auto-Upgrade

  1. Lowercase Conversion of All Raw Attributes
  2. Source UI Navigation and Filtering
  3. (BETA) Auto-Upgrade
  4. SMTP Process Email Alerts for Azure and AWS
  5. Enhanced Source Triple-dot Menu Options
  6. Reprocess Options on Outputs
  7. Meta Monitor 2.0 and Cleanup
  8. Process Request Backoff on Request Limit Exceeded (AWS and Azure)

Please read the following notes about updated features in this release as there are functionality changes that require additional attention!

The process to upgrade to this version uses a new Github branch strategy but will follow Terraform steps as usual - both updated in the AWS guide or Azure guideHowever, there are new variables and Terraform version updates required for this upgrade to work.  Please see the short list of changes required in the upgrade instructions here below 6.2.0. Upgrade directly to the latest patch of 6.2.x instead of 6.2.0 but follow the same instructions linked here.
The Cleanup process is now managed with a standard schedule with the name of "Cleanup" (name is case-sensitive).  This schedule name should never be changed. The schedule is defaulted to run every day at 12am UTC but users can adjust the day and time values as needed.
The Auto-Upgrade process is managed with a new standard schedule with the name of "Automatic Upgrade" (name is case-sensitive).  This schedule name should never be changed. This schedule is defaulted to run every Sunday at 12am UTC, but users can adjust the day and time values as needed to match the right upgrade window.
All meta monitor tables available in Databricks within the meta database will be changing names.  Any queries or notebooks referencing these tables should be updated to match the new table names. Please see the section Meta Monitor 2.0 and Cleanup for the new table names and locations.

Lowercase Conversion of Raw Attributes

As of the 6.0.0 release of DataForge, any new raw attributes ingested will be converted to lowercase column aliases.  Due to this change, yaml imports will not when promoting changes to sources with case sensitive changes from 6.1.x to 6.2.x.  This will continue to be the case moving forward for any sources created.  However, sources created before the 6.0.0 upgrade may now have a mix of casing in their raw attributes from ingesting with the "Force Case Sensitive" toggle turned off.

This version release will complete the conversion of all raw attribute column aliases to lowercase to enforce consistency.  If the exact same lowercase version of the column being converted already exists in a separate attribute column alias, then the new column alias after conversion will be appended with "_2".

Example of Attributes from Raw Schema Pre-6.2.0:

ID Version first created Column Name Column Normalized Data Type Column Alias
1 5.2.0 MyColumn Value string MyColumn
2 6.0.0 mycolumn value int mycolumn

 

Example of Attributes from Raw Schema Post-6.2.0:

ID Version first created Column Name Column Normalized Data Type Column Alias
1 5.2.0 MyColumn value string value
2 6.0.0 mycolumn value int value_2

Source UI Navigation and Filtering

There is a new and improved user interface for the Sources page which optimizes for performance.  Previously, if an environment had a few hundred sources or more, the Sources page would be slow to load and sometimes create a choppy user landing experience.  This UI has been refactored into a table with a "Load More" button at the bottom to load more sources to the view as needed.

mceclip2.png

Users can enjoy advanced filtering on the Sources page by using the filter drop-down "Add Filter" on the top left of the page, selecting a filter type, and then using the drop-down to the right to select the filter value.  Filter options have been added for Agent, Schedule, and Token.

mceclip3.png

Users now have the ability to categorize sources using Tokens and filter Sources by token on the Sources page to easily navigate!  The Token filter option will provide all of the key:value pairs available to filter on.  There is an optional search bar at the top of the filter value selection to narrow down the list.

mceclip0.png

(BETA) Auto-Upgrade

WARNING: This is a Beta feature and we do not recommend turning this setting on in any Production environments without first consulting support

We encourage anyone comfortable with trying this feature out to enable it in any Development or Sandbox environments.  Please share your feedback with the DataForge team on this new feature!

Auto-upgrade is a new feature to DataForge with this release that will allow users to automatically upgrade environments to the latest Minor and Patch versions as they are available.  Major versions will continue to use the same manual Terraform upgrade process found here for AWS or here for Azure.

Every environment will have a setting called "auto-upgrade-enabled" in the Service Configuration page of DataForge that will be defaulted to False or Off.  This means the environment will not run the auto-upgrade process.  To enable auto-upgrades, users need to edit this parameter in the Service Configurations and set it to True or On.  

Every environment will also have a new schedule called "Automatic Upgrade". The name of this schedule should not be changed as the process relies on it to know when is an acceptable window to complete any upgrades available.  Users can edit this schedule similarly to other schedules by adjusting the cron values and saving the changes.  Ideally, set the Automatic Upgrade schedule to a window when there aren't any scheduled processes to run.  At the beginning of the upgrade window, DataForge will pause all new ingestions and wait for existing processes to finish before starting an upgrade.  The upgrade window lasts for one hour.  If active processing has not finished at the end of the hour, DataForge will resume new ingestions and wait for the next upgrade window to start again.

Example upgrade schedule set to run upgrades between 12AM UTC and 1AM UTC every day

A bell icon will display in the top-left corner of the UI next to the main menu if there is an upgrade available, an upgrade is in progress, or an upgrade failed to complete.  Click the bell icon to view more details.  In the event of an upgrade failure (orange bell color), please reach out to DataForge Support to troubleshoot what went wrong.

mceclip1.png

 

SMTP Process Email Alerts for Azure and AWS

DataForge now supports email alert configuration for Azure and AWS environments through SMTP using Sendgrid.  AWS users using SNS for alerts may continue doing so as this is an additional option for configuration.  For more detail or setup instructions, please refer to the SMTP Alerts Guide.

Please note it is a known issue that when a cluster fails to launch an alert is currently not sent.  An example of when this can happen is if AWS or Azure do not have enough instance capacity at the time. We will be enhancing this feature in a later release to accommodate this occurrence. 

Success or Failure Alerts can be sent for the following events: Ingestion, Data Process (Parse, CDC, Enrichment, Refresh, Recalculate), and Output.

SNS ARNs should be used for any alert parameter ending with "Topics" in AWS environments.  Email addresses should be entered directly in any alert parameter ending with "emails". 

Azure environments will only see parameters for emails.

mceclip0.png

Alert parameters for Source (ingestion or data process)

 

mceclip0.png

Alert parameters for Output

Enhanced Source Triple-dot Menu Options

Users can now navigate directly to mapped Outputs from a Source by using the triple-dot menu options for the source, similar to where users Pull Data Now.  The output will be opened in a new tab when the output name is clicked.

mceclip1.png

Navigating to Outputs source is mapped to

Users also now have more options when using the source-level triple-dot menu to Reset Output.  This option can now be used to reset output for all Outputs that the source is mapped to (All Channels option), or be used to reset output for an individual Output.  The individual outputs will be shown as Output Name - Channel Name.  This is still the equivalent of resetting output for all inputs the source contains but can now be targeted.

mceclip2.png

Resetting Source Output for a specific Output

 

Reprocess Options on Outputs

Users now have the ability to reset output for a channel directly from the Output mapping screen by clicking on the triple-dot menu to the right of the channel name and selecting Reset Channel Output.  This is the equivalent of Reset Output for all Inputs from the channel/source. This gives users the ability to update output column mappings and reprocess Output all from the same page!

mceclip0.png

Reset Output for the channel directly from Output mapping

Users can also Reset All processes directly from the Output mapping screen through the same triple-dot menu on the channel/source.  When selecting the Reset Source option, users will be given the same options available from the Source Inputs screen such as Recalculate Changed/All and Pull Data Now.

mceclip1.png

Re-process options available from Output Channel

Meta Monitor 2.0 and Cleanup

Cleanup has now been combined with the Meta Monitor process and will run more efficiently due to improvements made.  We have optimized performance by leveraging a partitioned delta table for process history.  Meta Monitor and Cleanup are expected to run multiples faster than previous versions for this reason.  

Cleanup schedules will now be maintained in the Schedules page rather than the Service Configurations.  A default cleanup schedule will be created in every environment with a schedule name of "Cleanup" as part of the upgrade that will be set to run nightly at 12PM UTC. Do not change the name of this schedule as it needs to have the same name and spelling for the process to recognize the schedule and launch.  To adjust the schedule run times, open the Cleanup schedule as you would a normal schedule and change the cron values to adjust the start times and save the changes.  Any schedule change will be updated after the next cleanup process run completes or if the Core service is restarted.

mceclip2.png

Default Cleanup Schedule

A cleanup-concurrency-multiplier parameter has been added to the Service Configurations page to be used for tuning the cleanup process.  Generally, it is not recommended to change this value from the default unless necessary or if the DataForge Support team suggests a change.  

As mentioned, the table names produced from the Meta Monitor process that are available in Databricks Meta database have changed.  Any notebooks or queries referencing these tables needs to be updated with the new table names to avoid issues!  This is different from querying directly from the Postgres database.  Below is a list of previously available Databricks table names along with the new names they've been given.

Previous Databricks Table Name New Databricks Table Name
meta.pg_input meta.input
meta.pg_process meta.process
meta.pg_source meta.source
meta.pg_output meta.output
meta.process_history meta.process_history_v2

For a full list of the tables available after upgrading to 6.2.0, navigate to Databricks and open the Data tab from the Menu.  Click the hive_metastore catalog and the meta database to see the tables available.

mceclip1.png

Process Request Backoff for Request Limit Exceeded (AWS and Azure)

On occasion, there may be instances where processes fail with a Databricks error of "Request Limit Exceeded".  This indicates that AWS or Azure was not able to fulfill the request for the instance type requested by the cluster launching in the provider Zone. 

To alleviate issues when this occurs, we have added a backoff strategy to DataForge job runs to better handle this error.  When this error is returned, DataForge will wait for a short period of time before retrying the request.  This is recommended by both Azure and AWS to reduce the pressure on the request rate limiting these vendors have in place.  By delaying further requests briefly, there is additional time for the request token buckets to refill before they are put to work again.  For more information, please visit the AWS Request Throttling or Azure Request Limit documentation.

Full Changelog

  • Lowercase all raw attribute aliases and update all references
    Lower-case all legacy pre 6.0 raw atribute aliases and update all references
  • Move deploy schema/table creation and functions to Deployment repo
    Deployment app updates the deploy schema in database before running the main deploy
  • Update AWS Databricks security group
    Databricks security group in AWS now allows inbound traffic from itself on all ports for UDP and TCP
  • Add lifecycle ignore for sku and scale for Azure Bastion host
    Azure Bastion scale units and sku can be changed in Azure portal without being reverted next time terraform runs
  • Add upgrade status UI display + manual upgrade check and core restart
    New UI features :
    - bell icon displayed next to top level navigation menu with details about available new version, pending auto-upgrade or upgrade issues
    - ability to manually check for upgrade, restart core and api services from the service configurations page
  • Meta Monitor 2.0
    Consolidated legacy meta_monitor_refresh process with cleanup.
    Optimized performance by leveraging partitioned delta table for process history.
    Exposed live databricks views for several popular postgres metadata tables
  • Improve UI performance when environments have >1000 sources
    Refactor sources list (UI entry point) to improve performance and enable better UX (fast searching and sorting by extended list of attributes)
  • Build SMTP Alerts for Azure
    Email alerting is now available in AWS and Azure. Any SMTP provider is able to be configured in the platform via Terraform, as long as it has a public host and a username and password authentication
  • Add backoff timeout with Jitter when job fails due to Insufficient Capacity Issues
    This fix will minimize occurrence of cluster launch errors with "REQUEST_LIMIT_EXCEEDED" error detail. It adds a random 0.5-1 second back-off during new job run launch, thus preventing cloud API from overloading. It also checks falied job error details for "REQUEST_LIMIT_EXCEEDED" message and retries failed processes at 3,6,9.. minute intervals, up to the retry count parameter value specified for the source
  • Build variable in Terraform to allow customers to ignore Appstream
    appStreamEnabled variable can allow customers to control AppStream resource deployment in AWS environments. Valid values are yes/no.
  • Change mini sparky and data viewer to use spot will fallback cluster instead of sparky-pool
    Mini sparky and data viewer are no longer connected to the default sparky pool and use spot with fallback to provision instances
  • Prompt users to be sure they want to delete input for input deletes
  • Update Deployment App for Upgrades
    Deployment App now uses a separate version than the platform version and is started by Core. It will read a version from the secrets manager (manual upgrade) or deploy.upgrade table (automated) and download artifacts to do the deployment
  • Update TF to support Upgrades
    Terraform has been updated to support auto-upgrade process. imageVersion variable has been deprecated and replaced with manualUpgradeVersion. releaseUrl and deploymentToken have been added as variables
  • Build Upgrade API components (lambda functions)
    API components for new DataForge automatic upgrade infrastructure
  • Build Release database components
    Database components for new DataForge automatic upgrade infrastructure
  • Create Release API project in TF and deployment pipeline
    Added Infrastructure components for the auto upgrade feature into terraform modules where it can be deployed into clients' accounts.
  • Add diagnostics to Azure flexible server to get postgres logs in Log Analytics workspace
    Added the ability to query Postgres logs in the Log Analytics workspace in Azure. You will be able to access the logs using this query: AzureDiagnostics | where Category == "PostgreSQLLogs"
  • Incorporate Purge Protection variable into Terraform
    Added the ability to enable purge protection on a key vault via a variable in terraform.
  • Add ability to filter by tokens on Sources page
    New feature added to new source list page: allows to filter sources by token name-value pair
  • Create variable in Terraform to let users switch between spot and on-demand for sparky-pool
    sparkyPoolAvailability variable can be used in Terraform to switch sparky-pool between spot and on-demand. For AWS, the values are "SPOT" and "ON_DEMAND". For Azure, the values are "SPOT_AZURE" and "ON_DEMAND_AZURE". Once Terraform is ran, deployment container must be ran on the current version as a manual version upgrade to update the cluster configurations with the new pool ID.
  • Prevent empty string variables in Terraform
    Empty string variables no longer allowed in Terraform
  • Create DataForge engineering/support read and read-write roles in Azure
    There are three new roles in azure to restrict users' permissions in a resource group. The new roles are named <environment>read-only<client>, <environment>read-write<client>, <environment>billing<client>.
  • Add button on mapping screen to recalculate all and reset all output per channel.
    Added several source reset options to output mapping screen, allowing users to initiate reset without needing to navigate away.
  • VPC Logs should be optional in terraform
    Added the ability to destroy VPC logs via a variable in terraform. This variable is called removeVpcLogs and the expected values are yes/no

Updated

Was this article helpful?

0 out of 0 found this helpful