Follow each of the steps in this guide in order to use the Terraform DataForge Quick Start tool to create all the necessary resources and infrastructure for a new DataForge Workspace.
Google Chrome is the supported browser for DataForge. Using any unsupported browsers may cause issues loading pages.
Databricks
- Create a Databricks Account. You can sign up for a free trial at https://www.databricks.com/try-databricks#account
Github
- Create a Github Account if you do not already have one and follow the signup instructions to confirm your account. The Free account will work for this quick start: https://github.com/signup
- Once you're signed into Github, open the DataForgeLabs Terraform Module Examples repo
- Click the Fork option near the top-right and click the Create Fork button to fork the repo into your own account.
- The forked repository contains a directory, "aws", that contains two files that include the default variables that need to be defined for the quick start tool to work. You can change these files in two ways:
- Change the authentication from databricks_account_user/databricks_account_password to instead use databricks_client_id/databricks_client_secret/databricks_workspace_admin_email
- Customize the network Databricks will be deployed to using the remaining Optional Inputs from the guide
- Make either of these changes by replacing the variables in the "main.tf" and "variables.tf" files within the directory. Both files must have the same variables listed. Use the following guide to update your directory files with the appropriate variables: https://registry.terraform.io/modules/dataforgelabs/aws-databricks/dataforge/latest?tab=inputs
- Be sure the "main.tf" and "variables.tf" files contain the required inputs along with other variables you want to define and Commit the Changes to your repository.
Terraform
- Sign up for a Terraform Account if you do not already have one and confirm the new account through the email confirmation that is sent: https://app.terraform.io/public/signup/account
- Once you are signed in to Terraform, create a new Organization. The organization can be given any name you like as long as it follows the Terraform creation guidelines listed.
- Create a new Workspace in Terraform:
- Choose the Version Control Workflow as you will need to sync with your Github account
- Select Github and Github.com for the version control provider.
- In the window popup that appears, select the Authorize Terraform Cloud button. If the popup window does not appear, you may need to adjust your browser settings to allow popups.
- On the next popup that appears, select the button to Install. If you are seeing a Terraform page that has a spinning icon and shows Github App Installation, look for the popup mentioned on your windows.
- You should now be on a screen to choose your repository. Select the repository that you forked from the DataForge Terraform Module Examples repo in the Github steps.
- Expand the "Additional Options" section and enter "aws" into the Terraform Working Directory, then select the Create option at the bottom of the page
- Terraform will show a Configure Terraform Variables page, automatically listing the variables needed. Enter the value of each variable from your AWS or Databricks accounts. Use the Inputs Guide to read about where to find the value of each variable. For environment_prefix, use only alphanumeric characters and dashes (underscores will cause failures). After entering the values, select the Save Variables option.
- Select the Start Run button, optionally give the run a name like "dataforge quickstart", and leave the Run Type as "Plan and apply (standard)". Select the Start button.
- When the Plan stage is complete, you will see a green checkmark and the message "Plan Finished". Scroll to the bottom of the page and select the Apply button to finish letting Terraform stand up all the resources in your Cloud environment.
- When the Apply stage in Terraform is complete, you should see a green checkmark and the message "Apply Complete". Copy the Databricks "workspace_url" and the "instance_profile_arn" to be used in the next and final section of steps. These two values can also be found in the Overview -> Outputs section if needed.
Databricks
- Open the new Databricks Workspace URL that was copied from the previous step in Terraform.
- Select your initial drop-down in the top-right corner and select Settings. Databricks Example here.
- Select Security on the sub-menu and click the "Manage" option next to Instance Profiles.
- Click the "Add instance profile" option and paste the instance_profile_arn value copied from Terraform into the "Instance Profile ARN" text box and click the Add button.
You are now finished using the DataForge Terraform Quick Start and all of the necessary resources and infrastructure have been created for you to easily request a new DataForge Workspace. A new Databricks Workspace should exist in your Databricks account for you to use. You will need a Databricks Personal Access Token to enter into the DataForge Workspace Request.
Please return to the New DataForge Workspace Creation form to finish your setup.
If issues arise or additional help is needed, please open a support request with the DataForge team and one of our members will assist you with getting the Quick Start working.
Updated