Table of Contents
Connect Row Zero to Databricks
Row Zero supports all cloud platforms (AWS, Azure, and GCP). The following instructions cover connecting Row Zero to a Databricks Lakehouse through two different methods of authentication and how to export your spreadsheet to Databricks:
Personal Access Token
In Row Zero click on the 'connections' icon in the upper right-hand corner and then click the button to '+ Add connection.'
The connection requires 4 pieces of information defined below.
- Name - This field can be anything that helps identify the data source being connected. (e.g. "Production Data" or "Product Metrics")
- Server hostname - This is the Databricks warehouse name, which can be found in your Databricks portal under the SQL Warehouses menu. In SQL Warehouses, click on the warehouse you wish to connect to. Your server hostname is listed under 'Connection details' and is the long string of letters and numbers under "server hostname" at the top of the screen (e.g. cbd-9hfliu9u-83jd.cloud.databricks.com).
- Warehouse ID - Warehouse ID is under the SQL Warehouses, Overview, Name (e.g. 0923idj093jd9j11)
- Access Token - Your access token can be generated by going to your account icon in the upper right-hand corner of the Databricks portal, selecting 'Settings' and then 'Developer.' Click the blue button to generate your access token. If you do not see a blue button, contact your Databricks account admin as they will need to update your Databricks personal access token permissions
Once all the information is entered, hit 'Test connection' to ensure the information is correct. If the test returns green, you can click '+ Add Source.' If the test returns red, check your connection credentials and try again.
Databricks OAuth
To connect to Databricks using OAuth, a Databricks Admin at your company will need to pre-configure the connection. Instructions for OAuth setup at your company can be found in our Databricks OAuth page. After your Databricks and Row Zero accounts have been configured to use Databricks OAuth, you can use the following instructions to login.
In the 'connections' tab, click on 'Databricks OAuth'
Then enter your Warehouse Id, which can be found under the SQL Warehouses, Overview, Name (e.g. 0923idj093jd9j11).
After entering your Warehouse Id and hitting 'Connect to Databricks', you will be asked to enter your Databricks username and password.
Click 'Sign-in' and you will see a black page saying 'Succeeded. You can close this window.' Your Databricks OAuth sign-in is complete and you can now return to your original tab or window and begin importing data from Databricks to a spreadsheet in Row Zero.
Write a query to import data
Now that Databricks is connected to Row Zero, write a SQL query to pull data into the spreadsheet. First select the 'Catalog' and 'Schema' from the two drop downs. Write a query in the query editor. The easiest query to write is the 'select *' statement, which pulls in the entire table. Example:
select * from TABLENAME
Row Zero is equipped to handle large data sets and will easily ingest the entirety of a table. Write more complicated queries to pull in various subsets of data stored in Databricks. All results will be displayed in the 'connected data table' in the spreadsheet.
Save Your Spreadsheet to Databricks
With a connected Databricks account, you can export data sets from Row Zero to your cloud data warehouse. Once you've defined the data you would like to save as a table in Databricks, right click on the data table or selected range and navigate to 'Export to > Databricks' in the context menu.
The export to Databricks modal will pop and ask you to select the database, schema, and specify a table name. New table names are forced to start with 'rz_' to avoid overwriting any tables already in the Databricks cloud data warehouse. Hit export and Row Zero will create a new table in Databricks and import all the data from your selected spreadsheet range or table. When the export is complete, you will see a notification in the upper right-hand corner.
To verify your table is in Databricks, you can use either of the commands below or simply write a query against it.
show tables
select * from rz_TABLENAME
You're connected to the most powerful spreadsheet!
Now that Row Zero is connected to Databricks, you can easily import Databricks data to a Row Zero spreadsheet. Row Zero is powerful enough to pull in entire Databricks tables, so you can create live pivot tables, build models, and run your favorite spreadsheet fuctions on top of your Databricks data without writing complex SQL queries. Easily adjust your query or refresh to get the latest data from Databricks without overwriting your work.