
During the Databricks Data and AI Summit in June, Databricks announced the general availability of Databricks Apps. In this article, we will walk through how to build a chess app, and when it is launched within Databricks it will be very impressive.
However, before we dive into the fun, we need to understand what it is and whether it makes sense to start migrating your Vercel app to Databricks Apps.
Brining security and app to your data
One of the most important features in Databricks Apps is to bring the app to where your data is. While Enterprises take everything to secure their data, leaking their valuable data via an application or a chatbot can be a problem that is easily overlooked. The reason is that developers would usually get access to all the data and develop logics for various scenarios and release the same feature to production. In today’s fast-paced development world, properly setting up authentication and authorization seems to be the last thing developers want to deal with, especially when there are many permissions like data security, endpoint security and agent security, etc. Each one of them requires different way to manage.
Databricks app is a security first application
As illustrated in the diagram above, Databricks Apps is backed by Unity Catalog (UC), which is already securely governing all the structured and unstructured data within the Lakehouse. Now UC is also governing Apps. No longer do developers need to worry about maintaining separate set of permissions.
Permissions deep dive
Databricks Apps provides two forms of permissions. One is app permission, which is granted via standard user permissions. Another is user permission, which the permission is granted via the app on behalf of the user, granted via the app interface.
App Authorization
When creating an app, Databricks automatically creates a service principal. A service principal is similar to a service account but it is tied to a Databricks Workspace. You can find these credentials under Workspace settings > Identity and access:
Similar to a service account, access is granted equally to all users. It is recommended to run only shared tasks like accessing metadata, like table lineage, etc.
User authorization
Databricks Apps creates an OAuth2 App Client ID / Client secret that’d allow the app to access resources on behalf of a user.
This is most useful for data security where the user is restricted to access certain information.
The OAuth2 workflow can be difficult to understand as illustrated in the below diagram:
Explaining the OAuth workflow is beyond the scope of this article. However, the good news is using Databricks SDK will take care of the workflow for you automatically.
The below link explains the SDK in great details.
https://docs.databricks.com/aws/en/dev-tools/sdk-python
For example, if we want to query a GenAI endpoint using the SDK, we can leverage the below code:
|
Apps Runtime
Now that we have covered one of the most important aspect, which is security. It’s time to develop the actual application. As opposed to open platforms, Databricks Apps does not allow developers to choose their own runtime environment. So it’s critical to develop the backend with the runtime in mind and ensure the frontend code is also compatible.
The following describes the system environment in which your Databricks app runs:
- Operating system: Ubuntu 22.04 LTS
- Python environment: Python 3.11, running in a dedicated virtual environment. All dependencies are isolated within this environment, including libraries defined in requirements.txt and pre-installed libraries.
- Node.js environment: Node.js version 22.16 for apps developed using JavaScript frameworks. Manage dependencies using npm and package.json.
- System resources: Each app can use up to 2 virtual CPUs (vCPUs) and 6 GB of memory. If your app exceeds these limits, Databricks might restart it.
Endpoint security
In traditional web development framework, the API endpoints are not secured by default. In other words, additional security implementation will be required if we want to safeguard a sensitive API endpoint, for example /employee_salary.
Databricks apps designed to open endpoints that start with /api suffix to be open without an auth token. Anything that don’t have this suffix will require a token to get access.
⭐️ Tips Use /api/endpoint for general backend access, say the next best move
|
Putting them altogether
In this article I intentionally worked around the pre-built templates in Databricks Apps to give room for imaginations. There are no restrictions whether you want to deploy a streamlit app or a chess app. Imagine this is your Vercel or Digital Ocean environment. Databricks took care of the security model for us using Unity Catalog and using the least privilege. That way the security discussion will always be in place without having to worry about that during phase 2 of development. It is an indeed smart move. With that in mind, now we can start migration our MEAN stack or MERN stack or python stack

AUTHOR - FOLLOW
Jason Yip
Director of Data and AI, Tredence Inc.
Next Topic
Lakeflow Declarative Pipelines: Deep Dive into the Evolution of Delta Live Tables (DLT)
Next Topic