Lakebase plugin
Provides a PostgreSQL connection pool for Databricks Lakebase Autoscaling with automatic OAuth token refresh.
Key features:
- Standard
pg.Poolcompatible with any PostgreSQL library or ORM - Automatic OAuth token refresh (1-hour tokens, 2-minute refresh buffer)
- Token caching to minimize API calls
- Built-in OpenTelemetry instrumentation (query duration, pool connections, token refresh)
- AppKit logger configured by default for query and connection events
Getting started with the Lakebase
The easiest way to get started with the Lakebase plugin is to use the Databricks CLI to create a new Databricks app with AppKit installed and the Lakebase plugin.
Prerequisites
- Node.js v22+ environment with
npm - Databricks CLI (v1.0.0 or higher): install and configure it according to the official tutorial.
- A new Databricks app with AppKit installed. See Bootstrap a new Databricks app for more details.
Steps
- Firstly, create a new Lakebase Postgres Autoscaling project according to the Get started documentation.
- To add the Lakebase plugin to your project, run the
databricks apps initcommand and interactively select the Lakebase plugin. The CLI will guide you through picking a Lakebase project, branch, and database.- When asked, select Yes to deploy the app to Databricks Apps right after its creation.
Basic usage
import { createApp, lakebase, server } from "@databricks/appkit";
await createApp({
plugins: [server(), lakebase()],
});
Accessing the pool
After initialization, access Lakebase through the AppKit.lakebase object:
const AppKit = await createApp({
plugins: [server(), lakebase()],
});
await AppKit.lakebase.query(`CREATE SCHEMA IF NOT EXISTS app`);
await AppKit.lakebase.query(`CREATE TABLE IF NOT EXISTS app.orders (
id SERIAL PRIMARY KEY,
user_id VARCHAR(255) NOT NULL,
amount DECIMAL(10, 2) NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
)`);
const result = await AppKit.lakebase.query(
"SELECT * FROM app.orders WHERE user_id = $1",
[userId],
);
// Raw pg.Pool (for ORMs or advanced usage)
const pool = AppKit.lakebase.pool;
// ORM-ready config objects
const ormConfig = AppKit.lakebase.getOrmConfig(); // { host, port, database, ... }
const pgConfig = AppKit.lakebase.getPgConfig(); // pg.PoolConfig
Configuration
Environment variables
The required environment variables are:
| Variable | Description |
|---|---|
LAKEBASE_ENDPOINT | Endpoint resource path (e.g. projects/.../branches/.../endpoints/...) |
PGHOST | Lakebase host (auto-injected in production by the postgres Databricks Apps resource) |
PGDATABASE | Database name (auto-injected in production by the postgres Databricks Apps resource) |
PGSSLMODE | TLS mode - set to require (auto-injected in production by the postgres Databricks Apps resource) |
When deployed to Databricks Apps with a postgres database resource configured, PGHOST, PGDATABASE, PGSSLMODE, PGUSER, PGPORT, and PGAPPNAME are automatically injected by the platform. Only LAKEBASE_ENDPOINT must be set explicitly:
env:
- name: LAKEBASE_ENDPOINT
valueFrom: postgres
For local development, the .env file is automatically generated by databricks apps init with the correct values for your Lakebase project.
For the full configuration reference (SSL, pool size, timeouts, logging, ORM examples), see the @databricks/lakebase README.
Pool configuration
Pass a pool object to override any defaults:
await createApp({
plugins: [
lakebase({
pool: {
max: 10, // Max pool connections (default: 10)
connectionTimeoutMillis: 5000, // Connection timeout ms (default: 10000)
idleTimeoutMillis: 30000, // Idle connection timeout ms (default: 30000)
},
}),
],
});
On-Behalf-Of (OBO) — per-user connections
When your app needs Row-Level Security (RLS) or per-user data isolation, use asUser(req) to execute queries using a per-user Lakebase connection pool. Each user's pool is authenticated with their Databricks identity, so PostgreSQL's current_user reflects the actual user.
Prerequisites
-
Enable user authorization in your Databricks App with the
postgresscope. See User authorization for setup instructions. In yourdatabricks.yml:resources:
apps:
app:
user_api_scopes:
- postgresApps scaffolded with
databricks apps initand the Lakebase plugin include this automatically. -
Each app user needs a Postgres role in Lakebase. Create one with the Databricks CLI:
databricks postgres create-role "projects/{project_id}/branches/{branch_id}" \
--json '{"spec": {"identity_type": "USER", "postgres_role": "user@example.com"}}'Alternatively, create roles in the Lakebase UI under Branch Overview → Add role.
noteDo not grant
databricks_superuserto OBO users — superusers bypass RLS. Use fine-grained grants instead.
Usage
No configuration needed — just call asUser(req):
const AppKit = await createApp({
plugins: [server(), lakebase()],
});
// Service principal query (default — bypasses RLS as table owner)
const all = await AppKit.lakebase.query("SELECT * FROM app.orders");
// User-scoped query (per-user pool, RLS enforced)
app.get("/api/my-orders", async (req, res) => {
const result = await AppKit.lakebase
.asUser(req)
.query("SELECT * FROM app.orders ORDER BY created_at DESC");
res.json(result.rows);
});
When asUser(req) is called:
- The user's token and identity are extracted from
x-forwarded-access-tokenandx-forwarded-emailheaders (set automatically by Databricks Apps). - A per-user
pg.Poolis created (or reused) with the user's OAuth credentials. query()andpooluse the user's pool —current_userin PostgreSQL reflects the user's identity.
Row-Level Security example
-- As the service principal (during app setup):
ALTER TABLE app.orders ENABLE ROW LEVEL SECURITY;
CREATE POLICY user_orders ON app.orders
FOR ALL TO PUBLIC
USING (owner = current_user);
-- Grant access so OBO users can query
GRANT USAGE ON SCHEMA app TO PUBLIC;
GRANT SELECT, INSERT ON ALL TABLES IN SCHEMA app TO PUBLIC;
How it works
- The service principal pool (
AppKit.lakebase.pool) is always created and used for DDL operations, seeding, and admin queries. - Per-user pools are created on the first
asUser(req)call and cached by user identity. Each pool has its own OAuth token refresh cycle. - Idle connections within per-user pools close automatically (30s idle timeout). Empty pool objects are cleaned up periodically.
- On shutdown, all pools (SP + user) are closed gracefully.
- In development mode (
NODE_ENV=development), if no user token is available,asUser(req)falls back to the SP pool with a warning.
PostgreSQL superusers bypass Row-Level Security entirely. Users with the databricks_superuser role will see all rows regardless of RLS policies. For RLS enforcement, use fine-grained grants instead of the superuser role.
Database Permissions
When you create the app with the Lakebase resource using the Getting started guide, the Service Principal is automatically granted CONNECT_AND_CREATE permission on the postgres resource. This lets the Service Principal connect to the database and create new objects, but not access any existing schemas or tables.
Local development
To develop locally against a deployed Lakebase database:
-
Deploy the app first. The Service Principal creates the database schema and tables on first deploy. Apps generated from
databricks apps inithandle this automatically - they check if tables exist on startup and skip creation if they do. -
Grant
databricks_superuser(skip if you are the Lakebase project owner — you already have full access):# Create a new role with databricks_superuser
databricks postgres create-role "projects/{project_id}/branches/{branch_id}" \
--json '{"spec": {"identity_type": "USER", "postgres_role": "user@example.com", "membership_roles": ["DATABRICKS_SUPERUSER"]}}'To grant superuser to an existing role, use
update-role:databricks postgres update-role \
"projects/{project_id}/branches/{branch_id}/roles/{role_id}" \
"spec.membership_roles" \
--json '{"spec": {"membership_roles": ["DATABRICKS_SUPERUSER"]}}'Alternatively, you can manage roles in the Lakebase Autoscaling UI under your project's Branch Overview page → Add role / Edit role.
-
Run locally - your Databricks user identity (email) is used for OAuth authentication. The
databricks_superuserrole gives full DML access (read/write data) but not DDL (creating schemas or tables) - that's why deploying first matters (see note below).
For other users, repeat step 2 to create an OAuth role with databricks_superuser for each user.
Postgres password authentication is a simpler alternative that avoids OAuth role permission complexity. However, it requires you to set up a password for the user in the Branch Overview page in the Lakebase Autoscaling UI.
When the app is deployed, the Service Principal creates schemas and tables and becomes their owner. databricks_superuser gives full DML access (read/write) but not DDL, so local development works only after the schema exists.
If you run npm run dev first, your credentials own the schema and the deployed app hits permission denied. To recover, export any data first (pg_dump or a temporary schema copy), then drop the schema and redeploy. After redeploying, the Service Principal recreates the schema on startup. (PostgreSQL schema ownership is tied to the role that created it and cannot be reassigned by regular users.)
Fine-grained permissions
For most use cases, databricks_superuser is sufficient. If you need schema-level grants instead, refer to the official documentation:
SQL script for fine-grained grants
Deploy and run the app at least once before executing these grants so the Service Principal initializes the database schema first.
Replace subject with the user email and schema with your schema name:
CREATE EXTENSION IF NOT EXISTS databricks_auth;
DO $$
DECLARE
subject TEXT := 'your-subject'; -- User email like name@databricks.com
schema TEXT := 'your_schema'; -- Replace 'your_schema' with your schema name
BEGIN
-- Create OAuth role for the Databricks identity
PERFORM databricks_create_role(subject, 'USER');
-- Connection and schema access
EXECUTE format('GRANT CONNECT ON DATABASE "databricks_postgres" TO %I', subject);
EXECUTE format('GRANT ALL ON SCHEMA %s TO %I', schema, subject);
-- Privileges on existing objects
EXECUTE format('GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA %s TO %I', schema, subject);
EXECUTE format('GRANT ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA %s TO %I', schema, subject);
EXECUTE format('GRANT ALL PRIVILEGES ON ALL FUNCTIONS IN SCHEMA %s TO %I', schema, subject);
EXECUTE format('GRANT ALL PRIVILEGES ON ALL PROCEDURES IN SCHEMA %s TO %I', schema, subject);
-- Default privileges on future objects
EXECUTE format('ALTER DEFAULT PRIVILEGES IN SCHEMA %s GRANT ALL ON TABLES TO %I', schema, subject);
EXECUTE format('ALTER DEFAULT PRIVILEGES IN SCHEMA %s GRANT ALL ON SEQUENCES TO %I', schema, subject);
EXECUTE format('ALTER DEFAULT PRIVILEGES IN SCHEMA %s GRANT ALL ON FUNCTIONS TO %I', schema, subject);
EXECUTE format('ALTER DEFAULT PRIVILEGES IN SCHEMA %s GRANT ALL ON ROUTINES TO %I', schema, subject);
END $$;