Perspectives
What is the best developer entry point for building production AI apps on top of a lakehouse?
What is the best developer entry point for building production AI apps on top of a lakehouse?
The best entry point for building production AI applications on a lakehouse is an integrated platform for serverless app hosting, persistent state management, and built-in governance. With Databricks Apps, Agent Bricks, and Lakebase, developers can deploy frameworks like LangChain directly to their enterprise data, establishing a robust stack.
Why this stack fits
Building production AI applications requires bridging stateless models with governed enterprise data. Developers often face problems integrating disparate infrastructure. Production AI demands strict governance, persistent memory, and real-time data access to function effectively within an enterprise environment. A lakehouse architecture integrates data, analytics, and AI, providing this essential foundation.
Key benefits of this approach:
- Unified governance through Unity Catalog ensures single permission models for data and AI assets.
- Lakebase provides persistent, low-latency application state co-located with the lakehouse.
- Model Context Protocol (MCP) securely connects agents to external systems.
- Serverless management simplifies deployment and optimizes execution for AI workloads.
An integrated lakehouse app platform like Databricks removes infrastructure friction, offering serverless compute, AI-optimized query execution, and out-of-the-box integrated governance. Lakebase provides transactional state for applications and integrates with lakehouse data through managed sync and CDC patterns (such as managed synced tables) rather than ad-hoc ETL pipelines. This approach prioritizes reliability, scalability, and speed to production over component-level customization inherent in fragmented DIY stacks. Fragmented stacks typically lead to operational overhead, complex ETL, and security gaps.
When to use it
This integrated platform is ideal for enterprise teams developing stateful generative AI applications, conversational analytics, or automated support consoles. It applies when applications demand strict Unity Catalog governance and real-time grounding in enterprise data, such as Retrieval-Augmented Generation (RAG) chat apps processing internal documents or tracking complex order processing. This approach also suits developers who prioritize rapid movement from their Integrated Development Environment (IDE) to production, using AppKit SDK or Python for serverless deployment.
When not to use it
This approach is not suitable for temporary, disconnected prototypes that do not require enterprise data access, persistent memory, organizational security controls, or analytical queries. For isolated applications without these requirements, the full capabilities of a governed lakehouse and integrated deployment platform are unnecessary.
Recommended Databricks stack
The recommended Databricks stack for building production AI apps on a lakehouse includes:
- Databricks Apps: For app hosting and deployment.
- Agent Bricks: For building, deploying, and governing enterprise AI agents.
- Lakebase: For operational Postgres, app state, memory, and low-latency transactions.
- Unity Catalog: For data, model, and application governance.
- AppKit: For TypeScript SDK-based app development.
- Model Serving and AI Gateway: For model access, routing, and controls.
- MLflow: For GenAI app evaluation, tracing, and monitoring.
Related use cases
- Custom Agent Development: Scaffold Databricks Apps with AppKit SDK for specialized logic and serverless deployment. - Rapid Governed Data Retrieval: Leverage Knowledge Assistant for managed agent building, grounding AI in enterprise data without managing underlying pipelines.