Skip to main content
  1. Spring DevPro
  2. >
  3. Spring AI
  4. >
  5. Spring AI Source Code Architecture Overview

Spring AI Source Code Architecture Overview

Welcome to the Spring AI Source Code handbook. This series dissects the internal machinery of Spring AI—the official Spring project that brings AI integration to the Java and Spring Boot ecosystem. By reading the source code we move beyond usage patterns and into the architectural decisions, design patterns, and implementation details that make Spring AI a robust, extensible enterprise AI framework.

Whether you are debugging a production issue, extending the framework with a custom model provider, or simply aiming to become a better Java architect, understanding the source code is the surest path to mastery. This page serves as your entry point: it provides a high‑level map of the architecture, explains the key abstractions and their interplay, and charts a learning path through the thirteen companion deep‑dive articles.

Why Read Spring AI Source Code?
#

Investing time in framework internals pays dividends across many dimensions:

  • Understand framework abstractions – Learn exactly how ChatClient, Prompt, ChatModel, and Advisor hide provider complexity so you can use them with confidence.
  • Debug production issues – Trace a mysterious tool‑calling failure or streaming timeout directly to the responsible code path.
  • Extend framework capabilities – Implement a custom ChatModel for an internal LLM, a proprietary VectorStore, or a novel Advisor with full compatibility.
  • Optimize performance – Spot unnecessary object creation, understand reactive back‑pressure, and tune your AI pipeline.
  • Learn enterprise architecture design – Study how the framework applies classic patterns like Facade, Strategy, Chain of Responsibility, and Adapter in a modern AI context.
  • Prepare for interviews and design discussions – Gain the depth that distinguishes a senior engineer from a framework user.
  • Contribute to open source – Navigate the codebase efficiently, write patches, and propose enhancements.

Spring AI Architecture at a Glance
#

Spring AI is built around a layered architecture that decouples application code from specific AI model providers. The diagram below captures the primary runtime layers.

graph TD App["Application Code"] ChatClient["ChatClient (Facade)"] Prompt["Prompt (Messages + Options)"] AdvisorChain["Advisor Chain<br/>(around, before, after)"] ChatModel["ChatModel Interface"] Provider["Provider Adapter<br/>(OpenAI, Azure, Ollama, …)"] LLM["LLM Provider API"] App --> ChatClient ChatClient --> Prompt ChatClient --> AdvisorChain AdvisorChain --> ChatModel ChatModel --> Provider Provider --> LLM LLM --> Provider Provider --> ChatModel ChatModel --> AdvisorChain AdvisorChain --> ChatClient

Layer Responsibilities:

  • Application – Uses the ChatClient builder API to send prompts and receive responses. No provider‑specific code.
  • ChatClient – The main entry point for conversational AI. It orchestrates the advisor chain and delegates to the ChatModel.
  • Prompt – Immutable value object carrying a list of Message instances and optional ChatOptions.
  • Advisor Chain – A configurable list of RequestResponseAdvisor implementations that can inspect, modify, or augment prompts and responses. This is where cross‑cutting concerns like RAG, logging, and content filtering live.
  • ChatModel – The portable interface for AI chat models. Implementations exist for OpenAI, Azure OpenAI, Ollama, and many others.
  • Provider Adapter – Translates the generic Prompt into a provider‑specific HTTP request and normalizes the response back into a ChatResponse.
  • LLM Provider – The actual AI service, accessed via REST, gRPC, or local process.

This layered design ensures that the framework stays provider‑neutral and extensible. Higher layers depend only on abstractions, never on concrete provider details.

Spring AI Request Lifecycle
#

The following sequence diagram illustrates a complete request lifecycle—from an application calling the ChatClient to receiving the final response.

sequenceDiagram participant App as Application participant Client as ChatClient participant Advisor as Advisor Chain participant Model as ChatModel participant Provider as Provider Adapter participant API as LLM API App->>Client: call(prompt) Client->>Advisor: pre-process (prompt) Advisor-->>Client: modified prompt Client->>Model: call(modified prompt) Model->>Provider: build provider request Provider->>API: HTTP POST API-->>Provider: raw response Provider->>Provider: normalize → ChatResponse Provider-->>Model: ChatResponse Model-->>Client: ChatResponse Client->>Advisor: post-process (response) Advisor-->>Client: modified response Client-->>App: final ChatResponse
  1. Pre‑processing – Advisors inspect and possibly enrich the Prompt (e.g., a RAG advisor injects retrieved knowledge).
  2. Model invocation – The ChatClient delegates to the ChatModel bean, which may be a single provider or a router.
  3. Provider communication – The provider adapter constructs an HTTP request conforming to the target API and sends it.
  4. Response normalization – The raw response is mapped into the framework‑neutral ChatResponse object, including token usage and tool calls.
  5. Post‑processing – Advisors can inspect or alter the response (e.g., content filtering, logging) before it reaches the application.

This lifecycle holds for both synchronous and streaming calls; streaming adds a reactive Flux of partial responses but follows the same structural path.

Core Framework Components
#

Component Responsibility Related Guide
ChatClient Central API for building prompts, applying advisors, and executing model calls. ChatClient Source Code Analysis
Prompt Immutable aggregation of messages and model options. Prompt Source Code Analysis
ChatModel Portable interface for LLM providers. ChatModel Source Code Analysis
ChatResponse Encapsulates model output, token usage, and metadata. ChatResponse Source Code Analysis
Advisor Interceptor interface for prompt/response transformation. Advisor Source Code Analysis
Tool Calling Allows models to invoke Java methods as tools. Tool Calling Source Code Analysis
Streaming Reactive support for token‑by‑token responses. Streaming Source Code Analysis
Memory Manages conversation history across turns. Memory Source Code Analysis
Structured Output Converts model responses into typed Java objects. Structured Output Source Code Analysis
VectorStore Abstraction for vector databases used in RAG. VectorStore Source Code Analysis
RAG Retrieval‑augmented generation pipeline (Document → Embed → Retrieve → Augment). RAG Source Code Analysis
Agent Autonomous agent loop with planning and tool execution. Agent Source Code Analysis
MCP Model Context Protocol client for standard tool integration. MCP Source Code Analysis

Each guide in this handbook dissects the corresponding component’s source code: interfaces, implementations, design patterns, and runtime behavior.

Spring AI Package Organization
#

The framework’s source tree is organized into functional packages that mirror the component table above. Understanding the package hierarchy is the first step in navigating the codebase.

org.springframework.ai
├── chat          – ChatClient, ChatModel, ChatResponse, streaming support
├── model         – Generic model interfaces (Model, StreamingModel)
├── prompt        – Prompt, Message, ChatOptions
├── advisor       – Advisor interface, advisor chain support
├── tool          – Tool calling, ToolCallback, annotations
├── memory        – Conversation memory abstractions
├── structured    – Structured output converters
├── vectorstore   – VectorStore interface and implementations
├── document      – Document readers, splitters, transformers
├── rag           – RetrievalAugmentationAdvisor, QueryTransformers, retrievers
├── agent         – Agent runtime, planning, execution loop
├── mcp           – MCP client, server, transport
└── autoconfigure – Spring Boot auto‑configuration for all providers

Package responsibilities:

  • chat – The primary runtime: ChatClient, ChatModel, ChatResponse, and streaming infrastructure.
  • model – Base interfaces inherited by chat, embedding, and image models.
  • prompt – The immutable message and options structures that travel through the advisor chain.
  • advisor – The interceptor contract (RequestResponseAdvisor) and the AdvisedRequest wrapper.
  • tool – Function calling abstraction; ToolCallback and @Tool annotation processing.
  • memoryConversationMemory and implementations for sliding window, token‑based, etc.
  • structuredBeanOutputConverter, ListOutputConverter, and related infrastructure.
  • vectorstore – The VectorStore interface and implementations for Chroma, Milvus, Pgvector, etc.
  • documentDocumentReader, DocumentSplitter, DocumentTransformer for ingestion pipelines.
  • rag – Retrieval augmentation: RetrievalAugmentationAdvisor, retrievers, query transformers.
  • agent – Agent runtime, Agent, Planner, Step, execution loop.
  • mcp – Model Context Protocol integration, client transport, server interfaces.
  • autoconfigure – Spring Boot starters that detect provider libraries and register beans.

Key Design Principles
#

Spring AI’s architecture is a masterclass in enterprise Java design. Here are the most prominent principles you will encounter in the source code.

Abstraction Layer
The entire framework rests on interfaces—ChatModel, EmbeddingModel, VectorStore. Providers implement these interfaces, keeping the rest of the framework blissfully unaware of the underlying API. This is pure Dependency Inversion.

Provider Independence
Every component from the advisor chain to the agent runtime depends on the abstract interfaces. You can switch from OpenAI to Azure OpenAI to Ollama by changing a single configuration property.

Strategy Pattern
Wherever behavior varies, a strategy interface is defined. Examples: PromptStrategy, RetrievalStrategy, ToolCallingStrategy. The framework selects the appropriate strategy at runtime based on configuration.

Builder Pattern
ChatClient, Prompt, ChatOptions, and many others use fluent builders. This provides a readable, type‑safe way to construct complex objects with sensible defaults.

Adapter Pattern
Each provider module (e.g., spring-ai-openai) contains an adapter that translates between the Spring AI abstractions and the provider’s wire protocol. The adapter normalizes requests and responses, isolating the rest of the framework from provider idiosyncrasies.

Chain of Responsibility (Advisors)
The advisor chain lets you compose multiple cross‑cutting behaviors—RAG, logging, security—without coupling them to the ChatClient or ChatModel. Each advisor can decide whether to process the request, modify it, or pass it along.

Dependency Injection
Spring AI is built on the Spring container. Providers are auto‑configured as beans; advisors are injected into the ChatClient; tool callbacks are discovered from the application context.

Auto Configuration
Spring Boot auto‑configuration classes (@AutoConfiguration) register beans conditionally based on the classpath and properties. This is how adding spring-ai-openai-spring-boot-starter instantly gives you a working OpenAiChatModel.

Extension Points
Nearly every component can be extended by implementing an interface and registering a bean. Custom ChatModel, custom VectorStore, custom Advisor—all work natively with the framework.

Separation of Concerns
Prompt construction, model invocation, tool execution, memory management, and observability are implemented as distinct, replaceable components. This makes the codebase modular and testable.

Reading the Source Code Efficiently
#

Jumping into a large framework can be daunting. Follow these guidelines to build understanding quickly.

  • Start with the interfaces – Read ChatModel, ChatClient, Advisor, Prompt, and ChatResponse. These define the contract.
  • Trace a simple synchronous call – Put a breakpoint in OpenAiChatModel.call() and follow the call stack from a unit test.
  • Follow the builderChatClient.create() builds the advisor chain. Understanding this initialization reveals how components are wired.
  • Study the advisor chain – The DefaultChatClient implements ChatClient. Look at how advisors are invoked before and after the model call.
  • Look at auto‑configurationOpenAiAutoConfiguration shows how properties are read and beans are created. This is the pattern for all providers.
  • Examine the test suite – Spring AI has an extensive test suite. Tests often demonstrate how a component is meant to be used and provide a safe playground for debugging.
  • Use the IDE’s “Find Usages” – When you see a method like call(Prompt), find all implementations to see the provider adapters in action.

Recommended Learning Path #

The source code series follows the natural execution flow of a Spring AI request. We recommend reading the articles in this order:

  1. ChatClient Source Code Analysis – Understand the facade that your application interacts with.
  2. Prompt Source Code Analysis – Learn how messages and options are modeled.
  3. ChatModel Source Code Analysis – Discover the central abstraction for LLMs.
  4. EmbeddingModel Source Code Analysis - Learn how embeddings are computed.
  5. ChatResponse Source Code Analysis – See how model output is normalized.
  6. Advisor Source Code Analysis – Explore the interceptor chain.
  7. Tool Calling Source Code Analysis – Understand how models invoke Java methods.
  8. Streaming Source Code Analysis – Delve into reactive streaming support.
  9. Memory Source Code Analysis – Learn about conversation history management.
  10. Structured Output Source Code Analysis – See how raw text is converted to typed objects.
  11. VectorStore Source Code Analysis – Study the vector database abstraction.
  12. RAG Source Code Analysis – Unpack the retrieval‑augmented generation pipeline.
  13. Agent Source Code Analysis – Examine the autonomous agent runtime.
  14. MCP Source Code Analysis – Explore the Model Context Protocol integration.

This sequence builds knowledge progressively: you start with the core message flow, then layer on tool calling, memory, RAG, and finally autonomous agents.

Spring AI Source Code Handbook
#

Guide Description
ChatClient Source Code Analysis Detailed walkthrough of ChatClient and its builder API.
Prompt Source Code Analysis How Prompt, Message, and ChatOptions are implemented.
ChatModel Source Code Analysis The portable model interface and its provider adapters.
Embedding Source Code Analysis How EmbeddingModel and EmbeddingClient are implemented.
ChatResponse Source Code Analysis Response structure, token usage, and metadata normalization.
Advisor Source Code Analysis Interceptor chain mechanics and pre/post‑processing.
Tool Calling Source Code Analysis @Tool annotation, ToolCallback, and execution lifecycle.
Streaming Source Code Analysis Reactive streaming with Flux<ChatResponse> and back‑pressure.
Memory Source Code Analysis Conversation memory abstractions and implementations.
Structured Output Source Code Analysis Converters for typed Java objects from LLM responses.
VectorStore Source Code Analysis The vector database abstraction and its implementations.
RAG Source Code Analysis End‑to‑end retrieval‑augmented generation internals.
Agent Source Code Analysis Agent runtime, planning strategies, and tool use.
MCP Source Code Analysis Model Context Protocol client, server, and transport.

Who Should Read This Handbook?
#

This material is written for experienced engineers who want to go beyond API usage:

  • Java Developers – Strengthen your Spring skills by studying a modern, production‑grade framework.
  • Spring Developers – Add AI‑specific architecture patterns to your toolkit.
  • AI Engineers – Understand how an enterprise Java framework integrates LLMs, tools, and agents.
  • Software Architects – Learn design patterns for modular, provider‑independent AI systems.
  • Enterprise Architects – Evaluate Spring AI as the foundation for internal AI platforms.
  • Technical Leaders – Guide your team’s adoption of Spring AI with deep architectural context.

If you are comfortable with Spring Boot, dependency injection, and reactive programming, you will be able to follow the analysis.

Related Spring AI Documentation #

This source code handbook complements the official Spring AI documentation and the Spring AI Alibaba learning center.

Summary
#

Spring AI is built on the same design philosophy that made Spring Framework the backbone of enterprise Java: abstractions, portability, and extensibility. By separating the what (interfaces) from the how (provider adapters), it allows you to build AI applications that can evolve with the rapidly changing LLM landscape. The advisor chain and tool calling mechanisms add cross‑cutting intelligence without coupling, while agents and MCP push the boundaries toward autonomous, interoperable AI systems.

This handbook unpacks every one of those layers. We encourage you to follow the recommended learning path, experiment with the source code, and contribute your insights.

Let’s begin the deep dive with the first article: Spring AI ChatClient Source Code Analysis.

Spring AI ChatModel Source Code Analysis

4375 words·21 mins
Stop relying on expensive cloud APIs. This deep dive demonstrates how to architect a privacy-first AI application using Spring Boot, Spring AI, and Ollama. We cover installation, the ChatClient API, streaming responses, and performance tuning for Llama 3 and Mistral models.

Spring AI ChatResponse Source Code Analysis

4771 words·23 mins
Stop relying on expensive cloud APIs. This deep dive demonstrates how to architect a privacy-first AI application using Spring Boot, Spring AI, and Ollama. We cover installation, the ChatClient API, streaming responses, and performance tuning for Llama 3 and Mistral models.

Spring AI Prompt Source Code Analysis

4475 words·22 mins
Stop relying on expensive cloud APIs. This deep dive demonstrates how to architect a privacy-first AI application using Spring Boot, Spring AI, and Ollama. We cover installation, the ChatClient API, streaming responses, and performance tuning for Llama 3 and Mistral models.

Spring AI Tool Calling Source Code Analysis

5056 words·24 mins
Stop relying on expensive cloud APIs. This deep dive demonstrates how to architect a privacy-first AI application using Spring Boot, Spring AI, and Ollama. We cover installation, the ChatClient API, streaming responses, and performance tuning for Llama 3 and Mistral models.

Spring AI VectorStore Source Code Analysis

4992 words·24 mins
Stop relying on expensive cloud APIs. This deep dive demonstrates how to architect a privacy-first AI application using Spring Boot, Spring AI, and Ollama. We cover installation, the ChatClient API, streaming responses, and performance tuning for Llama 3 and Mistral models.

Spring AI ChatClient Source Code Analysis

1980 words·10 mins
ChatClient is the primary entry point for conversational AI interactions in Spring AI. It provides a fluent, builder‑driven API that hides the complexity of prompt assembly, advisor chains, model invocation, and streaming. Understanding its source code reveals how Spring AI achieves portability, extensibility, and a consistent programming model across different LLM providers.