ChatClient is the primary entry point for conversational AI interactions in Spring AI. It provides a fluent, builder‑driven API that hides the complexity of prompt assembly, advisor chains, model invocation, and streaming. Understanding its source code reveals how Spring AI achieves portability, extensibility, and a consistent programming model across different LLM providers.
This chapter dissects ChatClient from the inside out: its architecture, builder pattern, fluent API design, request execution lifecycle, and integration with every other major Spring AI component. By the end, you will be able to trace a request from an application call through the entire framework and back.
What Is ChatClient? #
ChatClient is a facade that simplifies the process of building a prompt, applying cross‑cutting concerns, invoking a model, and processing the response. Its design goals are:
- Fluent API – Enable a readable, chainable style for constructing AI requests.
- Separation of concerns – Decouple prompt preparation, advisor processing, and model invocation.
- Portability – Work uniformly with any
ChatModelimplementation. - Extensibility – Allow custom advisors to intercept and modify requests and responses.
Compared to calling ChatModel directly, ChatClient adds:
- Automatic prompt message assembly (system, user, function results).
- Advisor chain execution before and after the model call.
- Unified support for streaming and synchronous calls.
- Integration with tool calling, memory, and structured output converters.
ChatClient in Spring AI Architecture #
ChatClient sits at the top of the runtime stack, between the application code and the model abstraction.
- Application – Uses
ChatClient.Builderto obtain an instance and callprompt().user("...").call(). - ChatClient – The facade. It holds an immutable configuration (default system message, advisors, model, tool callbacks) and orchestrates the lifecycle.
- Prompt – Constructed on the fly from the user input and the client’s defaults.
- Advisor Chain – Each advisor can modify the prompt before it reaches the model and the response before it is returned.
- ChatModel – The portable interface that the advisor chain wraps; ultimately delegates to a provider adapter.
- Provider Adapter – Translates to a vendor‑specific HTTP request.
Core Interfaces and Classes #
| Class / Interface | Responsibility |
|---|---|
ChatClient |
Public interface. Declares the prompt() method and the inner Builder interface. |
DefaultChatClient |
The only implementation. Holds configuration, builds the prompt, executes the advisor chain. |
ChatClient.Builder |
Builder for constructing a ChatClient. Allows setting defaults and injecting dependencies. |
ChatClientRequestSpec |
Fluent spec for building a request: adding messages, options, tools, advisors. |
ChatClientResponseSpec |
Fluent spec for executing the request and processing the response (call, stream, entity). |
ChatResponse |
The normalized response from the model. Contains generations, token usage, and metadata. |
ChatModel |
The model abstraction that DefaultChatClient delegates to after the advisor chain. |
Prompt |
Immutable list of Message objects with optional ChatOptions. |
DefaultChatClient is the central orchestrator. It implements both ChatClient and the logic behind the request/response spec interfaces.
Source Code Structure #
The relevant packages are:
org.springframework.ai.chat.client
├── ChatClient.java
├── DefaultChatClient.java
├── ChatClientRequest.java
├── ChatClientResponse.java
├── Advisor.java (RequestResponseAdvisor)
├── observation
│ ├── ChatClientObservationConvention.java
│ └── ...
└── advice
├── AbstractChatClientAdvisor.java
└── ...
DefaultChatClient is the core; it contains inner classes that implement ChatClientRequestSpec and ChatClientResponseSpec. The advisor chain is built from RequestResponseAdvisor beans injected into the builder.
ChatClient Builder Pattern #
The builder is the only way to create a ChatClient instance.
The DefaultBuilder stores configuration in fields and creates an immutable DefaultChatClient instance in build(). The builder can be obtained via ChatClient.builder() or injected as a prototype bean. In a Spring Boot application, auto‑configuration creates a pre‑configured builder with the available ChatModel, advisors, and tool callbacks.
Key aspect: the DefaultChatClient is stateless and thread‑safe; all per‑request state is encapsulated in the ChatClientRequest object.
Fluent API Design #
The fluent API is split into two phases:
- Request specification –
ChatClientRequestSpec - Response handling –
ChatClientResponseSpec
chatClient.prompt() // returns ChatClientRequestSpec
.user("What is Spring AI?") // add user message
.system("You are helpful.") // add system message
.advisors(myAdvisor) // attach request‑specific advisors
.options(ChatOptions.builder().temperature(0.7).build())
.call() // returns ChatClientResponseSpec
.chatResponse(); // synchronous execution
Each method in the spec returns this, enabling chaining. The call() method triggers execution. Before invoking the model, the spec builds a Prompt object from the messages and options.
The DefaultChatClient uses two inner classes to implement the specs:
DefaultChatClientRequestSpec– Collects messages, options, and callbacks.DefaultChatClientResponseSpec– Holds the executor logic forcall(),stream(), andentity().
This design separates construction from execution, allowing the same request spec to be executed multiple times.
Request Execution Lifecycle #
The following sequence diagram shows a synchronous call() execution:
- Request assembly – The
ChatClientRequestSpeccollects messages, options, and any request‑specific tools. It also merges the client’s default system message and default advisors. - Advisor pre‑processing – The advisor chain wraps the prompt. Each advisor receives the current
Promptand can return a new one. This is where RAG context is injected, logging occurs, etc. - Model invocation – The final
Promptis sent to theChatModel. The implementation uses theChatModelbean injected during client construction. - Advisor post‑processing – Advisors inspect the
ChatResponse. They can modify it, record metrics, or trigger side‑effects. - Response return – The final
ChatResponseis returned to the application.
For streaming, the model returns a Flux<ChatResponse>, and the advisor post‑processing is applied to each emitted item.
Integration with Prompt #
The ChatClientRequestSpec builds a Prompt from the accumulated messages and options. The default system message is prepended if present.
Internally, the spec uses a PromptTemplate when a templated string is provided (e.g., user("Hello {name}", Map.of("name","World"))). The template is resolved before building the prompt. The resulting Prompt is immutable and passed to the advisor chain.
Advisor Chain Integration #
Advisors are the primary extension mechanism. The chain is constructed from:
- The client’s default advisors (set in the builder).
- Request‑scoped advisors (added via
advisors(...)on the spec).
The DefaultChatClient wraps the model in an AdvisedChatModel that iterates over the advisor list. Each advisor implements RequestResponseAdvisor with two methods:
Prompt adviseRequest(Prompt prompt, Map<String, Object> context)ChatResponse adviseResponse(ChatResponse response, Map<String, Object> context)
The advisors are called in the order they were registered. The context map carries request‑specific information like tool callbacks and the ChatClient instance itself.
Streaming Implementation #
Streaming is triggered by calling stream() instead of call() on the response spec.
Flux<ChatResponse> flux = chatClient.prompt().user("...").stream().chatResponse();
Internally, DefaultChatClient delegates to ChatModel.stream(prompt). The returned Flux<ChatResponse> emits partial responses as tokens arrive. The advisor chain is applied per‑item, allowing real‑time filtering or augmentation.
The ChatClientResponseSpec also offers stream().content() which returns a Flux<String> of the text content, simplifying common use cases.
Tool Calling Integration #
ChatClient integrates tool calling through the builder’s defaultTools() and request‑spec tools() methods. Tools are registered as FunctionCallback beans.
When the request is built, the tool callbacks are added to the ChatOptions as functionCallbacks. The ChatModel implementation serializes them into the provider‑specific tool definitions.
After a response that contains a tool call, the framework (via the tool advisor or the agent runtime) executes the tool and feeds the result back into the conversation. This logic resides primarily in the ToolCallingAdvisor and Agent, which are higher‑level components; the ChatClient simply provides the tool list.
Structured Output Support #
The ChatClientResponseSpec provides an entity(Class<T>) method that converts the response text to a typed Java object.
MyBean bean = chatClient.prompt().user("...").call().entity(MyBean.class);
Under the hood, it uses a StructuredOutputConverter. The converter expects a JSON string from the model and deserializes it. If the model returns plain text, the converter can attempt to extract JSON. The converter is selected based on the target type and registered through auto‑configuration.
Memory Integration #
Memory is not built into ChatClient directly; instead, it is provided by an advisor (MemoryAdvisor). This advisor adds previous conversation messages to the prompt and stores new messages after the response.
The ChatClient builder’s defaultAdvisors() method can accept the MemoryAdvisor, making it part of the chain. The advisor uses a ConversationMemory bean to persist state.
Relationship with ChatModel #
| Aspect | ChatClient | ChatModel |
|---|---|---|
| Purpose | User‑facing API, orchestrator | Low‑level model abstraction |
| Responsibilities | Prompt building, advisor chain, tool registration, streaming, structured output | Communicating with LLM providers |
| Dependencies | ChatModel, advisors, tools | Provider‑specific adapters |
| Extension points | Advisors | New provider implementations |
| Called by | Application code | ChatClient (via advisor chain) |
ChatClient is the facade; ChatModel is the strategy.
Design Patterns Used #
- Facade Pattern –
ChatClientprovides a simplified interface to the complex subsystem of prompts, advisors, and models. - Builder Pattern –
ChatClient.BuilderandChatClientRequestSpecseparate construction from representation. - Strategy Pattern –
ChatModelis the strategy interface; provider adapters are concrete strategies. - Adapter Pattern – The advisor chain wraps the model, adapting its call/response interface.
- Template Method – The
DefaultChatClientdefines the skeleton of the request lifecycle, delegating steps to advisors and the model. - Dependency Injection – All dependencies (model, advisors, tools) are injected through the builder, making the system highly configurable.
Extension Points #
Developers can customize ChatClient behavior by:
- Registering custom advisors – Implement
RequestResponseAdvisorand add it via the builder or per request. - Overriding the default system message – Set a default system prompt in the builder.
- Providing a custom
ChatModel– Implement the interface and supply it to the builder. - Adding default tools – Use
FunctionCallbackbeans to register tools globally. - Customizing the response converter – Provide a
StructuredOutputConverterfor a specific type.
All these extensions are standard Spring beans, automatically discovered and injected.
Performance Considerations #
- Object creation – Each request creates a new
Prompt,ChatClientRequest, andChatClientResponse. These are lightweight and quickly garbage collected. - Immutability –
PromptandChatOptionsare immutable, making the advisor chain safe for concurrent requests without synchronization. - Streaming efficiency – Streaming avoids buffering the entire response in memory; the reactive
Fluxrespects back‑pressure. - Thread safety –
DefaultChatClientis stateless; the same instance can be shared across threads. - Advisor cost – Advisors run synchronously; long‑running logic (e.g., external service calls) should be done asynchronously or moved to a tool call.
Source Code Reading Guide #
To understand the internals, start with these files in order:
ChatClient.java– Understand the public API and theBuilderinterface.DefaultChatClient.java– The implementation: how the builder constructs the client, howDefaultChatClientRequestSpecandDefaultChatClientResponseSpecwork.AdvisedChatModel.java– How advisors are applied around the model call.ChatClientRequest.java– The immutable request object passed through the advisor chain.ToolCallingAdvisor.java– How tools are injected and executed.MemoryAdvisor.java– How conversation memory is managed.
Unit tests in spring-ai-core/src/test/java/org/springframework/ai/chat/client/ demonstrate the behavior and are excellent for step‑through debugging.
Related Source Code Guides #
After mastering ChatClient, continue to:
- Prompt Source Code Analysis – The message model that flows through the client.
- ChatModel Source Code Analysis – The model interface and provider adapters.
- ChatResponse Source Code Analysis – How model output is normalized.
- Advisor Source Code Analysis – Deep dive into the interceptor chain.
- Tool Calling Source Code Analysis – How
FunctionCallbackbridges AI and Java. - Memory Source Code Analysis – Conversation history management.
- Streaming Source Code Analysis – Reactive streaming internals.
- Structured Output Source Code Analysis – Converting AI text to Java objects.
Summary #
ChatClient is the central orchestrator of Spring AI’s conversational capabilities. It encapsulates the complexity of prompt assembly, advisor chains, tool registration, and streaming behind a clean, immutable builder‑driven facade. Its source code demonstrates how to apply classic enterprise patterns—Facade, Builder, Strategy, and Adapter—to a modern AI framework. Understanding ChatClient is the gateway to mastering the entire Spring AI architecture.
Proceed next to Prompt Source Code Analysis to see how the messages themselves are structured.