Spring AI ChatClient Source Code Analysis

Table of Contents

ChatClient is the primary entry point for conversational AI interactions in Spring AI. It provides a fluent, builder‑driven API that hides the complexity of prompt assembly, advisor chains, model invocation, and streaming. Understanding its source code reveals how Spring AI achieves portability, extensibility, and a consistent programming model across different LLM providers.

This chapter dissects ChatClient from the inside out: its architecture, builder pattern, fluent API design, request execution lifecycle, and integration with every other major Spring AI component. By the end, you will be able to trace a request from an application call through the entire framework and back.

What Is ChatClient?
#

ChatClient is a facade that simplifies the process of building a prompt, applying cross‑cutting concerns, invoking a model, and processing the response. Its design goals are:

Fluent API – Enable a readable, chainable style for constructing AI requests.
Separation of concerns – Decouple prompt preparation, advisor processing, and model invocation.
Portability – Work uniformly with any ChatModel implementation.
Extensibility – Allow custom advisors to intercept and modify requests and responses.

Compared to calling ChatModel directly, ChatClient adds:

Automatic prompt message assembly (system, user, function results).
Advisor chain execution before and after the model call.
Unified support for streaming and synchronous calls.
Integration with tool calling, memory, and structured output converters.

ChatClient in Spring AI Architecture
#

ChatClient sits at the top of the runtime stack, between the application code and the model abstraction.

graph TD App["Application"] Client["ChatClient (DefaultChatClient)"] Prompt["Prompt (Messages + Options)"] AdvisorChain["Advisor Chain<br/>(RequestResponseAdvisor)"] Model["ChatModel"] Provider["Provider Adapter"] LLM["LLM API"] App --> Client Client --> Prompt Client --> AdvisorChain AdvisorChain --> Model Model --> Provider Provider --> LLM

Application – Uses ChatClient.Builder to obtain an instance and call prompt().user("...").call().
ChatClient – The facade. It holds an immutable configuration (default system message, advisors, model, tool callbacks) and orchestrates the lifecycle.
Prompt – Constructed on the fly from the user input and the client’s defaults.
Advisor Chain – Each advisor can modify the prompt before it reaches the model and the response before it is returned.
ChatModel – The portable interface that the advisor chain wraps; ultimately delegates to a provider adapter.
Provider Adapter – Translates to a vendor‑specific HTTP request.

Core Interfaces and Classes
#

Class / Interface	Responsibility
`ChatClient`	Public interface. Declares the `prompt()` method and the inner `Builder` interface.
`DefaultChatClient`	The only implementation. Holds configuration, builds the prompt, executes the advisor chain.
`ChatClient.Builder`	Builder for constructing a `ChatClient`. Allows setting defaults and injecting dependencies.
`ChatClientRequestSpec`	Fluent spec for building a request: adding messages, options, tools, advisors.
`ChatClientResponseSpec`	Fluent spec for executing the request and processing the response (call, stream, entity).
`ChatResponse`	The normalized response from the model. Contains generations, token usage, and metadata.
`ChatModel`	The model abstraction that `DefaultChatClient` delegates to after the advisor chain.
`Prompt`	Immutable list of `Message` objects with optional `ChatOptions`.

DefaultChatClient is the central orchestrator. It implements both ChatClient and the logic behind the request/response spec interfaces.

Source Code Structure
#

The relevant packages are:

org.springframework.ai.chat.client
├── ChatClient.java
├── DefaultChatClient.java
├── ChatClientRequest.java
├── ChatClientResponse.java
├── Advisor.java (RequestResponseAdvisor)
├── observation
│   ├── ChatClientObservationConvention.java
│   └── ...
└── advice
    ├── AbstractChatClientAdvisor.java
    └── ...

DefaultChatClient is the core; it contains inner classes that implement ChatClientRequestSpec and ChatClientResponseSpec. The advisor chain is built from RequestResponseAdvisor beans injected into the builder.

ChatClient Builder Pattern
#

The builder is the only way to create a ChatClient instance.

classDiagram class ChatClient { <<interface>> +prompt() ChatClientRequestSpec +Builder builder() } class ChatClient.Builder { <<interface>> +defaultSystem(String) Builder +defaultAdvisors(RequestResponseAdvisor...) Builder +defaultTools(FunctionCallback...) Builder +build() ChatClient } class DefaultChatClient { -ChatModel chatModel -List~RequestResponseAdvisor~ advisors -String defaultSystem -ToolRegistry toolRegistry } ChatClient.Builder <|.. DefaultChatClient.DefaultBuilder ChatClient <|.. DefaultChatClient

The DefaultBuilder stores configuration in fields and creates an immutable DefaultChatClient instance in build(). The builder can be obtained via ChatClient.builder() or injected as a prototype bean. In a Spring Boot application, auto‑configuration creates a pre‑configured builder with the available ChatModel, advisors, and tool callbacks.

Key aspect: the DefaultChatClient is stateless and thread‑safe; all per‑request state is encapsulated in the ChatClientRequest object.

Fluent API Design
#

The fluent API is split into two phases:

Request specification – ChatClientRequestSpec
Response handling – ChatClientResponseSpec

chatClient.prompt()                // returns ChatClientRequestSpec
    .user("What is Spring AI?")   // add user message
    .system("You are helpful.")   // add system message
    .advisors(myAdvisor)          // attach request‑specific advisors
    .options(ChatOptions.builder().temperature(0.7).build())
    .call()                       // returns ChatClientResponseSpec
    .chatResponse();              // synchronous execution

Each method in the spec returns this, enabling chaining. The call() method triggers execution. Before invoking the model, the spec builds a Prompt object from the messages and options.

The DefaultChatClient uses two inner classes to implement the specs:

DefaultChatClientRequestSpec – Collects messages, options, and callbacks.
DefaultChatClientResponseSpec – Holds the executor logic for call(), stream(), and entity().

This design separates construction from execution, allowing the same request spec to be executed multiple times.

Request Execution Lifecycle
#

The following sequence diagram shows a synchronous call() execution:

sequenceDiagram participant App participant ReqSpec as ChatClientRequestSpec participant Client as DefaultChatClient participant Advisor as Advisor Chain participant Model as ChatModel App->>ReqSpec: user("...").call() ReqSpec->>Client: create ChatClientRequest Client->>Advisor: adviseRequest(prompt) Advisor-->>Client: modified prompt Client->>Model: call(modified prompt) Model-->>Client: ChatResponse Client->>Advisor: adviseResponse(response) Advisor-->>Client: modified response Client-->>App: ChatResponse

Request assembly – The ChatClientRequestSpec collects messages, options, and any request‑specific tools. It also merges the client’s default system message and default advisors.
Advisor pre‑processing – The advisor chain wraps the prompt. Each advisor receives the current Prompt and can return a new one. This is where RAG context is injected, logging occurs, etc.
Model invocation – The final Prompt is sent to the ChatModel. The implementation uses the ChatModel bean injected during client construction.
Advisor post‑processing – Advisors inspect the ChatResponse. They can modify it, record metrics, or trigger side‑effects.
Response return – The final ChatResponse is returned to the application.

For streaming, the model returns a Flux<ChatResponse>, and the advisor post‑processing is applied to each emitted item.

Integration with Prompt
#

The ChatClientRequestSpec builds a Prompt from the accumulated messages and options. The default system message is prepended if present.

Internally, the spec uses a PromptTemplate when a templated string is provided (e.g., user("Hello {name}", Map.of("name","World"))). The template is resolved before building the prompt. The resulting Prompt is immutable and passed to the advisor chain.

Advisor Chain Integration
#

Advisors are the primary extension mechanism. The chain is constructed from:

The client’s default advisors (set in the builder).
Request‑scoped advisors (added via advisors(...) on the spec).

The DefaultChatClient wraps the model in an AdvisedChatModel that iterates over the advisor list. Each advisor implements RequestResponseAdvisor with two methods:

Prompt adviseRequest(Prompt prompt, Map<String, Object> context)
ChatResponse adviseResponse(ChatResponse response, Map<String, Object> context)

The advisors are called in the order they were registered. The context map carries request‑specific information like tool callbacks and the ChatClient instance itself.

Streaming Implementation
#

Streaming is triggered by calling stream() instead of call() on the response spec.

Flux<ChatResponse> flux = chatClient.prompt().user("...").stream().chatResponse();

Internally, DefaultChatClient delegates to ChatModel.stream(prompt). The returned Flux<ChatResponse> emits partial responses as tokens arrive. The advisor chain is applied per‑item, allowing real‑time filtering or augmentation.

The ChatClientResponseSpec also offers stream().content() which returns a Flux<String> of the text content, simplifying common use cases.

Tool Calling Integration
#

ChatClient integrates tool calling through the builder’s defaultTools() and request‑spec tools() methods. Tools are registered as FunctionCallback beans.

When the request is built, the tool callbacks are added to the ChatOptions as functionCallbacks. The ChatModel implementation serializes them into the provider‑specific tool definitions.

After a response that contains a tool call, the framework (via the tool advisor or the agent runtime) executes the tool and feeds the result back into the conversation. This logic resides primarily in the ToolCallingAdvisor and Agent, which are higher‑level components; the ChatClient simply provides the tool list.

Structured Output Support
#

The ChatClientResponseSpec provides an entity(Class<T>) method that converts the response text to a typed Java object.

MyBean bean = chatClient.prompt().user("...").call().entity(MyBean.class);

Under the hood, it uses a StructuredOutputConverter. The converter expects a JSON string from the model and deserializes it. If the model returns plain text, the converter can attempt to extract JSON. The converter is selected based on the target type and registered through auto‑configuration.

Memory Integration
#

Memory is not built into ChatClient directly; instead, it is provided by an advisor (MemoryAdvisor). This advisor adds previous conversation messages to the prompt and stores new messages after the response.

The ChatClient builder’s defaultAdvisors() method can accept the MemoryAdvisor, making it part of the chain. The advisor uses a ConversationMemory bean to persist state.

Relationship with ChatModel
#

Aspect	ChatClient	ChatModel
Purpose	User‑facing API, orchestrator	Low‑level model abstraction
Responsibilities	Prompt building, advisor chain, tool registration, streaming, structured output	Communicating with LLM providers
Dependencies	ChatModel, advisors, tools	Provider‑specific adapters
Extension points	Advisors	New provider implementations
Called by	Application code	ChatClient (via advisor chain)

ChatClient is the facade; ChatModel is the strategy.

Design Patterns Used
#

Facade Pattern – ChatClient provides a simplified interface to the complex subsystem of prompts, advisors, and models.
Builder Pattern – ChatClient.Builder and ChatClientRequestSpec separate construction from representation.
Strategy Pattern – ChatModel is the strategy interface; provider adapters are concrete strategies.
Adapter Pattern – The advisor chain wraps the model, adapting its call/response interface.
Template Method – The DefaultChatClient defines the skeleton of the request lifecycle, delegating steps to advisors and the model.
Dependency Injection – All dependencies (model, advisors, tools) are injected through the builder, making the system highly configurable.

Extension Points
#

Developers can customize ChatClient behavior by:

Registering custom advisors – Implement RequestResponseAdvisor and add it via the builder or per request.
Overriding the default system message – Set a default system prompt in the builder.
Providing a custom ChatModel – Implement the interface and supply it to the builder.
Adding default tools – Use FunctionCallback beans to register tools globally.
Customizing the response converter – Provide a StructuredOutputConverter for a specific type.

All these extensions are standard Spring beans, automatically discovered and injected.

Performance Considerations
#

Object creation – Each request creates a new Prompt, ChatClientRequest, and ChatClientResponse. These are lightweight and quickly garbage collected.
Immutability – Prompt and ChatOptions are immutable, making the advisor chain safe for concurrent requests without synchronization.
Streaming efficiency – Streaming avoids buffering the entire response in memory; the reactive Flux respects back‑pressure.
Thread safety – DefaultChatClient is stateless; the same instance can be shared across threads.
Advisor cost – Advisors run synchronously; long‑running logic (e.g., external service calls) should be done asynchronously or moved to a tool call.

Source Code Reading Guide
#

To understand the internals, start with these files in order:

ChatClient.java – Understand the public API and the Builder interface.
DefaultChatClient.java – The implementation: how the builder constructs the client, how DefaultChatClientRequestSpec and DefaultChatClientResponseSpec work.
AdvisedChatModel.java – How advisors are applied around the model call.
ChatClientRequest.java – The immutable request object passed through the advisor chain.
ToolCallingAdvisor.java – How tools are injected and executed.
MemoryAdvisor.java – How conversation memory is managed.

Unit tests in spring-ai-core/src/test/java/org/springframework/ai/chat/client/ demonstrate the behavior and are excellent for step‑through debugging.

Related Source Code Guides
#

After mastering ChatClient, continue to:

Prompt Source Code Analysis – The message model that flows through the client.
ChatModel Source Code Analysis – The model interface and provider adapters.
ChatResponse Source Code Analysis – How model output is normalized.
Advisor Source Code Analysis – Deep dive into the interceptor chain.
Tool Calling Source Code Analysis – How FunctionCallback bridges AI and Java.
Memory Source Code Analysis – Conversation history management.
Streaming Source Code Analysis – Reactive streaming internals.
Structured Output Source Code Analysis – Converting AI text to Java objects.

Summary
#

ChatClient is the central orchestrator of Spring AI’s conversational capabilities. It encapsulates the complexity of prompt assembly, advisor chains, tool registration, and streaming behind a clean, immutable builder‑driven facade. Its source code demonstrates how to apply classic enterprise patterns—Facade, Builder, Strategy, and Adapter—to a modern AI framework. Understanding ChatClient is the gateway to mastering the entire Spring AI architecture.

Proceed next to Prompt Source Code Analysis to see how the messages themselves are structured.

What Is ChatClient? #

ChatClient in Spring AI Architecture #

Core Interfaces and Classes #

Source Code Structure #

ChatClient Builder Pattern #

Fluent API Design #

Request Execution Lifecycle #

Integration with Prompt #

Advisor Chain Integration #

Streaming Implementation #

Tool Calling Integration #

Structured Output Support #

Memory Integration #

Relationship with ChatModel #

Design Patterns Used #

Extension Points #

Performance Considerations #

Source Code Reading Guide #

Related Source Code Guides #

Summary #

Accelerate Your Cloud Certification.