Build AI Agent Using ADK
A short guide on using ADK to build an AI Agent.
11/28/20252 min read
What is AI Agent?
A large language model (LLM) is a static model that learns patterns in language from training data and then predicts the next token in a sequence in a probabilistic way. On its own, it can’t take real-world actions, such as calling an API, invoking a function, querying a database, or searching for real-time information. You can think of it as an intelligent “brain.” An AI agent is this intelligent brain powered by hands. It combines the reasoning capabilities of an LLM with the ability to act in the world through tools, code, and external systems. In other words, an AI agent can both think and act to help people achieve their goals and complete tasks end-to-end.
High-Level Overview of ADK Architecture
ADK is an agent development kit built by Google. It helps developers build AI agents much more easily and quickly. At a high level, an AI agent in ADK follows a loop of events:
Take the latest input (user message + prior context).
Send it to the model.
Let the model either respond with final text or issue a structured tool call.
Execute the tool and capture the result.
Append that result back into the agent’s context.
Repeat until the model produces a final answer.
In this architecture, the LLM is the brain. It consumes the current history (conversation, tool results, system instructions) and produces a response. This response can be plain text or a structured tool call. The LLM's internal thought process is often exposed as a special structured text field, which is invaluable for debugging the agent's decision-making. The models are configured with a list of available tools (their signatures) and decide which one to call based on the user's goal.
Tools are the hands. They allow the Agent to overcome the LLM's limitations (e.g., retrieving real-time data, running code, accessing private APIs). When the LLM calls a function, the Tool executes the corresponding external command and returns the raw results.
Runner is the orchestrator. It oversees the Agentic Loop and manages the entire invocation (a single turn of conversation).
Event Management: Every action—from the user's input, LLM response, to the tool's final result—is recorded as an Event. The Runner’s job is to read the latest event and decide the next step.
Execution Control (The Interrupt): When the Runner detects a Function Call Event, it immediately interrupts the LLM's stream, executes the required external tool, collects the results, and submits those results back to the LLM to continue its process.
History Management: It actively manages the conversation History by appending new Events and, if configured, trimming or summarizing old context to keep the prompt size manageable.
Session is the file system. It is the durable record of the conversation. It acts as the filesystem for a specific user interaction, storing every Event and the current state of the Agent. An event is the basic unit of communication representing things that happen during a session (user message, agent reply, tool use), forming the conversation history.
Because the LLM streams its output in small fragments (tokens for text, argument chunks for function calls), the Aggregator acts as the Interpreter and Reassembler. Its role is to:
Stitch: Collect all fragmented tokens and arguments and reassemble them into complete, usable objects (like a full, valid FunctionCall command).
Order: Ensure content is delivered to the Runner in the correct sequence, especially when switching between text and structured data.
By combining these components, the ADK creates a robust, reliable, and observable framework for building intelligent, self-directed AI agents.
