1. Introduction
AI Agents are evolving from standalone applications to embedded interactive experiences on the web. The core challenge for frontend teams is: how to safely and efficiently integrate server-side large language model reasoning capabilities with client-side specific operations (such as GPS, clipboard, DOM manipulation) in the browser environment. This article uses React (based on the AG-UI framework) and Vue3, two mainstream frontend frameworks, as examples to illustrate the specific implementation path of embedding AI Agent frontend solutions on the web.
The content covers client-server collaboration principles, definition and execution of frontend tool functions, streaming output integration, secure communication, and multi-Agent isolation strategies. After reading this article, you will master:
- Key steps and configuration points for integrating React AG-UI Agent components
- Complete code implementation for Vue3 SSE streaming Agent integration
- Definition standards and secure execution mechanisms for frontend tool functions
- Using AbortController to manage user aborts of streaming requests
- Best practices for multi-Agent context isolation
2. Core Concept: Frontend Collaboration Pattern for Embedding AI Agent on the Web
2.1 Client-Server Collaboration Principle
The essence of the frontend solution for embedding AI Agent on the web is the division of responsibilities and collaborative work between the client and the server. The Agent instance runs on the server, responsible for invoking the LLM for reasoning and decision-making; the frontend is primarily responsible for UI interaction and browser environment specific operations.
The specific collaboration flow is as follows:
Server-side led reasoning: The user inputs a message on the frontend, which is sent to the server via an API request. The server maintains the Agent’s conversation context and is responsible for invoking the LLM to generate replies or decisions.
When tool calls require client capabilities: When the LLM’s decision involves browser-specific operations (such as getting location, accessing clipboard, manipulating DOM elements), the server does not send the final answer but instead sends a “tool call request” to the frontend.
Frontend executes and returns the result: After receiving the tool call request, the frontend executes the corresponding frontend function and sends the result back to the server via a protocol (e.g., JSON-RPC).
Server continues reasoning: The server uses the tool execution result as additional context and continues to invoke the LLM to generate the final reply.
Taking the Microsoft AG-UI framework as an example, the Agent instance is created on the server, but tool functions can be marked as “frontend execution” via AIFunctionFactory.Create. When the LLM’s decision requires client capabilities, the server sends the tool call ID and parameters to the client, and the frontend executes them and returns the result. This division leverages the server’s computing resources, protects API keys, while retaining the frontend’s control over the browser environment.
In practical engineering, the server and frontend exchange data through a standard protocol (such as JSON-RPC). The frontend does not directly expose API keys or LLM configurations, thus ensuring security.
Note: This pattern requires the frontend and backend to agree on a unified communication protocol, including the format of tool call requests, parameter passing methods, result return formats, and error handling. It is recommended to define the interface specification early in the project to avoid repeated adjustments later.
2.2 Definition and Execution Mechanism of Frontend Tool Functions
Frontend tool functions are the bridge for Agent interaction with the browser environment. Unlike backend tools, frontend tools directly manipulate DOM, call browser APIs, and read local state, thus requiring stricter definition standards. In the AG-UI example, the definition of frontend tools must follow these principles:
Clear description and signature: Each tool function must include a
[Description]attribute to explain the tool’s purpose, parameters, and trigger conditions to the LLM. For example, the description ofGetUserLocation()is “Get the user’s current geographic location from the browser.” The LLM decides when to call the tool based on this description. If the description is vague or missing, the LLM may not correctly invoke the tool.No side effects: Frontend tools should avoid modifying global state or performing irreversible operations (such as popups). If side effects are unavoidable, they should be designed asynchronously with a cancellation mechanism (e.g.,
AbortController) to allow users to interrupt execution.Permission levels: Classify tools into different permission levels based on operation sensitivity. For example, reading screen brightness is a low-risk operation and can be executed directly; accessing the clipboard or sending emails is a medium-risk operation requiring user confirmation; manipulating DOM structure is a high-risk operation requiring server-side signature authorization.
In AG-UI, the code example for creating a frontend tool is as follows:
1 | |
When the frontend receives a tool call request, it must safely execute the corresponding function and return the result to the server. During execution, all exceptions should be caught to ensure that even if the call fails, the Agent’s subsequent operation is not affected.
Tip: The execution result of a tool function affects the LLM’s subsequent reasoning. If the tool returns null or error information, the LLM may repeatedly call the same tool. It is recommended to design a fallback return value in the frontend tool, such as returning “User denied location permission” when location acquisition fails, instead of returning null.
3. Hands-on: React AG-UI Integration with Agent Dialogue and Frontend Tools
3.1 Setting Up AG-UI Client and Registering Frontend Tools
Integrating the AG-UI client into a React project requires the following steps:
Install dependencies: Introduce the AG-UI npm package and related type definitions into the project. Assuming using the AG-UI React component library, install
@agentuity/reactand@agentuity/sdk. Runnpm install @agentuity/react @agentuity/sdk.Initialize the client: Create an AG-UI client instance at the application entry point, configuring the server address and communication protocol. Usually a singleton pattern manages the client to avoid duplicate instantiation.
Register frontend tools: Define a handler function for each frontend tool and register it with the client. AG-UI’s React hook
useAgentautomatically handles the distribution of tool calls. Example code:
1 | |
Key notes:
- Tool handler functions must be asynchronous functions returning a
Promise. AG-UI waits for the Promise to resolve and automatically sends the result back to the server. - The tool name (e.g.,
get_user_locationabove) must match the function name used in the server-sideAIFunctionFactory.Create. It is case-sensitive; it is recommended to use lowercase snake_case consistently.
3.2 Integrating Streaming Output and User Interaction
Agent replies are usually returned in a streaming manner, rendered piece by piece on the frontend to enhance user experience. AG-UI’s useAgent hook encapsulates the SSE connection internally, automatically handling streaming text and tool call requests.
Streaming rendering: The
messagesstate returned byuseAgentupdates in real-time as the server pushes data. Each message containsrole(user/assistant/tool) andcontent. When rendering, simply iterate over themessagesarray and use a Markdown rendering component for assistant-type messages.User abort: Call the
abortfunction returned byuseAgentto interrupt the current streaming request. This is implemented internally usingAbortController. When the user clicks the “Stop” button, callabort(), the frontend disconnects the SSE and notifies the server to cancel the reasoning.Tool call UI feedback: When the Agent calls a frontend tool, a message with
role: 'tool'appears inmessages. It is recommended to display the status of the tool call in the UI, for example, showing an intermediate prompt like “Getting your current location…” to let the user know the Agent is executing an operation.
Handling mechanism for corresponding resources or network requests:
1 | |
Note: The AbortController interrupts the request on the frontend side. If the server does not implement a cancellation mechanism, the server may continue reasoning and waste computing resources. It is recommended that the backend also listen for request cancellation events and actively terminate the LLM interface call.
4. Hands-on: Vue3 SSE Integration with Agent Streaming Output
4.1 Setting Up Vue3 Project and Implementing SSE Long Connection
In a Vue3 project, since AG-UI is currently mainly oriented towards React, it is common to manually build SSE or use fetch + ReadableStream to connect to the server. Using fetch + ReadableStream is recommended because traditional EventSource only supports GET requests and cannot customize headers, while Agent dialogue is usually a POST request.
Below is a complete Vue3 composable example:
1 | |
Key notes:
The main advantage of using
fetch+ReadStreamis supporting POST requests and custom headers (such as authentication tokens).EventSourceonly supports GET requests, which cannot meet the needs of Agent dialogue scenarios.The buffer
bufferhandles chunked data. SSE messages pushed by the server may span multiple data chunks;bufferensures complete concatenation before parsing.After reading the response body stream, manual decoding is required; otherwise, Chinese characters may not be handled correctly.
decoder.decode(value, { stream: true })ensures multi-byte characters are not truncated.
Usage example:
1 | |
4.2 Frontend Tool Callback and Secure Communication
When the server pushes a tool_call type message, the frontend needs to execute the corresponding tool function and send the result back to the server via another API endpoint. This is the most critical part of SSE integration, involving security issues of cross-domain communication.
Tool callback flow:
- Parse the
tool_callmessage to get thetoolname andarguments. - Call the corresponding function based on the tool name. It is recommended to define tool functions in an object mapping for easy maintenance and permission control.
- Send the tool execution result back to the server via a
PUT /agent/tool-resultor similar endpoint. - After receiving the result, the server continues to push subsequent messages via SSE.
Three dimensions of secure communication:
Tool whitelist: The frontend only executes tool functions that are in the whitelist, eliminating the risk of executing unknown functions. The whitelist can be sent by the server during project initialization, enabled after verification by the frontend.
HTTPS + signature: All communications use HTTPS encryption. When the server pushes a
tool_callmessage, it includes a signature in the message body (e.g., using HMAC-SHA256 to sign thetoolidentifier andagentId). The frontend verifies the signature before execution. This prevents server impersonation or man-in-the-middle attacks.User secondary confirmation: For high-risk operations (such as modifying DOM, reading sensitive data), pop up a confirmation dialog before tool execution. A permission level check can be added in the
tool_callhandler to decide whether to pop up based on the level. For example, reading location requires only one user confirmation, but sending clipboard data requires confirmation each time.
Vue3 tool callback example:
1 | |
Tip: Each tool call must carry a toolCallId, used by the server to associate the tool execution result with the LLM reasoning context. If the frontend fails to correctly match the toolCallId to the server request, the Agent may incorrectly associate the result with a different dialogue context. It is recommended to log the full chain of tool calls during the development phase for easier troubleshooting.
5. Advanced Techniques and Pitfall Documentation
5.1 Context Isolation in Multi-Agent Collaboration
When embedding multiple Agents (e.g., assistant, data analyst) on the same page, contexts must be isolated by agentId to avoid state interference. Each Agent instance has its own dialogue history, tool list, and session state.
Design points:
Independent instances: Each Agent uses an independent client instance or context object. In React, multiple
useAgentcalls or nested Providers can be used; in Vue3, create an independentuseAgentStreaminstance for each Agent.Communication isolation: The backend must have an
agentIdroute, and the frontend carries the unique identifier of the corresponding Agent when sending requests. The server distinguishes and maintains their respective contexts accordingly.Frontend routing proxy: If multiple Agents use the same backend entry point, the frontend must inject the
agentIdinto the request path. The server routes to the corresponding Agent instance based on theagentId. For example, using a path like/api/agent/:agentId/stream.Namespace tools: Different Agents may call the same tool simultaneously (e.g.,
get_user_location), but the execution results should belong to their respective contexts. The frontend must ensure that each Agent’s tool callback carries the correctagentId.
Common issue: If context isolation is not done, the tool result of Agent A might be used by Agent B, causing confusion in B’s reasoning. This is especially noticeable in high-concurrency or long-running scenarios. It is recommended to adopt a strict isolation scheme from the development stage to avoid later refactoring.
5.2 Common Pitfalls: Tool Function Side Effects and Timeout Handling
Common issues and strategies for frontend tool execution:
Side effects not undone: Tool functions may trigger UI operations (e.g., confirmation dialogs). If the user does not respond or cancels, the tool state is suspended. Solutions:
- All tool functions should be designed to be cancellable, using
AbortControllerorCancelTokenpattern. - Automatic rejection on timeout: Set a hard timeout (e.g., 30 seconds) for each tool call. After timeout, the frontend automatically returns a
timeoutresult.
- All tool functions should be designed to be cancellable, using
Correct cancellation of AbortController:
- When the user clicks “Stop Reply”, both the SSE stream and any executing tool functions should be interrupted.
- In the
tool_callhandler, create a localAbortControllerfor each tool call and link it with the main request’ssignal. When the main request is interrupted, all child tool tasks should also be cancelled.
- Ensure that after a tool call completes, the frontend does not attempt to send results to an already interrupted SSE connection, otherwise a
TypeError: Failed to fetchwill be triggered.
- Client state recovery after server timeout: When the server disconnects due to long reasoning or LLM API timeout, the client should automatically restore to an interactive state. Specific practices:
- In the exception handling branch of
useAgentStream, setisStreaming.value = falseand clear some cache. - Provide a “Retry” button for the user to resend the last message.
- In the exception handling branch of
It is recommended to record the user input content of the last user message at the end of the messages array for easy retry.
- The server should return the offset of the last processed message, and the client restores the context based on it. When retrying, the frontend includes this offset, and the server continues reasoning from the breakpoint.
Precautions: Do not directly manipulate server state in frontend tool functions. Frontend tools are only responsible for getting or modifying browser environment data; server state (such as user login status, shopping cart) should be operated by the Agent through backend tools. If frontend tools incorrectly modify server data, data inconsistency may occur.
6. Summary and Extensions
6.1 Solution Summary
This article has detailed the implementation of the frontend solution for embedding AI Agent on the web. The core pattern is “server reasoning + frontend tool execution + secure communication”. The specific solutions for two mainstream frameworks are compared as follows:
| Dimension | React AG-UI Integration | Vue3 SSE Integration |
|---|---|---|
| Framework encapsulation level | High, AG-UI provides complete React components and hooks | Low, need to manually build SSE protocol and tool callbacks |
| Applicable scenarios | New project reconstruction or where AG-UI system can be introduced | Existing Vue3 projects, or need to customize streaming communication |
| Tool execution method | AG-UI automatically schedules tools; frontend only needs to register | Need to manually parse tool_call messages and send results back |
| Streaming output | Automatic handling, useAgent returns incremental messages |
Manually build ReadableStream and maintain message state |
| Secure communication | Framework has built-in signature and permission mechanisms | Need to implement whitelist + HTTPS + signature yourself |
Selection advice: If the team uses React and the project starts from scratch, AG-UI is preferred; if the project is based on Vue3 or requires highly customized communication protocols, the handwritten SSE solution is more flexible.
Practical implementation suggestions:
Frontend tool functions must be described, side-effect free, and permission-level graded. This is the foundation for the Agent to correctly call tools.
SSE is the recommended communication protocol for frontend integration of Agent streaming output, paired with
AbortControllerfor user abort.Secure communication must be guaranteed from three dimensions: tool whitelist, HTTPS + signature, and user secondary confirmation. None are optional.
Multi-Agent collaboration must isolate contexts using
agentIdto avoid interference.It is recommended to log tool call logs and SSE message streams during the development phase for easier troubleshooting.
6.2 Extension Directions
Based on the current solution, the following directions can be explored in the future:
Execute tools in WebWorker to improve performance: Move the execution logic of frontend tools to a WebWorker to avoid tool calls blocking the UI thread, improving the responsiveness of the main flow, especially in scenarios with multiple concurrent tool calls.
Multimodal Agent frontend integration: Support Agents generating or manipulating canvases (e.g., drag-and-drop component libraries, flowchart editors). The frontend provides an interactive canvas area, and the Agent reads and modifies the canvas state through tool functions, enabling richer interaction capabilities.
Generic frontend Agent integration pattern based on LangChain: LangChain already provides frontend tool registration and callback interfaces, which can be used as a middle layer to uniformly manage frontend Agents. This helps reuse tool logic across frameworks (React, Vue3, Angular) and reduces future maintenance costs.
State persistence and offline capabilities: Store Agent dialogue context in IndexedDB or LocalStorage, automatically restore after page refresh. Combined with Service Worker, limited offline Agent interaction can be achieved.
The above directions can serve as references for the next phase of technical exploration and solution selection.
Summary
Through the study of this article, I believe you have gained a deeper understanding of “embedding AI in the web”. It is recommended to practice more in combination with actual projects. If you have any questions, feel free to discuss!