Frontend Solution for Embedding Agent on Web

1. Introduction

AI Agents are evolving from standalone applications to embedded interactive experiences on the web. The core challenge for frontend teams is: how to safely and efficiently integrate server-side large language model reasoning capabilities with client-side specific operations (such as GPS, clipboard, DOM manipulation) in the browser environment. This article uses React (based on the AG-UI framework) and Vue3, two mainstream frontend frameworks, as examples to illustrate the specific implementation path of embedding AI Agent frontend solutions on the web.

The content covers client-server collaboration principles, definition and execution of frontend tool functions, streaming output integration, secure communication, and multi-Agent isolation strategies. After reading this article, you will master:

Key steps and configuration points for integrating React AG-UI Agent components
Complete code implementation for Vue3 SSE streaming Agent integration
Definition standards and secure execution mechanisms for frontend tool functions
Using AbortController to manage user aborts of streaming requests
Best practices for multi-Agent context isolation

2. Core Concept: Frontend Collaboration Pattern for Embedding AI Agent on the Web

2.1 Client-Server Collaboration Principle

The essence of the frontend solution for embedding AI Agent on the web is the division of responsibilities and collaborative work between the client and the server. The Agent instance runs on the server, responsible for invoking the LLM for reasoning and decision-making; the frontend is primarily responsible for UI interaction and browser environment specific operations.

The specific collaboration flow is as follows:

Server-side led reasoning: The user inputs a message on the frontend, which is sent to the server via an API request. The server maintains the Agent’s conversation context and is responsible for invoking the LLM to generate replies or decisions.
When tool calls require client capabilities: When the LLM’s decision involves browser-specific operations (such as getting location, accessing clipboard, manipulating DOM elements), the server does not send the final answer but instead sends a “tool call request” to the frontend.
Frontend executes and returns the result: After receiving the tool call request, the frontend executes the corresponding frontend function and sends the result back to the server via a protocol (e.g., JSON-RPC).
Server continues reasoning: The server uses the tool execution result as additional context and continues to invoke the LLM to generate the final reply.

Taking the Microsoft AG-UI framework as an example, the Agent instance is created on the server, but tool functions can be marked as “frontend execution” via AIFunctionFactory.Create. When the LLM’s decision requires client capabilities, the server sends the tool call ID and parameters to the client, and the frontend executes them and returns the result. This division leverages the server’s computing resources, protects API keys, while retaining the frontend’s control over the browser environment.

In practical engineering, the server and frontend exchange data through a standard protocol (such as JSON-RPC). The frontend does not directly expose API keys or LLM configurations, thus ensuring security.

Note: This pattern requires the frontend and backend to agree on a unified communication protocol, including the format of tool call requests, parameter passing methods, result return formats, and error handling. It is recommended to define the interface specification early in the project to avoid repeated adjustments later.

2.2 Definition and Execution Mechanism of Frontend Tool Functions

Frontend tool functions are the bridge for Agent interaction with the browser environment. Unlike backend tools, frontend tools directly manipulate DOM, call browser APIs, and read local state, thus requiring stricter definition standards. In the AG-UI example, the definition of frontend tools must follow these principles:

Clear description and signature: Each tool function must include a [Description] attribute to explain the tool’s purpose, parameters, and trigger conditions to the LLM. For example, the description of GetUserLocation() is “Get the user’s current geographic location from the browser.” The LLM decides when to call the tool based on this description. If the description is vague or missing, the LLM may not correctly invoke the tool.
No side effects: Frontend tools should avoid modifying global state or performing irreversible operations (such as popups). If side effects are unavoidable, they should be designed asynchronously with a cancellation mechanism (e.g., AbortController) to allow users to interrupt execution.
Permission levels: Classify tools into different permission levels based on operation sensitivity. For example, reading screen brightness is a low-risk operation and can be executed directly; accessing the clipboard or sending emails is a medium-risk operation requiring user confirmation; manipulating DOM structure is a high-risk operation requiring server-side signature authorization.

In AG-UI, the code example for creating a frontend tool is as follows:

// Define a frontend tool function
[Description("Get the user's current location from GPS.")]
static string GetUserLocation()
{
    // Actually executed by frontend JavaScript calling browser API
    return "Amsterdam, Netherlands (52.37°N, 4.90°E)";
}

// Register as a frontend tool
AITool[] frontendTools = [AIFunctionFactory.Create(GetUserLocation)];

// Pass frontend tools when creating the Agent
AIAgent agent = chatClient.AsAIAgent(
    name: "agui-client",
    description: "AG-UI Client Agent",
    tools: frontendTools);

When the frontend receives a tool call request, it must safely execute the corresponding function and return the result to the server. During execution, all exceptions should be caught to ensure that even if the call fails, the Agent’s subsequent operation is not affected.

Tip: The execution result of a tool function affects the LLM’s subsequent reasoning. If the tool returns null or error information, the LLM may repeatedly call the same tool. It is recommended to design a fallback return value in the frontend tool, such as returning “User denied location permission” when location acquisition fails, instead of returning null.

3. Hands-on: React AG-UI Integration with Agent Dialogue and Frontend Tools

3.1 Setting Up AG-UI Client and Registering Frontend Tools

Integrating the AG-UI client into a React project requires the following steps:

Install dependencies: Introduce the AG-UI npm package and related type definitions into the project. Assuming using the AG-UI React component library, install @agentuity/react and @agentuity/sdk. Run npm install @agentuity/react @agentuity/sdk.
Initialize the client: Create an AG-UI client instance at the application entry point, configuring the server address and communication protocol. Usually a singleton pattern manages the client to avoid duplicate instantiation.
Register frontend tools: Define a handler function for each frontend tool and register it with the client. AG-UI’s React hook useAgent automatically handles the distribution of tool calls. Example code:

import { useAgent } from '@agentuity/react';
import { useState, useCallback } from 'react';

// Define frontend tool handling logic
const frontendTools = {
  get_user_location: async () => {
    return new Promise((resolve) => {
      navigator.geolocation.getCurrentPosition(
        (pos) => resolve({ latitude: pos.coords.latitude, longitude: pos.coords.longitude }),
        () => resolve({ error: 'Location permission denied' })
      );
    });
  },
  // Other tools...
};

function AgentChat() {
  const [messages, setMessages] = useState([]);
  const { sendMessage, isStreaming, abort } = useAgent({
    agentId: 'my-agent',
    tools: frontendTools,
  });

  // ...
}

Key notes:

Tool handler functions must be asynchronous functions returning a Promise. AG-UI waits for the Promise to resolve and automatically sends the result back to the server.
The tool name (e.g., get_user_location above) must match the function name used in the server-side AIFunctionFactory.Create. It is case-sensitive; it is recommended to use lowercase snake_case consistently.

3.2 Integrating Streaming Output and User Interaction

Agent replies are usually returned in a streaming manner, rendered piece by piece on the frontend to enhance user experience. AG-UI’s useAgent hook encapsulates the SSE connection internally, automatically handling streaming text and tool call requests.

Streaming rendering: The messages state returned by useAgent updates in real-time as the server pushes data. Each message contains role (user/assistant/tool) and content. When rendering, simply iterate over the messages array and use a Markdown rendering component for assistant-type messages.
User abort: Call the abort function returned by useAgent to interrupt the current streaming request. This is implemented internally using AbortController. When the user clicks the “Stop” button, call abort(), the frontend disconnects the SSE and notifies the server to cancel the reasoning.
Tool call UI feedback: When the Agent calls a frontend tool, a message with role: 'tool' appears in messages. It is recommended to display the status of the tool call in the UI, for example, showing an intermediate prompt like “Getting your current location…” to let the user know the Agent is executing an operation.

Handling mechanism for corresponding resources or network requests:

const { sendMessage, isStreaming, abort } = useAgent({
  agentId: 'my-agent',
  tools: frontendTools,
  onToolStart: (toolName) => console.log(`Tool ${toolName} started`),
  onToolEnd: (toolName, result) => console.log(`Tool ${toolName} ended with`, result),
});

// Implement the abort button
<button onClick={abort} disabled={!isStreaming}>Stop Reply</button>

Note: The AbortController interrupts the request on the frontend side. If the server does not implement a cancellation mechanism, the server may continue reasoning and waste computing resources. It is recommended that the backend also listen for request cancellation events and actively terminate the LLM interface call.

4. Hands-on: Vue3 SSE Integration with Agent Streaming Output

4.1 Setting Up Vue3 Project and Implementing SSE Long Connection

In a Vue3 project, since AG-UI is currently mainly oriented towards React, it is common to manually build SSE or use fetch + ReadableStream to connect to the server. Using fetch + ReadableStream is recommended because traditional EventSource only supports GET requests and cannot customize headers, while Agent dialogue is usually a POST request.

Below is a complete Vue3 composable example:

// useAgentStream.ts
import { ref, onUnmounted } from 'vue';
import { AbortController as PolyfillAbortController } from 'abortcontroller-polyfill';

interface StreamMessage {
  type: 'thinking' | 'tool_call' | 'tool_result' | 'message' | 'done';
  content?: string;
  tool?: string;
  arguments?: Record<string, any>;
  result?: any;
}

export function useAgentStream() {
  const isStreaming = ref(false);
  const messages = ref<StreamMessage[]>([]);
  let abortController: AbortController | null = null;

  const sendMessage = async (userMessage: string) => {
    if (isStreaming.value) return;

    abortController = new (window.AbortController || PolyfillAbortController)();
    isStreaming.value = true;

try {
      const response = await fetch('/api/agent/stream', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ message: userMessage }),
        signal: abortController.signal,
      });

      const reader = response.body!.getReader();
      const decoder = new TextDecoder();
      let buffer = '';

      while (true) {
        const { done, value } = await reader.read();
        if (done) break;

        buffer += decoder.decode(value, { stream: true });
        const lines = buffer.split('\n');
        buffer = lines.pop() || '';

for (const line of lines) {
          if (line.startsWith('data: ')) {
            const data = line.slice(6).trim();
            if (data === '[DONE]') {
              messages.value.push({ type: 'done' });
              break;
            }
            try {
              const parsed: StreamMessage = JSON.parse(data);
              messages.value.push(parsed);
            } catch (e) {
              console.warn('Failed to parse SSE data:', data);
            }
          }
        }
      }
    } catch (error: any) {
      if (error.name === 'AbortError') {
        console.log('Stream aborted by user');
      } else {
        console.error('SSE stream error:', error);
      }
    } finally {
      isStreaming.value = false;
      abortController = null;
    }
  };

  const stopStream = () => {
    abortController?.abort();
  };

  onUnmounted(() => {
    abortController?.abort();
  });

  return { isStreaming, messages, sendMessage, stopStream };
}

Key notes:

The main advantage of using fetch + ReadStream is supporting POST requests and custom headers (such as authentication tokens). EventSource only supports GET requests, which cannot meet the needs of Agent dialogue scenarios.
The buffer buffer handles chunked data. SSE messages pushed by the server may span multiple data chunks; buffer ensures complete concatenation before parsing.
After reading the response body stream, manual decoding is required; otherwise, Chinese characters may not be handled correctly. decoder.decode(value, { stream: true }) ensures multi-byte characters are not truncated.

Usage example:

<template>
  <div>
    <input v-model="input" @keyup.enter="send" />
    <div v-for="msg in messages" :key="msg.type">
      <div v-if="msg.type === 'message'">{{ msg.content }}</div>
      <div v-else-if="msg.type === 'tool_call'">Calling tool: {{ msg.tool }}</div>
    </div>
    <button @click="stopStream" v-if="isStreaming">Stop</button>
  </div>
</template>

<script setup lang="ts">
import { ref } from 'vue';
import { useAgentStream } from './useAgentStream';

const input = ref('');
const { isStreaming, messages, sendMessage, stopStream } = useAgentStream();
const send = () => {
  if (input.value.trim()) {
    sendMessage(input.value);
    input.value = '';
  }
};
</script>

4.2 Frontend Tool Callback and Secure Communication

When the server pushes a tool_call type message, the frontend needs to execute the corresponding tool function and send the result back to the server via another API endpoint. This is the most critical part of SSE integration, involving security issues of cross-domain communication.

Tool callback flow:

Parse the tool_call message to get the tool name and arguments.
Call the corresponding function based on the tool name. It is recommended to define tool functions in an object mapping for easy maintenance and permission control.
Send the tool execution result back to the server via a PUT /agent/tool-result or similar endpoint.
After receiving the result, the server continues to push subsequent messages via SSE.

Three dimensions of secure communication:

Tool whitelist: The frontend only executes tool functions that are in the whitelist, eliminating the risk of executing unknown functions. The whitelist can be sent by the server during project initialization, enabled after verification by the frontend.
HTTPS + signature: All communications use HTTPS encryption. When the server pushes a tool_call message, it includes a signature in the message body (e.g., using HMAC-SHA256 to sign the tool identifier and agentId). The frontend verifies the signature before execution. This prevents server impersonation or man-in-the-middle attacks.
User secondary confirmation: For high-risk operations (such as modifying DOM, reading sensitive data), pop up a confirmation dialog before tool execution. A permission level check can be added in the tool_call handler to decide whether to pop up based on the level. For example, reading location requires only one user confirmation, but sending clipboard data requires confirmation each time.

Vue3 tool callback example:

const toolHandlers: Record<string, (args: any) => Promise<any>> = {
  get_user_location: async () => {
    // Low risk, execute directly
    return new Promise((resolve) => {
      navigator.geolocation.getCurrentPosition(
        (pos) => resolve({ latitude: pos.coords.latitude, longitude: pos.coords.longitude }),
        () => resolve({ error: 'Location permission denied' })
      );
    });
  },
  get_clipboard: async () => {
    // Medium risk, needs user confirmation
    const confirmed = await userConfirm('Allow Agent to read clipboard content?\n');
    if (!confirmed) throw new Error('User cancelled clipboard access');
    return navigator.clipboard.readText();
  },
};

const handleToolCall = async (toolName: string, args: any) => {
  const handler = toolHandlers[toolName];
  if (!handler) {
    console.warn(`Unknown tool: ${toolName}`);
    return { error: 'Tool not found' };
  }
  try {
    const result = await handler(args);
    // Send result back to server via API
    await fetch('/api/agent/tool-result', {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({ agentId: 'demo-agent', toolCallId: args.toolCallId, result }),
    });
  } catch (e) {
    // Send error back
    await fetch('/api/agent/tool-result', {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({ agentId: 'demo-agent', toolCallId: args.toolCallId, error: String(e) }),
    });
  }
};

Tip: Each tool call must carry a toolCallId, used by the server to associate the tool execution result with the LLM reasoning context. If the frontend fails to correctly match the toolCallId to the server request, the Agent may incorrectly associate the result with a different dialogue context. It is recommended to log the full chain of tool calls during the development phase for easier troubleshooting.

5. Advanced Techniques and Pitfall Documentation

5.1 Context Isolation in Multi-Agent Collaboration

When embedding multiple Agents (e.g., assistant, data analyst) on the same page, contexts must be isolated by agentId to avoid state interference. Each Agent instance has its own dialogue history, tool list, and session state.

Design points:

Independent instances: Each Agent uses an independent client instance or context object. In React, multiple useAgent calls or nested Providers can be used; in Vue3, create an independent useAgentStream instance for each Agent.
Communication isolation: The backend must have an agentId route, and the frontend carries the unique identifier of the corresponding Agent when sending requests. The server distinguishes and maintains their respective contexts accordingly.
Frontend routing proxy: If multiple Agents use the same backend entry point, the frontend must inject the agentId into the request path. The server routes to the corresponding Agent instance based on the agentId. For example, using a path like /api/agent/:agentId/stream.
Namespace tools: Different Agents may call the same tool simultaneously (e.g., get_user_location), but the execution results should belong to their respective contexts. The frontend must ensure that each Agent’s tool callback carries the correct agentId.

Common issue: If context isolation is not done, the tool result of Agent A might be used by Agent B, causing confusion in B’s reasoning. This is especially noticeable in high-concurrency or long-running scenarios. It is recommended to adopt a strict isolation scheme from the development stage to avoid later refactoring.

5.2 Common Pitfalls: Tool Function Side Effects and Timeout Handling

Common issues and strategies for frontend tool execution:

Side effects not undone: Tool functions may trigger UI operations (e.g., confirmation dialogs). If the user does not respond or cancels, the tool state is suspended. Solutions:
- All tool functions should be designed to be cancellable, using AbortController or CancelToken pattern.
- Automatic rejection on timeout: Set a hard timeout (e.g., 30 seconds) for each tool call. After timeout, the frontend automatically returns a timeout result.
Correct cancellation of AbortController:
- When the user clicks “Stop Reply”, both the SSE stream and any executing tool functions should be interrupted.
- In the tool_call handler, create a local AbortController for each tool call and link it with the main request’s signal. When the main request is interrupted, all child tool tasks should also be cancelled.

Ensure that after a tool call completes, the frontend does not attempt to send results to an already interrupted SSE connection, otherwise a TypeError: Failed to fetch will be triggered.

Client state recovery after server timeout: When the server disconnects due to long reasoning or LLM API timeout, the client should automatically restore to an interactive state. Specific practices:
- In the exception handling branch of useAgentStream, set isStreaming.value = false and clear some cache.
- Provide a “Retry” button for the user to resend the last message.

It is recommended to record the user input content of the last user message at the end of the messages array for easy retry.

The server should return the offset of the last processed message, and the client restores the context based on it. When retrying, the frontend includes this offset, and the server continues reasoning from the breakpoint.

Precautions: Do not directly manipulate server state in frontend tool functions. Frontend tools are only responsible for getting or modifying browser environment data; server state (such as user login status, shopping cart) should be operated by the Agent through backend tools. If frontend tools incorrectly modify server data, data inconsistency may occur.

6. Summary and Extensions

6.1 Solution Summary

This article has detailed the implementation of the frontend solution for embedding AI Agent on the web. The core pattern is “server reasoning + frontend tool execution + secure communication”. The specific solutions for two mainstream frameworks are compared as follows:

Dimension	React AG-UI Integration	Vue3 SSE Integration
Framework encapsulation level	High, AG-UI provides complete React components and hooks	Low, need to manually build SSE protocol and tool callbacks
Applicable scenarios	New project reconstruction or where AG-UI system can be introduced	Existing Vue3 projects, or need to customize streaming communication
Tool execution method	AG-UI automatically schedules tools; frontend only needs to register	Need to manually parse `tool_call` messages and send results back
Streaming output	Automatic handling, `useAgent` returns incremental `messages`	Manually build `ReadableStream` and maintain message state
Secure communication	Framework has built-in signature and permission mechanisms	Need to implement whitelist + HTTPS + signature yourself

Selection advice: If the team uses React and the project starts from scratch, AG-UI is preferred; if the project is based on Vue3 or requires highly customized communication protocols, the handwritten SSE solution is more flexible.

Practical implementation suggestions:

Frontend tool functions must be described, side-effect free, and permission-level graded. This is the foundation for the Agent to correctly call tools.
SSE is the recommended communication protocol for frontend integration of Agent streaming output, paired with AbortController for user abort.
Secure communication must be guaranteed from three dimensions: tool whitelist, HTTPS + signature, and user secondary confirmation. None are optional.
Multi-Agent collaboration must isolate contexts using agentId to avoid interference.
It is recommended to log tool call logs and SSE message streams during the development phase for easier troubleshooting.

6.2 Extension Directions

Based on the current solution, the following directions can be explored in the future:

Execute tools in WebWorker to improve performance: Move the execution logic of frontend tools to a WebWorker to avoid tool calls blocking the UI thread, improving the responsiveness of the main flow, especially in scenarios with multiple concurrent tool calls.
Multimodal Agent frontend integration: Support Agents generating or manipulating canvases (e.g., drag-and-drop component libraries, flowchart editors). The frontend provides an interactive canvas area, and the Agent reads and modifies the canvas state through tool functions, enabling richer interaction capabilities.
Generic frontend Agent integration pattern based on LangChain: LangChain already provides frontend tool registration and callback interfaces, which can be used as a middle layer to uniformly manage frontend Agents. This helps reuse tool logic across frameworks (React, Vue3, Angular) and reduces future maintenance costs.
State persistence and offline capabilities: Store Agent dialogue context in IndexedDB or LocalStorage, automatically restore after page refresh. Combined with Service Worker, limited offline Agent interaction can be achieved.

The above directions can serve as references for the next phase of technical exploration and solution selection.

Summary

Through the study of this article, I believe you have gained a deeper understanding of “embedding AI in the web”. It is recommended to practice more in combination with actual projects. If you have any questions, feel free to discuss!