Developing and Integrating Custom Tools for Agents
1. Introduction
Agents extend their capabilities through tool calls (Function Call), which is currently the mainstream approach. Standard tool libraries (e.g., calculator, search, knowledge base retrieval) can cover general scenarios, but customized requirements in business systems—such as querying internal ticket systems, invoking proprietary recommendation algorithm interfaces, or operating internal approval workflows—typically need custom tools to be developed for integration.
This article explains how to complete the full development and integration of custom tools based on two frameworks: LangChain and Qwen-Agent. The content covers: using the @tool decorator, tool registration and parameter binding mechanisms, secure communication and permission levels, common pitfalls and advanced tips, and a complete hands‑on example from definition to deployment. After reading, you will be able to independently develop custom tools for your existing Agent system and understand the principles and limitations of the underlying Function Call protocol.
2. Core Concepts: Agent Tool Calling Principles and Function Call Mechanism
2.1 Basic Flow of Tool Calling
The core workflow of an Agent calling a tool is as follows:
- User Input: The user submits a natural language request to the Agent, e.g., “Calculate the area of a circle with radius 5”.
- LLM Reasoning: Based on the current conversation context and the list of registered tools and their descriptions, the LLM decides whether a tool should be invoked. If so, the LLM outputs a structured tool call request containing the tool name and a parameter dictionary.
- Runtime Parsing: The Agent framework receives the LLM output, parses the tool name and parameters, looks up the corresponding Python function, and executes it.
- Result Return: The function execution result (a string or structured data) is wrapped into a message and sent back to the LLM, which then generates the final answer.
This closed‑loop design ensures the Agent can “think → act → observe results → think again”, which is the core of the ReAct pattern.
2.2 Function Call Protocol and the @tool Decorator
Function Call is an output format constraint introduced by OpenAI in the GPT‑4 series: when the LLM determines that a tool call is needed, it outputs a JSON object containing name and arguments fields. Subsequent reasoning models (e.g., Claude 3, Qwen2.5) already support a similar protocol.
In LangChain, the @tool decorator is the standard way to simplify tool definition. Essentially, it transforms a plain Python function into a Tool object that conforms to the Agent framework, automatically performing the following:
- Extracting the function signature: Parses parameter names, type hints, and default values.
- Generating the parameter schema: Converts the function signature into JSON Schema format (including types, descriptions, and whether parameters are required).
- Injecting the docstring: The function’s
__doc__becomes the tool’sdescriptionfield, which the LLM uses to decide when to call this tool.
Therefore, the quality of a tool’s definition directly determines whether the LLM can correctly invoke it.
3. Ways to Define Custom Tools: @tool Decorator vs. Structured Tool Class
3.1 Using the @tool Decorator
LangChain’s @tool decorator provides the most concise way to define a tool. Example:
1 | |
The decorator accepts name and description parameters to explicitly override defaults. When the function name is not descriptive enough, you should set the description.
For more detailed parameter descriptions, use the Annotated type:
1 | |
Here, the schema generated by @tool includes a description for each parameter, greatly increasing the probability that the LLM generates correct parameters.
3.2 Tool Class vs. StructuredTool Class
Besides the decorator, LangChain also offers the Tool and StructuredTool classes.
Toolclass: Takes a single string as input, suitable when the tool has only one input parameter (e.g., reading file content, querying a single value).StructuredToolclass: Takes structured parameters (defined via a Pydantic model), suitable for scenarios with multiple parameters and rich types. The instance created by the@tooldecorator is actually aStructuredTool.
Recommendation: Prefer the @tool decorator; it is concise and handles parameter parsing automatically. Only manually instantiate a StructuredTool when you need fine‑grained control over the tool’s behavior (e.g., custom error handling, asynchronous execution).
3.3 Importance of the Docstring
The LLM relies on the tool’s description to decide when to call it and how to pass parameters. This means:
- The tool description (docstring) should clearly state its functionality, applicable scenarios, and boundary conditions.
- Parameter descriptions should clearly indicate what input is allowed and what range of values is expected (e.g., numeric units, string encoding).
- If the tool has side effects (e.g., writing to a database, sending an email), be sure to mention this in the description.
Note: The tool description should not be too verbose, but it must never be empty. A vague description (e.g., “query information”) can cause the LLM to mistakenly use this tool when another tool should be used.
4. Tool Registration and Parameter Binding: Practice with LangChain and Qwen‑Agent
4.1 Tool Registration in LangChain
In LangChain, tools are registered by placing the defined tool instances into a list and passing them to an Agent or Runnable:
1 | |
AgentExecutor manages the complete lifecycle of tool calls: receiving user input, calling the LLM, parsing tool calls, executing tools, returning results to the LLM, and generating the final answer.
At the call level, a tool object can be manually invoked to verify it works correctly:
1 | |
4.2 Registering Custom Tools in Qwen‑Agent
Qwen‑Agent (the Agent framework that accompanies Alibaba Cloud’s Qwen models) does not depend on LangChain. Instead, tools are registered using a declarative JSON Schema. Developers provide a list of functions, each containing fields like name, description, and parameters.
Typical registration flow:
1 | |
Qwen‑Agent also provides a FunctionPlugin class that allows developers to implement the tool’s lifecycle (initialization, execution, cleanup) in a class‑based way, suitable for tools with complex state.
4.3 Comparison of the Two Frameworks
| Dimension | LangChain | Qwen‑Agent |
|---|---|---|
| Tool definition style | @tool decorator / StructuredTool |
JSON Schema / FunctionPlugin class |
| Parameter parsing | Auto‑generated schema from function signature and type hints | Manual schema (must match function signature) |
| Recommended use case | Rapid prototyping, mixing multiple frameworks | Pure Qwen model, projects within Alibaba Cloud ecosystem |
| Learning curve | Low (intuitive decorator pattern) | Higher (need to understand JSON Schema specification) |
Both frameworks rely on the Function Call protocol; tool definitions can be swapped between them, but an adaptation layer is required during integration.
5. Secure Communication and Permission Levels: Tool Whitelist, User Confirmation, and Front‑end Integration
Once a tool is exposed to the Agent, the Agent can perform actions with side effects (e.g., deleting files, sending messages, modifying databases). In production environments, security mechanisms must be established.
5.1 Three‑Layer Tool Security Design
Layer 1: Front‑end Tool Whitelist
The front end (client) is only allowed to execute tools declared by the server. Specifically: when initializing the Agent, the server maintains a list of tool IDs. When the front end receives a tool call instruction via SSE or WebSocket, it first verifies whether the tool ID is within the current session’s whitelist. Calls with IDs not in the whitelist are rejected.
Layer 2: HTTPS + Signature Anti‑tampering
Tool call instructions transmitted between front end and back end should be signed (e.g., HMAC‑SHA256) to ensure they have not been tampered with during transmission. If an instruction is invalid, the front end should refuse to execute it and report back to the server.
Layer 3: User Secondary Confirmation
For high‑risk tools (e.g., writing to a database, sending a message, deleting a resource), the permission level should be marked in the tool metadata. When receiving such a tool call instruction, the front end should display a confirmation dialog and execute only after the user explicitly agrees.
5.2 Front‑end Integration Experience for Web‑embedded Agents
In the experience of “embedding an Agent in a web page”, the server is responsible for LLM reasoning and maintaining the tool catalog, while the front end handles tool execution and UI presentation. Key implementation points:
- The server pushes messages (including tool call instructions) downstream via SSE.
- Upon receiving a tool call message, the front end parses the instruction content and verifies that the tool ID is in the whitelist.
- If verification passes, the front end invokes the actual tool function implemented there (e.g.,
showConfirmDialog,updateDom); if verification fails, it shows an error to the front end. - Results are returned to the server via an SSE stream or an HTTP callback.
5.3 Permission Level Example
Add a permission field to the tool metadata:
1 | |
The front end checks the permission field to decide whether a confirmation dialog is needed. Suggested grading rules:
read_only: No side effects, execute automatically.write: Has side effects, must wait for user secondary confirmation.admin: Involves system‑level operations, requires both confirmation and an admin password.
6. Advanced Tips and Common Pitfalls
6.1 Tool Execution Timeout and Idempotency
The default behavior of Agent tool calls is synchronous waiting. If a tool takes too long to execute (e.g., an external API timeout), the Agent response cycle becomes longer and user experience degrades.
Recommended practices:
- Set a timeout for tools (e.g., 30s). In LangChain, this can be achieved with
asyncio.wait_fororThreadPoolExecutor. - Design tool functions to be idempotent: the same input repeated multiple times produces the same result without side effects. This is crucial to avoid data pollution when errors cause retries.
6.2 Context Isolation and Multi‑Agent Collaboration
In multi‑agent systems, different agents may call the same tool. If the tool holds state (e.g., a cache or counter), interference can occur.
Solutions:
- Use
agentIdas a key to isolate tool call context. For example, inside the tool function, usecontext.get("agentId")to identify which Agent is making the call. - Use
ChatMemoryto manage the history of tool calls for each Agent, ensuring each Agent can only access its own history.
6.3 Handling Parameter Parsing Failures
The LLM‑generated tool call parameters are not always as expected. Common problems include: wrong parameter type (a string where a number was expected), missing required parameters, or null parameter values.
Strategies:
- Perform type validation and fallback to default values at the entry point of the tool function.
- LangChain’s
ToolExceptionmechanism allows the tool to raise a special exception; the framework returns the exception information to the LLM for retry. - Log failure cases to later optimize the Agent’s prompt or the tool description.
6.4 Front‑end Performance Optimization
When an Agent frequently executes tools that manipulate the DOM (e.g., real‑time page layout modification, element insertion), directly operating the DOM can cause numerous reflows and repaints.
Recommended approach: Use off‑screen DOM (DocumentFragment) for batch modifications, then swap the fragment into the main DOM in one step. Tool calls can be aggregated and processed in batches instead of one by one.
7. Hands‑on Example: Developing and Integrating an Internal Knowledge Base Query Tool
7.1 Scenario Definition
Business requirement: Users can query the company’s internal Wiki knowledge base via the Agent and get document summaries. The tool should accept query and top_k parameters.
7.2 LangChain Implementation
1 | |
7.3 Equivalent Qwen‑Agent Implementation
1 | |
7.4 Verification
After executing executor.invoke, the normal output should be:
1 | |
If the Agent does not call the tool (and the LLM directly generates an answer), check whether the tool description is clear and whether the prompt explicitly requires tool invocation.
8. Summary and Further Directions
Key Recap
The critical steps in developing and integrating custom tools for Agents:
- Define the tool: Use the
@tooldecorator (LangChain) or manually write a JSON Schema (Qwen‑Agent), ensuring clear descriptions and correct parameter types. - Register and bind: Pass the list of tool instances to the Agent or register them via the
functionsparameter with the LLM. - Security control: Three‑layer security through front‑end whitelist, signature anti‑tampering, and user secondary confirmation.
- Front‑end/Back‑end integration: The server maintains the tool catalog; the front end receives instructions via SSE, checks permissions, executes, and feeds back results.
Further Directions
- SSE streaming to the front end: During tool execution, push real‑time status updates (e.g., “Calling tool A, parameters: …”) to the front end to improve user experience.
- Running a lightweight LLM on the client with WebAssembly: Use a small model (e.g., TinyLlama) on the client for partial tool decisions, reducing server load and supporting offline scenarios.
- Multi‑agent tool invocation via the A2A protocol: One Agent’s tool can be called by another Agent’s
send_task, enabling task decomposition and automated collaboration.
It is recommended to keep an eye on official updates from LangChain and Qwen‑Agent frameworks, and use the approaches in this article as a starting reference for custom tool development in your team, adapting and extending them to your specific business scenarios.
Conclusion
Through this article, we hope you have gained a deeper understanding of custom tool development for Agents. It is recommended to practice with real projects. If you have any questions, feel free to discuss!