Implementation of A2A Multi-Agent Inter-Communication and Invocation

1. Introduction

As AI Agent application scenarios evolve from single-point tasks to complex workflows, multi-agent cross-platform collaboration has become a necessity in engineering practice. Different frameworks (such as LangChain, CrewAI, Vertex AI Agent Engine) and agents implemented in different languages lack a standard communication protocol, leading to high integration costs and duplicated effort. Google’s Agent2Agent (A2A) protocol solves this problem by standardizing the communication mechanism.

This article first introduces the core principles of the A2A protocol (based on JSON-RPC 2.0 over HTTP(S)), then provides complete practical steps and code examples for two agents to call each other using Cloud Run and Agent Engine, and finally summarizes common pitfalls and debugging methods. After reading this article you will master: A2A multi-agent communication principles, practical JSON-RPC 2.0 agent inter-calling, deploying an A2A Agent service on Cloud Run, key points for configuring the A2A protocol in Agent Engine, and multi-agent role division with A2A invocation design ideas.

2. Core Concepts of the A2A Protocol

The A2A protocol is an open standard designed to enable AI agents from different platforms and frameworks to communicate and collaborate, regardless of their underlying technical implementation. Its core design philosophy is the “discover-communicate-collaborate” three-phase model.

2.1 Communication Foundation Based on JSON-RPC 2.0

The A2A protocol selects JSON-RPC 2.0 over HTTP(S) as the core communication method. JSON-RPC 2.0 is a lightweight, stateless remote procedure call protocol where both requests and responses are encoded in JSON. A typical A2A request contains:

jsonrpc: Fixed to “2.0”
method: The RPC method name to call, e.g., “send_task”
params: Method parameters, usually including task_id, message, etc.
id: Unique identifier for the request, used to match requests with responses

{
  "jsonrpc": "2.0",
  "method": "send_task",
  "params": {
    "task_id": "task-001",
    "message": {
      "role": "user",
      "content": "Please analyze last quarter's sales data"
    }
  },
  "id": "req-abc-123"
}

Why choose JSON-RPC 2.0? It is simple enough, widely supported across languages, stateless by nature and therefore a natural fit for HTTP, and the community has accumulated many ready-made libraries and tools. Compared to gRPC, JSON-RPC 2.0 has a lower debugging threshold and is more suitable for heterogeneous system integration.

2.2 AgentCard: Capability Declaration and Discovery

Each A2A Agent must expose a /.well-known/agent-card endpoint (or register it via service metadata) that returns a JSON document describing its own capabilities—the AgentCard. The AgentCard contains the following key fields:

name: Agent name
description: Functional description, used by other agents for text‑based matching and selection
url: Service endpoint URL
capabilities: Array of capabilities, declaring which RPC methods the agent supports (e.g., send_task, get_task, cancel_task)
authentication: Optional, describes supported authentication methods

In practice, it is recommended to define a clear AgentCard for each agent and keep its content consistent with the actual API endpoints. The presence of the AgentCard gives the A2A protocol “service discovery” capability—the caller can fetch the target agent’s AgentCard, determine whether it has the required capability, and only then initiate a task.

2.3 Task Lifecycle Management

The A2A protocol abstracts a single agent invocation as a Task, whose complete lifecycle includes:

Create (createTask/send_task): The initiator sends a task request, the receiver returns a task_id, and the task enters a “pending” or “working” state.
Query (get_task): The initiator polls for the task status and intermediate/final results.
Update (task_update): The receiver actively pushes status changes (requires callback or streaming support from the initiator).
Cancel (cancel_task): The initiator requests task termination; the receiver cleans up resources and returns confirmation.

A2A supports both synchronous (the initiator blocks waiting for the result) and asynchronous (the initiator submits the task and then polls or waits for push) modes. In streaming scenarios, the receiver can return results incrementally via chunked HTTP responses (Server‑Sent Events) or by returning multiple JSON segments.

2.4 Messages and Multimodal Support

The Message structure in A2A supports text, structured data (e.g., JSON), and references to external resources (e.g., image URLs, binary Base64 encodings). This ensures that agents can exchange non‑textual information—for example, a vision agent receives an image URL, analyzes it, and returns structured results.

3. Development Environment and Prerequisites

Before building an A2A Agent, prepare the following environment and resources:

Google Cloud Project: The A2A protocol’s Codelab and official examples usually rely on Google Cloud Run and Agent Engine. You need an active project with both the Cloud Run API and Vertex AI API enabled (if using Agent Engine).
Install and configure the gcloud CLI: Run gcloud init to authenticate and set the default project. Ensure Python 3.10+ and pip are installed.
Obtain the sample code repository: Run git clone https://github.com/google/agent2agent-codelab (if this repository is unavailable, you can build from the simplified code provided below).
Install dependencies: Create a requirements.txt in the project directory containing the following core libraries:
1
2
3
flask>=2.3 requests>=2.31 gunicorn>=20.1
If you need to use A2A on Agent Engine, also install the google-cloud-aiplatform SDK.

Run pip install -r requirements.txt.
(Optional) Prepare Docker environment: Cloud Run deployment builds a container image, so it is recommended to have Docker Desktop installed locally for debugging.

4. Hands‑on: Deploying the First A2A Agent Service (Cloud Run)

This section deploys a simple Python Agent service that exposes A2A protocol endpoints and can be discovered and invoked by other agents.

4.1 Service Code Implementation

Create the file agent_service.py implementing the HTTP server and A2A endpoints:

from flask import Flask, request, jsonify
import uuid
import json

app = Flask(__name__)

# Simulated task storage (use persistent storage in production)
tasks = {}

@app.route('/.well-known/agent-card', methods=['GET'])
def agent_card():
    """Return this agent's capability declaration"""
    card = {
        "name": "Sales Data Analysis Assistant",
        "description": "Analyze sales data and generate monthly/quarterly reports",
        "url": request.host_url.rstrip('/'),
        "capabilities": ["send_task", "get_task", "cancel_task"],
        "authentication": {}
    }
    return jsonify(card)

@app.route('/', methods=['POST'])
def handle_generic():
    """
    Generic A2A request handler
    It is recommended to dispatch to different handling functions based on the method
    """
    try:
        data = request.get_json(force=True)
    except Exception:
        return jsonify({"jsonrpc": "2.0", "error": {"code": -32700, "message": "Parse error"}, "id": None}), 400

    method = data.get('method')
    params = data.get('params', {})
    req_id = data.get('id', str(uuid.uuid4()))

    if method == 'send_task':
        # Create a task
        task_id = params.get('task_id', str(uuid.uuid4()))
        tasks[task_id] = {
            "status": "working",
            "messages": [params.get('message')]
        }
        # Execute agent logic (actual LLM call omitted here, just return confirmation)
        tasks[task_id]["status"] = "completed"
        tasks[task_id]["messages"].append({
            "role": "agent",
            "content": f"Analyzed task {task_id}. Result: Sales trend is stable; it is recommended to increase investment in customer care.\n"
        })
        return jsonify({
            "jsonrpc": "2.0",
            "result": {
                "task_id": task_id,
                "status": "completed"
            },
            "id": req_id
        })

    elif method == 'get_task':
        task_id = params.get('task_id')
        task = tasks.get(task_id)
        if not task:
            return jsonify({
                "jsonrpc": "2.0",
                "error": {"code": -32000, "message": "Task not found"},
                "id": req_id
            }), 404
        return jsonify({
            "jsonrpc": "2.0",
            "result": task,
            "id": req_id
        })

    elif method == 'cancel_task':
        task_id = params.get('task_id')
        tasks.pop(task_id, None)
        return jsonify({
            "jsonrpc": "2.0",
            "result": {"task_id": task_id, "status": "cancelled"},
            "id": req_id
        })

    else:
        return jsonify({
            "jsonrpc": "2.0",
            "error": {"code": -32601, "message": "Method not found"},
            "id": req_id
        }), 405

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=8080, debug=True)

4.2 Local Verification

Start the service in the terminal: python agent_service.py. Then verify the AgentCard and task creation with curl:

# Get the AgentCard
curl http://localhost:8080/.well-known/agent-card | jq .

# Create a task
curl -X POST http://localhost:8080/ \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc": "2.0",
    "method": "send_task",
    "params": {
        "task_id": "test-001",
        "message": {
            "role": "user",
            "content": "Analyze July sales data"
        }
    },
    "id": "req-1"
  }' | jq .

4.3 Deploy to Cloud Run

Write a Dockerfile (omitted; base image python:3.11-slim, expose port 8080, CMD using gunicorn)
Build and push the image: gcloud builds submit --tag gcr.io/[PROJECT_ID]/a2a-agent-v1
Deploy the service: gcloud run deploy a2a-agent-v1 --image gcr.io/[PROJECT_ID]/a2a-agent-v1 --platform managed --region us-central1 --allow-unauthenticated
Note the returned HTTPS service URL; other agents will access it later via this URL.

Note: In production, disable --allow-unauthenticated and configure service account or API key authentication.

5. Hands‑on: Deploy a Second Agent and Implement Mutual Invocation

Now deploy a second Agent (e.g., “Strategy Recommendation Assistant”) that depends on the first Agent’s data analysis results to generate strategy suggestions.

5.1 Second Agent Service Code

Create agent_planner.py. In its internal send_task handling logic, actively call the task endpoint of the first Agent (the data analysis assistant):

import requests
import json

# Cloud Run service URL of the first Agent
DATA_AGENT_URL = "https://a2a-agent-v1-xxxxxx-uc.a.run.app"

def call_data_agent(query):
    """Send a task to the data analysis assistant and wait for the result"""
    payload = {
        "jsonrpc": "2.0",
        "method": "send_task",
        "params": {
            "task_id": f"plan-{uuid.uuid4()[:8]}",
            "message": {
                "role": "user",
                "content": query
            }
        },
        "id": str(uuid.uuid4())
    }
    resp = requests.post(DATA_AGENT_URL, json=payload, timeout=30)
    resp.raise_for_status()
    return resp.json()

In the send_task handler, when a strategy generation request is received, first call the data analysis assistant and then generate the strategy recommendation based on the returned result. The full code pattern is the same as Section 4.1, with the addition of an internal RPC call.

5.2 Deploy and Integrated Test

Deploy the second Agent to Cloud Run (same steps as above). Then send a task to the strategy recommendation assistant via curl:

curl -X POST https://agent-planner-xxxxxx-uc.a.run.app/ \
  -H "Content-Type: application/json" \
  -d '{
    "method": "send_task",
    "params": {
        "message": {"role": "user", "content": "Based on last month's sales data, give a customer retention strategy"}
    },
    "id": "req-2"
  }'

The expected result should contain intermediate results from the data analysis assistant (simplified as a string in the example) as well as the output from the strategy recommendation assistant itself. This chained invocation pattern is very common in multi‑agent collaboration.

6. Key Code Interpretation: Implementing JSON-RPC 2.0 Agent Mutual Invocation

The A2A protocol requires special attention to the following aspects at the code level:

6.1 Core RPC Methods

Method Name	Purpose	Key Parameters	Typical Response
`send_task`	Create and start executing a task	`task_id`, `message`	`{task_id, status}`
`get_task`	Query task status and results	`task_id`	`{status, messages, artifacts}`
`cancel_task`	Cancel a running task	`task_id`	`{task_id, status:"cancelled"}`

The A2A protocol standardizes these method names so that different agents do not need to negotiate API naming conventions. The id must be unique: different requests from the same agent must not use the same id value; otherwise, the receiver may be unable to distinguish which response belongs to which request. In practice, use UUIDs.

6.2 AgentCard JSON Structure

{
  "name": "Data Analysis Assistant",
  "description": "Supports sales data statistics, trend analysis, anomaly detection",
  "url": "https://my-agent.example.com",
  "capabilities": ["send_task", "get_task"],
  "authentication": {
    "schemes": [
      { "type": "http", "in": "header", "scheme": "bearer" }
    ]
  }
}

Before sending a task, the caller should retrieve the target agent’s AgentCard and verify that its capabilities include the required RPC method(s). If the capabilities do not match, the call should be refused rather than blindly retried.

6.3 Streaming vs. Non‑Streaming Responses

A2A supports two modes:

Non‑streaming: The server returns the complete result (including intermediate and final results) in a single HTTP response. Suitable for tasks that do not require real‑time progress display.
Streaming: The server uses Transfer-Encoding: chunked or SSE, first returning an intermediate status (e.g., working), then incrementally returning results or the final result. The client must support stream parsing.

On Cloud Run with Flask, implement streaming responses using flask.Response with a generator function. Note: Cloud Run has a timeout limit (maximum 30 minutes); long‑running streaming tasks should consider segmentation or multiple rounds of polling.

7. Advanced Tips: Optimizing Agent Engine A2A Protocol Configuration

If you are using Vertex AI Agent Engine (i.e., the Agent Engine feature of Gen AI Agent Builder), the following configuration items allow fine‑grained control over A2A behavior:

7.1 Configuration Item Description

Configuration Item	Purpose	Recommended Value
`agent_name`	Unique identifier of the agent within Agent Engine	Should match the `name` in the AgentCard
`authentication`	Sets the `authentication` field in the AgentCard	Use `NONE` for internal calls; API Key is mandatory for cross‑project calls
`allowed_hosts`	Whitelist to restrict requests from specific domains	`["*.mycompany.com"]`
`logging.enabled`	Whether to log A2A requests	true
`timeout`	Maximum wait time for a single RPC call (seconds)	300 (adjust based on actual task complexity)

7.2 Enable Detailed Logging via Environment Variables

Add environment variables in the Agent Engine deployment configuration:

env:
  - name: A2A_LOGGING_LEVEL
    value: DEBUG
  - name: A2A_TRACE_ENABLED
    value: true

The logs will output the request body, response body, and processing time for each RPC, which is very helpful for troubleshooting cross‑project call failures.

7.3 Performance Recommendations

Timeout: For internal calls, set 30–60 seconds; for external dependencies (e.g., third‑party APIs), set 10 seconds to avoid blocking the agent with slow calls.
Retry strategy: For send_task returning 5xx or network errors, use exponential backoff (max 3 retries). For get_task polling, an interval of 1–5 seconds is recommended to avoid overwhelming the server.
Connection pool reuse: Use requests.Session or urllib3.PoolManager to manage connections and reduce TLS handshake overhead.

8. Pitfall Log and Communication Failure Troubleshooting Guide

The following summarizes typical causes of A2A Agent communication failures and their fixes:

8.1 Network Firewall Blocking Ports

Symptom: Request times out from the initiator; curl returns Connection timed out.
Analysis: Cloud Run services only open ports 443 (HTTPS) and 80 (HTTP) by default. If the agent is deployed inside a VPC and uses a non‑standard port, firewall rules or a load balancer need to be configured.
Solution: Ensure both sides use the default HTTPS port (443) and verify that the service endpoint is reachable.

Temporarily, run nc -vz [host] [port] to check port connectivity.

8.2 HTTPS Certificate Mismatch or Invalid

Symptom: requests.exceptions.SSLError.
Analysis: A self‑signed certificate service was called, or the service domain is not trusted. Cloud Run uses Google‑managed SSL certificates to automatically configure certificates for custom domains, but if you call the service by IP address directly without binding a domain, the certificate will not match.
Solution: Always use the HTTPS URL generated by Cloud Run; do not resolve to an IP directly.

If it must be used on an internal network, you can set verify=False (development only).

8.3 Duplicate or Missing `id` in JSON‑RPC Request

Symptom: The receiver returns "error": {"code": -32600, "message": "Invalid Request"}.
Analysis: The JSON‑RPC 2.0 specification requires each request to have a non‑null id, and uniqueness is recommended. If the caller mistakenly sends an empty string or a duplicate id, the receiver may not handle it correctly.
Solution: When building the request body, always generate the id using str(uuid.uuid4()) and validate that it is non‑empty in the code.

8.4 Incompatible Agent Capabilities Leading to Task Rejection

Symptom: Returns "error": {"code": -32601, "message": "Method not found"}.
Analysis: The caller tries to invoke an RPC method that the target agent has not declared support for in its AgentCard. For example, the target agent’s AgentCard only lists send_task and get_task, but the caller calls cancel_task.
Solution: Before calling, fetch the target agent’s AgentCard, assert that the required method is in the capabilities array; otherwise, return an error directly and log it.

8.5 Agent Engine Default Authentication Not Disabled Leading to Cross‑project Call Failures

Symptom: 401 or 403 is returned in cross‑project calls.
Analysis: Agent Engine has authentication enabled by default, and the authentication field is not declared in the AgentCard. The caller does not carry a valid token.
Solution:

Set authentication: NONE in the Agent Engine configuration (only for non‑sensitive internal scenarios).
Or generate a token on the caller side and include it in the Authorization header.

The simplest method is to use a Google Cloud ID token (the caller must have the appropriate permissions).

8.6 CORS Policy Refusing Browser Calls

Symptom: When a front‑end web page directly calls the Agent service, the browser console shows a CORS error.
Analysis: Cloud Run does not allow cross‑origin requests by default.
Solution: Add a CORS middleware in the Flask application:

1 2	`from flask_cors import CORS CORS(app, resources={r"/": {"origins": ""}})`

In production, limit origins to specific trusted domains.

9. Multi‑Agent Collaboration Patterns and Design Suggestions

9.1 Common Collaboration Patterns

Sequential invocation (task chain): Agent A calls Agent B, Agent B calls Agent C, forming a pipeline. Suitable for data processing flows (e.g., cleaning -> analysis -> report).
Parallel invocation (result aggregation): A supervisor agent sends tasks to multiple child agents simultaneously and aggregates the results after all have returned. Suitable for multi‑dimensional analysis (e.g., querying sales, inventory, and customer feedback at the same time).
Dynamic discovery (registry): Maintain a central AgentCard registry (e.g., via a service mesh or a simple configuration center). When a new agent registers, other agents can learn about its capabilities and endpoints by querying the registry, enabling hot‑plugging.

9.2 Role Division Design Principles

Single responsibility: Each agent is responsible for the business logic of only one domain; do not mix functionalities. For example, a data analysis agent should not send emails; an email agent should not generate reports.
Stateless design: Agent services should preferably be stateless; task state can be stored in a backend database or memory (but be aware of loss upon restart). This enables horizontal scaling.
Orchestration by an entry agent: Typically, an “orchestration agent” acts as the external entry point, responsible for receiving user requests, splitting tasks, making A2A calls, collecting results, and assembling the final output.

The orchestration agent itself should not execute concrete business logic—only scheduling.

9.3 A2A Protocol vs. Simple HTTP Calls

Comparison Item	Simple HTTP Call	A2A Protocol
Standardization	None, requires bilateral negotiation	Yes, based on JSON‑RPC 2.0
Self‑discovery	None	Supported via AgentCard
Task management	None	Full lifecycle
Cross‑platform	Requires adaptation	Natively supported
Debugging tools	Few	Can be debugged with curl + jq

Therefore, for multi‑agent collaboration, the A2A protocol is preferred unless there is only a single two‑way call and scalability is not a concern.

10. Summary and Further Exploration

This article has taken you from core principles to complete hands‑on examples, step‑by‑step, explaining how to implement cross‑platform multi‑agent inter‑communication and invocation based on the A2A protocol. Key takeaways:

The A2A protocol is an open standard based on JSON‑RPC 2.0 over HTTP(S). It enables capability discovery via AgentCard and manages task status through the Task lifecycle.
Hands‑on deployment: You can quickly set up an A2A Agent service using Python + Flask or Cloud Run and enable agents to call each other via HTTP POST requests.
Core of JSON‑RPC 2.0 agent inter‑calling lies in send_task, get_task, cancel_task, and adherence to coding conventions such as id uniqueness and capability checking.
Agent Engine configuration optimization includes authentication mode, log level, whitelisting, and timeout policies.
Troubleshooting should first check network connectivity, certificates, JSON‑RPC format, agent capability compatibility, and authentication configuration.

As the AI Agent ecosystem evolves, the A2A protocol will complement MCP (Model Context Protocol)—MCP serves as the universal interface between models and tools, while A2A is the collaboration protocol between agents. In the future, large organizations may form agent meshes managed by a central registry that unifies the A2A endpoints of hundreds of agents.

The community is also discussing support for additional transport layers (e.g., WebSocket, gRPC) and higher security requirements (e.g., mutual TLS).

I recommend that teams gradually build an agent capability library in internal projects: first define the AgentCard specification, ensure that newly developed agents expose standard A2A endpoints; then validate sequential and parallel invocation patterns in limited scopes; and finally implement an orchestrated multi‑agent collaboration system to improve business response efficiency.

Conclusion

Through this article, you should now have a deeper understanding of the “A2A Protocol Multi‑Agent Communication Principles.” I encourage you to practice by applying it to real projects. If you have any questions, feel free to discuss!