Beyond Monoliths: Crafting Interoperable AI Systems with Google’s A2A Agent Architecture

Sandeep Belgavi · 8 min read · 5 days ago

In today’s interconnected digital landscape, building intelligent systems often means orchestrating multiple specialized services. Google’s Agent-to-Agent (A2A) protocol provides an elegant and standardized way for these services — or “agents” — to discover, interact, and collaborate. This document provides a technical deep dive into a generic A2A architecture and its components.

---

1. High-Level Architecture Diagram

At its essence, an A2A system orchestrates specialized agents (servers) to expose their capabilities and clients (orchestrators or other agents) to intelligently consume those capabilities.

+-----------------------------------+                    +-------------------------------------+
|      A2A Client (Orchestrator)    |                    |      A2A Agent (Server)             |
|                                   |                    | +---------------------------------+ |
| 1. Discover Agent Manifest        |     HTTP GET       | | /.well-known/agent.json         | |
|    (Identifies Skills, Schemas)   |<-------------------| | (The Agent's Public Resume)     | |
|                                   |                    | +---------------------------------+ |
|                                   |                    |                                     |
| 2. Invoke Skill (Send Task)       |     HTTP POST      | +---------------------------------+ |
|    (A2A Request, JSON-RPC 2.0)    |------------------->| | /tasks/send                     | |
|                                   |                    | | (The Agent's Action Center)     | |
|                                   |                    | +---------+-----------------------+ |
|                                   |                    |           |                         |
|                                   |                    |           | Internal Processing     |
|                                   |                    |           | (NLP, Vector Embedding, |
|                                   |                    |           | Database Queries, AI Logic)|
|                                   |                    |           |                         |
|                                   |                    | +---------v-----------------------+ |
| 3. Receive Response               |<-------------------| | Structured A2A Response         | |
|    (A2A Response, JSON-RPC 2.0)   |                    | | (Results, Status, Artifacts)    | |
+-----------------------------------+                    +-------------------------------------+

---

The Core Pillars of A2A

At its heart, an A2A system comprises two main actors:

The A2A Agent (Server): A specialized service that exposes its capabilities.
The A2A Client (Orchestrator/Consumer): A service that discovers and utilizes the agent’s capabilities.

Let’s break down how they interact.

---

The A2A Agent: Your Intelligent Service’s Public Face

Every A2A agent acts as a server, making its functionalities known to the world through two crucial endpoints: the Manifest and the Task Endpoint.

a. The Agent Manifest: Your Service’s Resume (`/.well-known/agent.json`)

Imagine a central registry where every intelligent service publishes its capabilities. The A2A manifest fulfills this role. It’s a JSON document served at a standard, predictable URL (e.g., http://your-agent.com/.well-known/agent.json).

This manifest isn’t just metadata; it’s a contract. It details:

Identity: name, description, provider, and version of the agent.
Capabilities: Whether it supports streaming, push notifications, etc.
Skills: This is the most vital part. Each skill object describes a specific function the agent can perform (e.g., translate-text, generate-image, find-bus-route).
- id & name: Unique identifiers for the skill.
- description & tags: Human-readable explanations and keywords for discoverability.
- examples: Crucial for demonstrating how to use the skill.
- input_schema: A JSON Schema that strictly defines the structure of the request payload for this skill. This ensures that any client interacting with the agent sends data in the correct format.
- output_schema: Another JSON Schema specifying the expected structure of the response. This allows clients to reliably parse the results.

Code Snippet: A2A Agent Manifest (from `a2a_server.py`)

# a2a_server.py
@app.get("/.well-known/agent.json")
async def agent_manifest():
    return {
        "name": "Simple Greeting Agent",
        "description": "An agent that can generate personalized greetings.",
        "url": "http://localhost:8000/",
        "version": "1.0.0",
        "provider": {
            "organization": "Sample Co.",
            "url": "https://example.com"
        },
        "skills": [
            {
                "id": "greet-user",
                "name": "Greet User",
                "description": "Generates a personalized greeting message for a given name.",
                "input_schema": {
                    "type": "object",
                    "properties": {
                        "message": {
                            "type": "object",
                            "properties": {
                                "parts": {
                                    "type": "array",
                                    "items": {
                                        "type": "object",
                                        "properties": {
                                            "type": {"type": "string", "enum": ["text"]},
                                            "text": {"type": "string", "description": "The name to greet."}
                                        },
                                        "required": ["type", "text"]
                                    }
                                }
                            },
                            "required": ["parts"]
                        }
                    },
                    "required": ["message"]
                },
                "output_schema": {
                    "type": "object",
                    "properties": {
                        "greeting_message": {
                            "type": "string",
                            "description": "The personalized greeting generated by the agent."
                        }
                    },
                    "required": ["greeting_message"]
                }
            }
        ]
    }

b. The Task Endpoint: The Operational Core (`/tasks/send`)

Once a client has intelligently parsed an agent’s manifest and identified the desired skill, it sends an HTTP POST request to the agent’s /tasks/send endpoint. This endpoint serves as the operational heart, triggering the execution of the requested skill.

Upon receiving the payload, the agent’s robust internal logic springs into action, performing a sequence of sophisticated steps such as:

Request Parsing: Carefully extract the message and relevant parameters from the incoming A2A request.
Intelligent Processing: Engage specialized modules (e.g., Natural Language Processing for intent recognition, image processing for feature extraction, business rule engines for decision making).
Data Transformation: Transform data (e.g., generate high-dimensional vector embeddings for semantic understanding, convert data formats).
Data Retrieval & AI Inference: Interact with data stores (e.g., execute database queries, perform vector searches on specialized databases, retrieve information from external APIs).
Result Refinement: Apply filtering, sorting, and aggregation to the raw outputs.
Response Orchestration: Finally, the agent constructs a standardized A2A response, typically adhering to the JSON-RPC 2.0 specification. This response includes a status (e.g., completed, failed), history (if stateful), and crucial artifacts containing the precise output, tailored to the output_schema defined in the manifest.

Code Snippet: Illustrative A2A Task Endpoint (from `a2a_server.py`)

# a2a_server.py
from datetime import datetime # Import datetime for timestamp
@app.post("/tasks/send")
async def send_task(request: Request):
    payload = await request.json()
    # Extract task_id, request_id from payload (simplified for brevity)
    task_id = payload.get("params", {}).get("id") or payload.get("id", "default_task_id")
    request_id = payload.get("id", "default_request_id")

    # Extract user text from the message parts
    message_parts = payload.get("params", {}).get("message", {}).get("parts", [])
    user_text = ""
    if message_parts and isinstance(message_parts[0], dict) and "text" in message_parts[0]:
        user_text = message_parts[0]["text"]
    elif message_parts and isinstance(message_parts[0], str): # Fallback for plain string message
        user_text = message_parts[0]
    # Core logic based on user_text (e.g., generate a simple greeting)
    greeting = f"Hello, {user_text}! A pleasure to interact with you!" if user_text else "Greetings, esteemed user!"

    # Construct A2A compliant response adhering to JSON-RPC 2.0 and output_schema
    response_payload = {
        "jsonrpc": "2.0",
        "id": request_id,
        "result": {
            "id": task_id,
            "status": {
                "state": "completed", # Can also be "failed"
                "timestamp": datetime.utcnow().isoformat()
            },
            "history": [], # For stateful agents, this would contain history of state transitions
            "artifacts": [
                {
                    "parts": [
                        {
                            "type": "text",
                            "text": greeting # Primary textual output
                        }
                    ],
                    "index": 0
                }
            ]
        },
        "greeting_message": greeting # Custom field as per output_schema
    }
    return JSONResponse(response_payload)

---

The A2A Client: The Intelligent Orchestrator

An A2A client is any application or service that wants to leverage the capabilities of an A2A agent. This could be a user interface, a chatbot, or even another A2A agent acting as an orchestrator for a multi-step workflow.

The client’s workflow is straightforward:

Dynamic Discovery: The client initiates interaction by performing an HTTP GET request to the agent’s manifest URL (/.well-known/agent.json). This crucial step allows it to dynamically learn about the agent's skills and their precise input_schema and output_schema, without any prior hardcoding.
Intelligent Request Construction: Guided by its understanding of the user’s intent or its internal logic, the client selects the most appropriate skill and constructs a request payload that rigidly conforms to that skill’s input_schema. This ensures valid and predictable data transmission.
Skill Invocation: The client then dispatches an HTTP POST request, encapsulating the A2A payload, to the agent’s /tasks/send endpoint.
Robust Response Processing: Upon receiving the A2A response, the client performs several critical steps: it meticulously checks the status (e.g., completed, failed) to ascertain task success, and then intelligently parses the artifacts and other custom fields according to the output_schema to extract the desired results.
Actionable Outcomes: Based on the processed response, the client can then take subsequent actions: rendering results to a user, storing data, triggering another A2A agent’s skill (enabling powerful chaining), or initiating error handling routines.

Code Snippet: Illustrative A2A Client Interaction (from `a2a_client.py`)

# a2a_client.py
import requests
import json
from datetime import datetime # Import datetime for dynamic IDs

AGENT_BASE_URL = "http://localhost:8000" # Ensure this matches your server's address

def discover_agent_manifest(agent_url: str) -> dict:
    """
    Fetches and returns the agent's manifest, raising an exception on HTTP errors.
    """
    manifest_url = f"{agent_url}/.well-known/agent.json"
    print(f"Client: Attempting to discover manifest at: {manifest_url}")
    response = requests.get(manifest_url)
    response.raise_for_status() # Raises HTTPError for bad responses (4xx or 5xx)
    manifest = response.json()
    print("\nClient: --- Agent Manifest Discovered ---")
    print(json.dumps(manifest, indent=2))
    return manifest

def invoke_agent_skill(agent_url: str, skill_id: str, message_text: str) -> dict:
    """
    Invokes a specific skill on the agent with a given message text.
    """
    task_send_url = f"{agent_url}/tasks/send"

    # Construct the A2A compliant request payload
    request_payload = {
        "jsonrpc": "2.0",
        "id": "req-client-" + datetime.now().strftime("%Y%m%d%H%M%S%f"), # Dynamic unique request ID
        "params": {
            "id": "task-client-" + datetime.now().strftime("%Y%m%d%H%M%S%f"), # Dynamic unique task ID
            "message": {
                "parts": [
                    {"type": "text", "text": message_text}
                ]
            }
        }
    }

    print(f"\nClient: --- Sending Task to Agent for skill '{skill_id}' ---")
    print(f"Client: Request Payload: {json.dumps(request_payload, indent=2)}")

    response = requests.post(task_send_url, json=request_payload)
    response.raise_for_status()
    result = response.json()
    print("\nClient: --- Agent Response Received ---")
    print(json.dumps(result, indent=2))
    return result

if __name__ == "__main__":
    # Ensure the A2A server (a2a_server.py) is running before executing the client
    try:
        # Step 1: Client dynamically discovers the agent's capabilities
        manifest = discover_agent_manifest(AGENT_BASE_URL)

        # In a production client, you'd parse the manifest to find the desired skill
        # by ID or a more sophisticated matching logic based on user intent.
        # Here, we assume 'greet-user' is the first skill.
        skill_to_invoke = None
        for skill in manifest.get("skills", []):
            if skill.get("id") == "greet-user":
                skill_to_invoke = skill.get("id")
                break

        if skill_to_invoke:
            # Step 2 & 3: Client constructs and sends a request to invoke the skill
            print(f"\nClient: Invoking '{skill_to_invoke}' Skill with input 'Alice'...")
            response = invoke_agent_skill(AGENT_BASE_URL, skill_to_invoke, "Alice")

            # Step 4: Client parses the response and takes action
            status = response.get("result", {}).get("status", {}).get("state")
            # Safely extract text from artifacts, checking nested dictionaries
            output_artifact_text = "N/A"
            if response.get("result", {}).get("artifacts"):
                first_artifact = response["result"]["artifacts"][0]
                if first_artifact.get("parts"):
                    first_part = first_artifact["parts"][0]
                    output_artifact_text = first_part.get("text", "N/A")

            # Accessing custom fields from the output_schema
            custom_greeting = response.get("greeting_message", "N/A")

            print(f"\nClient: --- Processed Results ---")
            print(f"Client: Task Status: {status}")
            print(f"Client: Agent's Primary Text Artifact: {output_artifact_text}")
            print(f"Client: Agent's Custom Greeting Field: {custom_greeting}")

            if status == "completed" and "Alice" in output_artifact_text:
                print("\nClient: Successfully received a personalized greeting for Alice!")
            else:
                print("\nClient: Task did not complete successfully or response was unexpected.")
        else:
            print("\nClient: No 'greet-user' skill found in the agent manifest.")
    except requests.exceptions.ConnectionError:
        print(f"\nClient: Error: Could not connect to the agent at {AGENT_BASE_URL}.")
        print("Please ensure the 'a2a_server.py' is running and accessible.")
    except requests.exceptions.RequestException as e:
        print(f"\nClient: An unexpected error occurred during interaction: {e}")
    except Exception as e:
        print(f"\nClient: An unhandled error occurred: {e}")

---

The Strategic Benefits of A2A: Why it Matters for Your Architecture

Adopting an A2A-inspired architecture offers profound advantages, making it an increasingly attractive paradigm for modern system design:

Unparalleled Interoperability: The cornerstone of A2A. Standardized manifests and meticulously defined communication schemas ensure that disparate agents — developed by different teams, using different technologies, or even managed by different organizations — can seamlessly understand and communicate with each other. This breaks down technological silos.
Enhanced Modularity & Specialization: Complex problems can be decomposed into smaller, highly specialized agents, each focusing on a distinct capability. This dramatically simplifies development, facilitates independent testing, and streamlines maintenance, allowing teams to become experts in specific domains.
Dynamic Extensibility: The system can evolve organically. New agents or skills can be introduced into the ecosystem without necessitating modifications to existing clients or orchestrators, so long as they adhere to the A2A standard. This fosters agility and future-proofs your architecture.
Optimized Scalability: Each agent, being an independent service, can be scaled autonomously based on its specific load requirements. This allows for fine-grained resource allocation and cost optimization, preventing bottlenecks in the overall system.
Intelligent Adaptation: Clients gain the ability to dynamically discover and adapt to new functionalities as agents are updated or novel agents are deployed across the network. This leads to inherently more resilient, flexible, and adaptive systems that can respond to changing requirements without manual intervention.

---

Conclusion

Google’s A2A protocol provides a robust and visionary blueprint for constructing intelligent, scalable, and inherently interoperable systems. By meticulously adhering to its standardized discovery and interaction mechanisms, developers can unlock the true potential of distributed intelligence. This empowers the creation of highly sophisticated and adaptive applications that were once deemed prohibitively complex. Whether your goal is to integrate existing legacy services, build groundbreaking new AI capabilities, or orchestrate complex workflows, an A2A-inspired architecture stands as a powerful and forward-thinking paradigm for your next engineering endeavor.