Beyond Monoliths: Crafting Interoperable AI Systems with Google’s A2A Agent Architecture

Sandeep Belgavi · 8 min read · 5 days ago

3

In today’s interconnected digital landscape, building intelligent systems often means orchestrating multiple specialized services. Google’s Agent-to-Agent (A2A) protocol provides an elegant and standardized way for these services — or “agents” — to discover, interact, and collaborate. This document provides a technical deep dive into a generic A2A architecture and its components.

---

1. High-Level Architecture Diagram

At its essence, an A2A system orchestrates specialized agents (servers) to expose their capabilities and clients (orchestrators or other agents) to intelligently consume those capabilities.

+-----------------------------------+                    +-------------------------------------+
|      A2A Client (Orchestrator)    |                    |      A2A Agent (Server)             |
|                                   |                    | +---------------------------------+ |
| 1. Discover Agent Manifest        |     HTTP GET       | | /.well-known/agent.json         | |
|    (Identifies Skills, Schemas)   |<-------------------| | (The Agent's Public Resume)     | |
|                                   |                    | +---------------------------------+ |
|                                   |                    |                                     |
| 2. Invoke Skill (Send Task)       |     HTTP POST      | +---------------------------------+ |
|    (A2A Request, JSON-RPC 2.0)    |------------------->| | /tasks/send                     | |
|                                   |                    | | (The Agent's Action Center)     | |
|                                   |                    | +---------+-----------------------+ |
|                                   |                    |           |                         |
|                                   |                    |           | Internal Processing     |
|                                   |                    |           | (NLP, Vector Embedding, |
|                                   |                    |           | Database Queries, AI Logic)|
|                                   |                    |           |                         |
|                                   |                    | +---------v-----------------------+ |
| 3. Receive Response               |<-------------------| | Structured A2A Response         | |
|    (A2A Response, JSON-RPC 2.0)   |                    | | (Results, Status, Artifacts)    | |
+-----------------------------------+                    +-------------------------------------+
---

The Core Pillars of A2A

At its heart, an A2A system comprises two main actors:

Let’s break down how they interact.

---

The A2A Agent: Your Intelligent Service’s Public Face

Every A2A agent acts as a server, making its functionalities known to the world through two crucial endpoints: the Manifest and the Task Endpoint.

a. The Agent Manifest: Your Service’s Resume (/.well-known/agent.json)

Imagine a central registry where every intelligent service publishes its capabilities. The A2A manifest fulfills this role. It’s a JSON document served at a standard, predictable URL (e.g., http://your-agent.com/.well-known/agent.json).

This manifest isn’t just metadata; it’s a contract. It details:

Code Snippet: A2A Agent Manifest (from a2a_server.py)

# a2a_server.py
@app.get("/.well-known/agent.json")
async def agent_manifest():
    return {
        "name": "Simple Greeting Agent",
        "description": "An agent that can generate personalized greetings.",
        "url": "http://localhost:8000/",
        "version": "1.0.0",
        "provider": {
            "organization": "Sample Co.",
            "url": "https://example.com"
        },
        "skills": [
            {
                "id": "greet-user",
                "name": "Greet User",
                "description": "Generates a personalized greeting message for a given name.",
                "input_schema": {
                    "type": "object",
                    "properties": {
                        "message": {
                            "type": "object",
                            "properties": {
                                "parts": {
                                    "type": "array",
                                    "items": {
                                        "type": "object",
                                        "properties": {
                                            "type": {"type": "string", "enum": ["text"]},
                                            "text": {"type": "string", "description": "The name to greet."}
                                        },
                                        "required": ["type", "text"]
                                    }
                                }
                            },
                            "required": ["parts"]
                        }
                    },
                    "required": ["message"]
                },
                "output_schema": {
                    "type": "object",
                    "properties": {
                        "greeting_message": {
                            "type": "string",
                            "description": "The personalized greeting generated by the agent."
                        }
                    },
                    "required": ["greeting_message"]
                }
            }
        ]
    }

b. The Task Endpoint: The Operational Core (/tasks/send)

Once a client has intelligently parsed an agent’s manifest and identified the desired skill, it sends an HTTP POST request to the agent’s /tasks/send endpoint. This endpoint serves as the operational heart, triggering the execution of the requested skill.

Upon receiving the payload, the agent’s robust internal logic springs into action, performing a sequence of sophisticated steps such as:

Code Snippet: Illustrative A2A Task Endpoint (from a2a_server.py)

# a2a_server.py
from datetime import datetime # Import datetime for timestamp
@app.post("/tasks/send")
async def send_task(request: Request):
    payload = await request.json()
    # Extract task_id, request_id from payload (simplified for brevity)
    task_id = payload.get("params", {}).get("id") or payload.get("id", "default_task_id")
    request_id = payload.get("id", "default_request_id")

    # Extract user text from the message parts
    message_parts = payload.get("params", {}).get("message", {}).get("parts", [])
    user_text = ""
    if message_parts and isinstance(message_parts[0], dict) and "text" in message_parts[0]:
        user_text = message_parts[0]["text"]
    elif message_parts and isinstance(message_parts[0], str): # Fallback for plain string message
        user_text = message_parts[0]
    # Core logic based on user_text (e.g., generate a simple greeting)
    greeting = f"Hello, {user_text}! A pleasure to interact with you!" if user_text else "Greetings, esteemed user!"

    # Construct A2A compliant response adhering to JSON-RPC 2.0 and output_schema
    response_payload = {
        "jsonrpc": "2.0",
        "id": request_id,
        "result": {
            "id": task_id,
            "status": {
                "state": "completed", # Can also be "failed"
                "timestamp": datetime.utcnow().isoformat()
            },
            "history": [], # For stateful agents, this would contain history of state transitions
            "artifacts": [
                {
                    "parts": [
                        {
                            "type": "text",
                            "text": greeting # Primary textual output
                        }
                    ],
                    "index": 0
                }
            ]
        },
        "greeting_message": greeting # Custom field as per output_schema
    }
    return JSONResponse(response_payload)
---

The A2A Client: The Intelligent Orchestrator

An A2A client is any application or service that wants to leverage the capabilities of an A2A agent. This could be a user interface, a chatbot, or even another A2A agent acting as an orchestrator for a multi-step workflow.

The client’s workflow is straightforward:

Code Snippet: Illustrative A2A Client Interaction (from a2a_client.py)

# a2a_client.py
import requests
import json
from datetime import datetime # Import datetime for dynamic IDs

AGENT_BASE_URL = "http://localhost:8000" # Ensure this matches your server's address

def discover_agent_manifest(agent_url: str) -> dict:
    """
    Fetches and returns the agent's manifest, raising an exception on HTTP errors.
    """
    manifest_url = f"{agent_url}/.well-known/agent.json"
    print(f"Client: Attempting to discover manifest at: {manifest_url}")
    response = requests.get(manifest_url)
    response.raise_for_status() # Raises HTTPError for bad responses (4xx or 5xx)
    manifest = response.json()
    print("\nClient: --- Agent Manifest Discovered ---")
    print(json.dumps(manifest, indent=2))
    return manifest

def invoke_agent_skill(agent_url: str, skill_id: str, message_text: str) -> dict:
    """
    Invokes a specific skill on the agent with a given message text.
    """
    task_send_url = f"{agent_url}/tasks/send"

    # Construct the A2A compliant request payload
    request_payload = {
        "jsonrpc": "2.0",
        "id": "req-client-" + datetime.now().strftime("%Y%m%d%H%M%S%f"), # Dynamic unique request ID
        "params": {
            "id": "task-client-" + datetime.now().strftime("%Y%m%d%H%M%S%f"), # Dynamic unique task ID
            "message": {
                "parts": [
                    {"type": "text", "text": message_text}
                ]
            }
        }
    }

    print(f"\nClient: --- Sending Task to Agent for skill '{skill_id}' ---")
    print(f"Client: Request Payload: {json.dumps(request_payload, indent=2)}")

    response = requests.post(task_send_url, json=request_payload)
    response.raise_for_status()
    result = response.json()
    print("\nClient: --- Agent Response Received ---")
    print(json.dumps(result, indent=2))
    return result

if __name__ == "__main__":
    # Ensure the A2A server (a2a_server.py) is running before executing the client
    try:
        # Step 1: Client dynamically discovers the agent's capabilities
        manifest = discover_agent_manifest(AGENT_BASE_URL)

        # In a production client, you'd parse the manifest to find the desired skill
        # by ID or a more sophisticated matching logic based on user intent.
        # Here, we assume 'greet-user' is the first skill.
        skill_to_invoke = None
        for skill in manifest.get("skills", []):
            if skill.get("id") == "greet-user":
                skill_to_invoke = skill.get("id")
                break

        if skill_to_invoke:
            # Step 2 & 3: Client constructs and sends a request to invoke the skill
            print(f"\nClient: Invoking '{skill_to_invoke}' Skill with input 'Alice'...")
            response = invoke_agent_skill(AGENT_BASE_URL, skill_to_invoke, "Alice")

            # Step 4: Client parses the response and takes action
            status = response.get("result", {}).get("status", {}).get("state")
            # Safely extract text from artifacts, checking nested dictionaries
            output_artifact_text = "N/A"
            if response.get("result", {}).get("artifacts"):
                first_artifact = response["result"]["artifacts"][0]
                if first_artifact.get("parts"):
                    first_part = first_artifact["parts"][0]
                    output_artifact_text = first_part.get("text", "N/A")

            # Accessing custom fields from the output_schema
            custom_greeting = response.get("greeting_message", "N/A")

            print(f"\nClient: --- Processed Results ---")
            print(f"Client: Task Status: {status}")
            print(f"Client: Agent's Primary Text Artifact: {output_artifact_text}")
            print(f"Client: Agent's Custom Greeting Field: {custom_greeting}")

            if status == "completed" and "Alice" in output_artifact_text:
                print("\nClient: Successfully received a personalized greeting for Alice!")
            else:
                print("\nClient: Task did not complete successfully or response was unexpected.")
        else:
            print("\nClient: No 'greet-user' skill found in the agent manifest.")
    except requests.exceptions.ConnectionError:
        print(f"\nClient: Error: Could not connect to the agent at {AGENT_BASE_URL}.")
        print("Please ensure the 'a2a_server.py' is running and accessible.")
    except requests.exceptions.RequestException as e:
        print(f"\nClient: An unexpected error occurred during interaction: {e}")
    except Exception as e:
        print(f"\nClient: An unhandled error occurred: {e}")
---

The Strategic Benefits of A2A: Why it Matters for Your Architecture

Adopting an A2A-inspired architecture offers profound advantages, making it an increasingly attractive paradigm for modern system design:

---

Conclusion

Google’s A2A protocol provides a robust and visionary blueprint for constructing intelligent, scalable, and inherently interoperable systems. By meticulously adhering to its standardized discovery and interaction mechanisms, developers can unlock the true potential of distributed intelligence. This empowers the creation of highly sophisticated and adaptive applications that were once deemed prohibitively complex. Whether your goal is to integrate existing legacy services, build groundbreaking new AI capabilities, or orchestrate complex workflows, an A2A-inspired architecture stands as a powerful and forward-thinking paradigm for your next engineering endeavor.