Background Work, Frontline Gains: Unlocking Async with A2A

Exploring the power of Google's Agent to Agent (A2A) protocol for building scalable, asynchronous AI systems. Learn how async patterns are foundational for intelligent agent-based applications and why A2A could become the standard for modern AI workflows.

One of the earliest realizations I had as a software engineer was just how long certain tasks can really take in production environments. Back in the day, large PDFs could sometimes cause memory leaks, especially if the reading process blocked other incoming connections. Working early on in healthcare and academic systems, I became all too familiar with the challenge of handling oversized files.

In the Ruby ecosystem, we had a fantastic library called Sidekiq. It made offloading long-running jobs to background workers incredibly straightforward. It let us queue up tasks, track their progress, and define success or failure callbacks that integrated cleanly with the rest of the application. It fundamentally changed how I thought about asynchronous processing.

Today, I'm building AI-enabled applications where one of our primary goals is to empower both an AI copilot and users with tools that reduce repetitive workload. One of my favorite examples is how our copilot can autofill tedious forms. Users can simply paste the content of an email request into the app, and the copilot intelligently extracts the relevant details and fills in the form. It's a small feature with a massive productivity gain.

The next frontier for us is optimizing more of our long-running tasks to run asynchronously. One example is processing lengthy PDFs into our Retrieval Augmented Generation (RAG) database. As the data grows, running these processes in a blocking, synchronous way becomes unsustainable.

The Power of Google's A2A Protocol

This is where Google's Agent to Agent (A2A) protocol really shines. In their official documentation, a main agent communicates with a "sourcer" agent that has access to various tools and advertises its capabilities. This model allows for robust agent interactions without relying on persistent memory or shared cache of the conversation history.

I particularly appreciated how elegantly the A2A documentation describes the architecture. For example, here's a simplified snippet from server.py that allows an agent to interact with a queue-based task system:

async def _proc ess_request(self, request: Request):
    try:
        body = await request.json()
        json_rpc_request = A2ARequest.validate_python(body)
 
        if isinstance(json_rpc_request, GetTaskRequest):
            result = await self.task_manager.on_get_task(json_rpc_request)
        elif isinstance(json_rpc_request, SendTaskRequest):
            result = await self.task_manager.on_send_task(json_rpc_request)
        elif isinstance(json_rpc_request, SendTaskStreamingRequest):
            result = await self.task_manager.on_send_task_subscribe(
                json_rpc_request
            )
        elif isinstance(json_rpc_request, CancelTaskRequest):
            result = await self.task_manager.on_cancel_task(json_rpc_request)
        elif isinstance(json_rpc_request, SetTaskPushNotificationRequest):
            result = await self.task_manager.on_set_task_push_notification(json_rpc_request)
        elif isinstance(json_rpc_request, GetTaskPushNotificationRequest):
            result = await self.task_manager.on_get_task_push_notification(json_rpc_request)
        elif isinstance(json_rpc_request, TaskResubscriptionRequest):
            result = await self.task_manager.on_resubscribe_to_task(
                json_rpc_request
            )
        else:
            logger.warning(f"Unexpected request type: {type(json_rpc_request)}")
            raise ValueError(f"Unexpected request type: {type(request)}")
 
        return self._create_response(result)
 
    except Exception as e:
        return self._handle_exception(e)

This basic interface is deceptively powerful. It enables any agent with network access and permissions to drop a task into the queue, and then later fetch results or monitor progress without synchronous dependencies.

Why Async Patterns Will Define the Future of AI Systems

Asynchronous operations are not just an optimization, they are foundational for scalable, intelligent agent-based systems. Protocols like A2A show us the path forward for building cooperative AI systems that delegate, distribute, and deliver.

Terminologies like RAG (Retrieval Augmented Generation) may seem domain-specific, but they're quickly becoming standard for handling context-aware prompts in large language models. Similarly, concepts like MCP (Minimal Capability Protocol) offer a pragmatic way to think about agent interoperability — A2A shares that same spirit, but with more architectural guidance and production-readiness.

Final Thoughts

MCP has been around for a while as an entirely new concept that feels familiar. So does A2A. It's simple to understand and comply with, and it scales elegantly. I predict this will become the standard library for modern AI applications.

If you're building agent-based applications or looking to scale AI workflows, adopting async-first design thinking, and exploring A2A could be the most impactful move you make this year.