
fan-out/fan-in for parallel agent tasks: where email fits
Fan-out/fan-in lets agents split work across parallel sub-agents and merge results. Here's how the pattern works, with email-specific examples.
Your agent has 200 unread emails sitting in its inbox. Each one needs classification, priority scoring, and routing to the right workflow. Processing them one message at a time takes about eleven minutes. Processing them in parallel, with twenty sub-agents each handling a batch of ten, takes under forty seconds.
The difference isn't a better model or a faster API. It's an orchestration pattern called fan-out/fan-in. Most guides cover it for generic research or data-processing tasks, but email workloads are one of the strongest natural fits for the pattern, and almost nobody writes about that angle.
What is fan-out/fan-in for parallel agent tasks?#
Fan-out/fan-in is an orchestration pattern where a single coordinating agent spawns N independent sub-agents, each handling one slice of a larger task simultaneously (fan-out), then collects and merges their outputs once complete (fan-in). Instead of sequentially summarizing five emails, spawn five agents in parallel and merge their outputs in one aggregation step.
You'll encounter this pattern under other names in distributed systems: scatter-gather, fork-join, MapReduce. The core idea is identical. Break a big job into independent pieces, process the pieces at the same time, combine the results when they finish.
What makes it interesting for AI agents specifically is that each "piece" can be handled by a fully autonomous sub-agent with its own context window, tools, and decision-making. You're not just parallelizing a function call. You're parallelizing judgment.
How fan-out works#
The orchestrator's job during fan-out is straightforward: split the workload into chunks and assign each chunk to a sub-agent.
The critical requirement is independence. If sub-agent B needs the output of sub-agent A to do its work, they can't run in parallel. Fan-out only applies when tasks don't depend on each other.
Good candidates for fan-out:
- Classifying a batch of inbound emails (each email is independent)
- Researching five competitors simultaneously
- Generating personalized replies to twenty customer inquiries
- Sending campaign emails to different audience segments
- Extracting structured data from invoices, receipts, or purchase orders in parallel
In Python, asyncio.TaskGroup (available in 3.11+) handles the mechanics cleanly:
import asyncio
async def classify_email(agent, email):
return await agent.classify(email)
async def fan_out_classification(emails):
results = []
async with asyncio.TaskGroup() as tg:
for email in emails:
task = tg.create_task(classify_email(Agent(), email))
results.append(task)
return [t.result() for t in results]
Tip
Prefer asyncio.TaskGroup over asyncio.gather() for agent fan-out. TaskGroup cancels remaining tasks automatically when one fails, preventing orphaned agents from holding open connections or inboxes.
With asyncio.gather, a failed task can leave orphaned coroutines running in the background. That gets messy fast when agents hold resources like open inboxes or API connections. TaskGroup cleans up after itself.
One thing to keep in mind during fan-out is resource allocation. Each sub-agent you spawn consumes memory, holds open network connections, and occupies a slot in your LLM provider's rate limit budget. If you spawn 100 agents at once and your provider allows 60 requests per minute, you've just created a traffic jam. Batch your fan-out in waves that respect your API limits, or implement a semaphore to throttle concurrent agent launches.
SEM = asyncio.Semaphore(20) # max 20 concurrent agents
async def throttled_classify(agent, email):
async with SEM:
return await agent.classify(email)
How fan-in merges results#
Fan-in is the aggregation step. All sub-agents have finished (or timed out), and the orchestrator combines their outputs into something usable.
Simple fan-in concatenates results into a list. More useful fan-in applies logic: deduplicating findings, ranking outputs by confidence score, filtering low-quality results, or resolving conflicts when two sub-agents disagree about the same input.
For email workloads, fan-in might look like building a priority-sorted inbox from classification results, combining delivery confirmations and bounce reports from parallel send agents, or aggregating extracted data from invoices that were processed by separate agents.
The fan-in step is also where you handle incomplete results. If you fanned out to ten agents and two timed out, do you block the entire pipeline or proceed with eight? For most email use cases, partial fan-in is the right call. Eight classified emails are more useful than zero while you wait on the slowest sub-agent. You can implement progressive fan-in with asyncio.as_completed(), which yields results as each sub-agent finishes rather than waiting for the entire batch.
A well-designed fan-in step also records metadata about the aggregation itself. Track how many sub-agents succeeded, how many failed, and how long each one took. This telemetry helps you tune batch sizes and timeout thresholds over time, so your fan-out gets more efficient with every run.
Where email workloads fit the pattern#
Most fan-out/fan-in tutorials use research tasks as their go-to example. But email is arguably a better fit, for a simple reason: emails are naturally independent. A message from your largest client doesn't change how you should classify a shipping notification. That independence is exactly what fan-out requires.
Consider inbound email triage. Your agent receives messages from customers, vendors, newsletters, and automated systems. Each email needs independent classification: is it urgent? Who handles it? Does it contain a prompt injection attempt? These decisions don't depend on each other, which makes triage a textbook fan-out workload. Spawn a classifier per batch, run them in parallel, fan in the results into a prioritized queue.
The same logic applies to outbound campaigns. If you're sending personalized emails to different audience cohorts, each segment is independent. Fan out your sending agents by segment, let them run simultaneously, fan in the delivery reports to see what landed. A campaign targeting five segments that would normally take fifteen minutes to send sequentially finishes in three minutes when each segment gets its own agent.
Provisioning works this way too. Onboarding ten users at once? Each sub-agent can provision its own inbox, send a welcome message, and report back to the coordinator. With agent-first email infrastructure like LobsterMail, the agent itself handles inbox creation, so there's no central bottleneck, no admin queue, no approval step slowing down the fan-out.
Another strong fit is multi-format extraction. Imagine your support inbox receives PDFs, images of receipts, and plain-text complaints. You can fan out to specialized sub-agents: one handles OCR on images, another parses PDFs, a third classifies plain-text messages. Each sub-agent uses different tools, but the fan-in step unifies them into a single structured feed your downstream workflow can consume.
If your agents need to coordinate on a single email where one writes a draft, another reviews it, and a third sends it, that's not fan-out. That's sequential coordination, and we covered the patterns for that in multi-agent email: when agents need to talk to each other.
Handling timeouts and failures#
The happy path is easy. The hard part is what happens when sub-agents fail mid-execution.
Start with timeouts. Set an explicit timeout on every sub-agent, with a reasonable default of 2-3x the expected execution time for a single sub-task. If classification normally takes 2 seconds per email, a 5-second timeout is generous enough to handle occasional slowdowns without letting a hung agent block the pipeline forever.
async def fan_out_with_timeout(emails, timeout=5.0):
tasks = [classify_email(Agent(), email) for email in emails]
done = await asyncio.gather(
*[asyncio.wait_for(t, timeout=timeout) for t in tasks],
return_exceptions=True
)
successful = [r for r in done if not isinstance(r, Exception)]
failed = [r for r in done if isinstance(r, Exception)]
return successful, failed
Duplicate work is another risk. If your task partitioning isn't clean, two agents might process the same email. Use deterministic assignment (explicit ID lists, or index-based partitioning) to guarantee each message maps to exactly one agent. Never let sub-agents pull from a shared queue without a locking mechanism.
Shared state causes the worst bugs. If parallel agents read and write to the same resource, you get race conditions. The fix is architectural: keep each agent stateless during fan-out. Every agent gets its own input slice, writes to its own output buffer. All merging happens exclusively at the fan-in step, after parallel execution completes.
For real-time awareness of when each sub-agent's send completes, event-driven delivery via webhooks lets your coordinator react immediately instead of polling for status updates.
When not to fan out#
Fan-out adds coordination overhead. Spawning agents, distributing work, collecting results, handling failures. For small workloads under ten items, sequential processing is often faster because you skip all of that overhead entirely.
Skip the pattern when tasks depend on each other's outputs, when the workload is small enough that sequential processing finishes in acceptable time, when you're hitting rate limits that make parallelism pointless (ten agents hitting the same API get throttled the same as one making ten calls), or when the cost of N parallel agents exceeds the value of faster completion.
There's also a debugging cost to consider. Parallel agent workflows are harder to trace than sequential ones. When something goes wrong, you're reading interleaved logs from twenty agents instead of a single linear execution path. Make sure your observability tooling can handle correlated traces across sub-agents before you commit to fan-out in production.
The practical starting point: build your workflow sequentially first. Measure where the bottleneck actually sits. If it's in processing independent items and you have enough of them to justify the coordination cost, fan-out/fan-in will give you a real speedup. For email workloads where you're classifying hundreds of inbound messages or sending campaigns across multiple segments, that crossover point arrives fast.
Frequently asked questions
What does 'fan-out' mean in the context of parallel agent tasks?
Fan-out is the moment an orchestrator agent distributes independent sub-tasks to multiple agents that run simultaneously. Each sub-agent works on its own slice of the problem without waiting for any other agent to finish first.
What does 'fan-in' mean and how does it merge outputs?
Fan-in is the aggregation step where the orchestrator collects results from all sub-agents and combines them into a single output. The merge logic can range from simple list concatenation to complex conflict resolution, depending on the workload.
How much faster is fan-out compared to sequential execution?
If each task takes 2 seconds and you have 50 of them, sequential processing takes 100 seconds. Fan-out with 10 agents processing 5 tasks each finishes in roughly 10 seconds plus coordination overhead. Speedup is close to linear when tasks are truly independent.
When should I use fan-out instead of a single agent with multiple tools?
Use fan-out when you have many independent tasks that don't share state. A single agent with tools works fine for sequential workflows, but it can't process ten emails at the same time the way ten parallel agents can.
What happens to the overall workflow if one sub-agent fails mid fan-out?
It depends on your error handling strategy. With asyncio.TaskGroup, the remaining tasks are cancelled automatically. With asyncio.gather(return_exceptions=True), the failed task's exception is returned alongside successful results, and your orchestrator decides whether to retry, skip, or halt.
How do I prevent duplicate work across parallel sub-agents?
Use deterministic assignment so every input maps to exactly one agent. Assign work by explicit ID lists or index-based partitioning rather than letting agents pull from a shared queue, which risks two agents grabbing the same item.
Can fan-out/fan-in handle inbound email classification?
Yes, and it's one of the best use cases. Each inbound email is naturally independent, so you can classify them in parallel batches and merge the results into a prioritized queue. The speedup scales nearly linearly with the number of sub-agents.
What is the scatter-gather pattern and how does it relate to fan-out/fan-in?
Scatter-gather is an older term from messaging systems and distributed databases that describes the same concept. "Scatter" corresponds to fan-out (distributing work) and "gather" corresponds to fan-in (collecting results). The terms are interchangeable.
How does fan-out/fan-in differ from a traditional task queue?
A task queue (like Celery or SQS) decouples producers and consumers through persistent storage and is built for durability. Fan-out/fan-in is an in-process orchestration pattern where the coordinator directly manages sub-agents and waits for their results, making it better for low-latency agent coordination.
Can a fan-in step aggregate partial results if some agents are still running?
Yes. Using asyncio.as_completed(), your orchestrator can process results as each sub-agent finishes rather than waiting for the entire batch. This is useful when acting on early results matters more than having the complete set.
How do parallel agents avoid race conditions?
Keep each sub-agent stateless during fan-out. Give every agent its own input slice and its own output buffer. All merging happens at the fan-in step after parallel execution completes, which eliminates contention entirely.
Should I use asyncio.gather or TaskGroup for agent fan-out?
asyncio.TaskGroup (Python 3.11+) is generally the better choice. It automatically cancels remaining tasks when one raises an exception, which prevents orphaned agents from holding open connections. asyncio.gather requires more manual cleanup but works on older Python versions.
How much context should each sub-agent receive during fan-out?
Give each sub-agent only the context it needs for its specific sub-task. Passing the full workload to every agent wastes tokens and adds latency. If an agent only needs to classify one email, send it that email and the classification schema, not the entire inbox.


