When to use fan-out
- Getting opinions from multiple experts (legal + engineering + product).
- Searching multiple sources in parallel (web + docs + database).
- Running the same task with different model configs to compare quality.
Advertisement
The basic recipe
List specialists = List.of(legalAgent, engAgent, productAgent);
List> futures = specialists.stream()
.map(a -> CompletableFuture.supplyAsync(
() -> a.invoke(userMessage),
executor))
.toList();
CompletableFuture.allOf(futures.toArray(new CompletableFuture[0])).join();
List results = futures.stream().map(CompletableFuture::join).toList(); Advertisement
Timeouts per branch
Give each branch its own timeout so one slow specialist doesn't stall the whole fan-out.
CompletableFuture withTimeout = future
.orTimeout(15, TimeUnit.SECONDS)
.exceptionally(e -> AgentResponse.error(e)); Combining results
How you combine depends on your goal: majority vote, best-of-N (via a judge agent), or simple concatenation for a summary agent to synthesize.
Cost warning
Fan-out is N times the LLM cost. Set a budget cap per invocation and log actual token spend per branch.