When to use fan-out

  • Getting opinions from multiple experts (legal + engineering + product).
  • Searching multiple sources in parallel (web + docs + database).
  • Running the same task with different model configs to compare quality.
Advertisement

The basic recipe

List specialists = List.of(legalAgent, engAgent, productAgent);
List> futures = specialists.stream()
    .map(a -> CompletableFuture.supplyAsync(
        () -> a.invoke(userMessage),
        executor))
    .toList();

CompletableFuture.allOf(futures.toArray(new CompletableFuture[0])).join();
List results = futures.stream().map(CompletableFuture::join).toList();
Advertisement

Timeouts per branch

Give each branch its own timeout so one slow specialist doesn't stall the whole fan-out.

CompletableFuture withTimeout = future
    .orTimeout(15, TimeUnit.SECONDS)
    .exceptionally(e -> AgentResponse.error(e));

Combining results

How you combine depends on your goal: majority vote, best-of-N (via a judge agent), or simple concatenation for a summary agent to synthesize.

Cost warning

Fan-out is N times the LLM cost. Set a budget cap per invocation and log actual token spend per branch.