How zenflow turns a YAML agent workflow into a running plan. An LLM coordinator routes events between agents through hub-and-spoke mailboxes; the executor schedules a DAG of steps; goai handles the tool loop. Every box, arrow, and label below is a real component you can grep for in the source.
edges
function call
tool loop · llm call
message delivery (mailbox)
data / setup
Hub-and-spoke. Solid green = function calls inside the process. Dashed cyan = messages routed through the per-agent mailbox (animated to convey flow direction). Solid amber = the LLM tool loop running through the goai SDK. Peer agents never address each other directly -- every cross-agent message goes through the coordinator's forward_to_agent tool.
Three roles you keep meeting
Orchestrator · Coordinator · Executor
the owner
Orchestrator
The public Go type. One instance per process. Holds your model, progress sinks, storage, the coordinator runner, and the orchestrator-wide options. It doesn't run agents itself -- it constructs an Executor on each Run* call and exposes the entry points.
lifetime
long-lived (per process)
api
zenflow.New(opts...)
entry
RunFlow · RunGoal · RunAgent
source
zenflow.go
the hub
Coordinator
An LLM-backed AgentRunner the orchestrator hosts. Receives every cross-agent message in its own mailbox, decides where each goes (forward_to_agent), narrates progress (narrate), and may finalize the run (finalize). The hub of hub-and-spoke.
A per-run internal struct. Walks the workflow's DAG topologically, spawns one AgentRunner per step, owns the MessageRouter / Mailbox / delivery-engine lifecycle, and assembles the WorkflowResult. Lives only for the duration of one Run* call.
lifetime
per Run* call (transient)
api
internal -- not user-constructed
owns
MessageRouter · Mailbox · deliveryEngine
source
executor.go
Three modes, one engine
How a workflow starts
flow
Run a YAML DAG
Run a fully-declared .yaml workflow end to end. The agents, dependencies, and conditions are all explicit before the first LLM call -- diffable, reviewable in a PR.
cli
zenflow flow workflow.yaml
library
orch.RunFlow(ctx, wf)
good for
known plans, repeated runs
goal
Decompose a goal
Hand a single sentence to the coordinator and let it generate a workflow on the fly. The decomposition is auditable -- you can --plan it before approving.
cli
zenflow goal "ship the launch"
library
orch.RunGoal(ctx, goal)
good for
exploratory, ad-hoc work
agent
Single agent chat
One agent, one task, one tool loop. Zero coordinator. Useful for embedded chat surfaces, quick one-shots, and the foundation that the other modes are built on.
cli
zenflow agent "review the diff"
library
orch.RunAgent(ctx, cfg)
good for
chat, scripts, embedders
Hub-side primitives
Coordinator message tools
The coordinator wires three tools: forward_to_agent for routing, narrate for progress events, and finalize for terminal synthesis. Its agent-side counterpart, send_message, is documented in the next section.
tool
forward_to_agent
{ target_step_id, text, kind? }
Routes a message to a step's mailbox via MessageRouter.Send. Wrapper steps (loop · include) are rejected at the tool boundary so the coord can't address a container that has no agent.
tool
narrate
{ text }
Pushes an EventCoordinatorNarration onto the progress sink. Stdout sink renders with ≋; JSON sink emits a structured event for downstream observability.
tool
finalize
{ summary? }
Signals "the workflow is done, here's my synthesis." CLI treats it as advisory -- DAG completion is authoritative. Embedders may select on runner.Done() to honour it.
Auto-injected on the spoke
Agent-side tools
tool
send_message
{ text }
Auto-injected on every non-coordinator step runner that has a MessageRouter wired. Routes the message back to the coordinator's mailbox -- the spoke-to-hub leg of hub-and-spoke.
tool
submit_result
{ ...schema fields }
Auto-injected when the step's cfg.ResultSchema is non-nil. Returns the structured output that ends up in StepResult.Output; goai validates the payload against the schema before the runner accepts it.
tool
agent
{ name, prompt, ... }
Auto-injected when a child spawner is wired (AgentToolDef). Spawns a child AgentRunner for hierarchical decomposition -- the parent step waits on the child's result before continuing its own loop.
Delivery contract
Race-safe Mailbox + Wake
guarantee
Per-agent mailbox · per-step lock
Each step has its own InMemoryMailboxStore and an RWMutex the executor takes when transitioning lifecycle (start, idle, terminal). The delivery engine ticks every 500 ms and signals an idle agent's Wake channel only when its inbox grew, so we never enter goai.GenerateText (or goai.StreamText when streaming is enabled) on a still-empty queue.
store
MailboxStore (interface)
delivery
deliveryEngine.tick(500ms)
wake
chan struct{} (cap 1)
contract
Zero silent drops · typed reasons
Every dropped message produces exactly one EventMessageDropped with a typed DropReason. The reasons cover every observable outcome: workflow-cancelled, target-terminal, unknown-step, mailbox-closed-by-finalize, max-wake-cycles, hold-timeout, mailbox-full, no-transcript, transcript-too-large, resume-shutdown, resolver-error. No message disappears without a verdict.
event
EventMessageDropped
reasons
11 typed drop categories
error
*DropError (errors.As)
limits
Hard caps
The engine enforces a small set of safety ceilings so a runaway DAG, an infinitely chatty coordinator, or a flooded mailbox can't blow past process bounds. Each cap is overrideable via the matching With* option where the use-case justifies it.
steps / workflow
MaxStepsPerWorkflow = 100
include depth
MaxIncludeDepth = 5
nesting depth
MaxNestingDepth = 20
mailbox size
DefaultMaxMailboxSize = 10000
coord wake cycles
coordDefaultMaxWakeCycles = 100
Lifecycle & recovery
From plan to result
executor
DAG scheduler · loops · includes
Kahn-style topological scheduling over the workflow's steps, respecting dependsOn, condition (CEL), and maxConcurrency. Loops resolve their inner DAGs lazily; includes import sub-workflows; forEach spawns one inner DAG per item, capped at the loop's maxConcurrency.
schedule
topoSort(steps)
parallel
sem ← chan struct{}
loop
repeatUntil · forEach
output
Result · transcript · OTel
On completion the executor returns a WorkflowResult with per-step status, token counts, and content. A TranscriptStore persists every LLM turn for resume-from-checkpoint. The optional OTel adapter exports each step as a span tree so latency, errors, and token usage drop straight into your existing observability stack.