Errors
zenflow surfaces errors at three layers: typed sentinel error values, DropReason codes for messaging failures, and CLI exit codes for the zenflow binary.
Sentinel errors
Use errors.Is(err, X) to classify; errors.As(err, &X) to extract a typed wrapper.
Orchestrator lifecycle
zenflow.ErrOrchestratorClosed-RunAgent/RunAgentAsyncwas called on an orchestrator afterClose()ran. The orchestrator's background goroutines are gone and its handle registry is drained; new calls would leak resources with no lifecycle to attach to. Fix: construct a freshOrchestrator, or remove the stale entry from your factory cache so the next call rebuilds it.zenflow.ErrModelRequired-RunFlow/RunGoal/RunAgentwas called on an orchestrator that has noprovider.LanguageModelconfigured. Fix: passWithModel(...)at orchestrator construction, or setcfg.Modelper call.zenflow.ErrStorageRequired-ResumeFlowwas called on an orchestrator that has noStorageconfigured. Fix: passWithStorage(...).zenflow.ErrWorkflowNil-RunFlow/ResumeFlowwas passed a nil*Workflow. Fix: load the workflow viaLoadWorkflowor construct one explicitly before calling.zenflow.ErrPlanDenied-RunGoal's LLM-decomposed plan was rejected by the configuredApprovalHandler(the handler returnedfalse). Distinct fromErrApprovalTimeout(the handler ran but exceeded its window). The flow is aborted cleanly with no executor side effects.zenflow.ErrApprovalTimeout- the configuredApprovalHandler.ApprovePlandid not return withinWithApprovalTimeout's window. The flow is aborted; the handler's late return, if any, is ignored.zenflow.ErrNilAgentHandle/zenflow.ErrNilOrchestrator- returned by methods called on a nil receiver (defensive guards for callers that race a Close with concurrent use).zenflow.ErrResumeNoModel-Executor.ResumeStepcould not construct an agent runner because neither the saved transcript nor the executor had a model resolver. Fix: installWithModelResolveror ensure the executor's default runner model matches the transcript.
Async agent handles
AgentError wraps a sentinel with optional human-readable text. errors.Is(AgentError{Sentinel: X}, X) returns true.
zenflow.ErrAgentHandleTimeout- the async handle exceeded its TTL (default 30 minutes;DefaultAgentHandleTTL). The agent goroutine is cancelled via its context; its later arrival, if any, is discarded. Override the TTL. SDK consumers: callzenflow.WithAgentHandleTTL(d). CLI users: setZENFLOW_AGENT_HANDLE_TTLenv var (the CLI maps it to the option; the library never reads env vars directly).zenflow.ErrAgentCancelled- the handle was cancelled viaAgentHandle.Cancel()before the agent completed. SubsequentDone()reads see this sentinel.zenflow.ErrAgentPanicked- the agent goroutine recovered a panic. The recovered value is inAgentError.Msg. Inspect it before retrying - panics usually indicate a real bug, not a transient failure.
Storage cleanup
In zenflow, "storage cleanup" means evicting per-run state from the configured Storage backend after the caller has finished consuming the WorkflowResult. The in-memory *MemoryStorage accumulates metadata, per-step results, and shared-memory entries for every run it has seen; without explicit eviction, long-lived embedders (HTTP servers, queue workers) leak that state across runs. *MemoryStorage exposes DeleteRun(runID string) (no ctx, no error return) to evict a completed run's metadata, step results, and shared memory from the in-process map. Call it after consuming WorkflowResult in long-lived embedders to bound memory growth. Idempotent: no-op when runID is unknown.
Transcript store
zenflow.ErrNoTranscript-Loadfound no matching transcript. Returned asDropReasonNoTranscriptwhen a Send to a closed step has no history to resume from. Typically observed for steps that ran before the transcript store was wired up, or whose transcript was explicitly deleted.zenflow.ErrTranscriptTooLarge- an Append would exceed the configured cap (WithMaxTranscriptMessages/WithMaxTranscriptBytes). The messages are NOT appended; the store's slot is sealed. Routed toDropReasonTranscriptTooLargeon subsequent resume attempts. To preserve operability (at the cost of potentially-incomplete history), callWithTruncationOnCapReached()(paired withWithoutTruncationOnCapReached()to disable) and use a store that implementsTranscriptTruncatedLoader.
Resume path
zenflow.ErrResumeShutdown- the workflow's context was cancelled mid-resume; the in-flight resume goroutine exited early. Routed toDropReasonResumeShutdown. Typically observed when a long-running resume races against a workflow timeout or explicit cancel.zenflow.ErrModelResolverMissing- a saved transcript references a model identifier that does not match the executor's default runner model and noModelResolverwas configured to resolve it. Without a resolver, the resume fails loudly rather than silently falling back to the wrong model. Fix: installWithModelResolverthat maps the saved model string to the rightprovider.LanguageModel.zenflow.ErrModelResolverError- aModelResolverwas configured but returned an error (or returnednilmodel with no error). Routed toDropReasonResolverError. Distinct fromErrModelResolverMissingso operators can tell "no resolver installed" from "resolver ran and failed".zenflow.ErrMailboxFullOnResume- a queued resume attempt was rejected because the active resume's mailbox was already at its configured cap. Routed toDropReasonMailboxFull.
Coordinator tool arguments
The four built-in coordinator tools (forward_to_agent, send_message, narrate, finalize) validate their inputs before doing anything. Argument failures bubble back to the LLM as a tool error, so the model can self-correct on the next step.
zenflow.ErrForwardTargetRequired- the coordinator calledforward_to_agentwithout atarget_step_id(or with an empty string). When you'll see it: a coord prompt regression where the model emits{"text": "..."}and forgets the target. Surfaces as a tool error in the next coord turn; the model usually retries with the right arg. Repeat offenses suggest the system prompt needs a stronger reminder.zenflow.ErrSendMessageEmpty- the coordinator calledsend_message(hub broadcast) with emptytext. When you'll see it: prompt drift where the model invokes the tool with no payload. The tool refuses; the next coord turn includes the error and the model can retry.zenflow.ErrNarrateEmpty- the coordinator callednarratewith emptytext. When you'll see it: same shape asErrSendMessageEmpty- the model invoked the narration tool without supplying narration text. Refused; next turn carries the error.
Agent runner
AgentRunner enforces the agent contract (turn cap, terminal submit_result, handle identity). All sentinels here are wrapped by AgentError when surfaced through AgentHandle.Done().
zenflow.ErrAgentToolDirectInvocation- an agent-as-tool definition was invoked directly through the goai tool loop instead of being intercepted by the spawner hook. When you'll see it: a misconfigured orchestrator whereOnBeforeToolExecutedid not register the agent spawner, so the call fell through to the defaultExecutebody. Fix: ensure the orchestrator's spawner hook is wired beforeRunFlow/RunAgent; never callagentTool.Executefrom custom code.zenflow.ErrAgentTurnLimitExceeded- the agent ran the configuredMaxTurnsLLM calls without ever callingsubmit_result. When you'll see it: a runaway agent that keeps thinking aloud or invoking tools instead of terminating. Fix: raise the turn cap viaWithMaxTurns(n)if the task genuinely needs more steps, otherwise tighten the agent prompt to push towardsubmit_resultsooner.zenflow.ErrAgentNoSubmitResult- the agent finished its turn budget (or returnedfinishReason=stop) but never calledsubmit_result, despite aresultSchemabeing configured. When you'll see it: the agent emitted a final assistant message in plain text and stopped. The last assistant text is appended to the error message for diagnostic context. Fix: strengthen the system prompt's "you MUST call submit_result" clause, or supply a fallback handler.zenflow.ErrInvalidAgentHandleID-NewAgentHandlewas called with an empty ID. When you'll see it: a custom embedder constructing handles by hand and forgetting to populate the ID; never seen via the standardRunAgentAsyncpath (which generates IDs). Fix: pass a non-empty stable ID (UUID or step-derived string) toNewAgentHandle.
Orchestrator wiring
These guards fire when a consumer of the orchestrator API supplied a missing or malformed argument. They are returned synchronously from the calling method; no goroutines are spawned and no events are emitted.
zenflow.ErrRunnerNil-Executor.Runwas invoked on anExecutorwhoseRunnerfield isnil. When you'll see it: custom embedders that build anExecutordirectly (instead of going throughOrchestrator) and forgot to set the runner. Fix: preferOrchestrator, which wires the runner internally; if you must useExecutordirectly, setexec.Runner = ...before callingRun.zenflow.ErrEmptyGoal-RunGoalwas called with an empty (or whitespace-only) goal string. When you'll see it: a CLI consumer that forwards user input without trimming, or a programmatic caller that builds the goal from a template that resolved to an empty string. Fix: validate the goal is non-empty before callingRunGoal.zenflow.ErrRunNotFound-Storage.LoadRunwas called with a run ID the configured backend has never seen. Wrapped with the run ID for context. When you'll see it: aResumeFlowagainst a run that was never persisted, or a stale run ID from a previous storage backend. Fix: verify the run ID exists viaStorage.ListRunsbefore resuming.zenflow.ErrStepNotFound-Storage.LoadStepResultwas asked for a step that has no persisted result yet (the step never completed, or never ran). Wrapped withrunID/stepID. When you'll see it: custom resume logic that probes step results before checking the run's step graph. Fix: consultWorkflowResult.Stepsfor the canonical list of completed steps before loading by ID.zenflow.ErrIncludePathEscape- a step'sinclude:reference resolved to a path outside the workflow's base directory (via..traversal or a leading/). Wrapped with the offending step ID and ref. When you'll see it: a workflow attempting to pull a system file (e.g./etc/passwd) or escape into a sibling directory. Fix: keepinclude:paths relative and inside the workflow root; if cross-directory inclusion is genuinely needed, set a differentBaseDir.zenflow.ErrIncludeDepthExceeded- aninclude:chain exceededMaxIncludeDepth(5). Wrapped with step ID and ref. When you'll see it: an accidental cycle (A includes B includes A) or a deeply nested fragment library. Fix: flatten the include graph; the cap protects against infinite recursion and is not configurable on purpose.zenflow.ErrRefPathEscape- an@-prefixed file ref (in step input or agent prompt) resolved to a path outside the workflow's base directory. Wrapped with the offending path. When you'll see it: same shape asErrIncludePathEscape, but for inline@./file.txtrefs rather thaninclude:blocks. Fix: keep ref paths relative and inside the workflow root.zenflow.ErrNilFactoryInner-NewFactoryCache(nil)was called. When you'll see it: custom plumbing that wraps an inner factory but forgot to construct it first. Fix: pass a real factory; the cache is a memoizer, it has no useful behavior wrappingnil.
Permission policy
DecidePermission and the orchestrator's permission gate return these sentinels (wrapped) when a tool call is blocked. The CLI's interactive prompt does not fire these - they're for declarative --allow / --deny / --strict decisions.
zenflow.ErrToolDenied- a tool name matched the policy'sDenylist (typically from--deny). When you'll see it: an agent attempting to run a tool the operator explicitly forbade. The tool call fails immediately and the LLM sees the error in the next turn. Fix: if the deny is intentional, no action - the agent will route around it. If the deny was a misconfiguration, remove the offending entry from--deny.zenflow.ErrToolNotAllowed- strict mode is on (--strict) and the requested tool is not on theAllowlist. When you'll see it: an agent calling a tool the operator never explicitly allowed under strict mode. Distinct fromErrToolDeniedso operators can route alerts separately ("explicit deny" vs "not in allowlist"). Fix: add the tool to--allowif it should be permitted, otherwise let the agent fail and re-plan.
MessageRouter
The router exposes one bare sentinel for back-pressure conditions; most router failures are surfaced through DropEvent / DropError instead (see ## DropReason).
zenflow.ErrMailboxFull- aSendwas rejected because the target's bounded mailbox is at itsMaxMailboxSizecap (oldest-wins fairness). When you'll see it: production deployments that set a positive cap and have a producer outpacing the consumer. Surfaced both as this sentinel (for direct callers) and asDropReasonMailboxFull(for observers viaWithDropCallback/EventMessageDropped). Fix: raiseWithMaxMailboxSize, slow the producer, or speed up the consumer.
DropReason
Every router message that fails to reach its target's mailbox produces exactly one DropEvent with a typed DropReason. There are no silent drops. Subscribers receive both EventMessageDropped (via ProgressSink) and the optional WithDropCallback invocation.
DropReason.String() returns the canonical wire-format value used in Event.Data["reason"]. The values are stable and safe to match on in dashboards / alert rules.
| Constant | Wire string | Cause | Mitigation |
|---|---|---|---|
DropReasonUnspecified | "unspecified" | Zero value; never emitted in practice | None - if you see this, file a bug. |
DropReasonWorkflowCancelled | "workflow-cancelled" | Workflow context cancelled (or the abort strategy fired) before the message could be delivered | Expected during clean cancellation. Investigate only if seen on a successful run. |
DropReasonTargetTerminal | "target-terminal" | Send to a step whose mailbox was closed (the step reached terminal lifecycle state) | Common in flows where coord narrates after step completion. Avoid by routing late messages to coord instead, or by enabling resume so terminated steps can re-engage. |
DropReasonUnknownStep | "unknown-step" | Send to a stepID that was never registered AND has no pending senders | Typo in step ID, message routed to a step that doesn't exist in the DAG, or send to an external identity (e.g., "coordinator") that wasn't pre-registered via WithExternalInbox. |
DropReasonMailboxClosedByFinalize | "mailbox-closed-by-finalize" | Mailbox raced with a concurrent close; the closed flag won | Race during step termination. Usually benign - if observed frequently, check for senders that don't open a sender slot via MessageRouter.OpenSender. |
DropReasonMaxWakeCycles | "max-wake-cycles" | The wake-loop hit the MaxWakeCycles cap with messages still pending; remaining messages drained as drops | Raise WithMaxWakeCycles (default 10; coord default 100). If raising doesn't help, investigate the producer for a hot loop. |
DropReasonHoldTimeout | "hold-timeout" | The hold-timeout fired before the agent's three-invariant termination rule could converge; remaining mailbox messages drained as drops | Raise WithHoldTimeout (default 30 seconds) for chat-style workflows; lower it for batch pipelines where idle gaps signal stuck steps. |
DropReasonMailboxFull | "mailbox-full" | The bounded mailbox is at the MaxMailboxSize cap; the newest message is rejected (oldest-wins fairness) | Raise WithMaxMailboxSize, slow the producer, or speed up the consumer. Default is unbounded; production deployments should set a positive cap. |
DropReasonNoTranscript | "no-transcript" | Target mailbox was closed AND the executor's TranscriptStore has no saved transcript for the step | Step ran before transcript wiring, or the transcript was explicitly deleted. Resume cannot proceed. |
DropReasonTranscriptTooLarge | "transcript-too-large" | The saved transcript exceeds the configured cap (WithMaxTranscriptMessages / WithMaxTranscriptBytes); resume would exceed the size bound | Raise the caps, prune transcript history, or enable WithTruncationOnCapReached() (paired with WithoutTruncationOnCapReached() to disable) to fall back to a truncated tail. |
DropReasonResumeShutdown | "resume-shutdown" | The workflow context was cancelled mid-resume; the in-flight resume goroutine exited early | Expected during clean shutdown. The original sender sees the drop and can retry on the next workflow run. |
DropReasonResolverError | "resolver-error" | A configured ModelResolver was consulted for a saved-transcript model identifier and returned an error | Inspect resolver logs - likely a misconfigured provider, missing API key, or transient infra failure. Distinct from the catch-all target-terminal so operators can route resolver alerts separately. |
DropError
MessageRouter.Send returns *zenflow.DropError on every drop. The error's Error() method returns the canonical "dropped: <reason>" string (matching DropReason.String()) so existing substring-matching consumers - like LLM tool results that pass err.Error() through verbatim - continue to work without modification.
For routing-decision callers that need to act on a specific reason, extract the typed value via errors.As:
if err := router.Send(stepID, msg); err != nil {
var de *zenflow.DropError
if errors.As(err, &de) {
switch de.Reason {
case zenflow.DropReasonUnknownStep:
// append "valid step IDs: …" hint, retry with corrected target
case zenflow.DropReasonMailboxFull:
// back off + retry; the consumer is behind
case zenflow.DropReasonTargetTerminal:
// step has finished; route the message somewhere else
default:
// log and surface the canonical text via err.Error()
}
}
return err
}The struct definition is intentionally minimal:
type DropError struct {
Reason DropReason
}DropError.Reason is the typed enum, not a string - immune to format-string drift if DropReason.String() is ever revised. The same DropReason is also surfaced through the WithDropCallback(fn) observer (typed DropEvent.Reason) and through EventMessageDropped events (canonical wire string in Event.Data["reason"]).
Detecting cancelled vs failed
When iterating WorkflowResult.Steps:
StepStatus == StepCompleted- the step succeeded.StepStatus == StepFailed- the step errored.StepResult.Erroris non-nil; checkerrors.Is(sr.Error, ctx.Err())to distinguish workflow-level cancellation from step-level errors.StepStatus == StepCancelled- the step was cancelled because the workflow was aborted (under thecascadestrategy) or its context was cancelled before completion.StepStatus == StepSkipped- the step was skipped because a dependency failed (under theskip-dependentsstrategy) or itsConditionevaluated false.
For workflow-level state:
result, err := orch.RunFlow(ctx, wf)
switch {
case errors.Is(err, context.Canceled):
// Workflow was cancelled by the caller.
case errors.Is(err, context.DeadlineExceeded):
// Workflow ran past its context deadline.
case err != nil:
// Configuration or storage error; result may be nil.
case result.Status == zenflow.StatusCompleted:
// All steps succeeded.
case result.Status == zenflow.StatusFailed:
// No steps completed.
case result.Status == zenflow.StatusPartial:
// Mixed: walk result.Steps for the per-step breakdown.
}Error wrapping conventions
zenflow follows standard Go error-wrapping conventions:
- Public methods wrap underlying errors via
fmt.Errorf("...: %w", err)soerrors.Isanderrors.Astraverse the chain. - Sentinel values (e.g.,
ErrOrchestratorClosed) are returned directly without wrapping when no further context is meaningful. - Typed wrappers (
AgentError,JSONParseError,CoordinatorValidationError) embed a sentinel viaUnwrap()soerrors.Is(err, sentinel)works on a*CoordinatorValidationError.
Workflow validation errors
LoadWorkflow and ParseWorkflow return typed error structs when YAML parses but fails schema/topology validation. Callers should errors.As to extract structured fields (which step ID, which dependency, etc.) for tooling.
| Type | Fields | Meaning |
|---|---|---|
*ValidationError | Message string | Generic validation error (e.g., "step "x": retries must be non-negative"). Catch-all for rules that don't have their own type. |
*CycleError | Message string | Topological sort detected a cycle in dependsOn. Message names the offending edge. |
*MissingAgentError | Message string, Agent string, StepID string | Step references an agent name that's not declared in agents:. |
*DuplicateStepError | Message string, StepID string | Two steps share the same id. |
*MissingDepError | Message string, Dep string, StepID string | Step's dependsOn lists a step ID that doesn't exist. |
*NoStepsError | Message string | Workflow has zero steps. |
*MissingNameError | Message string | Workflow name: field is empty. |
*IncludeConflictError | Message string, StepID string, Field string | An include: reference collides with an inline step ID or another include. |
*LoopValidationError | Message string, StepID string | A loop: block violates one of the loop-specific rules (e.g., forEach mutually exclusive with until, invalid outputMode). |
Coordinator validation errors
RunGoal calls into the coordinator LLM to decompose a goal into a workflow. Failures during that decomposition return typed wrappers around the underlying cause:
| Type | Fields | Meaning |
|---|---|---|
*JSONParseError | Err error (Unwrap) | The coordinator's response was not valid JSON. The wrapped error is the JSON decoder's message. |
*CoordinatorValidationError | Err error (Unwrap) | The coordinator returned valid JSON but the workflow it produced failed validate(). The wrapped error is one of the workflow validation errors above. |
*ToolNotFoundError | Tool string, Agent string | The coordinator referenced a tool name in an agents.<name>.tools: list that isn't registered with the orchestrator. |
RunGoal retries up to 2 times on JSONParseError and 1 time on CoordinatorValidationError before giving up.
Portability errors
LintPortability and SanitizeUnicode return typed errors so callers can decide whether to surface them as warnings or fatal:
| Type | Fields | Meaning |
|---|---|---|
*HostSpecificEnvError | Field string, Var string | Workflow string interpolates a host-specific env var ($USER, $HOSTNAME, $HOME, $PWD). The workflow won't load on another host. |
*UnicodeUnsafeError | Reason string, Rune rune | Workflow text contains a Unicode bidi-override codepoint (U+202A-E, U+2066-9). These can hide malicious content from human review. |
CLI exit codes
The zenflow CLI translates internal failures to stable exit codes (defined in cmd/zenflow/main.go):
| Code | Meaning | When it fires |
|---|---|---|
0 | Success | Workflow / goal / agent completed without errors; all steps reached StepCompleted. |
1 | Workflow failed | WorkflowResult.Status is StatusFailed or StatusPartial; one or more steps did not complete successfully. |
2 | Validation/coordinator error | Invalid YAML, schema rejection, JSONParseError, CoordinatorValidationError, ToolNotFoundError. |
3 | Usage error | Unknown flag, missing positional argument, --resume on goal/agent, --plan on goal/agent, mutually exclusive flags supplied together. The usage() line is printed to stderr. |
124 | Watchdog timeout | The --timeout value was exceeded. The watchdog grace period (default 30 seconds) gave clean shutdown a chance; on grace expiry the process exits via os.Exit(124). Orphan goroutines, if any, are killed with the process. |
continue-on-error style wrappers in CI should treat 0 as pass and any other code as fail. If you want to distinguish workflow failure (1) from setup failure (2/3), check the exit code explicitly.
124 and partial output
When the watchdog fires, any progress events emitted before timeout have already been written to stdout (and to file when --json redirected to one). The watchdog kills the process; it does not flush stdio buffers. For NDJSON consumers, this means the last line of output may be truncated mid-event. Defensive parsers should skip non-parseable trailing lines.
Retries
zenflow does not retry the CLI invocation itself - one CLI run = one execution. Retries within a workflow happen at the step level, configured via Step.Retries / Step.MaxRetries / WorkflowOptions.MaxRetries. The CLI exit code reflects the final status after retries are exhausted.
If your CI orchestrator retries failed jobs automatically, prefer narrow retries (transient API errors, timeouts) over broad ones (configuration errors). Exit code 3 rarely benefits from a retry; exit code 124 sometimes does (transient slowness); exit code 1 depends on the failed step.
Diagnostic patterns
A few patterns that come up often when debugging zenflow runs.
"I'm seeing message_dropped events but my workflow finished"
That's expected in two scenarios:
- Coordinator narrating after step completion. The coord LLM tries to push a narration to a step that already terminated. Reason:
target-terminal. Benign - the message had no business reaching a finished step, and the coord can keep narrating the next step. - Workflow cancellation flush. When a workflow context is cancelled, every still-pending message in every mailbox is flushed as a drop with reason
workflow-cancelled. This satisfies the "no silent drops" contract and lets observers see exactly what was lost.
If you see drops with reason unknown-step, mailbox-full, or max-wake-cycles, that's worth investigating - they signal misrouting, back-pressure, or a hot loop respectively.
"RunFlow returned successfully but result.Steps says some steps failed"
RunFlow only returns an error for orchestrator-level failures (nil workflow, missing model). Storage write failures are NOT propagated through this return value - they surface as EventError events and slog.Warn log entries. Subscribe to events to detect them. Per-step failures show up in the Steps map - the workflow completed, but with a StatusPartial or StatusFailed overall status.
Always inspect both the error return and result.Status before declaring success:
result, err := orch.RunFlow(ctx, wf)
if err != nil {
return fmt.Errorf("run flow: %w", err)
}
if result.Status != zenflow.StatusCompleted {
return fmt.Errorf("workflow finished with status %s", result.Status)
}"RunAgentAsync handle never delivers a result"
Three things to check:
- TTL expired. The handle was force-completed with
AgentError{Sentinel: ErrAgentHandleTimeout}. Read it fromDone()to confirm. - Orchestrator was Closed. New
RunAgentAsynccalls returnErrOrchestratorClosedsynchronously; existing handles are force-cancelled. If you're seeing handles that never resolve, you may be reading from a stale handle whose buffered result was already consumed. - Internal goroutine reading from
Done(). Only the external caller should read fromAgentHandle.Done(). Internal lifecycle code that needs to know "the handle is terminal" should use the unexportedfinishedchannel pattern (seeagent_handle.go); usingDone()would steal the buffered result.
"Resume keeps failing with no-transcript"
Resume requires both a TranscriptStore (default InMemoryTranscriptStore for intra-run, persistent store for cross-run) AND for the original step to have actually appended messages. Steps that error out before any LLM call still get metadata seeded (so the resume reconstructs the system prompt and model), but a step that crashes immediately may have no messages to load.
If you need cross-process resume (different CLI runs, different containers), wire a persistent TranscriptStore via WithTranscriptStore. The default in-memory store is per-orchestrator and lost on exit.