Plan and Step State Machines

The two interlocking finite state machines that govern research plan and step lifecycles, including every valid state, transition, and the derivePlanStatus function.

The plan engine uses two finite state machines -- one for the overall plan and one for each step. Both are implemented as pure functions in packages/mcp/src/plan-engine/plan-transitions.ts with no database access, no side effects, and full test coverage.

Plan State Machine

A research plan has six possible states:

State	Description	Terminal?
`planning`	Plan created, no steps started yet	No
`executing`	At least one step is `in_progress`	No
`awaiting_review`	Paused for user checkpoint	No
`stalled`	A step has been `in_progress` beyond the stall threshold	No
`completed`	All steps are in a terminal state (completed, skipped, or failed)	Yes
`failed`	Explicit failure via branching condition or user rejection	Yes

Valid Plan Transitions

planning       --> executing         (first get_next_step call)
planning       --> failed            (branching condition failure)

executing      --> awaiting_review   (request_user_review)
executing      --> stalled           (stall detected by get_plan_status)
executing      --> completed         (all steps terminal)
executing      --> failed            (branching condition failure)

awaiting_review --> executing        (submit_user_decision: approve/modify/skip)
awaiting_review --> failed           (submit_user_decision: reject)

stalled        --> executing         (get_next_step resumes the plan)
stalled        --> failed            (explicit failure)

completed      --> (none)            terminal state
failed         --> (none)            terminal state

The transition map is defined as a constant:

const VALID_PLAN_TRANSITIONS: Record<PlanStatus, PlanStatus[]> = {
  planning: ["executing", "failed"],
  executing: ["awaiting_review", "stalled", "completed", "failed"],
  awaiting_review: ["executing", "failed"],
  stalled: ["executing", "failed"],
  completed: [],
  failed: [],
};

Step State Machine

Each step in a plan has six possible states:

State	Description	Terminal?
`pending`	Not yet started	No
`in_progress`	Currently being executed by the AI platform	No
`awaiting_input`	Paused for user review	No
`completed`	Step finished successfully	Yes
`skipped`	Step was skipped (by branching or user decision)	Yes
`failed`	Step failed (but the plan continues)	Partially -- allows retry

Valid Step Transitions

pending        --> in_progress       (get_next_step starts the step)
pending        --> skipped           (branching skip_to skips intermediate steps)

in_progress    --> awaiting_input    (request_user_review)
in_progress    --> completed         (submit_step_result)
in_progress    --> failed            (modify_plan fail_step)

awaiting_input --> in_progress       (submit_user_decision: modify)
awaiting_input --> completed         (submit_user_decision: approve)
awaiting_input --> skipped           (submit_user_decision: skip)
awaiting_input --> failed            (submit_user_decision: reject)

failed         --> pending           (retry)

completed      --> (none)            terminal state
skipped        --> (none)            terminal state

The key non-obvious transition: failed --> pending. This is the retry path. A failed step can be reset to pending so that get_next_step picks it up again. The completed and skipped states are fully terminal with no outgoing transitions.

const VALID_STEP_TRANSITIONS: Record<StepStatus, StepStatus[]> = {
  pending: ["in_progress", "skipped"],
  in_progress: ["awaiting_input", "completed", "failed"],
  awaiting_input: ["in_progress", "completed", "skipped", "failed"],
  completed: [],
  skipped: [],
  failed: ["pending"], // retry
};

Transition Functions

The plan engine exposes four functions for state transitions:

`canTransitionPlan(from, to): boolean`

Checks whether a transition is valid without throwing. Use this for conditional logic where an invalid transition is a normal code path (e.g., checking if a plan can be resumed).

`canTransitionStep(from, to): boolean`

Same pattern for step transitions.

`transitionPlan(from, to): PlanStatus`

Validates the transition and returns the new state. Throws AppError("INVALID_TRANSITION") if the transition is not in the valid map. Use this when the transition must succeed -- the throw signals a bug in the calling code.

`transitionStep(from, to): StepStatus`

Same pattern for step transitions. Both transitionPlan and transitionStep include the from and to states in the error metadata for debugging.

Derived Plan Status

The derivePlanStatus function computes what the plan's status should be based on the aggregate states of all its steps. This is used after step completion to determine the new plan status without hardcoding specific transitions in each tool handler.

function derivePlanStatus(stepStatuses: StepStatus[]): PlanStatus;

The rules are evaluated in strict order:

Empty steps -- returns planning (plan has no steps yet)
Any step awaiting_input -- returns awaiting_review (user needs to respond)
All steps terminal (completed, skipped, or failed) -- returns completed
Otherwise -- returns executing

The "failed steps complete the plan" rule

The most important design decision in derivePlanStatus: a failed step counts as terminal. If all steps are completed, skipped, or failed, the plan is completed, not failed. A plan with steps ["completed", "failed", "skipped"] derives to completed.

This means:

Individual step failures are informational, not plan-breaking
The plan continues past failed steps to the next pending step
Plans only fail through explicit branching fail actions or user reject decisions

This rule matches the tests:

it("continues executing when a step has failed", () => {
  expect(derivePlanStatus(["completed", "failed", "pending"])).toBe("executing");
});

it("returns completed if all steps are terminal", () => {
  expect(derivePlanStatus(["completed", "failed", "skipped"])).toBe("completed");
});

Rule priority: `awaiting_input` wins

If any step is awaiting_input, the plan is awaiting_review regardless of other step states. This is checked first, before the all-terminal check:

it("prioritizes awaiting_input even when a step has failed", () => {
  expect(derivePlanStatus(["failed", "awaiting_input"])).toBe("awaiting_review");
});

Invariants

These are documented in the source via @invariant TSDoc tags:

All transition functions are pure -- no side effects, no DB access
completed and failed are terminal plan states with no outgoing transitions
Failed steps do NOT fail the plan -- they are treated like skipped steps
derivePlanStatus evaluates rules in strict order: awaiting_input first, then all-terminal, then executing

Branching -- how branching conditions trigger skip_to and fail actions that interact with these state machines
Execution Loop -- how MCP tool handlers call these pure functions within database transactions
Data & Storage -- Schema Design -- the research_plan.status and plan_step.status columns that store these states

Plan and Step State Machines

Plan State Machine

Valid Plan Transitions

Step State Machine

Valid Step Transitions

Transition Functions

canTransitionPlan(from, to): boolean

canTransitionStep(from, to): boolean

transitionPlan(from, to): PlanStatus

transitionStep(from, to): StepStatus