Execution Loop

The end-to-end flow from plan creation through step execution to completion, including human-in-the-loop review and cross-session resume.

This page describes the full execution lifecycle of a research plan, from creation to completion. Every interaction is initiated by the AI platform (Claude Code, ChatGPT) calling MCP tools -- the server never pushes work to the client.

Phase 1: Plan Creation

The AI platform calls create_research_plan with a structured definition:

Insert plan -- a research_plan row with status planning, the research question, and optional formatting notes
Insert steps -- plan_step rows with sequential step_order (1-based), each starting in pending status
Insert branching conditions -- plan_branching_condition rows linked to specific steps by resolving afterStepOrder to step IDs
Auto-link skill execution -- if a sessionId is provided, the tool looks up the most recent started skill execution for that session and links it to the new plan by setting its planId
Audit log -- writes a plan_modified event with action created

The response includes the planId, step IDs, and the first step's details so the AI platform can immediately begin execution.

Step Types

Each step has a stepType that tells the AI platform what kind of work to do:

Type	Purpose
`search`	Find and evaluate sources
`extract`	Pull structured data from sources
`analyze`	Interpret data, identify patterns
`critique`	Evaluate quality, identify gaps
`synthesize`	Combine findings into conclusions
`checkpoint`	Pause for user review
`custom`	Anything not covered above

Step types are informational -- they guide the AI platform's behavior through the step instructions, not through different server-side logic.

Phase 2: Step Execution Loop

After plan creation, the AI platform enters the core execution loop.

Getting the next step

The AI platform calls get_next_step with the planId. The handler:

Loads the plan and checks its status
If completed -- returns { status: "plan_complete" } with formatting notes
If failed -- returns { status: "plan_failed" }
If awaiting_review -- returns { status: "awaiting_review" }
Otherwise -- finds the first step with pending status (ordered by step_order)
Transitions the step from pending to in_progress via transitionStep
If the plan is in planning or stalled state, transitions it to executing via transitionPlan
Writes a step_started audit log entry
Returns the step's ID, order, type, and instructions

If no pending steps exist but some are in_progress or failed, the handler returns { status: "no_pending_steps" } with a count of in-progress and failed steps.

Executing the step

Between get_next_step and submit_step_result, the AI platform does the actual work. This happens entirely in the AI platform -- Trovella's server has no involvement. The AI platform may:

Search the web for sources
Call get_step_context to load prior step results and related artifacts
Call search_sources to query Trovella's hybrid search index
Call extract_data for structured extraction (the one tool that makes server-side LLM calls)
Call store_research to persist intermediate artifacts

Submitting results

The AI platform calls submit_step_result with the step's result data. The handler:

Loads the step and validates it is in_progress (also accepts pending for timing edge cases)
Transitions the step to completed
Stores the resultSummary, confidence, stepExecutionReport, and outputFormattingNotes
Writes a step_completed audit log entry
Evaluates branching conditions for this step
If a fail branch action fires, sets the plan to failed and returns immediately
Otherwise, calls derivePlanStatus on all step states to determine the new plan status
Updates the research_plan status and completedAt (if completed)

The AI platform then calls get_next_step again to continue the loop.

Timing edge case: pending step submission

submit_step_result accepts steps in both in_progress and pending status. This handles a race condition where the AI platform starts working on a step before the get_next_step transaction has fully committed. Rather than rejecting valid work, the handler auto-advances the step through both transitions:

if (step.status === "pending") {
  transitionStep("pending", "in_progress");
}
transitionStep("in_progress", "completed");

Phase 3: Human-in-the-Loop Review

At any point during execution, the AI platform can pause for user input.

Requesting review

The AI platform calls request_user_review with a summary and optional questions. The handler:

Validates the step is in_progress (can transition to awaiting_input)
Validates the plan is executing (can transition to awaiting_review)
Transitions the step to awaiting_input
Transitions the plan to awaiting_review
Writes a user_reviewed audit log entry with action review_requested

After this, calling get_next_step returns { status: "awaiting_review" } until the user responds.

User decisions

The user responds through submit_user_decision with one of four decisions:

Decision	Step effect	Plan effect
`approve`	Step completed	Plan resumes to `executing`
`reject`	Step failed	Plan fails
`modify`	Step returns to `in_progress`	Plan resumes to `executing`
`skip`	Step skipped	Plan resumes to `executing`

The modify decision is particularly important: it appends the user's feedback to the step's instructions (separated by a horizontal rule and "User feedback:" prefix), then returns the step to in_progress so the AI platform can re-execute it with the new guidance.

Phase 4: Plan Completion

When derivePlanStatus returns completed (all steps are in a terminal state), the plan is marked as completed. The get_next_step response includes:

status: "plan_complete"
outputMediaType -- from the linked skill execution metadata
outputFormattingInstructions -- from the linked skill execution metadata
planFormattingNotes -- from the plan's outputFormattingNotes field
stepFormattingNotes -- from each completed step's outputFormattingNotes field

This formatting metadata guides the AI platform in generating the final deliverable via store_research_output.

Cross-Session Resume

Plans survive session disconnects because all state is in PostgreSQL. To resume a plan from a new session:

Call get_research_context with the planId -- returns the full plan context including all steps, their statuses, and prior results
Call get_next_step -- if the plan was stalled, the handler transitions it back to executing; if a step was in_progress but never completed, the handler skips to the next pending step

The stalled --> executing plan transition is what enables this resume flow. See Stall Detection for how stalls are identified.

Plan Modification During Execution

The AI platform can modify plans that are in planning or executing state via modify_plan. Five modification actions are available:

Action	What it does	Constraint
`add_steps`	Inserts new steps, shifting subsequent step orders	Specify `insertAfterOrder` or append to end
`remove_step`	Deletes a step and compacts step orders	Step must be `pending`
`reorder_steps`	Reassigns step orders from a provided ID sequence	Provide complete step ID list
`update_step_instructions`	Replaces a step's instruction text	Any step status
`fail_step`	Marks a step as failed with a reason	Step must be `pending` or `in_progress`

Plans in awaiting_review, stalled, completed, or failed states cannot be modified.

Every modification writes a plan_modified audit log entry with the action type and an optional modificationRationale explaining why the change was made.

Audit Trail

Every state transition and mutation writes to the plan_audit_log table via writeAuditLog(). This is a hard requirement documented in packages/mcp/CLAUDE.md -- all plan/step mutations must call writeAuditLog inside the transaction, before commit.

The eight event types map to specific points in the execution loop:

Event type	When it fires
`step_started`	`get_next_step` transitions a step to `in_progress`
`step_completed`	`submit_step_result` completes a step
`step_failed`	A step is failed via branching or `fail_step`
`plan_modified`	Plan created, steps added/removed/reordered, instructions updated
`user_reviewed`	Review requested or user decision submitted
`session_resumed`	Plan resumed from a new session
`skill_started`	Skill execution begins
`skill_completed`	Skill execution finishes

In addition to the business-level audit log, every MCP tool call (except ping) is recorded in mcp_tool_call_log via the withToolCallLogging middleware. This captures request/response metadata, duration, and error flags as a fire-and-forget async write.

State Machines -- the transition rules enforced at each step of this loop
Branching -- how conditions evaluated after step completion alter the flow
Stall Detection -- how the system identifies steps stuck in in_progress
Progress Tracking -- how get_plan_status summarizes progress during execution
Identity & Access -- Tenant Isolation -- the withTenantContext wrapper used in every tool handler