Trovella Wiki

Execution Flow

End-to-end lifecycle of a research skill invocation -- from user prompt through interview, routing, planning, step execution, output delivery, and feedback capture.

Every research request follows the same six-phase lifecycle. The phases are sequential -- each must complete before the next begins. The entire flow is driven by skill prompts that instruct the AI platform what to do, with MCP tools providing persistence and structure.

Phase 1: Interview

The router skill (/research) parses the user's message and extracts eight elements:

ElementDescriptionAlways extracted?
topicThe core research subjectYes
researchTypegeneral_topic, competitive_analysis, or decision_supportYes
scopeBreadth and depth boundariesYes
depthSignalsWords indicating desired depth ("quick", "thorough", etc.)Yes
decisionContextWhat decision this research informsIf applicable
knownContextWhat the user already knowsIf provided
outputMediaTypePreferred output formatAsked if not detectable
outputFormattingInstructionsStyling, layout, or length requirementsIf provided

Clarification Rules

The skill follows strict rules about when to ask follow-up questions:

Proceed without asking when the question is focused and self-contained, the depth is clear from context, or there is enough information to build a plan. When proceeding without questions, the skill stores a skipQuestionsReason explaining why.

Ask 1-2 questions when the topic is ambiguous, the scope is unclear, or decision criteria are missing. Never more than 2. The question-answer exchanges are stored as clarifyingQuestions: [{ question, answer }].

Phase 2: Route

Based on the interview extraction, the router classifies the request as scan or deep. The user never sees this decision. See Routing Logic for the full classification criteria.

After deciding, the router composes a routingRationale string explaining the full decision logic -- how many options, how many dimensions, ambiguity level, depth signals, and the final reasoning. This rationale is stored in skill execution metadata for later analysis of routing accuracy.

Phase 3: Initialize and Check Prior Research

Two things happen in sequence:

Tracking Initialization

The router calls log_skill_execution to create a tracking record:

log_skill_execution({
  skillName: "research",
  status: "started",
  metadata: {
    topic, researchType, scope, depthSignals,
    decisionContext, knownContext,
    outputMediaType, outputFormattingInstructions,
    clarifyingQuestions OR skipQuestionsReason,
    routingRationale,
    delegatedTo: "scan" or "deep",
    originalQuery: "<user's message, up to 5000 chars>"
  }
})

This returns an executionId that is used for all subsequent tracking updates.

Prior Research Check

The skill calls search_sources({ query: "<key terms>", limit: 5 }) to find existing research artifacts. The result determines three possible paths:

Artifacts foundUser decisionWhat happens
0(automatic)Proceed silently to planning
1-5Use existingSession ends -- skill execution marked completed with closedReason
1-5RefreshProceed to planning with user's additional context
1-5UniqueProceed to planning as new research

When artifacts are found, each is presented with its original research prompt, date, and summary. The user's decision and the full exchange are stored in priorResearchCheck metadata.

Phase 4: Plan Creation

The skill designs a research plan based on the research type and routed mode. Plan structure varies by type (see Skill Definitions for templates).

The skill calls create_research_plan with:

  • name -- formatted as [Scan] <topic> or [Deep] <topic>
  • researchQuestion -- the core question from the interview
  • steps -- ordered list of { stepType, instructions } entries
  • branchingConditions -- quality gates (deep only)
  • planDesignRationale -- explanation of why this plan structure was chosen
  • outputFormattingNotes -- initial ideas for output layout
  • sessionId -- for audit tracking

Auto-Linking

When create_research_plan receives a sessionId, the server automatically finds the most recent started skill execution for that session and updates it with the planId and status: "executing". This eliminates a separate log_skill_execution call that the AI platform might forget.

The auto-link logic in packages/mcp/src/tools/create-research-plan.ts:

  1. Query skill_execution WHERE claude_code_session_id = sessionId AND status = 'started' ORDER BY created_at DESC LIMIT 1
  2. If found: update plan_id, status = 'executing', merge planDesignRationale into metadata
  3. If not found: no-op (plan still works, just not linked to a skill execution)

Phase 5: Step Execution

The skill enters the pull-based execution loop. This is the core research cycle:

LOOP:
  response = get_next_step({ planId })

  IF response.status == "plan_complete": EXIT LOOP
  IF response.status == "awaiting_review": EXIT LOOP (wait for user)
  IF response.status == "plan_failed": EXIT LOOP (handle error)

  step = response.step

  IF step.stepOrder > 1:
    context = get_step_context({ planId, stepId })

  // AI platform does the actual research work here
  // (searches, reads, analyzes, reasons, writes)

  submit_step_result({
    planId, stepId,
    result: <structured>,
    confidence: <0-1>,
    stepExecutionReport: { thinking, webSearches, webFetches, otherToolCalls, subagents },
    outputFormattingNotes: <optional>
  })

  // Store significant outputs
  IF meaningful output:
    store_research({ artifactType, title, content, contentText, confidence, planId, stepId })

END LOOP

Step Types and AI Platform Behavior

Each step type instructs the AI platform to adopt a different analytical perspective:

Step TypeWhat the AI platform does
searchInvestigate using knowledge, web search, and search_sources for existing artifacts
extractUse extract_data for structured extraction or structure data manually
analyzeReview prior steps, identify patterns, assess confidence
critiqueChallenge the analysis -- missing info, biases, alternative interpretations
synthesizeCombine all findings into the final deliverable
checkpointPresent findings to the user via request_user_review, pause for input
customFollow step-specific instructions (typically follow-up on checkpoint feedback)

Step Execution Reports

Every step submission includes a stepExecutionReport that captures the AI platform's complete reasoning process:

  • thinking -- reasoning narrative, hypotheses, decisions, discarded approaches, confidence factors
  • webSearches -- exact queries, result counts, which results were used and why
  • webFetches -- URLs fetched, content summaries, extracted facts
  • otherToolCalls -- other tool invocations with inputs and results
  • subagents -- subagent delegations with task descriptions and linked report IDs

This is the primary observability mechanism for research quality. All fields are required even if empty (use empty arrays, not omitted fields).

Dynamic Plan Modification

During execution, the AI platform can modify the plan when circumstances change:

  • Add steps -- when gaps are discovered (especially after critique)
  • Remove steps -- when a pending step becomes irrelevant
  • Reorder steps -- when execution order should change
  • Update instructions -- when step context has changed
  • Fail a step -- when a step cannot be completed (plan continues)

Every modification requires a modificationRationale explaining why the change was needed.

Phase 6: Output Delivery and Feedback

When get_next_step returns plan_complete, the skill invokes /research-output to generate the final deliverable.

Delivery Flow

  1. Load full research context via get_research_context
  2. Generate formatted output according to user's media type preference
  3. Store the deliverable via store_research_output for later retrieval
  4. Present to the user
  5. Ask for feedback: "How did this turn out? Anything I should have done differently?"

Feedback Collection

Two feedback moments bracket the end of the research experience:

Initial feedback -- captured right after delivery. The skill waits for the user's response and records their satisfaction level, feedback summary, improvement suggestions, and whether follow-up research was requested.

Closing feedback -- captured when the conversation shifts away from this research topic. This captures accumulated signals from the post-delivery discussion. Skipped if initial feedback already captured everything.

Both are persisted via submit_research_feedback to the research_feedback table.

Tracking Completion

After output delivery and feedback, the skill updates the execution record:

log_skill_execution({
  executionId,
  skillName: "research",
  status: "completed",
  metadata: { stepsCompleted, artifactsStored, checkpointFeedback }
})

Error Handling

If any step fails during execution:

  1. Check get_plan_status({ planId }) to diagnose the failure
  2. If recoverable: use modify_plan to skip or replace the step, continue the loop
  3. If not recoverable: present what was gathered so far, indicate where research was interrupted
  4. Always log the failure: log_skill_execution({ executionId, status: "failed", errorMessage }) with the skill execution status set to failed

The output skill has its own error handling: if store_research_output fails, the content is still presented to the user. If submit_research_feedback fails, it is logged silently to avoid disrupting the user experience.

Cross-Session Resume

Research plans persist server-side, enabling resume across sessions:

  1. Call list_active_plans() to find non-terminal plans
  2. Present active plans and let the user choose
  3. Call get_research_context({ planId, sessionId }) to load full state (writes session_resumed audit log)
  4. Call get_next_step({ planId }) to pick up where it left off
  5. Continue the execution loop

This works across AI tools -- research started in Claude Code can be resumed from any MCP-compatible client.

On this page