Execution Flow
End-to-end lifecycle of a research skill invocation -- from user prompt through interview, routing, planning, step execution, output delivery, and feedback capture.
Every research request follows the same six-phase lifecycle. The phases are sequential -- each must complete before the next begins. The entire flow is driven by skill prompts that instruct the AI platform what to do, with MCP tools providing persistence and structure.
Phase 1: Interview
The router skill (/research) parses the user's message and extracts eight elements:
| Element | Description | Always extracted? |
|---|---|---|
topic | The core research subject | Yes |
researchType | general_topic, competitive_analysis, or decision_support | Yes |
scope | Breadth and depth boundaries | Yes |
depthSignals | Words indicating desired depth ("quick", "thorough", etc.) | Yes |
decisionContext | What decision this research informs | If applicable |
knownContext | What the user already knows | If provided |
outputMediaType | Preferred output format | Asked if not detectable |
outputFormattingInstructions | Styling, layout, or length requirements | If provided |
Clarification Rules
The skill follows strict rules about when to ask follow-up questions:
Proceed without asking when the question is focused and self-contained, the depth is clear from context, or there is enough information to build a plan. When proceeding without questions, the skill stores a skipQuestionsReason explaining why.
Ask 1-2 questions when the topic is ambiguous, the scope is unclear, or decision criteria are missing. Never more than 2. The question-answer exchanges are stored as clarifyingQuestions: [{ question, answer }].
Phase 2: Route
Based on the interview extraction, the router classifies the request as scan or deep. The user never sees this decision. See Routing Logic for the full classification criteria.
After deciding, the router composes a routingRationale string explaining the full decision logic -- how many options, how many dimensions, ambiguity level, depth signals, and the final reasoning. This rationale is stored in skill execution metadata for later analysis of routing accuracy.
Phase 3: Initialize and Check Prior Research
Two things happen in sequence:
Tracking Initialization
The router calls log_skill_execution to create a tracking record:
log_skill_execution({
skillName: "research",
status: "started",
metadata: {
topic, researchType, scope, depthSignals,
decisionContext, knownContext,
outputMediaType, outputFormattingInstructions,
clarifyingQuestions OR skipQuestionsReason,
routingRationale,
delegatedTo: "scan" or "deep",
originalQuery: "<user's message, up to 5000 chars>"
}
})
This returns an executionId that is used for all subsequent tracking updates.
Prior Research Check
The skill calls search_sources({ query: "<key terms>", limit: 5 }) to find existing research artifacts. The result determines three possible paths:
| Artifacts found | User decision | What happens |
|---|---|---|
| 0 | (automatic) | Proceed silently to planning |
| 1-5 | Use existing | Session ends -- skill execution marked completed with closedReason |
| 1-5 | Refresh | Proceed to planning with user's additional context |
| 1-5 | Unique | Proceed to planning as new research |
When artifacts are found, each is presented with its original research prompt, date, and summary. The user's decision and the full exchange are stored in priorResearchCheck metadata.
Phase 4: Plan Creation
The skill designs a research plan based on the research type and routed mode. Plan structure varies by type (see Skill Definitions for templates).
The skill calls create_research_plan with:
- name -- formatted as
[Scan] <topic>or[Deep] <topic> - researchQuestion -- the core question from the interview
- steps -- ordered list of
{ stepType, instructions }entries - branchingConditions -- quality gates (deep only)
- planDesignRationale -- explanation of why this plan structure was chosen
- outputFormattingNotes -- initial ideas for output layout
- sessionId -- for audit tracking
Auto-Linking
When create_research_plan receives a sessionId, the server automatically finds the most recent started skill execution for that session and updates it with the planId and status: "executing". This eliminates a separate log_skill_execution call that the AI platform might forget.
The auto-link logic in packages/mcp/src/tools/create-research-plan.ts:
- Query
skill_executionWHEREclaude_code_session_id = sessionId AND status = 'started'ORDER BYcreated_at DESC LIMIT 1 - If found: update
plan_id,status = 'executing', mergeplanDesignRationaleinto metadata - If not found: no-op (plan still works, just not linked to a skill execution)
Phase 5: Step Execution
The skill enters the pull-based execution loop. This is the core research cycle:
LOOP:
response = get_next_step({ planId })
IF response.status == "plan_complete": EXIT LOOP
IF response.status == "awaiting_review": EXIT LOOP (wait for user)
IF response.status == "plan_failed": EXIT LOOP (handle error)
step = response.step
IF step.stepOrder > 1:
context = get_step_context({ planId, stepId })
// AI platform does the actual research work here
// (searches, reads, analyzes, reasons, writes)
submit_step_result({
planId, stepId,
result: <structured>,
confidence: <0-1>,
stepExecutionReport: { thinking, webSearches, webFetches, otherToolCalls, subagents },
outputFormattingNotes: <optional>
})
// Store significant outputs
IF meaningful output:
store_research({ artifactType, title, content, contentText, confidence, planId, stepId })
END LOOP
Step Types and AI Platform Behavior
Each step type instructs the AI platform to adopt a different analytical perspective:
| Step Type | What the AI platform does |
|---|---|
search | Investigate using knowledge, web search, and search_sources for existing artifacts |
extract | Use extract_data for structured extraction or structure data manually |
analyze | Review prior steps, identify patterns, assess confidence |
critique | Challenge the analysis -- missing info, biases, alternative interpretations |
synthesize | Combine all findings into the final deliverable |
checkpoint | Present findings to the user via request_user_review, pause for input |
custom | Follow step-specific instructions (typically follow-up on checkpoint feedback) |
Step Execution Reports
Every step submission includes a stepExecutionReport that captures the AI platform's complete reasoning process:
thinking-- reasoning narrative, hypotheses, decisions, discarded approaches, confidence factorswebSearches-- exact queries, result counts, which results were used and whywebFetches-- URLs fetched, content summaries, extracted factsotherToolCalls-- other tool invocations with inputs and resultssubagents-- subagent delegations with task descriptions and linked report IDs
This is the primary observability mechanism for research quality. All fields are required even if empty (use empty arrays, not omitted fields).
Dynamic Plan Modification
During execution, the AI platform can modify the plan when circumstances change:
- Add steps -- when gaps are discovered (especially after critique)
- Remove steps -- when a pending step becomes irrelevant
- Reorder steps -- when execution order should change
- Update instructions -- when step context has changed
- Fail a step -- when a step cannot be completed (plan continues)
Every modification requires a modificationRationale explaining why the change was needed.
Phase 6: Output Delivery and Feedback
When get_next_step returns plan_complete, the skill invokes /research-output to generate the final deliverable.
Delivery Flow
- Load full research context via
get_research_context - Generate formatted output according to user's media type preference
- Store the deliverable via
store_research_outputfor later retrieval - Present to the user
- Ask for feedback: "How did this turn out? Anything I should have done differently?"
Feedback Collection
Two feedback moments bracket the end of the research experience:
Initial feedback -- captured right after delivery. The skill waits for the user's response and records their satisfaction level, feedback summary, improvement suggestions, and whether follow-up research was requested.
Closing feedback -- captured when the conversation shifts away from this research topic. This captures accumulated signals from the post-delivery discussion. Skipped if initial feedback already captured everything.
Both are persisted via submit_research_feedback to the research_feedback table.
Tracking Completion
After output delivery and feedback, the skill updates the execution record:
log_skill_execution({
executionId,
skillName: "research",
status: "completed",
metadata: { stepsCompleted, artifactsStored, checkpointFeedback }
})
Error Handling
If any step fails during execution:
- Check
get_plan_status({ planId })to diagnose the failure - If recoverable: use
modify_planto skip or replace the step, continue the loop - If not recoverable: present what was gathered so far, indicate where research was interrupted
- Always log the failure:
log_skill_execution({ executionId, status: "failed", errorMessage })with the skill execution status set tofailed
The output skill has its own error handling: if store_research_output fails, the content is still presented to the user. If submit_research_feedback fails, it is logged silently to avoid disrupting the user experience.
Cross-Session Resume
Research plans persist server-side, enabling resume across sessions:
- Call
list_active_plans()to find non-terminal plans - Present active plans and let the user choose
- Call
get_research_context({ planId, sessionId })to load full state (writessession_resumedaudit log) - Call
get_next_step({ planId })to pick up where it left off - Continue the execution loop
This works across AI tools -- research started in Claude Code can be resumed from any MCP-compatible client.
Related Pages
- Skill Definitions -- plan templates and step types per skill
- Routing Logic -- how the router decides between scan and deep
- Lifecycle Tracking -- database schema and admin API
- Plan Orchestration -- the state machine governing plan and step transitions
- Tool Protocol -- MCP authentication, tool catalog, and logging middleware
Skill Definitions
The four research skills -- router, scan, deep, and output -- their responsibilities, prompt structure, and research type support.
Routing Logic
How the router skill classifies research requests as quick scan or deep dive -- criteria, depth signals, default behavior, and routing rationale.