Debug Endpoint
The debugSearch tRPC procedure that returns keyword, semantic, and fused results separately for search quality analysis.
The debugSearch procedure exposes the internals of the query pipeline by returning keyword, semantic, and fused results as three separate arrays. It powers the Hybrid Search Admin View and is useful for diagnosing ranking issues, comparing search engine contributions, and tuning relevance.
How It Differs from search
| Aspect | search | debugSearch |
|---|---|---|
| Response | Single FusedResult[] | Three arrays: keyword, semantic, fused |
| Filters | Full filter set (8 optional fields) | Only sourceTable filter |
| Authorization | read on ResearchArtifact | read on Organization (admin only) |
| Feature tag | "hybrid-search-query" | "hybrid-search-debug" |
| Use case | Production search | Search quality debugging |
The debugSearch procedure uses a separate AI feature tag (hybrid-search-debug) so debug embedding costs are tracked independently from production search costs in the ai_usage table.
Implementation
The procedure runs the same pipeline as search but returns the intermediate results:
debugSearch: authorizedProcedure
.input(
z.object({
query: z.string().min(1).max(1000),
limit: z.number().min(1).max(50).default(20),
sourceTable: z.enum([
"research_artifact", "research_output", "extraction_result",
]).optional(),
}),
)
.query(async ({ ctx, input }) => {
if (ctx.ability.cannot("read", "Organization")) {
throw new TRPCError({ code: "FORBIDDEN" });
}
const queryType = classifyQuery(input.query);
const { embedding } = await ctx.ai.embedQuery(
input.query, "hybrid-search-debug"
);
const [kwResults, semResults] = await Promise.all([
keywordSearch({
query: input.query,
organizationId: ctx.organizationId,
limit: input.limit,
sourceTable: input.sourceTable,
}),
semanticSearch(ctx.db, embedding, input.limit, input.sourceTable),
]);
// ... map to RankedResult[] ...
const fused = reciprocalRankFusion(kwRanked, semRanked, input.limit);
return { queryType, keyword: kwRanked, semantic: semRanked, fused };
}),
Unlike search, which requests limit * 2 candidates from each engine, debugSearch requests exactly limit from each. This keeps the three columns comparable in the admin UI.
Response Shape
{
queryType: "keyword" | "balanced" | "semantic";
keyword: RankedResult[]; // BM25 results with score and rank
semantic: RankedResult[]; // pgvector results with similarity and rank
fused: FusedResult[]; // Merged results with RRF score and source flags
}
Each RankedResult includes the raw score from its source engine and the rank position. The FusedResult adds rrfScore, inKeyword, and inSemantic for provenance tracking.
The stats Endpoint
The stats procedure complements debugSearch by providing index-level metrics:
stats: authorizedProcedure.query(async ({ ctx }) => {
if (ctx.ability.cannot("read", "Organization")) {
throw new TRPCError({ code: "FORBIDDEN" });
}
const rows = await ctx.db
.select({
sourceTable: documentChunk.sourceTable,
count: count(),
})
.from(documentChunk)
.groupBy(documentChunk.sourceTable);
const totalChunks = rows.reduce((sum, r) => sum + r.count, 0);
return { totalChunks, bySourceTable: rows };
});
This returns the total number of indexed chunks and the breakdown by source table. The admin view uses this to display KPI cards and the "Chunks by Content Type" table.
When to Use Debug vs. Production Search
Use debugSearch when:
- Investigating why a specific document ranks high or low
- Comparing how keyword and semantic engines interpret the same query
- Evaluating whether a query benefits more from BM25 or vector similarity
- Testing filter behavior on a new content type
Use search for all production application code -- it returns only the fused results and supports the full filter set.
Related Pages
- Hybrid Search Admin View -- the admin UI that renders these results in a three-column layout
- Query Pipeline Overview -- full pipeline flow
- Fusion Algorithm -- how the fused results are calculated