Query Performance

PostgreSQL performance patterns used in the codebase -- initPlan caching, parallel queries, pagination, and batch inserts.

initPlan Caching

The RLS policies use a subselect wrapper around current_setting:

// packages/db/src/schema/base.ts
const tenantIdExpr = sql`(SELECT current_setting('app.tenant_id'))`;

This is not an accident. When PostgreSQL sees current_setting('app.tenant_id') as a bare function call, it may evaluate it for every row during a sequential scan. Wrapping it in (SELECT ...) tells the query planner to treat it as an initPlan -- a subquery evaluated once and cached for the entire statement execution.

For a table with 10,000 rows, this is the difference between 1 function call and 10,000 function calls per query. The pattern is documented in the schema conventions as "Guide 3.1" and is mandatory for all RLS policy expressions.

Rule: If you write raw SQL that references current_setting('app.tenant_id'), always wrap it in a SELECT:

-- Correct
WHERE organization_id = (SELECT current_setting('app.tenant_id'))

-- Incorrect (evaluated per-row)
WHERE organization_id = current_setting('app.tenant_id')

Parallel Queries

Multiple independent queries within the same transaction can run in parallel using Promise.all:

const [items, [totalRow]] = await Promise.all([
  ctx.db
    .select()
    .from(researchArtifact)
    .where(where)
    .orderBy(desc(researchArtifact.createdAt))
    .limit(input.limit)
    .offset(input.offset),
  ctx.db.select({ total: count() }).from(researchArtifact).where(where),
]);

This pattern is used throughout the research plan router for paginated endpoints that need both the data page and the total count.

When it works: Both queries use the same transaction (ctx.db), which is backed by a single PostgreSQL connection. PostgreSQL processes commands sequentially on a single connection, so Promise.all here does not achieve true parallel execution at the database level -- but it does allow the Node.js event loop to send both queries without awaiting each one individually, which can reduce roundtrip overhead when the database is on a network hop.

When to use it: Pair a data query with a count query, or run independent reads that do not depend on each other's results. The pattern is safe inside a transaction because both queries see the same snapshot.

The hybrid search router takes this further by parallelizing a Typesense keyword search with a pgvector semantic search:

const [keywordResults, semanticResults] = await Promise.all([
  keywordSearch({ query: input.query, organizationId: ctx.organizationId, ... }),
  semanticSearch(ctx.db, embedding, input.limit * 2, input.sourceTable),
]);

Here, the keyword search hits an external service (Typesense) while the semantic search hits PostgreSQL, so true parallelism is achieved.

Pagination

All paginated endpoints follow the same pattern: LIMIT + OFFSET + parallel count.

.input(z.object({
  limit: z.number().min(1).max(100).default(20),
  offset: z.number().min(0).default(0),
}))
.query(async ({ ctx, input }) => {
  const where = eq(researchOutput.planId, input.planId);

  const [items, [totalRow]] = await Promise.all([
    ctx.db.select().from(researchOutput)
      .where(where)
      .orderBy(desc(researchOutput.createdAt))
      .limit(input.limit)
      .offset(input.offset),
    ctx.db.select({ total: count() }).from(researchOutput).where(where),
  ]);

  return { items, total: totalRow?.total ?? 0 };
});

The limit is capped at 100 via Zod validation to prevent clients from requesting unbounded result sets.

OFFSET/LIMIT trade-offs: For small datasets (hundreds to low thousands of rows per tenant), OFFSET-based pagination is straightforward and sufficient. If a table grows beyond tens of thousands of rows per tenant, consider cursor-based pagination (keyset pagination using WHERE created_at < $cursor ORDER BY created_at DESC LIMIT $n).

Batch Inserts in Loops

The MCP tool create-research-plan inserts steps sequentially in a loop:

for (let i = 0; i < steps.length; i++) {
  const step = steps[i];
  if (!step) continue;
  const stepId = randomUUID();
  stepIds.push(stepId);

  await tx.insert(planStep).values({
    id: stepId,
    planId,
    organizationId,
    stepOrder: i + 1,
    stepType: step.stepType,
    status: "pending",
    instructions: step.instructions,
  });
}

This is acceptable for small batches (research plans have single-digit step counts). For larger batch inserts, Drizzle supports multi-row .values([...]) which generates a single INSERT statement:

await tx.insert(planStep).values(
  steps.map((step, i) => ({
    id: randomUUID(),
    planId,
    organizationId,
    stepOrder: i + 1,
    stepType: step.stepType,
    status: "pending",
    instructions: step.instructions,
  })),
);

The single-statement approach is more efficient but makes it harder to collect the generated IDs for later use. Choose based on whether you need the IDs.

Background Job Inserts

The Inngest index-content function inserts document chunks one at a time inside a withTenantContext call:

await withTenantContext(data.organizationId, data.userId, async (tx) => {
  for (const row of chunkRows) {
    await tx.insert(documentChunk).values(row);
  }
});

This keeps the transaction short per-insert while still benefiting from RLS. For the current content volumes (single-digit to low tens of chunks per document), sequential inserts are fine. At scale, this could be batched into multi-row inserts.

Conditional Aggregation

PostgreSQL's FILTER (WHERE ...) clause is cleaner than CASE expressions for conditional counts:

const rows = await ctx.db
  .select({
    totalCalls: count(),
    errorCount: sql<number>`count(*) filter (where ${aiUsage.stopReason} = 'error')`,
    streamingCount: sql<number>`count(*) filter (where ${aiUsage.streaming} = true)`,
  })
  .from(aiUsage);

This computes multiple conditional aggregates in a single table scan.

Connection Pool Sizing

The pool is set to max: 5 connections. This is intentionally small because:

The production VM runs a single Node.js process
Cloud SQL's db-g1-small tier supports ~25 connections max
withTenantContext holds a connection for the duration of a request, so 5 concurrent tenant-scoped requests is the practical limit per process

If the application moves to multiple processes or serverless functions, the pool size should be tuned or a connection pooler (PgBouncer, Supavisor) should be introduced.