Table Conventions
Column types, naming patterns, shared column helpers, indexing strategies, and JSONB typing used across all Trovella tables.
Every table in the schema follows a consistent set of conventions for column types, naming, timestamps, and indexing. These conventions exist to keep the schema predictable for both human developers and AI agents.
Primary Keys
All primary keys are text("id").primaryKey(). This matches Better Auth's ID format (text, not UUID) and avoids ::uuid casting at every join boundary.
export const researchPlan = pgTable("research_plan", {
id: text("id").primaryKey(),
// ...
});
IDs are generated by the application layer (Better Auth for auth tables, nanoid or similar for feature tables). The database does not generate IDs.
Shared Column Helpers
Two column spread objects in packages/db/src/schema/base.ts provide standard timestamp and tenant columns.
tenantColumns
Used by all tenant-scoped tables. Adds organization_id, created_at, and updated_at:
export const tenantColumns = {
organizationId: text("organization_id").notNull(),
createdAt: timestamp("created_at", { withTimezone: true }).defaultNow().notNull(),
updatedAt: timestamp("updated_at", { withTimezone: true }).defaultNow().notNull(),
} as const;
Usage in a table definition:
import { tenantColumns, tenantPolicies } from "./base";
export const myTable = pgTable(
"my_table",
{
id: text("id").primaryKey(),
...tenantColumns,
name: text("name").notNull(),
},
(table) => [...tenantPolicies("my_table", "organization_id")],
).enableRLS();
baseColumns
Used by global (non-tenant) tables. Adds only created_at and updated_at:
export const baseColumns = {
createdAt: timestamp("created_at", { withTimezone: true }).defaultNow().notNull(),
updatedAt: timestamp("updated_at", { withTimezone: true }).defaultNow().notNull(),
} as const;
When to Use Which
| Table type | Spread | RLS | Example |
|---|---|---|---|
| Tenant-scoped feature table | tenantColumns | Yes | researchPlan, aiUsage |
| Global reference table | baseColumns | No | aiModel |
| Better Auth table | Neither (manual) | Varies | user, member |
Better Auth tables (user, session, account, verification, organization) define their own timestamp columns without withTimezone: true because Better Auth's plugin generates the schema. The member and invitation tables are an exception -- they are tenant-scoped and use tenantPolicies() but define their columns manually to match Better Auth's expectations.
Timestamp Conventions
All timestamps use timestamp("field", { withTimezone: true }) with two standard columns:
created_at-- set once at row creation via.defaultNow().notNull()updated_at-- set at creation via.defaultNow().notNull(), updated by application code (not a database trigger)
Optional temporal columns follow the same { withTimezone: true } pattern:
completedAt: timestamp("completed_at", { withTimezone: true }),
startedAt: timestamp("started_at", { withTimezone: true }),
expiresAt: timestamp("expires_at", { withTimezone: true }),
revokedAt: timestamp("revoked_at", { withTimezone: true }),
lastUsedAt: timestamp("last_used_at", { withTimezone: true }),
There is no automatic updated_at trigger. The application is responsible for updating this column. This is a deliberate choice -- it avoids hidden side effects in transactions and makes update timing explicit in the code.
Column Types by Use Case
| Use case | Drizzle type | PostgreSQL type | Notes |
|---|---|---|---|
| Primary key | text("id") | TEXT | Application-generated |
| Foreign key | text("field_id") | TEXT | Always with .references() |
| Short text | text("name") | TEXT | No varchar -- PostgreSQL treats them identically |
| Long text | text("content") | TEXT | No length limits |
| Boolean | boolean("active") | BOOLEAN | |
| Integer | integer("count") | INTEGER | |
| Decimal (money) | numeric("cost", { precision: 10, scale: 6 }) | NUMERIC(10,6) | Returns string in JS -- use for financial precision |
| Decimal (score) | real("confidence") | REAL | 4-byte float, returns native JS number |
| Date (no time) | date("effective_from") | DATE | Used for pricing effective dates |
| Enum | myEnum("field") | Custom enum | See Enums |
| Semi-structured data | jsonb("data").$type<T>() | JSONB | See below |
| Embedding vector | halfvec("embedding") | halfvec(1536) | Custom type, see below |
| UUID (rare) | uuid("correlation_id") | UUID | Only for correlation IDs with .defaultRandom() |
numeric vs real for Decimal Values
Drizzle's numeric() always returns JavaScript strings because PostgreSQL NUMERIC can exceed Number.MAX_SAFE_INTEGER. This means code working with numeric columns must parse values explicitly.
For financial data (cost tracking), numeric is correct -- precision matters.
For confidence scores and other approximate values, real (4-byte float) returns native JavaScript numbers. The schema switched from numeric to real for confidence columns during early development to avoid unnecessary string parsing.
JSONB Column Typing
JSONB columns use Drizzle's .$type<T>() to provide TypeScript type safety without database-level schema enforcement:
// Typed object
supportedFeatures: jsonb("supported_features")
.$type<{
thinking?: boolean;
vision?: boolean;
streaming?: boolean;
batch?: boolean;
toolUse?: boolean;
structuredOutputs?: boolean;
}>()
.notNull()
.default({}),
// Typed array
thinkingBlocks: jsonb("thinking_blocks")
.$type<Array<{ thinking: string; tokens?: number }>>()
.default([]),
// Flexible record
resultSummary: jsonb("result_summary").$type<Record<string, unknown>>(),
The convention is:
- Use a specific type when the shape is well-known and stable (e.g.,
supportedFeatures) - Use
Record<string, unknown>when the shape varies by context (e.g.,resultSummary,eventData,metadata) - Use
string | unknown[]when the data can be either format (e.g.,systemPromptinaiCallDetails) - Validate JSONB data at the application layer with Zod, not in the database
Vector Columns (pgvector)
The document_chunk table stores embedding vectors using a custom Drizzle type for pgvector's halfvec:
const halfvec = customType<{ data: number[]; driverParam: string }>({
dataType() {
return "halfvec(1536)";
},
toDriver(value: number[]): string {
return `[${value.join(",")}]`;
},
fromDriver(value: unknown): number[] {
const str = String(value);
return str.slice(1, -1).split(",").map(Number);
},
});
Key design choices:
halfvecinstead ofvector: Half-precision (2 bytes/dimension) saves ~50% storage and memory with negligible recall loss on normalized embeddings- Dimension 1536: Matches Gemini Embedding 2's output at the MRL-truncated size, balancing cost and quality over the full 3072 dimensions
- HNSW indexes (added via raw SQL in migrations): Better recall-latency than IVFFlat without periodic retraining
Naming Conventions
Table names: snake_case, singular noun or noun phrase. Examples: research_plan, plan_step, ai_usage, document_chunk.
Column names: snake_case in SQL, camelCase in TypeScript (Drizzle maps automatically). Examples: organization_id in SQL becomes organizationId in TypeScript.
Index names: tableName_columnName_idx for regular indexes, tableName_columnName_uidx for unique indexes.
index("ai_usage_organizationId_idx").on(table.organizationId),
uniqueIndex("ai_model_apiModelId_uidx").on(table.apiModelId),
Foreign key column names: Match the referenced table name with _id suffix. Examples: user_id, plan_id, model_id, source_artifact_id.
Indexing Strategy
Every tenant-scoped table indexes organization_id as a standalone column. This is critical for RLS performance -- the policy evaluates organization_id = (SELECT current_setting('app.tenant_id')) on every row access, and the index makes that comparison use an index scan instead of a sequential scan.
Beyond the tenant index, each table indexes columns used in:
- Foreign key lookups: Every FK column gets an index (e.g.,
plan_step_planId_idx) - Status filtering: Status/enum columns used in queries (e.g.,
research_plan_status_idx) - Time-range queries:
created_atis indexed on tables queried by time range (e.g.,ai_usage_createdAt_idx) - Unique lookups: Unique constraints with unique indexes (e.g.,
ai_model_apiModelId_uidx)
Composite indexes are used sparingly. The document_chunk table has one for the polymorphic source reference:
index("document_chunk_source_idx").on(table.sourceTable, table.sourceId),