Docker Containers
Container architecture, health checks, the multi-stage Docker build, and how production and local development stacks differ.
Production Stack
The production stack is defined in infra/docker-compose.prod.yml and runs on the Compute Engine VM at /opt/trovella/.
Container Dependency Graph
caddy
depends_on:
web (service_healthy)
typesense (service_started)
web
depends_on:
cloud-sql-proxy (service_started)
inngest (service_started)
Caddy waits for the web container to pass its health check before accepting traffic. The web container waits for the Cloud SQL Proxy and Inngest to start (but not necessarily be healthy -- service_started not service_healthy).
Container Details
caddy (caddy:2-alpine)
Off-the-shelf Caddy image. Mounts the Caddyfile read-only and persists TLS certificates in named volumes (caddy-data, caddy-config). Exposes ports 80, 443, and 443/udp (HTTP/3) to the host.
web (us-central1-docker.pkg.dev/trovella-shared/trovella/web:latest)
Custom-built Next.js standalone image. Reads all configuration from /opt/trovella/.env via env_file. Exposes port 3000 only to the Docker network (not the host). Includes a health check:
healthcheck:
test:
[
"CMD",
"node",
"-e",
"fetch('http://localhost:3000/api/health').then(r => { if (!r.ok) process.exit(1) }).catch(() => process.exit(1))",
]
interval: 10s
timeout: 5s
retries: 3
start_period: 15s
The health check calls /api/health, which verifies database, Redis, and Typesense connectivity.
cloud-sql-proxy (gcr.io/cloud-sql-connectors/cloud-sql-proxy:2.21.2)
Google's official Cloud SQL Auth Proxy. Authenticates to Cloud SQL using the VM service account's IAM credentials. The connection name is passed via the CLOUD_SQL_CONNECTION_NAME environment variable from .env. Listens on port 5432 on the Docker network.
command:
- ${CLOUD_SQL_CONNECTION_NAME}
- --address=0.0.0.0
- --port=5432
typesense (typesense/typesense:27.1)
Off-the-shelf Typesense image. Data persists in a named volume (typesense-data). The API key is set via the TYPESENSE_API_KEY environment variable. Only exposed on the Docker network (port 8108).
inngest (inngest/inngest:latest)
Self-hosted Inngest Dev Server (Go binary). Runs in production mode (inngest start, not inngest dev). Configured with event and signing keys passed as CLI flags. Points its SDK URL at the web container:
command: >
inngest start
--host 0.0.0.0
--port 8288
--event-key ${INNGEST_EVENT_KEY}
--signing-key ${INNGEST_SIGNING_KEY}
--sdk-url http://web:3000/api/inngest
For details on Inngest function definitions and job patterns, see Data & Storage -- Background Jobs.
Restart Policy
All containers use restart: unless-stopped. When the VM reboots (maintenance, upgrade), Docker automatically restarts all containers. This is how the system recovers after a machine type upgrade.
Named Volumes
| Volume | Container | Purpose |
|---|---|---|
caddy-data | caddy | TLS certificates and OCSP staples |
caddy-config | caddy | Caddy runtime configuration |
typesense-data | typesense | Search index data |
The web container is stateless. The Cloud SQL Proxy and Inngest containers store no local data.
Docker Build Pipeline
The Next.js app image is built by the build-push CI job using a multi-stage Dockerfile at apps/web/Dockerfile.
Build Stages
base (node:22-alpine + pnpm 10.32.1)
|
v
deps (pnpm install --frozen-lockfile, monorepo package.json files only)
|
v
builder (pnpm turbo build --filter=@repo/web)
|
v
runner (standalone server.js, ~150 MB final image)
base -- Enables corepack and activates pnpm. Shared by deps and builder.
deps -- Copies only pnpm-workspace.yaml, pnpm-lock.yaml, and every package's package.json for cache-efficient dependency installation. No source code is copied in this stage.
builder -- Copies the full source tree and runs the Turbo build filtered to @repo/web. Build-time arguments for NEXT_PUBLIC_* variables are injected here:
ARG NEXT_PUBLIC_BETTER_AUTH_URL=https://trovella.ai
ARG NEXT_PUBLIC_SENTRY_DSN
ARG SENTRY_AUTH_TOKEN
runner -- Minimal production image. Copies only the Next.js standalone output (server.js), static assets, and public files. Runs as a non-root nextjs user (UID 1001). Final image size is approximately 150 MB.
Image Registry
Images are pushed to Google Artifact Registry:
us-central1-docker.pkg.dev/trovella-shared/trovella/web
Each build is tagged with both the Git commit SHA and latest. The VM always pulls latest. For rollback, the commit SHA tag provides a stable reference to any previous build.
Build-Time vs Runtime Configuration
| Variable | When Set | How |
|---|---|---|
NEXT_PUBLIC_BETTER_AUTH_URL | Build time | Docker ARG in builder stage |
NEXT_PUBLIC_SENTRY_DSN | Build time | Docker ARG, read from Secret Manager in CI |
DATABASE_URL | Runtime | .env file on VM, synced from Secret Manager |
ANTHROPIC_API_KEY | Runtime | .env file on VM |
| All other secrets | Runtime | .env file on VM |
NEXT_PUBLIC_* variables must be set at build time because Next.js inlines them into the client bundle. All other configuration is runtime, read from the .env file that sync-secrets-vm.sh generates.
Local Development Stack
The local stack (docker-compose.yml at the monorepo root) provides the same services with development-friendly defaults:
| Difference | Production | Local |
|---|---|---|
| Database | Cloud SQL via Auth Proxy | Local pgvector/pgvector:pg18 container |
| Redis | Upstash (managed) | Local redis:8-alpine container |
| Resend (deferred) | Mailpit (localhost:8025 for UI) | |
| Inngest mode | inngest start (production) | inngest dev (development, no auth) |
| Next.js | Docker container with standalone build | pnpm dev on the host (Turbopack) |
| Ports | Only 80/443 via Caddy | All services on host ports |
Start with pnpm docker:up and stop with pnpm docker:down. The Next.js dev server runs on the host machine (not in a container), connecting to the containerized services.
Adding a New Service Container
- Add the service to
infra/docker-compose.prod.ymlwithrestart: unless-stopped - Add a corresponding entry to
docker-compose.ymlfor local development - If the service needs secrets, add them to
infra/sync-secrets-vm.shandinfra/environments/prod/main.tf - If the service needs a persistent volume, add a named volume
- If the web container needs to reach the service, add a
depends_onrelationship - Update the Compute Overview memory budget table
- Deploy: the CI pipeline SCPs
docker-compose.prod.ymlto the VM and runsdocker compose up -d --remove-orphans