ADR-012: CI/CD Pipeline -- Build, Test, Deploy
Decision record for the CI/CD pipeline structure, parallel jobs, local CI parity, and dependency automation.
Status: Accepted
Date: 2026-03-25 (initial pipeline in TRO-7), parallelized 2026-04-01, ci:check local mirror added 2026-04-03
Deciders: Kyle Olson (Solo Founder)
Decision
- CI platform: GitHub Actions with cancel-in-progress concurrency
- Pipeline structure: Five jobs --
quality(10 checks),docs(3 checks),build-push(Docker image),migrate-prod(Cloud SQL migrations),deploy-prod(VM deployment) - Deploy gate:
deploy-prodrequiresquality,build-push, andmigrate-prodto all pass;docsruns independently and does not gate deployment - Local parity:
pnpm ci:checkruns the same quality checks locally that CI runs remotely - Docker builds: BuildKit with GitHub Actions cache layer; images pushed to Google Artifact Registry tagged with commit SHA +
latest - Database migrations: Cloud SQL Auth Proxy in CI (not IP allowlisting)
- VM deployment: SCP files + SSH via IAP tunnel, ~1 minute
- Dependency automation: Renovate with weekend schedule, dev dependencies automerge, runtime dependencies require manual merge
- Pre-commit: Husky + lint-staged for Prettier and ESLint auto-fix on staged files
- Branch protection: CI status check required on
main, force pushes blocked, no direct pushes
Context
The CI/CD pipeline was one of the first things built (TRO-7, Week 0) because the founder wanted automated quality gates from day one. The initial version was minimal -- format, lint, typecheck, test, build -- running sequentially in a single job. It took about 1 minute.
Over Phase 0, the pipeline grew as quality checks were added: dependency-cruiser for package boundary validation, Knip for dead code detection, jscpd for code duplication, RLS integration tests requiring a live Postgres database, Typesense service containers for search tests, and a documentation quality job. By Phase 0's end, the monolithic pipeline took 11 minutes.
The restructuring split the pipeline into parallel jobs. quality and build-push run concurrently (since build-push does not depend on quality checks for the Docker build itself -- only deploy-prod gates on both). The docs job was separated because prose linting should not block production deploys.
A separate problem emerged: the local development workflow did not match CI. The founder discovered Knip issues consistently passing locally but failing in CI. Root cause: the local pre-commit workflow only ran lint + typecheck + test, missing 5 of the 10 CI checks. The pnpm ci:check script was created to mirror CI, and CLAUDE.md was updated to require all AI agents to run it before every commit.
Decision Drivers
- Fast feedback on PRs -- developers and AI agents need results within minutes, not 11 minutes for sequential checks
- No path to production without quality gates -- deployment must be impossible when any quality check fails
- Local/CI parity -- what passes locally must pass in CI; divergence wastes time and erodes trust
- Cost-conscious -- GitHub Actions free tier (2,000 minutes/month) must be sufficient for a solo developer
- Security posture preserved -- deployment must not require exposing SSH ports or database IPs
Alternatives Considered
Monolithic single-job pipeline vs parallel jobs
The original pipeline ran all checks sequentially in one job (simpler YAML, one failure point). At 11 minutes, it was the longest wait in the development loop.
Why parallel won: Splitting into parallel jobs reduced wall time from 11 minutes to ~3.5 minutes. The quality job still runs checks sequentially (later checks depend on earlier ones), but build-push runs in parallel because the Docker build is independent. A 3x speedup justified the additional YAML complexity.
Cloud SQL IP allowlisting vs Auth Proxy for migrations
GitHub Actions runners have ephemeral IPs from a large shared pool. Allowlisting the full range would negate the security posture of restricting Cloud SQL to the VM's static IP.
Why Auth Proxy won: Preserves Cloud SQL accessibility only from known, authenticated sources. The proxy adds ~15 seconds of setup (download binary, start, health check), negligible in a CI pipeline. Authentication uses Workload Identity Federation -- keyless, no stored credentials.
Full Renovate automerge vs devDeps-only automerge
The initial Renovate configuration automerged all patch/minor updates. The founder pushed back: runtime dependencies ship to production, and a minor version bump could change behavior in ways tests do not cover.
Revised policy (three tiers):
- DevDependency patch/minor -- automerge when CI passes (affects dev environment only)
- Runtime dependency (any version) -- labeled
dependency-runtime, requires manual merge (ships to production) - Major updates (any dep type) -- labeled
dependency-major, requires manual review (often contains breaking changes)
Key Implementation Decisions
Docs job does not gate deployment
Documentation quality failures (prose lint violations, broken links, stale docs) do not block production deploys. A typo in a guide is not worth holding back a security fix. The docs job still runs on every PR so failures are visible, just not blocking.
Expiring TODO comments
The unicorn/expiring-todo-comments ESLint rule requires every TODO, FIXME, or HACK comment to include an expiration date [YYYY-MM-DD] or a Linear ticket [TRO-NNN]. When the date passes, ESLint promotes the comment to an error and CI fails. ignoreDatesOnPullRequests: true means expiration is only checked on main, not on PR branches.
Branch protection
- Required status check:
qualitymust pass before merge - No force pushes to
main - No direct pushes to
main(all changes must go through a PR) - No "skip CI" escape hatch -- the founder's position is that if something is urgent enough to skip CI, the CI check should be fixed
CODEOWNERS (planned, deferred to Phase 2)
CODEOWNERS would require founder approval for changes to CI workflow files, enforcement configs, auth code, schema, and CLAUDE.md. Deferred because the founder reviews all PRs manually during Phase 1 (solo developer). The motivation: a Claude Code session once removed the dep-cruise CI step to unblock a failing deployment instead of fixing the violation.
Consequences
Positive
- Fast CI feedback -- ~3.5 minutes instead of 11 minutes
- No path around quality gates --
deploy-prodrequires all three prerequisite jobs; branch protection prevents direct pushes - Local/CI parity --
pnpm ci:checkeliminates the "works locally, fails in CI" class of issues - Secure deployment -- no public SSH, no exposed database IPs, no long-lived service account keys
- Self-cleaning technical debt -- expiring TODOs ensure every shortcut has a timer
Negative
- Service container fragility -- Docker Hub rate limiting causes transient CI failures. No resolution yet beyond re-running the failed job.
- No staging environment -- the pipeline deploys directly to production. Staging is planned for Month 2.
- Migration job complexity -- Cloud SQL Auth Proxy setup is the most complex part of the pipeline and difficult to debug when it fails.
- Pre-commit gaps -- the hook only runs Prettier and ESLint
--fix. Typecheck, dead code, dependency violations, and test failures are not caught at commit time. - Docs job is advisory-only -- documentation degradation requires manual attention.
Risks
- Free tier exhaustion -- mitigated by cancel-in-progress and
--affectedfiltering (~570 runs/month capacity) - DevDependency automerge -- could break the build in ways that pass CI but cause local issues; mitigated by weekend schedule
- Docker Hub rate limiting escalation -- planned migration to pre-pull images from Artifact Registry
- Single deploy target -- no blue-green or canary; bad deploys affect all users immediately; mitigated by ~1 minute rollback time
- CODEOWNERS deferred -- AI agents can modify CI/enforcement files without mandatory review until Phase 2
Validation
| Rule | Enforcement |
|---|---|
| All quality checks pass before deploy | deploy-prod job needs: [quality, build-push, migrate-prod] |
| No direct pushes to main | GitHub branch protection |
| Local checks mirror CI | pnpm ci:check (format, lint, dep-cruise, dead-code, duplication, typecheck, test) |
| Migrations skip when unnecessary | dorny/paths-filter on schema/migration/seed paths |
| Every TODO has an expiration | unicorn/expiring-todo-comments ESLint rule |
| Runtime dependency updates require review | Renovate dependency-runtime label + automerge: false |
References
- Source ADR:
docs/architecture/decisions/012-cicd-pipeline-build-deploy.md - Architecture diagram:
docs/architecture/07-cicd-pipeline.md - Related: Job Definitions, Quality Checks, Concurrency & Caching, Local CI Parity
- Related ADRs: ADR-008 (Compute -- VM deployment), ADR-011 (Monorepo -- Turborepo caching), ADR-013 (Architecture Enforcement), ADR-014 (Testing), ADR-015 (Documentation Quality)
- Related wiki: Infrastructure -- Deploy Pipeline, Data & Storage -- CI Deployment