ADR-006: Background Jobs -- Inngest (Self-Hosted)
Decision record for choosing Inngest as the durable execution engine over Temporal, BullMQ, and DIY approaches.
Status: Accepted Date: 2026-03-21 (Week 0 Decision Sprint), deferred 2026-03-27 (TRO-12), reinstated 2026-03-30 (TRO-56) Deciders: Kyle Olson (Solo Founder)
Decision
Use Inngest open-source Dev Server (Go binary), self-hosted on the Compute Engine VM, with state stored in the existing Cloud SQL PostgreSQL instance. The dashboard is accessible only via SSH tunnel. The SDK (v4) communicates with the Next.js app through a /api/inngest route using signing key authentication.
Context
The research engine architecture requires durable multi-step workflows: document chunking, contextual retrieval via Claude Haiku, Gemini embedding, pgvector insert, and Typesense sync. Each step can fail independently and needs retry capability without re-executing prior steps.
Inngest was initially scaffolded in TRO-12 with only a placeholder welcome-email function. The founder challenged this: "I don't like bringing in this many external platforms before we hit the scaling stage of the company." The system was nearly removed, then reinstated when TRO-56 (self-hosted Inngest) and TRO-58 (hybrid search pipeline) provided concrete workloads. This follows the "progressive complexity" principle: infrastructure earns its place through real use cases.
Alternatives Considered
Temporal
Most mature durable execution platform with a strong TypeScript SDK. Rejected because Temporal Cloud costs $200+/month and self-hosting requires a multi-service cluster. Inngest provides equivalent step-level durability at $0 with a single binary.
BullMQ + Redis
Well-established queue with simple semantics. Rejected because it provides only job-level retries (not step-level), so partial failure recovery would need to be built manually. Also would have required adding Redis as a dependency for the queue alone. Accepted as the graduated upgrade path at ~500K-1M executions/month if Inngest limits are hit.
Google Cloud Workflows
GCP-native and managed. Rejected because workflow definitions are YAML-based (poor for AI agent code generation) and there is no waitForEvent equivalent for human-in-the-loop research checkpoints.
DIY (fire-and-forget + custom retry)
Zero dependencies. This was the interim approach after TRO-12 deferred Inngest. It worked for trivial tasks but was insufficient for the multi-step search indexing pipeline where each step can fail independently.
Implementation Details
Self-hosted on VM, not Inngest Cloud
The Go binary runs as a Docker container on the shared Compute Engine VM. It stores workflow state in the existing Cloud SQL PostgreSQL instance, so there is no separate database or Redis dependency. Cost is $0 incremental.
Dashboard via SSH tunnel only
The Inngest dashboard is not exposed publicly. Access it with:
gcloud compute ssh trovella-vm -- -L 8288:localhost:8288
# Then open http://localhost:8288
SDK v4 signing key requirement
Inngest SDK v4 silently returns 500 errors if neither INNGEST_DEV=1 nor a valid INNGEST_SIGNING_KEY is set. In production, signing, event, and API keys are generated and stored in GCP Secret Manager, passed to the binary via CLI flags (inngest start --event-key <key> --signing-key <key>). The binary does not read these from environment variables.
Middleware bypass for /api/inngest
Inngest sends unsigned HTTP requests to the app for function registration and step callbacks. The auth middleware must not block these. /api/inngest is in the public routes list in proxy.ts, alongside /api/auth and /api/mcp.
Consequences
Positive
- Step-level durability: a failure during embedding does not re-execute chunking
- $0 incremental cost: self-hosted on existing VM, state stored in existing Cloud SQL
- Event-driven architecture: MCP tools are decoupled from processing pipelines
- Built-in dashboard for workflow monitoring, step inspection, and retry controls
- Concurrency control prevents overwhelming external APIs
Negative
- Additional Docker container on the VM (lightweight Go binary)
- SDK v4 quirks: signing key requirement is not well-documented;
step.run()returnsanyin strict TypeScript - Cognitive overhead of an additional platform, even when self-hosted and free
Risks
- Inngest project health -- venture-funded startup; migration to BullMQ + Redis would require rewriting workflow definitions (mitigated by small function surface area)
- Cloud SQL connection pooling -- the Go binary maintains its own pool alongside the web app (mitigated by Cloud SQL Enterprise Plus upgrade path)
- Single-VM bottleneck -- CPU-heavy workflows could impact web performance (mitigated by per-function concurrency limits)
References
- Linear: TRO-12 (initial scaffold), TRO-56 (self-hosted Inngest on VM), TRO-58 (hybrid search -- first real workload)
- Related: ADR-008 (Compute -- VM + Docker Compose + Caddy), ADR-009 (Search -- Typesense + pgvector hybrid)