Health and Observability

Redis health checks, the /api/health endpoint, connection lifecycle, and debugging cache behavior.

Health Check

The checkRedis function verifies Redis connectivity by sending a PING command and measuring round-trip latency:

import { checkRedis } from "@repo/cache";

const result = await checkRedis();
// { ok: true, latencyMs: 2 }
// { ok: false, latencyMs: 5003 }

Return Type

interface RedisHealthCheck {
  ok: boolean;
  latencyMs: number;
}

ok: true — Redis responded to PING successfully.
ok: false — the PING failed (connection refused, timeout, auth error, etc.). The latencyMs still reflects the time spent waiting.

The function catches all errors internally — it never throws. This makes it safe to call from health endpoints without wrapping in try/catch.

The `/api/health` Endpoint

The application health endpoint at apps/web/src/app/api/health/route.ts checks three infrastructure services in parallel:

const [database, redis, typesense] = await Promise.all([
  checkDatabase(),
  checkRedis(),
  checkTypesense(),
]);

The response includes individual check results and an aggregate status:

Condition	Status	HTTP Code
All checks pass	`healthy`	200
Some checks pass	`degraded`	200
All checks fail	`unhealthy`	503

A degraded status still returns HTTP 200. This is intentional — Docker healthchecks and uptime monitors need a 2xx response to consider the container alive. Only a total infrastructure failure returns 503.

When the health check is degraded or unhealthy, a warning is logged via @repo/logger:

{
  "level": "warn",
  "msg": "Health check degraded or unhealthy",
  "status": "degraded",
  "checks": {
    "database": { "ok": true, "latencyMs": 3 },
    "redis": { "ok": false, "latencyMs": 5002 },
    "typesense": { "ok": true, "latencyMs": 8 }
  }
}

Connection Lifecycle

Startup

The Redis connection is lazy — no connection is made until the first getRedis() call. This means the application starts without waiting for Redis, and connection errors only surface when cache functions are first invoked.

Graceful Shutdown

Call disconnectRedis() during application shutdown to cleanly close the connection:

import { disconnectRedis } from "@repo/cache";

// In your shutdown handler
await disconnectRedis();

This sends a QUIT command to Redis and nulls the internal singleton. If the application continues running after shutdown (e.g., in tests), the next getRedis() call creates a fresh connection automatically.

Reconnection

IORedis handles transient disconnects internally with automatic reconnection. The maxRetriesPerRequest: 3 setting means each individual Redis command retries up to 3 times before failing. This covers brief network blips without application-level retry logic.

If Redis is down for an extended period, commands will fail after retries are exhausted. The application degrades — cached reads fall through to errors, but the health endpoint reports degraded status rather than crashing.

Debugging

Inspecting Cache State Locally

Connect to the local Redis container to inspect keys:

# Connect to the local Redis CLI
docker exec -it trovella-redis redis-cli

# List all keys (local dev only — never use KEYS in production)
KEYS *

# Check a specific key
GET tenant:org_abc123:settings

# Check TTL remaining
TTL tenant:org_abc123:settings

# Scan for keys matching a pattern (production-safe)
SCAN 0 MATCH tenant:org_abc123:* COUNT 100

# Flush all keys (local dev only)
FLUSHALL

Inspecting Cache State in Production

Upstash provides a web dashboard for browsing keys, running commands, and monitoring throughput. Access it through the Upstash console. See Infrastructure — Cloud Resources for access details.

For CLI access to production Redis, use the Upstash connection URL:

# Requires redis-cli installed locally
redis-cli -u $REDIS_URL

Common Issues

Symptom	Likely cause	Resolution
`REDIS_URL environment variable is not set`	Missing env var	Add `REDIS_URL` to `.env`; run `pnpm docker:up` for local Redis
Health check shows `ok: false`	Redis container not running	`pnpm docker:up` to start local services
Stale data after a write	Missing `cacheInvalidate` call	Add invalidation after the write operation
High latency on `checkRedis`	Network issue to Upstash	Check Upstash status page; verify `REDIS_URL` points to nearest region
`ECONNREFUSED` on first request	Redis not yet ready on container start	Docker healthcheck should prevent this; verify `docker-compose.yml` healthcheck config