Trovella Wiki

Rollback Procedures

Three options for reverting a bad deploy, from fastest image swap to full code revert, plus database rollback links.

When a deploy breaks production, there are three rollback options ordered by speed. Choose based on the severity and whether database changes are involved.

Decision Tree

Deploy broke production
  |
  |-- Database schema changed?
  |     |
  |     Yes --> Is the old app compatible with the new schema?
  |     |         |
  |     |         Yes --> Option 1 (image rollback) + plan a corrective migration
  |     |         No  --> Option 3 (corrective migration) is required
  |     |
  |     No --> Option 1 (image rollback) for immediate fix
  |            then Option 2 (code revert) for a clean deploy

Option 1: Image Rollback (fastest, ~2 minutes)

Roll back the web container to a previous Docker image. Use this when the application code is broken but the database schema has not changed (or the schema change is backward-compatible).

Procedure

# 1. Find the last known good commit SHA
git log --oneline -5

# 2. SSH into the VM and swap the image tag
gcloud compute ssh trovella-prod-vm \
  --zone=us-central1-a --project=trovella-prod \
  --tunnel-through-iap --command="
    cd /opt/trovella && \
    sudo sed -i 's|web:latest|web:<good-commit-sha>|' docker-compose.prod.yml && \
    sudo docker compose -f docker-compose.prod.yml pull web && \
    sudo docker compose -f docker-compose.prod.yml up -d web && \
    echo 'Rollback complete'
  "

Every build pushes a commit-SHA-tagged image to Artifact Registry alongside the latest tag. You can target any previous build by its SHA.

After Stabilizing

The docker-compose.prod.yml on the VM now has a pinned SHA tag. The next CI deploy will overwrite this file with web:latest again. To prevent the broken code from redeploying, either:

  • Fix the issue and push to main (the normal path)
  • Revert the merge commit (Option 2) so the next deploy uses the previous code

Option 2: Code Revert (triggers full pipeline, ~10 minutes)

Revert the merge commit on main. This triggers a new CI run that builds and deploys the previous code.

Procedure

# 1. Identify the merge commit
git log --oneline -5

# 2. Revert it
git revert <merge-commit-sha>
git push origin main

# 3. Monitor the new pipeline
gh run watch

This is the cleanest rollback because it produces a new commit with a clear audit trail. The full pipeline runs (quality, build, deploy), so expect ~10 minutes before the revert is live.

When to Choose This Over Option 1

  • When you want the rollback to persist through future deploys (Option 1 is overwritten by the next deploy)
  • When you need a clean Git history showing the revert
  • When you have time -- the site is degraded but not fully down

Option 3: Database Migration Rollback

Migrations are forward-only. There is no pnpm db:rollback command. If a migration caused the issue, you must write a new migration that undoes the problematic change.

Procedure

  1. Write a new migration that reverses the schema change
  2. Test locally: edit the schema, run pnpm db:generate, then pnpm db:migrate
  3. Commit and push to main -- the migrate-prod job applies it automatically

For the full details on corrective migrations, backup restoration, and point-in-time recovery, see Data & Storage -- Migration Rollback.

When Migration + Deploy Interact

The most complex failure case is when both the migration and the application code need to be rolled back. The sequence matters:

  1. First: Apply the corrective migration (push a commit with the new migration file)
  2. Then: Revert the application code commit (or fix the code in a new commit)

Reverting the application code first could leave the app running against a schema it doesn't expect. Always fix the schema first.

Emergency: Manual Deploy of a Previous Image

If CI is broken and you cannot trigger a pipeline, you can manually deploy any image from Artifact Registry:

gcloud compute ssh trovella-prod-vm \
  --zone=us-central1-a --project=trovella-prod \
  --tunnel-through-iap --command="
    cd /opt/trovella && \
    sudo docker pull us-central1-docker.pkg.dev/trovella-shared/trovella/web:<commit-sha> && \
    sudo sed -i 's|web:latest|web:<commit-sha>|' docker-compose.prod.yml && \
    sudo docker compose -f docker-compose.prod.yml up -d web && \
    echo 'Manual deploy complete'
  "

This bypasses CI entirely. Use only when CI itself is down and production is broken.

Rollback Summary

OptionSpeedScopePersists?Needs CI?
Image rollback~2 minApp code onlyNo (next deploy overwrites)No
Code revert~10 minApp code onlyYes (new commit)Yes
Corrective migration~10 minDatabase schemaYes (new migration)Yes
Manual image deploy~3 minApp code onlyNoNo

On this page