Rollback Procedures
Three options for reverting a bad deploy, from fastest image swap to full code revert, plus database rollback links.
When a deploy breaks production, there are three rollback options ordered by speed. Choose based on the severity and whether database changes are involved.
Decision Tree
Deploy broke production
|
|-- Database schema changed?
| |
| Yes --> Is the old app compatible with the new schema?
| | |
| | Yes --> Option 1 (image rollback) + plan a corrective migration
| | No --> Option 3 (corrective migration) is required
| |
| No --> Option 1 (image rollback) for immediate fix
| then Option 2 (code revert) for a clean deploy
Option 1: Image Rollback (fastest, ~2 minutes)
Roll back the web container to a previous Docker image. Use this when the application code is broken but the database schema has not changed (or the schema change is backward-compatible).
Procedure
# 1. Find the last known good commit SHA
git log --oneline -5
# 2. SSH into the VM and swap the image tag
gcloud compute ssh trovella-prod-vm \
--zone=us-central1-a --project=trovella-prod \
--tunnel-through-iap --command="
cd /opt/trovella && \
sudo sed -i 's|web:latest|web:<good-commit-sha>|' docker-compose.prod.yml && \
sudo docker compose -f docker-compose.prod.yml pull web && \
sudo docker compose -f docker-compose.prod.yml up -d web && \
echo 'Rollback complete'
"
Every build pushes a commit-SHA-tagged image to Artifact Registry alongside the latest tag. You can target any previous build by its SHA.
After Stabilizing
The docker-compose.prod.yml on the VM now has a pinned SHA tag. The next CI deploy will overwrite this file with web:latest again. To prevent the broken code from redeploying, either:
- Fix the issue and push to
main(the normal path) - Revert the merge commit (Option 2) so the next deploy uses the previous code
Option 2: Code Revert (triggers full pipeline, ~10 minutes)
Revert the merge commit on main. This triggers a new CI run that builds and deploys the previous code.
Procedure
# 1. Identify the merge commit
git log --oneline -5
# 2. Revert it
git revert <merge-commit-sha>
git push origin main
# 3. Monitor the new pipeline
gh run watch
This is the cleanest rollback because it produces a new commit with a clear audit trail. The full pipeline runs (quality, build, deploy), so expect ~10 minutes before the revert is live.
When to Choose This Over Option 1
- When you want the rollback to persist through future deploys (Option 1 is overwritten by the next deploy)
- When you need a clean Git history showing the revert
- When you have time -- the site is degraded but not fully down
Option 3: Database Migration Rollback
Migrations are forward-only. There is no pnpm db:rollback command. If a migration caused the issue, you must write a new migration that undoes the problematic change.
Procedure
- Write a new migration that reverses the schema change
- Test locally: edit the schema, run
pnpm db:generate, thenpnpm db:migrate - Commit and push to
main-- themigrate-prodjob applies it automatically
For the full details on corrective migrations, backup restoration, and point-in-time recovery, see Data & Storage -- Migration Rollback.
When Migration + Deploy Interact
The most complex failure case is when both the migration and the application code need to be rolled back. The sequence matters:
- First: Apply the corrective migration (push a commit with the new migration file)
- Then: Revert the application code commit (or fix the code in a new commit)
Reverting the application code first could leave the app running against a schema it doesn't expect. Always fix the schema first.
Emergency: Manual Deploy of a Previous Image
If CI is broken and you cannot trigger a pipeline, you can manually deploy any image from Artifact Registry:
gcloud compute ssh trovella-prod-vm \
--zone=us-central1-a --project=trovella-prod \
--tunnel-through-iap --command="
cd /opt/trovella && \
sudo docker pull us-central1-docker.pkg.dev/trovella-shared/trovella/web:<commit-sha> && \
sudo sed -i 's|web:latest|web:<commit-sha>|' docker-compose.prod.yml && \
sudo docker compose -f docker-compose.prod.yml up -d web && \
echo 'Manual deploy complete'
"
This bypasses CI entirely. Use only when CI itself is down and production is broken.
Rollback Summary
| Option | Speed | Scope | Persists? | Needs CI? |
|---|---|---|---|---|
| Image rollback | ~2 min | App code only | No (next deploy overwrites) | No |
| Code revert | ~10 min | App code only | Yes (new commit) | Yes |
| Corrective migration | ~10 min | Database schema | Yes (new migration) | Yes |
| Manual image deploy | ~3 min | App code only | No | No |