Database Migrations with Alembic
A database migration is a version-controlled change to your schema: adding a column, renaming a table, creating an index. As your SaaS grows, you'll change the schema dozens of times. Without migrations, developers apply changes manually on staging, then forget to apply them on production—resulting in runtime errors. Alembic is the standard Python tool for schema migrations; it tracks every change, allows rollbacks, and detects schema conflicts. This guide covers initialization, auto-generating migrations, writing custom migrations, and deploying safely.
Why Migrations Matter in Multi-Tenant SaaS
In a multi-tenant system, all customers share the same database. A schema change must be applied to all data at once without downtime. Alembic lets you test migrations locally, verify they work with millions of rows, and deploy via CI/CD with zero downtime. A failed migration during deploy affects all customers simultaneously, so migration safety is critical to SaaS reliability.
Initializing Alembic
Install Alembic and initialize a repository:
pip install alembic
alembic init migrations
This creates a migrations/ folder with config and version directories. Configure Alembic to auto-detect schema changes:
# migrations/env.py (auto-generated, with edits)
from alembic import context
from sqlalchemy import engine_from_config, pool
from logging.config import fileConfig
from app.models import Base # Import your SQLAlchemy Base
config = context.config
fileConfig(config.config_file_name)
target_metadata = Base.metadata # Auto-detect models
def run_migrations_offline() -> None:
"""Run migrations 'offline' (not connecting to database)."""
context.configure(
url=config.get_main_option("sqlalchemy.url"),
target_metadata=target_metadata,
literal_binds=True,
dialect_opts={"paramstyle": "named"},
)
def run_migrations_online() -> None:
"""Run migrations 'online' (connecting to database)."""
config_section = config.get_section(config.config_prefix)
config_section["sqlalchemy.url"] = os.getenv(
"DATABASE_URL",
"postgresql://localhost/saas"
)
connectable = engine_from_config(
config_section,
prefix="sqlalchemy.",
poolclass=pool.NullPool,
)
with connectable.connect() as connection:
context.configure(
connection=connection,
target_metadata=target_metadata
)
with context.begin_transaction():
context.run_migrations()
Configure the database URL in alembic.ini:
sqlalchemy.url = postgresql://user:pass@localhost/saas
Or use an environment variable:
import os
sqlalchemy_url = os.getenv("DATABASE_URL", "sqlite:///./saas.db")
Auto-Generating Migrations
When you add a column to a model, Alembic detects the change and generates a migration:
# Create a new migration based on model changes
alembic revision --autogenerate -m "Add email_verified column to users"
This generates a file like migrations/versions/2024_06_02_123456_add_email_verified_column_to_users.py. Review it:
"""Add email_verified column to users."""
from alembic import op
import sqlalchemy as sa
revision = "abc1234567"
down_revision = "xyz7654321"
branch_labels = None
depends_on = None
def upgrade() -> None:
"""Apply the migration (forward direction)."""
op.add_column(
"users",
sa.Column("email_verified", sa.Boolean(), nullable=False, server_default="false")
)
op.create_index(
op.f("idx_users_email_verified"),
"users",
["email_verified"]
)
def downgrade() -> None:
"""Revert the migration (backward direction)."""
op.drop_index(op.f("idx_users_email_verified"), table_name="users")
op.drop_column("users", "email_verified")
Key observations:
upgrade()applies the migration;downgrade()reverts it.- Always set
server_defaultwhen adding a non-nullable column to existing tables (avoids errors on existing rows). - Create indexes for columns used in WHERE clauses (
email_verifiedmay be filtered frequently).
Writing Custom Migrations
Auto-generation misses complex changes. For custom migrations, edit the generated file:
"""Backfill payment_status for existing subscriptions."""
from alembic import op
import sqlalchemy as sa
revision = "def1234567"
down_revision = "abc1234567"
def upgrade() -> None:
"""Set payment_status = 'active' for all existing subscriptions."""
connection = op.get_bind()
# Add the column
op.add_column(
"subscriptions",
sa.Column("payment_status", sa.String(50), nullable=True)
)
# Backfill existing data
connection.execute(
sa.text(
"UPDATE subscriptions SET payment_status = 'active' WHERE payment_status IS NULL"
)
)
# Make the column not-null after data is filled
op.alter_column("subscriptions", "payment_status", nullable=False)
def downgrade() -> None:
"""Revert to previous state."""
op.drop_column("subscriptions", "payment_status")
For data transformation across tenants, use:
def upgrade() -> None:
"""Migrate users to have a profile_picture column."""
connection = op.get_bind()
op.add_column(
"users",
sa.Column("profile_picture_url", sa.String(500), nullable=True)
)
# For each tenant, copy a default avatar per user
connection.execute(
sa.text(
"""
UPDATE users
SET profile_picture_url = CONCAT(
'https://avatars.example.com/user/', id, '.png'
)
WHERE profile_picture_url IS NULL
"""
)
)
Running Migrations
Apply all pending migrations:
alembic upgrade head
Upgrade to a specific revision:
alembic upgrade <revision-id>
Downgrade (revert) the last migration:
alembic downgrade -1
View migration history:
alembic history
# Output:
# 2024-06-01 10:00:00,123 @ abc1234567 -> def4567890 (head), Add email_verified column to users
# 2024-05-31 15:30:00,456 @ xyz7654321 -> abc1234567, Create users table
Check the current database version:
alembic current
# Output: def4567890
Zero-Downtime Migration Strategy
Some schema changes require temporary columns and careful sequencing to avoid downtime. For example, renaming a column:
# Migration 1: Create new column, copy data
def upgrade_v1() -> None:
op.add_column("users", sa.Column("first_name_new", sa.String(100), nullable=True))
connection = op.get_bind()
connection.execute(
sa.text("UPDATE users SET first_name_new = first_name")
)
op.alter_column("users", "first_name_new", nullable=False)
# Migration 2 (after code deployed): Drop old column
def upgrade_v2() -> None:
op.drop_column("users", "first_name")
op.rename_table("users_first_name_new", "users_first_name")
Deploy the code change (reading from first_name_new) between migrations. This avoids the "table locked" state that would happen if you renamed a column directly on a live database.
Handling Failed Migrations
If a migration fails mid-way, check the alembic_version table:
SELECT * FROM alembic_version;
-- Output: (version_num,)
-- (abc1234567,)
If the upgrade failed partway, the table may be inconsistent. Manually verify the schema, then update the version table:
UPDATE alembic_version SET version_num = 'abc1234567' WHERE version_num = 'def4567890';
Test the migration on a production data snapshot (AWS RDS read replica) before deploying to production.
Continuous Integration for Migrations
Add migration checks to CI:
# Verify all migrations can run on a fresh database
alembic upgrade head
# Check for syntax errors
python -m py_compile migrations/versions/*.py
In your GitHub Actions or GitLab CI:
test-migrations:
runs-on: ubuntu-latest
services:
postgres:
image: postgres:15
env:
POSTGRES_PASSWORD: postgres
POSTGRES_DB: saas_test
steps:
- uses: actions/checkout@v3
- name: Run migrations
env:
DATABASE_URL: postgresql://postgres:postgres@localhost/saas_test
run: alembic upgrade head
Key Takeaways
- Migrations are version-controlled SQL changes; Alembic auto-generates them from SQLAlchemy models.
- Always write custom migrations for data transformations (backfills, aggregations).
- Test migrations on production-size datasets (RDS read replica) before deploying.
- Zero-downtime migrations use temporary columns; apply code and schema changes in separate steps.
- Store migration state in
alembic_versiontable; verify before manual edits.
Frequently Asked Questions
Can I auto-generate migrations for a production database?
Yes, but review them carefully. Alembic may not detect all changes (e.g., changes to indexes, constraints). Always review the generated SQL, especially on production-bound changes.
What if I need to skip a migration?
Rarely necessary. If you need to, manually update alembic_version to the desired revision, then apply the next one. This is dangerous; prefer creating a new migration that achieves the same end state.
How do I handle migrations for different database vendors (PostgreSQL, MySQL)?
Alembic supports conditional migrations using the context.get_context().dialect.name to branch logic. Write vendor-specific migrations in separate files or use conditional blocks in a single migration.
Can I run migrations from Python code, not just the CLI?
Yes. Use alembic.config.Config() and alembic.command.upgrade():
from alembic.config import Config
from alembic import command
alembic_cfg = Config("alembic.ini")
alembic_cfg.set_main_option("sqlalchemy.url", DATABASE_URL)
command.upgrade(alembic_cfg, "head")
What if a migration takes too long and locks the table?
Use PostgreSQL's CONCURRENTLY option for indexes:
op.create_index(op.f("idx_users_email"), "users", ["email"], postgresql_concurrently=True)
Or manually run it outside the transaction window during maintenance.