Skip to main content

Database Migrations with Alembic

A database migration is a version-controlled change to your schema: adding a column, renaming a table, creating an index. As your SaaS grows, you'll change the schema dozens of times. Without migrations, developers apply changes manually on staging, then forget to apply them on production—resulting in runtime errors. Alembic is the standard Python tool for schema migrations; it tracks every change, allows rollbacks, and detects schema conflicts. This guide covers initialization, auto-generating migrations, writing custom migrations, and deploying safely.

Why Migrations Matter in Multi-Tenant SaaS

In a multi-tenant system, all customers share the same database. A schema change must be applied to all data at once without downtime. Alembic lets you test migrations locally, verify they work with millions of rows, and deploy via CI/CD with zero downtime. A failed migration during deploy affects all customers simultaneously, so migration safety is critical to SaaS reliability.

Initializing Alembic

Install Alembic and initialize a repository:

pip install alembic
alembic init migrations

This creates a migrations/ folder with config and version directories. Configure Alembic to auto-detect schema changes:

# migrations/env.py (auto-generated, with edits)

from alembic import context
from sqlalchemy import engine_from_config, pool
from logging.config import fileConfig
from app.models import Base # Import your SQLAlchemy Base

config = context.config
fileConfig(config.config_file_name)
target_metadata = Base.metadata # Auto-detect models

def run_migrations_offline() -> None:
"""Run migrations 'offline' (not connecting to database)."""
context.configure(
url=config.get_main_option("sqlalchemy.url"),
target_metadata=target_metadata,
literal_binds=True,
dialect_opts={"paramstyle": "named"},
)

def run_migrations_online() -> None:
"""Run migrations 'online' (connecting to database)."""
config_section = config.get_section(config.config_prefix)
config_section["sqlalchemy.url"] = os.getenv(
"DATABASE_URL",
"postgresql://localhost/saas"
)

connectable = engine_from_config(
config_section,
prefix="sqlalchemy.",
poolclass=pool.NullPool,
)

with connectable.connect() as connection:
context.configure(
connection=connection,
target_metadata=target_metadata
)
with context.begin_transaction():
context.run_migrations()

Configure the database URL in alembic.ini:

sqlalchemy.url = postgresql://user:pass@localhost/saas

Or use an environment variable:

import os
sqlalchemy_url = os.getenv("DATABASE_URL", "sqlite:///./saas.db")

Auto-Generating Migrations

When you add a column to a model, Alembic detects the change and generates a migration:

# Create a new migration based on model changes
alembic revision --autogenerate -m "Add email_verified column to users"

This generates a file like migrations/versions/2024_06_02_123456_add_email_verified_column_to_users.py. Review it:

"""Add email_verified column to users."""
from alembic import op
import sqlalchemy as sa

revision = "abc1234567"
down_revision = "xyz7654321"
branch_labels = None
depends_on = None

def upgrade() -> None:
"""Apply the migration (forward direction)."""
op.add_column(
"users",
sa.Column("email_verified", sa.Boolean(), nullable=False, server_default="false")
)
op.create_index(
op.f("idx_users_email_verified"),
"users",
["email_verified"]
)

def downgrade() -> None:
"""Revert the migration (backward direction)."""
op.drop_index(op.f("idx_users_email_verified"), table_name="users")
op.drop_column("users", "email_verified")

Key observations:

  • upgrade() applies the migration; downgrade() reverts it.
  • Always set server_default when adding a non-nullable column to existing tables (avoids errors on existing rows).
  • Create indexes for columns used in WHERE clauses (email_verified may be filtered frequently).

Writing Custom Migrations

Auto-generation misses complex changes. For custom migrations, edit the generated file:

"""Backfill payment_status for existing subscriptions."""
from alembic import op
import sqlalchemy as sa

revision = "def1234567"
down_revision = "abc1234567"

def upgrade() -> None:
"""Set payment_status = 'active' for all existing subscriptions."""
connection = op.get_bind()

# Add the column
op.add_column(
"subscriptions",
sa.Column("payment_status", sa.String(50), nullable=True)
)

# Backfill existing data
connection.execute(
sa.text(
"UPDATE subscriptions SET payment_status = 'active' WHERE payment_status IS NULL"
)
)

# Make the column not-null after data is filled
op.alter_column("subscriptions", "payment_status", nullable=False)

def downgrade() -> None:
"""Revert to previous state."""
op.drop_column("subscriptions", "payment_status")

For data transformation across tenants, use:

def upgrade() -> None:
"""Migrate users to have a profile_picture column."""
connection = op.get_bind()

op.add_column(
"users",
sa.Column("profile_picture_url", sa.String(500), nullable=True)
)

# For each tenant, copy a default avatar per user
connection.execute(
sa.text(
"""
UPDATE users
SET profile_picture_url = CONCAT(
'https://avatars.example.com/user/', id, '.png'
)
WHERE profile_picture_url IS NULL
"""
)
)

Running Migrations

Apply all pending migrations:

alembic upgrade head

Upgrade to a specific revision:

alembic upgrade <revision-id>

Downgrade (revert) the last migration:

alembic downgrade -1

View migration history:

alembic history
# Output:
# 2024-06-01 10:00:00,123 @ abc1234567 -> def4567890 (head), Add email_verified column to users
# 2024-05-31 15:30:00,456 @ xyz7654321 -> abc1234567, Create users table

Check the current database version:

alembic current
# Output: def4567890

Zero-Downtime Migration Strategy

Some schema changes require temporary columns and careful sequencing to avoid downtime. For example, renaming a column:

# Migration 1: Create new column, copy data
def upgrade_v1() -> None:
op.add_column("users", sa.Column("first_name_new", sa.String(100), nullable=True))
connection = op.get_bind()
connection.execute(
sa.text("UPDATE users SET first_name_new = first_name")
)
op.alter_column("users", "first_name_new", nullable=False)

# Migration 2 (after code deployed): Drop old column
def upgrade_v2() -> None:
op.drop_column("users", "first_name")
op.rename_table("users_first_name_new", "users_first_name")

Deploy the code change (reading from first_name_new) between migrations. This avoids the "table locked" state that would happen if you renamed a column directly on a live database.

Handling Failed Migrations

If a migration fails mid-way, check the alembic_version table:

SELECT * FROM alembic_version;
-- Output: (version_num,)
-- (abc1234567,)

If the upgrade failed partway, the table may be inconsistent. Manually verify the schema, then update the version table:

UPDATE alembic_version SET version_num = 'abc1234567' WHERE version_num = 'def4567890';

Test the migration on a production data snapshot (AWS RDS read replica) before deploying to production.

Continuous Integration for Migrations

Add migration checks to CI:

# Verify all migrations can run on a fresh database
alembic upgrade head
# Check for syntax errors
python -m py_compile migrations/versions/*.py

In your GitHub Actions or GitLab CI:

test-migrations:
runs-on: ubuntu-latest
services:
postgres:
image: postgres:15
env:
POSTGRES_PASSWORD: postgres
POSTGRES_DB: saas_test
steps:
- uses: actions/checkout@v3
- name: Run migrations
env:
DATABASE_URL: postgresql://postgres:postgres@localhost/saas_test
run: alembic upgrade head

Key Takeaways

  • Migrations are version-controlled SQL changes; Alembic auto-generates them from SQLAlchemy models.
  • Always write custom migrations for data transformations (backfills, aggregations).
  • Test migrations on production-size datasets (RDS read replica) before deploying.
  • Zero-downtime migrations use temporary columns; apply code and schema changes in separate steps.
  • Store migration state in alembic_version table; verify before manual edits.

Frequently Asked Questions

Can I auto-generate migrations for a production database?

Yes, but review them carefully. Alembic may not detect all changes (e.g., changes to indexes, constraints). Always review the generated SQL, especially on production-bound changes.

What if I need to skip a migration?

Rarely necessary. If you need to, manually update alembic_version to the desired revision, then apply the next one. This is dangerous; prefer creating a new migration that achieves the same end state.

How do I handle migrations for different database vendors (PostgreSQL, MySQL)?

Alembic supports conditional migrations using the context.get_context().dialect.name to branch logic. Write vendor-specific migrations in separate files or use conditional blocks in a single migration.

Can I run migrations from Python code, not just the CLI?

Yes. Use alembic.config.Config() and alembic.command.upgrade():

from alembic.config import Config
from alembic import command

alembic_cfg = Config("alembic.ini")
alembic_cfg.set_main_option("sqlalchemy.url", DATABASE_URL)
command.upgrade(alembic_cfg, "head")

What if a migration takes too long and locks the table?

Use PostgreSQL's CONCURRENTLY option for indexes:

op.create_index(op.f("idx_users_email"), "users", ["email"], postgresql_concurrently=True)

Or manually run it outside the transaction window during maintenance.

Further Reading