Alembic with SQLAlchemy ORM: Syncing Models and Migrations
Alembic's power comes from its tight integration with SQLAlchemy's ORM. When you define a model in SQLAlchemy—a Python class representing a database table—Alembic can read your model definitions and auto-generate migrations that match. This keeps your code and database schema in perfect synchronization, preventing the "schema drift" where your models and database diverge.
The SQLAlchemy Declarative Base
All SQLAlchemy models inherit from a common base class:
# app/models.py
from sqlalchemy import Column, Integer, String, DateTime, ForeignKey, create_engine
from sqlalchemy.orm import DeclarativeBase, relationship
from datetime import datetime
class Base(DeclarativeBase):
pass
class User(Base):
__tablename__ = 'users'
id = Column(Integer, primary_key=True)
username = Column(String(50), unique=True, nullable=False)
email = Column(String(100), unique=True, nullable=False)
created_at = Column(DateTime, default=datetime.utcnow)
class Post(Base):
__tablename__ = 'posts'
id = Column(Integer, primary_key=True)
user_id = Column(Integer, ForeignKey('users.id'), nullable=False)
title = Column(String(200), nullable=False)
body = Column(String, nullable=False)
created_at = Column(DateTime, default=datetime.utcnow)
user = relationship(User, backref='posts')
The Base class is the registry: it holds metadata about all tables and columns. Alembic needs access to Base.metadata to see your schema.
Linking Alembic to Your Models
In migrations/env.py, import your Base and assign it to target_metadata:
# migrations/env.py
import os
from sqlalchemy import engine_from_config, pool
from alembic import context
# Import your models
from app.models import Base
# This is the critical line: tells Alembic where your models are
target_metadata = Base.metadata
# ... rest of env.py ...
Without this line, Alembic can't see your models and autogenerate won't work.
Autogenerate Workflow with Models
With models and target_metadata set up, the workflow becomes:
- Define or modify a model:
# app/models.py - add a new field to User
class User(Base):
__tablename__ = 'users'
id = Column(Integer, primary_key=True)
username = Column(String(50), unique=True, nullable=False)
email = Column(String(100), unique=True, nullable=False)
created_at = Column(DateTime, default=datetime.utcnow)
phone = Column(String(20)) # NEW FIELD
- Run autogenerate:
alembic revision --autogenerate -m "Add phone to users"
- Review the generated migration:
# migrations/versions/004_add_phone_to_users.py
def upgrade():
op.add_column('users', sa.Column('phone', sa.String(20), nullable=True))
def downgrade():
op.drop_column('users', 'phone')
- Apply it:
alembic upgrade head
Now your database schema matches your models exactly.
Handling Relationships
SQLAlchemy relationships (like backref, foreign_key) are typically not persisted to the database as columns. Alembic ignores them and focuses on the actual column definitions. So relationships don't trigger autogenerate changes.
However, the ForeignKey constraint does:
class Post(Base):
__tablename__ = 'posts'
user_id = Column(Integer, ForeignKey('users.id')) # This DOES trigger autogenerate
user = relationship(User) # This does NOT trigger autogenerate
When you add a ForeignKey, autogenerate produces:
op.create_foreign_key('fk_posts_user_id', 'posts', 'users', ['user_id'], ['id'])
Common Model-to-Migration Patterns
| Model Change | Autogenerate Produces | Review For |
|---|---|---|
Add Column(String(50)) | op.add_column() | Correct type, nullable |
Add ForeignKey() | op.create_foreign_key() | Correct references, cascade options |
Add unique=True | op.create_unique_constraint() | Constraint name |
Add index=True | op.create_index() | Index name and columns |
| Rename a column | op.drop_column() + op.add_column() | MANUAL: use op.alter_column() instead |
Change String(50) to String(100) | op.alter_column() | Type change correctness |
Add nullable=False to existing column | op.alter_column() | Data must exist first! |
The Nullable Gotcha
A common mistake: you add a column to your model with nullable=False, but your database already has existing rows. Autogenerate produces:
op.add_column('users', sa.Column('status', sa.String(20), nullable=False))
This fails in production because existing rows have no value for status. Fix it by:
- Adding the column as
nullable=Truefirst - Populating values (data migration)
- Making it
nullable=Falsein a follow-up migration
Or in one migration:
def upgrade():
# Add as nullable
op.add_column('users', sa.Column('status', sa.String(20)))
# Populate with a default
op.execute("UPDATE users SET status = 'active'")
# Now make it non-nullable
op.alter_column('users', 'status', nullable=False, existing_type=sa.String(20))
Preventing Schema Drift
Schema drift happens when your models and database disagree. For example:
- You add a column to a model but forget to run migrations
- Someone runs a manual ALTER TABLE on production
- You revert a migration but keep the model change
To detect drift early:
# In your app startup (e.g., app.py)
from sqlalchemy import inspect
from app.models import Base, User, Post
def check_schema_drift():
"""Compare model definitions to database schema."""
inspector = inspect(engine)
for model in [User, Post]:
table_name = model.__tablename__
db_columns = {col['name'] for col in inspector.get_columns(table_name)}
model_columns = {col.name for col in model.__table__.columns}
if db_columns != model_columns:
print(f"SCHEMA DRIFT in {table_name}!")
print(f" Database has: {db_columns}")
print(f" Model expects: {model_columns}")
raise RuntimeError("Run migrations to sync schema")
# Call during startup
check_schema_drift()
Using Sessions to Test Migrations
After applying a migration, test your models against the new schema:
# In your migration test file
from sqlalchemy import create_engine
from sqlalchemy.orm import Session
from app.models import Base, User
def test_migration_adds_phone_column():
engine = create_engine('sqlite:///test.db')
Base.metadata.create_all(engine) # Create fresh schema
# Try to create a User with the new phone field
with Session(engine) as session:
user = User(username='alice', email='[email protected]', phone='555-1234')
session.add(user)
session.commit()
assert user.phone == '555-1234'
Organizing Models for Large Projects
For projects with many models, organize them in separate files:
app/
models/
__init__.py
base.py # Define Base
users.py # User model
posts.py # Post model
comments.py # Comment model
db.py
app.py
# app/models/base.py
from sqlalchemy.orm import DeclarativeBase
class Base(DeclarativeBase):
pass
# app/models/__init__.py
from app.models.base import Base
from app.models.users import User
from app.models.posts import Post
from app.models.comments import Comment
__all__ = ['Base', 'User', 'Post', 'Comment']
# migrations/env.py
from app.models import Base # Imports all models transitively
target_metadata = Base.metadata
Key Takeaways
- Define all models using a common
DeclarativeBase(Base) class - Set
target_metadata = Base.metadatainenv.pyso Alembic can read your models - Use
alembic revision --autogenerateto generate migrations that match your models - Always review generated migrations; autogenerate has limitations
- Handle nullable columns carefully: add nullable, populate, then constrain
- Test migrations against your models to ensure they work together
- Check for schema drift in development to catch out-of-sync models early
Frequently Asked Questions
Can I have multiple Base classes?
Technically yes, but don't. Use a single Base for all models in your project. If you have multiple inheritance hierarchies, they should still inherit from the same root Base class.
What if I import a model inside env.py and get circular imports?
Reorganize your imports. Move model definitions into their own module that doesn't import your app logic. Then env.py can safely import models without triggering app initialization.
Why does autogenerate sometimes produce an empty migration?
If your models exactly match your database, autogenerate finds no differences and produces an empty migration. This is expected. You can delete the empty file from versions/ or keep it for documentation. In CI, delete empty migrations automatically to avoid clutter.
Can I use alembic without defining models?
Yes. For raw SQL databases or when you don't use SQLAlchemy ORM, write migrations by hand. Don't set target_metadata to model metadata; instead, let Alembic read your database schema directly (slower and less reliable).