Skip to main content

SQLAlchemy ORM Setup & Configuration Guide

SQLAlchemy ORM is a Python library that maps database tables to Python classes, letting you query and manipulate data using objects instead of raw SQL. Installation and configuration take under five minutes and unlock access to Pythonic database interactions for any relational database.

Installing SQLAlchemy 2.0

SQLAlchemy 2.0 (released in 2023) standardized the ORM API and deprecates legacy patterns. Begin by installing SQLAlchemy and a database driver. Open your terminal and run:

pip install sqlalchemy[asyncio] psycopg2-binary

This installs SQLAlchemy 2.0+ with async support and the PostgreSQL driver. For SQLite (no extra driver needed) or MySQL, adjust:

# SQLite (built into Python, no driver required)
pip install sqlalchemy

# MySQL
pip install sqlalchemy[asyncio] pymysql

Verify the installation:

import sqlalchemy
print(sqlalchemy.__version__) # Should output 2.0.0 or higher

Creating Your First Database Engine

A database engine is SQLAlchemy's connection pool—it manages how your application connects to the database. Create a new file db.py:

from sqlalchemy import create_engine

# SQLite (file-based, perfect for learning)
engine = create_engine("sqlite:///mydatabase.db", echo=True)

# PostgreSQL
engine = create_engine(
"postgresql+psycopg2://user:password@localhost:5432/mydb",
echo=True
)

# MySQL
engine = create_engine(
"mysql+pymysql://user:password@localhost:3306/mydb",
echo=True
)

The echo=True parameter logs every SQL statement to the console—invaluable for learning. In production, set echo=False to reduce overhead. The connection string (database URL) follows the format: dialect+driver://user:password@host:port/database.

Configuring Connection Pooling

For multi-threaded applications, configure a connection pool to reuse database connections and avoid exhausting system resources:

from sqlalchemy import create_engine
from sqlalchemy.pool import QueuePool

engine = create_engine(
"postgresql+psycopg2://user:password@localhost:5432/mydb",
poolclass=QueuePool,
pool_size=10, # Number of connections to maintain
max_overflow=20, # Additional connections when pool is exhausted
pool_recycle=3600, # Recycle connections after 1 hour (prevents timeout)
echo=False
)

This configuration maintains 10 active connections and creates up to 20 more if demand spikes. The pool_recycle parameter is critical for cloud databases (RDS, Cloud SQL) that time out idle connections.

Setting Up the Declarative Base

The declarative base is a Python class from which all your models inherit. It ties your model definitions to the engine. In db.py, add:

from sqlalchemy import create_engine
from sqlalchemy.orm import declarative_base, Session

Base = declarative_base()
engine = create_engine("sqlite:///mydatabase.db", echo=True)

def init_db():
"""Create all tables defined in models."""
Base.metadata.create_all(engine)

def get_session():
"""Return a new database session."""
return Session(engine)

The declarative_base() function returns a class that acts as a registry for all your models. When you define a model and inherit from Base, SQLAlchemy automatically tracks the class and maps it to a database table. Calling Base.metadata.create_all(engine) uses the engine to create all tables in the database if they don't exist.

Verifying Your Setup

Create a simple script test_connection.py to verify everything works:

from db import engine, init_db, get_session

# Create all tables (none exist yet, but this verifies the connection)
init_db()

# Test the connection
with get_session() as session:
result = session.execute("SELECT 1 AS test")
print(result.fetchone()) # Should print (1,)

print("Connection successful!")

Run the script:

python test_connection.py

If you see "Connection successful!" printed and (with echo=True) SQL statements logged, your setup is complete.

Key Takeaways

  • SQLAlchemy ORM maps database tables to Python classes, eliminating raw SQL boilerplate.
  • Install via pip: pip install sqlalchemy[asyncio] psycopg2-binary (adjust driver for your database).
  • Create an engine using create_engine() with a database URL (connection string).
  • Configure pooling for production to reuse connections and avoid resource exhaustion.
  • Declarative base (declarative_base()) is the registry for all your models and connects them to the engine.
  • Verify setup by calling Base.metadata.create_all(engine) and testing a simple query.

Frequently Asked Questions

What is the difference between echo=True and echo=False?

With echo=True, SQLAlchemy logs every SQL statement to console—useful for learning and debugging. In production, set echo=False to eliminate the logging overhead. You can also pass a Python logger to echo for more control.

Why do I need a connection pool?

A connection pool reuses database connections across requests instead of creating a new one each time. Without pooling, opening 1,000 concurrent requests would open 1,000 connections, overwhelming your database. A pool with pool_size=10 and max_overflow=20 handles spikes efficiently.

Can I change my database after setup?

Yes, but it's unusual. The connection string is isolated in the engine creation, so you can change it before initializing Base.metadata.create_all(engine). Once tables are created, migrate using tools like Alembic.

What is pool_recycle and why is it important?

Some cloud databases (AWS RDS, Google Cloud SQL) close idle connections after a timeout (typically 15–30 minutes). pool_recycle=3600 tells SQLAlchemy to discard and recreate connections older than 1 hour, preventing "connection lost" errors on long-running applications.

Is SQLite suitable for production?

SQLite is excellent for single-process applications and development but is not ideal for high-concurrency production systems. Use PostgreSQL or MySQL for production web services. SQLite is perfect for learning, testing, and desktop applications.

Further Reading