Connection Pooling in Python Databases
Creating a new database connection is expensive: it involves a TLS handshake, authentication, and initialization—taking 10–100 milliseconds on average. For a web application serving thousands of users, creating a fresh connection for every request quickly bottlenecks the server. Connection pooling solves this by maintaining a set of reusable connections, allowing your code to borrow from the pool, use it, and return it without the overhead of creation and teardown.
This tutorial shows you how to implement connection pooling with psycopg-pool, SQLAlchemy, and why it matters for production systems.
Why Connection Pooling Matters
Consider a web application serving 1000 requests per second. Without pooling, each request creates a connection (10 ms) and closes it (5 ms). That's 15 ms overhead per request—significant. With pooling, each request borrows a connection from a pool of 10–50 idle connections. The overhead drops to microseconds.
Beyond performance, pooling provides:
- Resource limits: The pool caps the number of connections (e.g., 20 max), preventing your database server from being overwhelmed.
- Recovery: If a connection fails silently, the pool detects it and creates a replacement.
- Fairness: All requests wait fairly for available connections, with a queue.
A typical pool has min_size idle connections and grows to max_size under load:
Idle: [conn1] [conn2] [conn3] (min_size=3)
Under load: [conn1(active)] [conn2(active)] [conn3] [conn4] [conn5] (max_size=5)
Connection Pooling with psycopg-pool
The psycopg library (version 3+) includes a built-in pool via psycopg-pool:
pip install "psycopg[pool]"
Here's a simple pool:
from psycopg_pool import ConnectionPool
# Create a pool with 5–10 connections
pool = ConnectionPool(
'postgresql://user:password@localhost:5432/myapp',
min_size=5,
max_size=10
)
# Borrow a connection, use it, return it
with pool.connection() as conn:
cursor = conn.cursor()
cursor.execute('SELECT COUNT(*) FROM users')
count = cursor.fetchone()[0]
print(f"Total users: {count}")
# Connection is returned to the pool when exiting the with block
# Clean up when done
pool.close()
The with pool.connection() statement is the key: it borrows a connection, executes queries, and automatically returns it to the pool. If the connection fails or is stale, the pool detects it and creates a replacement.
For a production web application, create the pool once at startup:
from fastapi import FastAPI, Depends
from psycopg_pool import ConnectionPool
app = FastAPI()
# Create pool at startup
pool = None
@app.on_event("startup")
async def startup():
global pool
pool = ConnectionPool(
'postgresql://user:password@localhost:5432/myapp',
min_size=10,
max_size=20
)
@app.on_event("shutdown")
async def shutdown():
pool.close()
@app.get("/users")
def list_users():
with pool.connection() as conn:
cursor = conn.cursor()
cursor.execute('SELECT id, name, email FROM users LIMIT 10')
rows = cursor.fetchall()
return rows
@app.get("/users/{user_id}")
def get_user(user_id: int):
with pool.connection() as conn:
cursor = conn.cursor()
cursor.execute('SELECT id, name, email FROM users WHERE id = %s', (user_id,))
user = cursor.fetchone()
return user
Each request borrows a connection from the pool. If all connections are in use, the request waits up to timeout seconds (default 30) for one to become available. If the wait times out, an exception is raised.
Advanced Pool Configuration
The pool supports several tuning options:
from psycopg_pool import ConnectionPool
pool = ConnectionPool(
'postgresql://user:password@localhost:5432/myapp',
min_size=5,
max_size=20,
max_waiting=10, # Queue up to 10 requests waiting for a connection
timeout=30, # Raise exception if no connection available after 30 seconds
max_idle=600, # Close idle connections after 10 minutes
max_lifetime=3600, # Replace connections after 1 hour
check=psycopg.extensions.check_lost_connection, # Detect lost connections
)
print(f"Pool has {pool.nconn} total connections")
print(f"Pool has {pool.nfree} free connections")
Key parameters:
min_size,max_size: Bounds for connection count.max_waiting: Queue limit for requests. If exceeded, raise exception.timeout: Seconds to wait for a connection. Prevents indefinite hangs.max_idle: Idle connections are closed after this many seconds (default: infinity, meaning idle connections persist).max_lifetime: Connections are replaced after this age in seconds, preventing stale connections.check: Function to detect connection health (e.g., detect lost connections).
For most applications, the defaults work well. Tune max_size based on your database server's limits and max_idle to save resources.
Connection Pooling with SQLAlchemy
SQLAlchemy is a Python ORM that abstracts database interactions. It includes built-in pooling via sqlalchemy.pool:
from sqlalchemy import create_engine, text
from sqlalchemy.pool import QueuePool
# Create an engine with a connection pool
engine = create_engine(
'postgresql://user:password@localhost:5432/myapp',
poolclass=QueuePool,
pool_size=10, # Number of idle connections to maintain
max_overflow=20, # Allow up to 20 additional connections under load
pool_recycle=3600, # Recycle connections after 1 hour
)
# Use a connection from the pool
with engine.connect() as conn:
result = conn.execute(text('SELECT COUNT(*) FROM users'))
count = result.scalar()
print(f"Total users: {count}")
# Close the pool when done
engine.dispose()
SQLAlchemy's QueuePool is similar to psycopg-pool:
pool_size=10: Maintain 10 idle connections.max_overflow=20: Grow up to 30 total (10 + 20).pool_recycle=3600: Recycle connections after 1 hour (prevents stale connections from idle databases).
For Django applications, the database configuration in settings.py includes pooling:
# settings.py
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.postgresql',
'NAME': 'myapp',
'USER': 'user',
'PASSWORD': 'password',
'HOST': 'localhost',
'PORT': '5432',
'CONN_MAX_AGE': 600, # Reuse connections for 10 minutes
}
}
Django uses a simple connection pool internally: CONN_MAX_AGE specifies how long to reuse a connection before closing it.
Monitoring Pool Health
Track pool usage to ensure it's sized correctly:
from psycopg_pool import ConnectionPool
import time
pool = ConnectionPool(
'postgresql://user:password@localhost:5432/myapp',
min_size=5,
max_size=20,
)
# Monitor the pool
for i in range(10):
print(f"Total: {pool.nconn}, Free: {pool.nfree}, Used: {pool.nconn - pool.nfree}")
time.sleep(1)
pool.close()
If nfree is always 0 (all connections busy), increase max_size. If nconn rarely exceeds min_size, you're over-provisioned—reduce max_size to save memory.
Comparison: Pooling Strategies
| Strategy | Connections | Overhead | Complexity | Best For |
|---|---|---|---|---|
| No pooling | 1 per request | High (TLS per request) | Low | Prototypes, occasional queries |
| Simple pool (size=1) | 1 reused | Low | Very low | Single-threaded scripts |
| Medium pool (10–20) | 10–20 total | Very low | Medium | Web apps (100–10k req/s) |
| Large pool (50+) | 50+ total | Very low | Medium | High-throughput systems (10k+ req/s) |
Key Takeaways
- Connection pooling reuses database connections, reducing the 10–100 ms overhead of creating a new connection per request.
- Use
ConnectionPool(psycopg) orcreate_engine(SQLAlchemy) to manage pools in production. - Configure
min_sizeandmax_sizebased on expected load and database limits. - Always use
with pool.connection()or equivalent to return connections to the pool. - Monitor pool health (
nconn,nfree) and adjust sizing if needed. - Without pooling, high-concurrency applications bottleneck quickly.
Frequently Asked Questions
How many connections should my pool have?
Use this formula: pool_size = (core_count * 2) + effective_spindle_count. For a 4-core machine with one SSD, try min_size=8, max_size=12. Start conservative and increase if the pool runs out under load.
What happens if the pool runs out of connections?
By default, the request waits in a queue for a connection to become available. After timeout seconds (default 30), an exception is raised. Set max_waiting to reject requests immediately if the queue is full.
Why does my pool stay at min_size and never grow?
The load might not be high enough to trigger growth, or all requests are fast enough that connections are idle by the time the next request arrives. Monitor with pool.nconn to see actual usage.
Can I share a pool between multiple threads?
Yes, psycopg pools and SQLAlchemy engines are thread-safe. However, don't share a connection itself between threads—borrow a new connection for each thread.
Should I set max_idle or max_lifetime?
max_idle closes connections that haven't been used in a while (saves memory). max_lifetime replaces all connections after an age (prevents stale connections after server restarts). Both are optional; the defaults work for most applications.
How do I monitor connection pool exhaustion in production?
Log the pool state periodically:
import logging
from psycopg_pool import ConnectionPool
pool = ConnectionPool(...)
logging.info(f"Pool: {pool.nconn} total, {pool.nfree} free")
Alternatively, use application performance monitoring (APM) tools like Sentry or New Relic to track connection counts.