SQLAlchemy Querying: SELECT Statements & Filtering
SQLAlchemy queries retrieve data from the database using Python syntax instead of raw SQL. A query builds a SELECT statement incrementally by chaining methods like .filter(), .order_by(), and .limit(). Understanding how to construct queries efficiently is essential for any ORM application.
Basic Query Structure
All SQLAlchemy queries follow the same pattern: create a query object, chain filtering/ordering/limiting methods, then execute. The most common execution methods are .all() (returns a list), .first() (returns one or None), and .scalar() (returns a single value):
from sqlalchemy import create_engine
from sqlalchemy.orm import Session
from models import Base, Author
engine = create_engine("sqlite:///library.db")
Base.metadata.create_all(engine)
with Session(engine) as session:
# Fetch all authors
all_authors = session.query(Author).all()
# Fetch the first author
first_author = session.query(Author).first()
# Get total count
author_count = session.query(Author).count()
print(f"Total authors: {author_count}")
The query() method starts a query chain. You pass the model class to query (e.g., Author). Then chain methods to refine the query, and finally call .all(), .first(), or .count() to execute and get results.
Filtering with .filter() and .filter_by()
The .filter() method adds WHERE clauses using comparison operators. .filter_by() is a shorthand for equality comparisons on model attributes:
# Using .filter() with operators
young_authors = session.query(Author).filter(Author.age < 40).all()
# Using .filter_by() (equality shorthand)
austen = session.query(Author).filter_by(name="Jane Austen").first()
# Chaining multiple filters (AND logic)
books = session.query(Book).filter(
Book.author_id == 1
).filter(
Book.published_year > 2020
).all()
# OR logic using .or_()
from sqlalchemy import or_
recent_or_old = session.query(Book).filter(
or_(Book.published_year > 2020, Book.published_year < 1950)
).all()
# NOT logic
not_austen = session.query(Author).filter(
~(Author.name == "Jane Austen")
).all()
Comparison operators include ==, !=, <, >, <=, >=, .ilike() (case-insensitive), .contains(), .startswith(), and .endswith(). Multiple .filter() calls are combined with AND logic.
String Searches and Pattern Matching
For flexible text searches, use string methods:
# Case-insensitive search
books_with_pride = session.query(Book).filter(
Book.title.ilike("%pride%")
).all()
# Starts with
austen_books = session.query(Book).filter(
Book.title.startswith("Pride")
).all()
# Ends with
shakespeares = session.query(Author).filter(
Author.name.endswith("Shakespeare")
).all()
# Exact match (case-sensitive)
exact = session.query(Book).filter(
Book.title == "Pride and Prejudice"
).all()
The % wildcard matches any substring. ilike() is case-insensitive (useful for user input); like() is case-sensitive. Avoid wildcards at the start of patterns when possible, as they prevent index usage on some databases.
Ordering and Limiting Results
Order results by one or more columns and limit the result set:
from sqlalchemy import desc
# Order by ascending (default)
authors_alphabetical = session.query(Author).order_by(Author.name).all()
# Order by descending
recent_books = session.query(Book).order_by(desc(Book.published_year)).all()
# Order by multiple columns
books = session.query(Book).order_by(
Author.name,
desc(Book.published_year)
).all()
# Limit the number of results
first_10_books = session.query(Book).limit(10).all()
# Offset (skip first N results)
books_11_to_20 = session.query(Book).offset(10).limit(10).all()
# Pagination helper
page_num = 2
per_page = 10
books_page_2 = session.query(Book).order_by(Book.id).offset(
(page_num - 1) * per_page
).limit(per_page).all()
The order_by() method accepts model attributes (ascending) or desc(attribute) for descending. Multiple order_by() calls set a priority order. limit() restricts results, and offset() skips the first N results (useful for pagination).
Filtering on Relationships
Queries can filter on related objects using .join():
# Books by Jane Austen (join through author)
austen_books = session.query(Book).join(Author).filter(
Author.name == "Jane Austen"
).all()
# Authors with more than 5 books
from sqlalchemy import func
prolific_authors = session.query(Author).join(Author.books).group_by(
Author.id
).having(func.count(Book.id) > 5).all()
The .join() method connects the query through a foreign key relationship. Without .join(), filtering on related attributes raises an error.
Checking Existence and Counting
Test whether records exist without fetching them:
# Count records matching a condition
book_count = session.query(Book).filter(
Book.author_id == 1
).count()
# Check existence
has_pride = session.query(Book).filter(
Book.title == "Pride and Prejudice"
).first() is not None
# More efficient: use exists()
from sqlalchemy import exists
pride_exists = session.query(
exists().where(Book.title == "Pride and Prejudice")
).scalar()
The .count() method is simpler but less flexible than querying with aggregates. For existence checks, .first() is sufficient; use exists() for complex subqueries.
Distinct Results
Remove duplicates from results:
# Get unique author IDs of all books
author_ids = session.query(Book.author_id).distinct().all()
# Get authors with any book (no duplicates when joining)
authors_with_books = session.query(Author).join(Author.books).distinct().all()
The .distinct() method (available after .query() with a specific column or after .join()) removes duplicate rows.
Common Query Patterns
| Goal | Code |
|---|---|
| Get one record by ID | session.query(Author).filter_by(id=1).first() |
| Get records newer than a date | session.query(Book).filter(Book.created_at > date(2020, 1, 1)).all() |
| Get records between two values | session.query(Book).filter(Book.price.between(10, 50)).all() |
| Get records in a list | session.query(Author).filter(Author.id.in_([1, 2, 3])).all() |
| Count total records | session.query(Author).count() |
Key Takeaways
- Queries chain methods: start with
.query(Model), add.filter(),.order_by(),.limit(), then execute with.all()or.first(). .filter()uses operators (==,<,>,.ilike(),.contains()); multiple filters = AND logic..filter_by()is shorthand for equality:.filter_by(name="Austen")=.filter(Author.name == "Austen").- Order and limit: use
.order_by(),.limit(), and.offset()for sorting and pagination. .join()queries relationships to filter on related objects or join results.
Frequently Asked Questions
What's the difference between .all(), .first(), and .scalar()?
.all() returns a list of all results. .first() returns one result or None. .scalar() returns a single value (one column of one row), used for counts or subqueries.
Why does filtering on a related attribute without .join() fail?
SQLAlchemy needs to know how to connect the tables. .join() tells it the relationship path. Without a join, SQLAlchemy can't translate your filter to SQL.
How do I count records matching a filter?
Use .count(): session.query(Book).filter(Book.author_id == 1).count(). Or use func.count() in a grouped query for aggregates.
Should I use .filter() or .filter_by()?
Use .filter_by() for simple equality checks (more readable). Use .filter() with operators for anything else. Both work the same under the hood.
How do I avoid the N+1 query problem?
This is common: looping over results and accessing a related attribute triggers one query per iteration. See the eager loading article for solutions using joinedload().