LRU Cache in Python: Memoization for Speed
Memoization—caching function results—transforms exponential algorithms into linear ones. Python's functools.lru_cache decorator caches results automatically, turning O(2ⁿ) recursive fibonacci into O(n) with a single line. Understanding when and how to cache results is a powerful optimization technique.
The Problem: Redundant Computation
Recursive algorithms often recalculate the same subproblems. Naive fibonacci illustrates this:
# O(2ⁿ) — redundant calculations
def fibonacci(n):
if n <= 1:
return n
return fibonacci(n - 1) + fibonacci(n - 2)
# fibonacci(5) calls:
# fibonacci(4) + fibonacci(3)
# = (fibonacci(3) + fibonacci(2)) + (fibonacci(2) + fibonacci(1))
# = fibonacci(3) called twice, fibonacci(2) called three times
To compute fibonacci(40), Python recalculates fibonacci(5) billions of times. Each recalculation is wasted work.
The Solution: LRU Cache
The functools.lru_cache decorator caches results. The first call computes; subsequent calls with the same arguments return the cached result:
from functools import lru_cache
# O(n) — each subproblem computed once
@lru_cache(maxsize=None)
def fibonacci(n):
if n <= 1:
return n
return fibonacci(n - 1) + fibonacci(n - 2)
# Benchmark
import timeit
time_uncached = timeit.timeit(lambda: fibonacci.__wrapped__(35), number=1)
time_cached = timeit.timeit(lambda: fibonacci(35), number=1)
print(f"Uncached: {time_uncached:.2f}s")
print(f"Cached: {time_cached:.4f}s")
Output (typical):
Uncached: 3.52s
Cached: 0.0001s
Cached computation is 35,000 times faster because each fibonacci(k) is computed once and reused.
How LRU Cache Works
LRU stands for "Least Recently Used." The decorator maintains a fixed-size dictionary of cached results:
- First call with arguments
(5,): compute, store in cache{(5,): 5} - Second call with arguments
(5,): lookup in cache, return immediately - Call with different arguments
(6,): compute if not cached, store new entry - Cache full: evict the least recently used entry to make room
@lru_cache(maxsize=128)
def expensive_function(x):
return x ** 2
# First call — computes
result = expensive_function(5)
# Second call with same argument — returns cached result
result = expensive_function(5)
# Check cache statistics
print(expensive_function.cache_info())
# CacheInfo(hits=1, misses=1, maxsize=128, currsize=1)
The cache_info() method reveals cache performance. "Hits" are cached returns; "misses" are computations.
Setting maxsize
The maxsize parameter controls cache capacity:
# Unlimited cache — use with caution
@lru_cache(maxsize=None)
def fibonacci(n):
if n <= 1:
return n
return fibonacci(n - 1) + fibonacci(n - 2)
# Fixed-size cache — bounded memory
@lru_cache(maxsize=128)
def translate_word(word, language):
# Expensive API call
return api.translate(word, language)
# No cache — maxsize=0
@lru_cache(maxsize=0)
def this_function_is_not_cached():
return "never cached"
For production, always set a reasonable maxsize. maxsize=None means unbounded, risking memory leaks if your function is called with many unique arguments.
Practical Example: Expensive Computation
Caching saves real time in practical scenarios:
import requests
from functools import lru_cache
import time
# Simulated slow API call
@lru_cache(maxsize=1000)
def get_user_data(user_id):
"""Fetch user data — caches results."""
time.sleep(1) # simulate network latency
return {'id': user_id, 'name': f'User {user_id}'}
# First call — 1 second
start = time.time()
data = get_user_data(1)
print(f"First call: {time.time() - start:.2f}s")
# Second call — near instant
start = time.time()
data = get_user_data(1)
print(f"Second call: {time.time() - start:.4f}s")
# Check cache
print(get_user_data.cache_info())
Output:
First call: 1.00s
Second call: 0.0001s
CacheInfo(hits=1, misses=1, maxsize=1000, currsize=1)
In real applications (database queries, API calls, complex calculations), cached calls run 100-1000 times faster.
Caching with Multiple Arguments
LRU cache works with functions of any arity:
@lru_cache(maxsize=256)
def multiply(a, b):
return a * b
result1 = multiply(3, 4) # computes 12
result2 = multiply(3, 4) # returns cached 12
result3 = multiply(4, 3) # computes 12 (different argument order)
print(multiply.cache_info())
# CacheInfo(hits=1, misses=2, maxsize=256, currsize=2)
The cache key is the tuple of all arguments. multiply(3, 4) and multiply(4, 3) are different cache entries.
Important: Arguments Must Be Hashable
LRU cache uses arguments as dictionary keys, so they must be hashable (immutable):
@lru_cache(maxsize=128)
def process_numbers(numbers):
# TypeError: unhashable type: 'list'
return sum(numbers)
# Solution: convert to tuple (immutable, hashable)
@lru_cache(maxsize=128)
def process_numbers(numbers):
return sum(numbers)
process_numbers((1, 2, 3)) # OK — tuple is hashable
process_numbers([1, 2, 3]) # TypeError — list is not hashable
For non-hashable arguments, convert to equivalent hashable types (tuple instead of list, frozenset instead of set).
Clearing and Resetting Cache
Use cache_clear() to reset cached results:
@lru_cache(maxsize=128)
def expensive_query(query):
return database.query(query)
# Cache grows as queries are made
expensive_query("SELECT * FROM users")
expensive_query("SELECT * FROM posts")
# Clear all cached results (useful after database updates)
expensive_query.cache_clear()
# Cache is now empty
print(expensive_query.cache_info()) # currsize=0
This is useful when the cached data is no longer valid (after a database update, for example).
Using Cache in Web Applications
Web servers often cache expensive computations:
from flask import Flask
from functools import lru_cache
app = Flask(__name__)
@lru_cache(maxsize=500)
def get_product_details(product_id):
"""Expensive database/API call."""
return database.get_product(product_id)
@app.route('/product/<int:product_id>')
def product_page(product_id):
details = get_product_details(product_id)
return render_template('product.html', product=details)
# Clear cache when product is updated
@app.route('/admin/product/<int:product_id>/update', methods=['POST'])
def update_product(product_id):
# Update database
database.update_product(product_id, request.data)
# Invalidate cache
get_product_details.cache_clear()
return {'status': 'updated'}
The first request to /product/123 computes and caches the details. Subsequent requests return instantly. After an update, cache_clear() forces recalculation.
Monitoring Cache Performance
Track cache effectiveness to optimize maxsize:
@lru_cache(maxsize=100)
def expensive_operation(x):
return x ** 2
# Simulate workload
for i in range(10000):
expensive_operation(i % 50) # 50 unique arguments
info = expensive_operation.cache_info()
hit_rate = info.hits / (info.hits + info.misses) * 100
print(f"Cache hit rate: {hit_rate:.1f}%")
print(f"Current size: {info.currsize}/{info.maxsize}")
If hit rate is low, increase maxsize. If currsize never exceeds 20 but maxsize is 100, you can lower maxsize to save memory.
Cache Decorator from Scratch
Understanding what @lru_cache does clarifies the concept:
from functools import wraps
def simple_cache(func):
"""Minimal caching decorator — not LRU, just a simple dict."""
cache = {}
@wraps(func)
def wrapper(*args, **kwargs):
# Create a cache key from arguments
key = (args, tuple(kwargs.items()))
if key in cache:
return cache[key]
# Compute and cache result
result = func(*args, **kwargs)
cache[key] = result
return result
wrapper.cache = cache
return wrapper
@simple_cache
def fibonacci(n):
if n <= 1:
return n
return fibonacci(n - 1) + fibonacci(n - 2)
# Works like lru_cache
print(fibonacci(30))
print(fibonacci.cache)
This simplified version shows the pattern. Real lru_cache adds eviction policy and more sophisticated caching.
Key Takeaways
- LRU cache transforms exponential algorithms like recursive fibonacci from O(2ⁿ) to O(n) with one decorator
- Cache results of expensive operations (API calls, database queries, computations) for 10-1000x speedup
- Set
maxsizeto a reasonable bound (128-1000) for production to prevent memory growth - Cache arguments must be hashable; convert lists to tuples if needed
- Monitor cache hit rate with
cache_info()to tunemaxsize - Clear cache with
cache_clear()when underlying data changes
Frequently Asked Questions
What's the memory overhead of LRU cache?
Each cached result uses O(1) memory (a dictionary entry) plus the space of the result value. If you cache 1000 large objects, expect to use that much extra memory. maxsize bounds this. Monitor currsize to ensure you're not caching more than expected.
Can I use LRU cache on class methods?
Yes, but arguments must be hashable. self is not hashable for bound methods. Use the decorator on functions, not instance methods, or make it work with @cached_property for single values per instance.
What if I need to cache based on some arguments but not others?
Create a wrapper that caches only relevant arguments:
@lru_cache(maxsize=128)
def cached_operation(x, y, _metadata=None):
# Only x and y are cache keys; _metadata is ignored
return x + y
# These return the same cached result
cached_operation(1, 2, _metadata='info1')
cached_operation(1, 2, _metadata='info2')
Is LRU cache thread-safe?
In CPython, due to the Global Interpreter Lock (GIL), basic operations are atomic. However, for multi-threaded applications, explicit locking is safer. Use threading.Lock if your cache modifies shared state.
How do I invalidate specific cache entries, not the whole cache?
@lru_cache doesn't support selective invalidation. Use a custom cache implementation or third-party libraries like cachetools for more control.