Redis Strings and Hashes: Caching Patterns
Redis stores data in simple data types: strings (key-value pairs), hashes (objects), sets (collections), and sorted sets (ranked collections). Strings and hashes are the foundation of caching. A string cache holds a single value (a user's feed, a user's theme); a hash stores an object with multiple fields (a user profile with name, email, role).
I have cached millions of user sessions in Redis strings and product data in hashes. The performance difference between fetching from Redis (0.5 ms) and MongoDB (50 ms) is dramatic on APIs serving 10,000 requests per second. This guide teaches the patterns that work in production.
Getting Started: Installing redis-py
Install the Redis Python client and connect to a Redis server.
pip install redis
import redis
# Connect to local Redis (default: localhost:6379)
r = redis.Redis(host='localhost', port=6379, decode_responses=True)
# Test connection
try:
r.ping()
print("Connected to Redis")
except redis.ConnectionError:
print("Could not connect to Redis. Is it running?")
# For production, use a connection pool and URL
r = redis.Redis.from_url('redis://localhost:6379/0', decode_responses=True)
Set decode_responses=True to automatically convert Redis bytes to Python strings. Without it, every value is bytes and requires .decode().
String Caching: Sessions, Counters, and Feeds
Strings are the simplest Redis data type. Use them for atomic values like user sessions, feature flags, or JSON-serialized objects.
import json
from datetime import datetime, timedelta
r = redis.Redis(host='localhost', port=6379, decode_responses=True)
# Store a simple counter
r.set('user:123:login_count', '42')
count = r.get('user:123:login_count')
print(f"Login count: {int(count)}")
# Increment a counter atomically (no race conditions)
r.incr('user:123:login_count') # Increments by 1
r.incrby('user:123:login_count', 5) # Increments by 5
r.decr('user:123:login_count') # Decrements by 1
# Store a session with expiration (1 hour)
session = {
'user_id': 123,
'email': '[email protected]',
'role': 'admin',
'login_time': datetime.now().isoformat()
}
session_key = f"session:{token}"
r.setex(session_key, 3600, json.dumps(session)) # Expires after 1 hour
# Retrieve and parse the session
session_json = r.get(session_key)
if session_json:
session = json.loads(session_json)
print(f"User: {session['email']}, Role: {session['role']}")
else:
print("Session expired")
# Store multiple keys at once
r.mset({
'user:1:theme': 'dark',
'user:1:language': 'python',
'user:1:notifications_enabled': 'true'
})
# Retrieve multiple keys
values = r.mget('user:1:theme', 'user:1:language')
print(f"Theme: {values[0]}, Language: {values[1]}")
# Check if a key exists
if r.exists('user:123:preferences'):
print("Preferences cached")
# Delete a key
r.delete('user:123:preferences')
# Set a key only if it does not exist (useful for locks)
r.setnx('lock:critical_section', '1') # Returns True if set, False if key exists
The setex() method combines set() and expire() in one atomic operation. For atomic increments, always use incr() instead of fetching, incrementing in Python, and setting—the latter has race conditions when multiple processes access the same counter.
Hash Caching: Storing Objects with Fields
Hashes are ideal for caching objects with multiple fields. Instead of serializing an object to JSON, store each field separately in Redis.
r = redis.Redis(host='localhost', port=6379, decode_responses=True)
# Store a user profile as a hash
user_id = 123
r.hset(f'user:{user_id}', mapping={
'name': 'Alice Chen',
'email': '[email protected]',
'role': 'admin',
'created_at': datetime.now().isoformat(),
'subscription_level': 'premium'
})
# Retrieve the entire hash
user_hash = r.hgetall(f'user:{user_id}')
print(f"User: {user_hash['name']}, Email: {user_hash['email']}")
# Retrieve a single field
email = r.hget(f'user:{user_id}', 'email')
print(f"Email: {email}")
# Retrieve multiple fields
fields = r.hmget(f'user:{user_id}', 'name', 'email', 'role')
print(f"Name: {fields[0]}, Email: {fields[1]}, Role: {fields[2]}")
# Check if a field exists
if r.hexists(f'user:{user_id}', 'email'):
print("Email field exists")
# Update a single field
r.hset(f'user:{user_id}', 'subscription_level', 'premium_plus')
# Increment a numeric field
r.hincrby(f'user:{user_id}', 'post_count', 1) # Increment by 1
r.hincrbyfloat(f'user:{user_id}', 'avg_rating', 0.5) # Increment by 0.5
# Get all keys in a hash
keys = r.hkeys(f'user:{user_id}')
print(f"Hash keys: {keys}")
# Get all values in a hash
values = r.hvals(f'user:{user_id}')
# Get the number of fields
field_count = r.hlen(f'user:{user_id}')
print(f"Hash has {field_count} fields")
# Delete a field from a hash
r.hdel(f'user:{user_id}', 'subscription_level')
Hashes are more flexible than strings for multi-field data. Instead of serializing a whole object (one JSON string) and deserializing it to change one field, you update a single hash field atomically. This is faster and supports partial fetches (e.g., hmget() for just name and email without loading the whole object).
Pattern: Rate Limiting with Strings
Use Redis strings with atomic increments to rate-limit API requests.
from time import time
def is_rate_limited(user_id, limit=100, window_seconds=60):
"""Check if user has exceeded rate limit. Returns True if limited."""
key = f'rate_limit:{user_id}:{int(time()) // window_seconds}'
# Increment request count
count = r.incr(key)
# Set expiration on first request in this window
if count == 1:
r.expire(key, window_seconds)
return count > limit
# In your API handler:
if is_rate_limited(user_id=123):
return {"error": "Rate limit exceeded"}, 429
else:
# Process the request
pass
The key includes the current time window (int(time()) // 60 gives a new key every 60 seconds). When the window changes, a new key is created automatically. When the old key expires, Redis deletes it.
Pattern: JSON Caching in Strings
For complex objects, serialize to JSON and store in a string with a TTL.
import json
# Cache a user's feed (expensive MongoDB query result)
user_id = 123
feed = fetch_feed_from_mongodb(user_id) # Expensive operation
# Cache for 5 minutes
cache_key = f'feed:{user_id}'
r.setex(cache_key, 300, json.dumps(feed))
# On next request, retrieve from cache
cached_feed = r.get(cache_key)
if cached_feed:
feed = json.loads(cached_feed)
print("Served from cache")
else:
feed = fetch_feed_from_mongodb(user_id)
r.setex(cache_key, 300, json.dumps(feed))
print("Fetched from database")
# Invalidate cache when feed is updated
r.delete(cache_key)
This is the standard cache-aside pattern: check Redis first, fall back to the database, update Redis on miss. For write-heavy data, use cache invalidation (delete the key on writes) rather than relying on TTLs.
Pattern: Leaderboard with Hashes
Store a leaderboard (scoreboard) by storing scores as hash fields.
# Store scores in a hash (game_id -> {user_id -> score})
game_id = 'chess_tournament_2024'
# Record scores
r.hset(f'leaderboard:{game_id}', mapping={
'alice': 2500,
'bob': 2400,
'carol': 2300,
'david': 2200
})
# Get a player's score
score = r.hget(f'leaderboard:{game_id}', 'alice')
print(f"Alice's score: {score}")
# Sort (retrieve all and sort in Python, or use sorted sets—see next article)
scores = r.hgetall(f'leaderboard:{game_id}')
sorted_scores = sorted(scores.items(), key=lambda x: int(x[1]), reverse=True)
print("Top players:")
for rank, (player, score) in enumerate(sorted_scores, start=1):
print(f"{rank}. {player}: {score}")
For true leaderboards with ranks and range queries, use Redis sorted sets (covered in the next article). Hashes are simpler if you only need individual scores.
Key Takeaways
- Store atomic values (sessions, counters) in Redis strings; use
setex()for automatic expiration andincr()/decr()for atomic increments - Use hashes to cache objects with multiple fields (user profiles, product data); update individual fields without reserializing
- Serialize complex objects to JSON for string caching; always use
setex()to avoid unbounded cache growth - Rate limit with string keys including time windows; Redis's atomic increment prevents race conditions
- Use
mset()andmget()for bulk operations to reduce network round-trips
Frequently Asked Questions
Should I cache in Redis or MongoDB?
Cache in Redis if you need sub-millisecond latency. MongoDB is 50x slower but durable (survives restarts). Typical pattern: cache hot data in Redis (user feeds, sessions), persist to MongoDB for durability and analysis.
What happens to my data if Redis restarts?
By default, Redis data is lost. Enable persistence with RDB (snapshots) or AOF (append-only file) to survive restarts. For sessions and short-lived caches (no data loss risk), disable persistence to save I/O. For critical caches, enable AOF.
Is JSON caching in strings as fast as hashes?
For single-field lookups, hashes are faster (no JSON parsing). For whole-object retrieval, the difference is negligible. Use hashes for frequently updated fields (user roles, scores); use strings with JSON for read-heavy, infrequently updated objects.
How do I cache with multiple app servers?
All servers connect to the same Redis instance. Redis handles concurrent access. Use Sentinel or Cluster for high availability (multiple Redis nodes).
Can I set expiration on individual hash fields?
No. Expiration is per key, not per field. If you need field-level expiration, store each field as a separate string key with its own TTL.