Memory Profiling Python: Track RAM Usage Patterns
memory_profiler measures memory allocation and deallocation per line of code, revealing which operations consume RAM and where memory peaks. While CPU profiling asks "where is time spent?", memory profiling asks "where is memory allocated?" and "is memory ever released?" This is essential for finding memory leaks (memory that's allocated but never freed) and optimizing memory-intensive applications like data processing pipelines or machine learning workloads.
I discovered the power of memory profiling while debugging a long-running Python service that crashed after 8 hours with an OutOfMemoryError. CPU profiling showed nothing wrong—the code looked efficient. Memory profiling revealed a single list in a callback function that accumulated 50 MB per hour but was never cleared. The line-by-line memory view made the leak obvious, and clearing the list on each iteration fixed it. Without memory profiling, I would have restarted the service nightly as a bandaid.
Installing memory_profiler
memory_profiler is a third-party package:
pip install memory-profiler
For more detailed output (peak memory, full annotations), also install psutil:
pip install psutil
Basic Usage: Decorator and Measuring Memory
Mark functions with @profile and run with python -m memory_profiler:
@profile
def allocate_memory():
"""Allocate increasing amounts of memory."""
data = []
for i in range(100000):
data.append([i] * 1000)
return data
allocate_memory()
Run:
python -m memory_profiler script.py
Output:
Filename: script.py
Function: allocate_memory at line 1
Total allocated: 312.5 MiB
Peak memory: 312.6 MiB
Line # Mem usage Increment Line Contents
=========================================================
1 43.2 MiB 0.0 MiB @profile
2 43.2 MiB 0.0 MiB def allocate_memory():
3 43.2 MiB 0.0 MiB data = []
4 46.3 MiB 3.1 MiB for i in range(100000):
5 312.6 MiB 266.3 MiB data.append([i] * 1000)
6 312.6 MiB 0.0 MiB return data
Each column:
| Column | Meaning |
|---|---|
| Mem usage | Total memory used by Python at this point (MiB) |
| Increment | Additional memory allocated since the previous line |
| Line Contents | The source code |
Line 5 allocated 266.3 MiB—the culprit. The operation [i] * 1000 creates a list of 1000 copies of i, and you're doing this 100,000 times. That's 100 million list objects.
Finding Memory Leaks: Detecting Unfreed Memory
A memory leak is memory that's allocated but never deallocated. Here's an example:
import sys
cache = {} # Global dictionary acts as a cache
@profile
def process_data(n):
"""Simulate processing, but caches results indefinitely."""
for i in range(n):
data = [x * x for x in range(1000)]
# BUG: We store in cache forever, never evict
cache[i] = data
return len(cache)
result = process_data(10000)
Run with python -m memory_profiler:
Total allocated: 155.2 MiB
Peak memory: 155.3 MiB
Line # Mem usage Increment Line Contents
=========================================================
1 43.2 MiB 0.0 MiB @profile
2 43.2 MiB 0.0 MiB def process_data(n):
3 43.2 MiB 0.0 MiB for i in range(n):
4 43.4 MiB 0.2 MiB data = [x * x for x in range(1000)]
5 155.3 MiB 112.1 MiB cache[i] = data
6 155.3 MiB 0.0 MiB return len(cache)
Line 5 grows from 43.2 MiB to 155.3 MiB—every iteration adds memory that's never freed. The cache is the memory leak. The fix:
cache = {}
MAX_CACHE_SIZE = 100 # Keep cache bounded
@profile
def process_data_fixed(n):
"""Fixed version with bounded cache."""
for i in range(n):
data = [x * x for x in range(1000)]
cache[i] = data
# Evict old entries to keep memory bounded
if len(cache) > MAX_CACHE_SIZE:
oldest = min(cache.keys())
del cache[oldest]
return len(cache)
Now memory grows initially, then stabilizes once the cache is full. Memory profiling caught the leak.
Programmatic Usage Without Decorators
You can profile memory without modifying code:
from memory_profiler import LineProfiler
def allocate_data(n):
"""Function to profile."""
data = []
for i in range(n):
data.append([i] * 100)
return data
profiler = LineProfiler()
profiler.add_function(allocate_data)
profiler.enable()
allocate_data(10000)
profiler.disable()
profiler.print_stats()
Run normally: python script.py. No decorator needed.
Example: Memory Analysis of a Data Processing Pipeline
Here's a realistic scenario: reading a CSV, filtering, and aggregating:
import csv
@profile
def process_csv(filename):
"""Read CSV, filter large rows, aggregate."""
records = []
# Read all rows into memory
with open(filename) as f:
reader = csv.DictReader(f)
for row in reader:
records.append(row) # Allocates memory for every row
# Filter: keep only large numbers
filtered = [r for r in records if int(r.get("value", 0)) > 1000]
# Aggregate by category
agg = {}
for record in filtered:
key = record["category"]
if key not in agg:
agg[key] = 0
agg[key] += 1
return agg
process_csv("data.csv")
If your CSV is 1 GB and you read it all into records, you'll allocate 1 GB for records, then more for filtered. Memory profiling reveals this. The fix: use streaming or generators instead of loading everything:
@profile
def process_csv_streaming(filename):
"""Streaming version—no intermediate storage."""
agg = {}
with open(filename) as f:
reader = csv.DictReader(f)
for row in reader:
# Process one row at a time, don't store
value = int(row.get("value", 0))
if value > 1000:
key = row["category"]
if key not in agg:
agg[key] = 0
agg[key] += 1
return agg
process_csv_streaming("data.csv")
Now memory is bounded by the size of agg, not the input file. Memory profiling shows the memory use remains constant regardless of file size.
Understanding Peak Memory and Increment Columns
Peak memory is the maximum RAM your function uses at any point. If Python allocates 500 MiB then deallocates 300 MiB, peak memory is still 500 MiB. Watch this column if you're near your system's memory limit.
Increment shows allocation per line. A single line with a large increment (e.g., reading a file) allocates a lot of memory. Lines with zero increment don't allocate new memory; they reuse existing allocations.
Example:
@profile
def analyze_peaks():
x = [0] * 10_000_000 # Line 1: allocates ~40 MiB
y = [i for i in x] # Line 2: allocates another ~40 MiB (peak ~80 MiB)
del x # Line 3: frees 40 MiB (but peak stays 80 MiB)
z = sum(y) # Line 4: sums y, no allocation
return z
analyze_peaks()
Memory profiling shows:
- Line 1: Increment +40 MiB
- Line 2: Increment +40 MiB (peak 80 MiB)
- Line 3: Increment 0 MiB (memory freed, but peak doesn't decrease)
- Line 4: Increment 0 MiB
Combining CPU and Memory Profiling
For a comprehensive performance view, profile both CPU and memory:
import cProfile
from memory_profiler import profile
@profile
def slow_and_memory_intensive():
# CPU-intensive
result = sum(i * i for i in range(10_000_000))
# Memory-intensive
data = [[i] * 1000 for i in range(10000)]
return result, len(data)
slow_and_memory_intensive()
# CPU profiling
python -m cProfile -o cpu.prof script.py
# Memory profiling
python -m memory_profiler script.py
Compare the two profiles: CPU profiling might show the sum() taking 50%, while memory profiling shows the list comprehension allocating 80% of memory. Both matter, and memory profiling catches issues that CPU profiling misses.
Key Takeaways
memory_profilerreveals line-by-line memory allocation and deallocation, essential for finding leaks and optimizing memory usage.- Watch the Increment column to find which lines allocate the most; watch Peak memory to ensure you don't exceed system limits.
- Common memory mistakes: reading entire files into memory instead of streaming, caches that grow unbounded, retaining references to large objects unintentionally.
- Combine memory profiling with CPU profiling for complete performance understanding.
- Memory leaks in long-running services are subtle but deadly; memory profiling catches them in minutes.
Frequently Asked Questions
Does memory_profiler measure peak memory across the entire program or per function?
Per function. The peak memory reported is the maximum used by that function during its execution. To measure whole-program memory, use tools like tracemalloc (standard library) or Pympler.
What if my function deallocates memory—will profiling show the memory freed?
Memory profiling shows the memory Python's allocator tracks. If you deallocate (e.g., del my_list), the Increment column might go negative. However, Python may not return memory to the OS immediately; it reuses the freed memory internally.
Can I profile asynchronous code?
Memory profiling works with async code if you use asyncio.run(). Decorate your async function and call it with asyncio.run().
How accurate is memory_profiler?
Accurate enough for finding leaks and bottlenecks, but not perfect. Python's memory allocator doesn't report memory at the exact line granularity; memory_profiler infers based on sampling. For exact byte-level profiling, use tracemalloc.