Line Profiler: Find Slow Lines in Python Code

line_profiler measures execution time per line of code, revealing which specific lines inside a function consume the most time. While cProfile tells you "this function is slow," line_profiler tells you "this line, specifically, is the culprit." It's the surgical tool for drilling into hot functions and finding the exact optimization target. Most developers discover line_profiler and wonder how they lived without it.

I used line_profiler to optimize a data validation function that was checked to take 2 seconds. With cProfile, I knew it was slow but not why. With line_profiler, I saw that a single line—a regex search applied to every record—consumed 1.8 of the 2 seconds. Replacing the regex with simple string operations cut it to 0.2 seconds. The per-line view made the fix obvious.

Installing line_profiler

line_profiler is a third-party package, not in the standard library:

pip install line_profiler

This installs the kernprof command-line tool and the line_profiler module.

Basic Usage: Decorating a Function

Mark functions you want to profile with the @profile decorator:

@profile
def slow_function(n):
    """This function will be profiled line-by-line."""
    total = 0
    for i in range(n):
        total += i  # Time spent here?
    return total

# Call the function
result = slow_function(1000000)

Then run it with kernprof:

kernprof -l -v script.py

The -l flag uses line_profiler (default is cProfile). The -v flag prints results immediately to the terminal.

Output:

Total time: 0.0234 s
File: script.py
Function: slow_function at line 1
Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     1                                           @profile
     2                                           def slow_function(n):
     3         1         10.0     10.0      0.0  total = 0
     4   1000001    8500000.0      8.5     85.0  for i in range(n):
     5   1000000    1300000.0      1.3     15.0  total += i
     6         1        100.0    100.0      0.0  return total

Each column shows:

Column	Meaning
Hits	Number of times this line executed
Time	Microseconds spent on this line (all hits combined)
Per Hit	Average microseconds per execution
% Time	Percentage of total function time

The for loop line consumed 85% of the time. That's where to focus optimization.

Programmatic Usage Without the Decorator

You don't need to modify your code if you use the programmatic API:

from line_profiler import LineProfiler

def slow_function(n):
    """Function to profile (no decorator needed)."""
    total = 0
    for i in range(n):
        total += i
    return total

def another_function(x):
    """Another function to profile."""
    result = x * x
    for j in range(1000):
        result += j
    return result

# Create a profiler
profiler = LineProfiler()
profiler.add_function(slow_function)
profiler.add_function(another_function)

# Run the functions
profiler.enable()
slow_function(1000000)
another_function(50)
profiler.disable()

# Print results
profiler.print_stats()

This approach doesn't require the @profile decorator or kernprof, just run the script normally.

Real-World Example: Optimizing a Data Processing Function

Here's a function that processes records—common in data pipelines:

@profile
def process_records(records):
    """Process a list of dictionaries."""
    result = []
    for record in records:
        # Check if record has required fields
        if "id" not in record or "value" not in record:
            continue
        
        # Convert and compute
        row = {
            "id": int(record["id"]),
            "value": float(record["value"]),
            "squared": float(record["value"]) ** 2,  # Recalculate value
        }
        result.append(row)
    
    return result

# Test data
records = [
    {"id": str(i), "value": str(float(i))}
    for i in range(10000)
]

process_records(records)

Running with kernprof -l -v:

Total time: 0.145 s
File: script.py
Function: process_records at line 1
Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
                                         @profile
                                         def process_records(records):
       1        100.0    100.0      0.0  result = []
   10000      40000.0      4.0      0.5  for record in records:
   10000      60000.0      6.0      1.0  if "id" not in record or "value" not in record:
       1        100.0    100.0      0.0  continue
   10000      55000.0      5.5      1.2  row = {
   10000      40000.0      4.0      0.9  "id": int(record["id"]),
   10000      45000.0      4.5      1.0  "value": float(record["value"]),
   10000     100000.0     10.0     68.0  "squared": float(record["value"]) ** 2,
   10000      25000.0      2.5      0.5  }
   10000      28000.0      2.8      0.6  result.append(row)
       1        100.0    100.0      0.0  return result

Line 10 dominates: computing float(record["value"]) ** 2 takes 68% of total time. And notice: we compute float(record["value"]) twice (lines 9 and 10). The optimization is clear:

@profile
def process_records_optimized(records):
    """Optimized version."""
    result = []
    for record in records:
        if "id" not in record or "value" not in record:
            continue
        
        # Compute value once
        value = float(record["value"])
        row = {
            "id": int(record["id"]),
            "value": value,
            "squared": value * value,  # Multiply, not power
        }
        result.append(row)
    
    return result

Changes: compute value once, then use value * value instead of ** 2 (multiplication is faster than exponentiation). Re-run with kernprof -l -v:

Total time: 0.082 s

Time dropped from 0.145s to 0.082s—a 43% speedup just from caching one float and using multiplication instead of exponentiation. The line-level breakdown showed exactly where to look.

Advanced: Profiling Across Multiple Functions

Pass multiple functions to kernprof:

@profile
def function_a():
    for i in range(100000):
        x = i * i
    return x

@profile
def function_b():
    total = sum(range(10000))
    return total

function_a()
function_b()

kernprof -l -v script.py

Output shows profiling for both functions separately. This is useful for comparing hot paths.

Working with Imports and Modules

If your code imports functions from other modules, you can profile them too:

from some_module import external_function

@profile
def my_function():
    for i in range(1000):
        external_function(i)
    return True

my_function()

Run with kernprof -l -v script.py. The @profile decorator marks only my_function for profiling; external_function calls are timed as a block. To drill into external_function, add @profile to its definition in its module.

Reducing Output Noise

By default, line_profiler shows every line. For long functions, filter to show only slow lines:

profiler = LineProfiler()
profiler.add_function(slow_function)
profiler.enable()
slow_function(1000000)
profiler.disable()

# Print only lines with >1% of function time
profiler.print_stats(percent_threshold=1)

Or save results to a file for later analysis:

with open('profile_results.txt', 'w') as f:
    profiler.print_stats(stream=f, percent_threshold=0.5)

Key Takeaways

line_profiler pinpoints the exact lines consuming time, turning vague "this function is slow" into "this operation is the bottleneck."
Use the @profile decorator on functions you want to measure, then run with kernprof -l -v script.py.
Focus on high-% Time lines and high Per Hit microseconds; those are your optimization targets.
Common wins: computing a value once instead of multiple times, using faster operations (multiplication vs. exponentiation), avoiding redundant function calls.
Combine with cProfile for a two-stage workflow: cProfile finds the slow function, line_profiler finds the slow line.

Frequently Asked Questions

Does the @profile decorator have to be imported?

No, kernprof injects it at runtime. You can just use @profile with no imports. If you run the script without kernprof, you'll get a NameError unless you define def profile(f): return f as a fallback.

Can I profile built-in functions and C extensions?

No, line_profiler measures Python code only. Calls to NumPy, compiled libraries, or built-ins show as a single line (the call itself). To see what's inside, use a sampling profiler like py-spy.

How much overhead does line_profiler add?

About 10–100× slower than normal execution (varies with function complexity). This is fine for identifying bottlenecks, but don't use line_profiler on hot paths in production.

Should I profile all my functions or just suspected slow ones?

Start with the suspected slow ones (identified by cProfile). Profiling everything is slow and generates too much output. Use the two-stage workflow: cProfile → line_profiler.

Installing line_profiler​

Basic Usage: Decorating a Function​

Programmatic Usage Without the Decorator​

Real-World Example: Optimizing a Data Processing Function​

Advanced: Profiling Across Multiple Functions​

Working with Imports and Modules​

Reducing Output Noise​

Key Takeaways​

Frequently Asked Questions​

Does the @profile decorator have to be imported?​

Can I profile built-in functions and C extensions?​

How much overhead does line_profiler add?​

Should I profile all my functions or just suspected slow ones?​

Further Reading​