Skip to main content

Performance optimization in Python — best practices and tools

Automated pipelines in article 158 keep regressions honest, yet latency issues often originate from asymptotic inefficiency (O(n²) merges), IO stalls, or thrashing C-extensions. Optimization starts with measurement, not instinct.


📚 Prerequisites

  • Ability to reason about asymptotic complexity at a conversational level.

🎯 What you'll learn

  • Use time.perf_counter for microbench discipline.
  • Reach for cProfile / pyinstrument when hotspots span functions.
  • Know when rewriting in Cython/rust extensions is folly vs necessary.

Measure first

import time

payload = range(500_000)
start = time.perf_counter()
sum(value * value for value in payload)
print(time.perf_counter() - start)

Wrap experiments in repeatable scripts or Jupyter cells with fixed seeds—noise dominates otherwise.


Profiling toolchain sketch

python -m cProfile -o stats.prof scripts/heavy_job.py

Inspect stats.prof visually with snakeviz or textual pstats. Look for cumulative time dominated by unexpectedly low-level helpers.

Common wins before exotic extensions:

  • Replace Python-level tight loops manipulating lists with NumPy vector ops.
  • Cache redundant pure-function calls (functools.lru_cache) when inputs repeat.
  • Stream large files rather than .read() everything.

💡 Key takeaways

  • “Fast enough” hinges on SLA and hardware budget—profile user-visible paths first.

➡️ Next steps

Close the book mindset loop with idiomatic Python — PEP 8 recap.