Debugging PyO3 Extensions: Tools and Techniques
Bugs in PyO3 extensions are insidious: they can cause Python to crash without a traceback, corrupt memory, or deadlock on the GIL. Traditional Python debugging tools (pdb, logging) work partially; you need Rust tooling to see what is really happening in native code. This article teaches you to debug PyO3 extensions using print statements, Rust debuggers (gdb, lldb), Python profilers, and careful reasoning about the GIL. By the end, you will have a reproducible workflow for finding and fixing the most elusive bugs.
Debugging PyO3 is a skill that separates competent developers from experts. The key insight is that bugs live in two worlds—Python and Rust—so you need tools from both to investigate.
Debugging Strategy: Print Statements First
The simplest technique is still often the most effective. Add debug output at key points to narrow the problem:
use pyo3::prelude::*;
#[pyfunction]
fn compute_result(a: i32, b: i32) -> i32 {
println!("compute_result called with a={}, b={}", a, b);
let result = a + b;
println!("result={}", result);
result
}
#[pymodule]
fn debug_ext(m: &Bound<'_, PyModule>) -> PyResult<()> {
m.add_function(wrap_pyfunction!(compute_result, m)?)?;
Ok(())
}
From Python:
from debug_ext import compute_result
compute_result(10, 20)
Output:
compute_result called with a=10, b=20
result=30
The println! macro writes to stdout, visible from Python. This works for quick debugging but leaves prints in production code (bad). For cleaner logging, use the log crate:
use log::info;
#[pyfunction]
fn compute_result(a: i32, b: i32) -> i32 {
info!("compute_result called with a={}, b={}", a, b);
let result = a + b;
info!("result={}", result);
result
}
Enable logging in Python with:
import logging
logging.basicConfig(level=logging.INFO)
from debug_ext import compute_result
compute_result(10, 20)
Debugging Panics and Crashes
If your extension crashes with a segfault or panic, PyO3 usually catches panics and converts them to RuntimeError. However, some crashes (e.g., undefined behavior, memory corruption) crash Python itself, leaving no traceback. To debug:
- Build with debug symbols:
maturin develop # Debug build; contains symbols
# or
maturin build --release # Release build also includes symbols by default
- Run under a debugger (
gdbon Linux,lldbon macOS,cdbon Windows):
gdb python
(gdb) run -c "from debug_ext import compute_result; compute_result(10, 20)"
If a segfault occurs, GDB shows the stack trace:
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7a0c123 in compute_result () from /path/to/debug_ext.so
(gdb) bt # backtrace
#0 0x00007ffff7a0c123 in compute_result () from /path/to/debug_ext.so
#1 0x00007ffff7a0c456 in compute_result_wrapper () from /path/to/debug_ext.so
#2 0x00007ffff50d1234 in PyObject_Call () from python
(gdb) frame 0
(gdb) info locals # Show local variables
(gdb) print variable_name # Inspect a variable
This identifies the exact line and variable causing the crash.
Diagnosing GIL Deadlocks
GIL deadlocks occur when you try to acquire the GIL while already holding it, or when you hold it across a blocking operation. PyO3 helps, but mistakes still happen:
use pyo3::prelude::*;
#[pyfunction]
fn bad_function() -> PyResult<()> {
Python::with_gil(|py| {
// Dangerous: holding the GIL across a blocking operation
std::thread::sleep(std::time::Duration::from_secs(5));
Ok(())
})
}
This blocks the entire Python interpreter for 5 seconds. The fix: release the GIL for blocking work:
#[pyfunction]
fn good_function() -> PyResult<()> {
let result = Python::with_gil(|py| {
py.allow_threads(|| {
// GIL is released here; safe to block
std::thread::sleep(std::time::Duration::from_secs(5));
42
})
});
Ok(result)
}
The py.allow_threads(|| ...) block explicitly releases the GIL, allowing other Python threads to run while Rust blocks.
If Python hangs (appears to freeze), it is likely a GIL issue. Attach a debugger and inspect the Python thread state:
gdb python
(gdb) run -c "from debug_ext import bad_function; bad_function()"
# ... waits 5 seconds ...
(Ctrl+C to interrupt)
(gdb) py-bt # Python backtrace (requires gdb with Python support)
Profiling Extensions with Python Profilers
PyO3 extensions show up as a single function call in Python profilers, so you cannot see inside them. However, profilers identify that your extension is the bottleneck:
import cProfile
from debug_ext import compute_result
cProfile.run("for _ in range(1_000_000): compute_result(10, 20)")
Output shows the extension function taking the most time. If the extension is the bottleneck, optimize the Rust code (not much Python can do). If it is not, profile your Python code instead.
For more granular profiling, emit timing data from Rust and log it to Python:
use std::time::Instant;
#[pyfunction]
fn timed_computation(n: i32) -> PyResult<(i32, f64)> {
let start = Instant::now();
let result = (0..n).sum();
let elapsed_ms = start.elapsed().as_secs_f64() * 1000.0;
println!("Computation took {:.2} ms", elapsed_ms);
Ok((result, elapsed_ms))
}
Testing Error Handling
Ensure your error handling paths work correctly:
use pyo3::prelude::*;
#[pyfunction]
fn divide(a: i32, b: i32) -> PyResult<f64> {
if b == 0 {
return Err(PyErr::new::<pyo3::exceptions::PyZeroDivisionError, _>(
"cannot divide by zero"
));
}
Ok(a as f64 / b as f64)
}
#[cfg(test)]
mod tests {
use super::*;
use pyo3::Python;
#[test]
fn test_divide_by_zero() {
Python::with_gil(|_py| {
let err = divide(10, 0).unwrap_err();
assert!(err.is_instance_of::<pyo3::exceptions::PyZeroDivisionError>(_py));
});
}
#[test]
fn test_divide_success() {
Python::with_gil(|_py| {
let result = divide(10, 2).unwrap();
assert_eq!(result, 5.0);
});
}
}
Run tests with cargo test:
cargo test
Common Bugs and Fixes
| Bug | Symptom | Fix |
|---|---|---|
| Memory not initialized | Garbage values or crashes | Initialize all fields in constructors; use Default trait. |
| GIL held too long | Python becomes unresponsive | Use py.allow_threads(|| ...) for blocking work. |
| Panic in Rust | Python exits without traceback | Return PyResult<T> and use Err(...) instead of panicking. |
| Reference counted object dropped | Segfault or crash | Use Py<T> to hold references to Python objects. |
| Type mismatch in arguments | TypeError at runtime | Check argument types; use #[pyo3(signature = ...)] for defaults. |
Debugging Checklist
- Build with debug symbols:
maturin develop - Start with print statements to narrow the problem
- Use a Rust debugger (gdb/lldb) for crashes and segfaults
- Check for GIL deadlocks: does Python freeze?
- Write unit tests for error paths
- Profile with Python tools to identify bottlenecks
- Use
logcrate for persistent debugging output - Enable Rust compiler warnings: check cargo output for
warning:
Key Takeaways
- Print statements and the
logcrate are your first debugging tools. - Rust debuggers (gdb/lldb) reveal stack traces and local variables in crashes.
- The GIL is a common pitfall; use
py.allow_threads()for blocking work. - Python profilers identify extensions as bottlenecks but not internal details.
- Unit tests validate error handling before code reaches users.
- Crashes in PyO3 are rare but severe; attach a debugger and get a full stack trace.
Frequently Asked Questions
How do I print from a PyO3 function without polluting my code with debug statements?
Use the log crate and enable logging in Python. This allows you to control log level at runtime (remove debug output for production builds).
What if my extension intermittently crashes with no clear cause?
Intermittent crashes often indicate a memory safety bug in Rust (unlikely with safe Rust) or a race condition with Python threads. Use cargo test --release to test extensively, and review any unsafe blocks in your code.
Can I step through Rust code in an IDE like VS Code?
Yes, with the Rust Analyzer extension and a debugger. Set breakpoints in src/lib.rs, then run Python under the debugger. See the Rust Analyzer documentation for setup.
How do I handle exceptions raised by Python code I call from Rust?
Catch PyErr with pattern matching or PyErr::is_instance_of(). See Article 5 (error handling) for examples.
What if the Python profiler shows my extension taking 90% of time, but it does not feel slow?
Extensions are slow relative to pure Python, but they still run native code (fast). The profiler correctly identifies the bottleneck. If you need more speed, optimize the Rust algorithm, not the calling convention.