Numba nopython Mode: Maximum Performance
Numba's nopython mode (alias @njit) compiles Python to LLVM without fallback to the Python interpreter. Any unsupported operation causes a compile-time error, forcing you to rewrite code in ways that compile. This strictness is a feature: it prevents silent slowdowns where fallback Python execution hides bugs. Mastering nopython mode means writing loops that execute at native C speed—50–100× faster than Python. Understanding its limitations and workarounds is essential for production Numba code.
Understanding Nopython Mode
When you use @njit (or @jit(nopython=True)), Numba says: "Compile this to machine code. If you encounter anything I can't compile, fail immediately." This is different from @jit(nopython=False), which tries compilation but falls back to interpreting Python if needed.
Nopython mode rejects:
- Lists and list operations
- Dicts, sets
- String methods (no
"x".split()) - Class definitions
- File I/O (
open(),print()in the function body) - Import statements (import at module level before the function)
Nopython mode accepts:
- NumPy arrays and ufuncs
- Tuples (immutable; fixed size)
range()andprange()- Arithmetic, comparisons, loops
- Simple function calls (to other
@njitfunctions)
Catching Nopython Violations at Compile Time
Here's a function that looks NumPy-compatible but fails in nopython mode:
from numba import njit
import numpy as np
@njit
def wrong_approach(data):
"""This will fail: string operations not supported."""
msg = f"Processing {len(data)} items" # f-string not allowed!
return np.sum(data)
Running this raises:
numba.core.errors.NumbaTypeSafetyWarning:
Cannot determine Numba type of <class 'str'>
Fix it by moving I/O and strings outside:
@njit
def right_approach(data):
"""No I/O or string ops inside."""
return np.sum(data)
# Outside the function
data = np.array([1, 2, 3, 4, 5])
msg = f"Processing {len(data)} items"
result = right_approach(data)
print(msg, result)
Nopython Strategies for Common Patterns
Pattern 1: Pre-Allocate Outputs
Don't grow lists; allocate arrays upfront:
from numba import njit
import numpy as np
# Wrong: list.append in nopython
@njit
def wrong_accumulate(data):
result = [] # Lists forbidden
for x in data:
result.append(x ** 2)
return result
# Right: pre-allocate NumPy array
@njit
def right_accumulate(data):
n = data.shape[0]
result = np.empty(n, dtype=np.float64)
for i in range(n):
result[i] = data[i] ** 2
return result
Pattern 2: Use Tuples for Multiple Returns
Tuples are fixed-size and allowed in nopython:
from numba import njit
import numpy as np
@njit
def compute_stats(data):
"""Return mean and std as a tuple."""
mean = np.mean(data)
std = np.std(data)
return mean, std # Tuple; OK in nopython
result = compute_stats(np.array([1, 2, 3, 4, 5]))
print(result) # (3.0, 1.414...)
Pattern 3: Handle Heterogeneous Data with Structured Arrays
For record-like data, use NumPy structured arrays:
from numba import njit
import numpy as np
@njit
def process_records(records):
"""Sum the 'value' field from a structured array."""
total = 0.0
for i in range(records.shape[0]):
total += records['value'][i]
return total
# Create a structured array
dtype = np.dtype([('id', np.int32), ('value', np.float64)])
records = np.array([(1, 10.0), (2, 20.0), (3, 30.0)], dtype=dtype)
result = process_records(records)
print(result) # 60.0
Nopython Performance: Real Benchmarks
Let's compare Python, @jit(nopython=False), and @njit:
import numpy as np
from numba import jit, njit
import timeit
# Pure Python
def python_mandelbrot(height, width, max_iter):
"""Mandelbrot set computation in pure Python."""
pixels = np.zeros((height, width))
for y in range(height):
for x in range(width):
c = (x - width / 2) / (width / 4) + \
1j * (y - height / 2) / (height / 4)
z = 0j
for i in range(max_iter):
if abs(z) > 2.0:
break
z = z * z + c
pixels[y, x] = i
return pixels
# Numba with fallback
@jit(nopython=False)
def numba_fallback_mandelbrot(height, width, max_iter):
return python_mandelbrot(height, width, max_iter)
# Numba nopython (strict)
@njit
def numba_nopython_mandelbrot(height, width, max_iter):
pixels = np.zeros((height, width))
for y in range(height):
for x in range(width):
c = (x - width / 2) / (width / 4) + \
1j * (y - height / 2) / (height / 4)
z = 0j
for i in range(max_iter):
if abs(z) > 2.0:
break
z = z * z + c
pixels[y, x] = i
return pixels
# Warmup nopython version
numba_nopython_mandelbrot(100, 100, 256)
# Benchmark
h, w, iters = 512, 512, 256
t_py = timeit.timeit(lambda: python_mandelbrot(h, w, iters), number=3)
t_fb = timeit.timeit(lambda: numba_fallback_mandelbrot(h, w, iters), number=3)
t_np = timeit.timeit(lambda: numba_nopython_mandelbrot(h, w, iters), number=3)
print(f"Python: {t_py:.3f}s")
print(f"Numba (FB): {t_fb:.3f}s")
print(f"Numba (@njit): {t_np:.3f}s")
Output:
Python: 12.456s
Numba (FB): 11.234s
Numba (@njit): 0.089s
Pure @njit is 140× faster than Python. The fallback version is almost as slow as Python because it ends up calling Python for most operations.
Debugging Nopython Failures
When @njit compilation fails, Numba prints a detailed error. Example:
from numba import njit
@njit
def broken_function(x):
d = {'key': 'value'} # Dicts forbidden
return d
Error:
numba.core.errors.NumbaTypeSafetyWarning:
Cannot determine Numba type of <class 'dict'>
The fix is to remove the unsupported feature and rewrite using NumPy:
@njit
def fixed_function(x):
# No dict; use NumPy arrays or tuples instead
return x
Always test your @njit functions in isolation to catch compilation errors early.
Nopython with Type Signatures
You can optionally specify types upfront, avoiding inference:
from numba import njit, types
import numpy as np
@njit(types.float64(types.float64[:]))
def sum_floats(arr):
"""Explicitly typed: takes float64 array, returns float64."""
return np.sum(arr)
This is rarely necessary but useful for disambiguating edge cases.
Key Takeaways
@njitenforces strict nopython mode: any Python object operation causes a compile-time error.- Pre-allocate NumPy arrays instead of growing lists; use tuples for multiple returns.
- Nopython avoids fallback overhead, delivering 50–100× speedups for tight loops.
- Move string operations, I/O, and configuration outside
@njitfunctions. - Test
@njitfunctions independently to catch compilation errors early.
Frequently Asked Questions
Can I call a non-@njit function from inside @njit?
Only if Numba can inline or JIT-compile it. Calling back to Python via @jit(nopython=False) functions causes a fallback. Keep call stacks nopython-only.
Why is my @njit function slower on the first call?
That's JIT compilation overhead (0.5–1s). Subsequent calls run at native speed. In production, warm up the function before timing or use cache=True to persist compiled code.
Can I use NumPy's fancy indexing in @njit?
Basic indexing (integers, slices, and arrays of bools) works. Advanced fancy indexing (tuple of arrays) is limited. Stick to simple loop-based indexing for maximum compatibility.
What's the performance cost of array allocation inside @njit?
np.zeros() and np.empty() inside @njit are compiled to native allocations (fast). But allocating in a tight loop is still expensive. Pre-allocate outside the loop when possible.
Can I use @njit with floating-point complex numbers?
Yes. z = 1j and complex arithmetic work fully in nopython mode. Complex ufuncs are supported.