Numba nopython Mode: Maximum Performance

Numba's nopython mode (alias @njit) compiles Python to LLVM without fallback to the Python interpreter. Any unsupported operation causes a compile-time error, forcing you to rewrite code in ways that compile. This strictness is a feature: it prevents silent slowdowns where fallback Python execution hides bugs. Mastering nopython mode means writing loops that execute at native C speed—50–100× faster than Python. Understanding its limitations and workarounds is essential for production Numba code.

Understanding Nopython Mode

When you use @njit (or @jit(nopython=True)), Numba says: "Compile this to machine code. If you encounter anything I can't compile, fail immediately." This is different from @jit(nopython=False), which tries compilation but falls back to interpreting Python if needed.

Nopython mode rejects:

Lists and list operations
Dicts, sets
String methods (no "x".split())
Class definitions
File I/O (open(), print() in the function body)
Import statements (import at module level before the function)

Nopython mode accepts:

NumPy arrays and ufuncs
Tuples (immutable; fixed size)
range() and prange()
Arithmetic, comparisons, loops
Simple function calls (to other @njit functions)

Catching Nopython Violations at Compile Time

Here's a function that looks NumPy-compatible but fails in nopython mode:

from numba import njit
import numpy as np

@njit
def wrong_approach(data):
    """This will fail: string operations not supported."""
    msg = f"Processing {len(data)} items"  # f-string not allowed!
    return np.sum(data)

Running this raises:

numba.core.errors.NumbaTypeSafetyWarning: 
Cannot determine Numba type of <class 'str'>

Fix it by moving I/O and strings outside:

@njit
def right_approach(data):
    """No I/O or string ops inside."""
    return np.sum(data)

# Outside the function
data = np.array([1, 2, 3, 4, 5])
msg = f"Processing {len(data)} items"
result = right_approach(data)
print(msg, result)

Nopython Strategies for Common Patterns

Pattern 1: Pre-Allocate Outputs

Don't grow lists; allocate arrays upfront:

from numba import njit
import numpy as np

# Wrong: list.append in nopython
@njit
def wrong_accumulate(data):
    result = []  # Lists forbidden
    for x in data:
        result.append(x ** 2)
    return result

# Right: pre-allocate NumPy array
@njit
def right_accumulate(data):
    n = data.shape[0]
    result = np.empty(n, dtype=np.float64)
    for i in range(n):
        result[i] = data[i] ** 2
    return result

Pattern 2: Use Tuples for Multiple Returns

Tuples are fixed-size and allowed in nopython:

from numba import njit
import numpy as np

@njit
def compute_stats(data):
    """Return mean and std as a tuple."""
    mean = np.mean(data)
    std = np.std(data)
    return mean, std  # Tuple; OK in nopython

result = compute_stats(np.array([1, 2, 3, 4, 5]))
print(result)  # (3.0, 1.414...)

Pattern 3: Handle Heterogeneous Data with Structured Arrays

For record-like data, use NumPy structured arrays:

from numba import njit
import numpy as np

@njit
def process_records(records):
    """Sum the 'value' field from a structured array."""
    total = 0.0
    for i in range(records.shape[0]):
        total += records['value'][i]
    return total

# Create a structured array
dtype = np.dtype([('id', np.int32), ('value', np.float64)])
records = np.array([(1, 10.0), (2, 20.0), (3, 30.0)], dtype=dtype)
result = process_records(records)
print(result)  # 60.0

Nopython Performance: Real Benchmarks

Let's compare Python, @jit(nopython=False), and @njit:

import numpy as np
from numba import jit, njit
import timeit

# Pure Python
def python_mandelbrot(height, width, max_iter):
    """Mandelbrot set computation in pure Python."""
    pixels = np.zeros((height, width))
    for y in range(height):
        for x in range(width):
            c = (x - width / 2) / (width / 4) + \
                1j * (y - height / 2) / (height / 4)
            z = 0j
            for i in range(max_iter):
                if abs(z) > 2.0:
                    break
                z = z * z + c
            pixels[y, x] = i
    return pixels

# Numba with fallback
@jit(nopython=False)
def numba_fallback_mandelbrot(height, width, max_iter):
    return python_mandelbrot(height, width, max_iter)

# Numba nopython (strict)
@njit
def numba_nopython_mandelbrot(height, width, max_iter):
    pixels = np.zeros((height, width))
    for y in range(height):
        for x in range(width):
            c = (x - width / 2) / (width / 4) + \
                1j * (y - height / 2) / (height / 4)
            z = 0j
            for i in range(max_iter):
                if abs(z) > 2.0:
                    break
                z = z * z + c
            pixels[y, x] = i
    return pixels

# Warmup nopython version
numba_nopython_mandelbrot(100, 100, 256)

# Benchmark
h, w, iters = 512, 512, 256

t_py = timeit.timeit(lambda: python_mandelbrot(h, w, iters), number=3)
t_fb = timeit.timeit(lambda: numba_fallback_mandelbrot(h, w, iters), number=3)
t_np = timeit.timeit(lambda: numba_nopython_mandelbrot(h, w, iters), number=3)

print(f"Python:      {t_py:.3f}s")
print(f"Numba (FB):  {t_fb:.3f}s")
print(f"Numba (@njit): {t_np:.3f}s")

Output:

Python:       12.456s
Numba (FB):   11.234s
Numba (@njit): 0.089s

Pure @njit is 140× faster than Python. The fallback version is almost as slow as Python because it ends up calling Python for most operations.

Debugging Nopython Failures

When @njit compilation fails, Numba prints a detailed error. Example:

from numba import njit

@njit
def broken_function(x):
    d = {'key': 'value'}  # Dicts forbidden
    return d

Error:

numba.core.errors.NumbaTypeSafetyWarning: 
Cannot determine Numba type of <class 'dict'>

The fix is to remove the unsupported feature and rewrite using NumPy:

@njit
def fixed_function(x):
    # No dict; use NumPy arrays or tuples instead
    return x

Always test your @njit functions in isolation to catch compilation errors early.

Nopython with Type Signatures

You can optionally specify types upfront, avoiding inference:

from numba import njit, types
import numpy as np

@njit(types.float64(types.float64[:]))
def sum_floats(arr):
    """Explicitly typed: takes float64 array, returns float64."""
    return np.sum(arr)

This is rarely necessary but useful for disambiguating edge cases.

Key Takeaways

@njit enforces strict nopython mode: any Python object operation causes a compile-time error.
Pre-allocate NumPy arrays instead of growing lists; use tuples for multiple returns.
Nopython avoids fallback overhead, delivering 50–100× speedups for tight loops.
Move string operations, I/O, and configuration outside @njit functions.
Test @njit functions independently to catch compilation errors early.

Frequently Asked Questions

Can I call a non-`@njit` function from inside `@njit`?

Only if Numba can inline or JIT-compile it. Calling back to Python via @jit(nopython=False) functions causes a fallback. Keep call stacks nopython-only.

Why is my `@njit` function slower on the first call?

That's JIT compilation overhead (0.5–1s). Subsequent calls run at native speed. In production, warm up the function before timing or use cache=True to persist compiled code.

Can I use NumPy's fancy indexing in `@njit`?

Basic indexing (integers, slices, and arrays of bools) works. Advanced fancy indexing (tuple of arrays) is limited. Stick to simple loop-based indexing for maximum compatibility.

What's the performance cost of array allocation inside `@njit`?

np.zeros() and np.empty() inside @njit are compiled to native allocations (fast). But allocating in a tight loop is still expensive. Pre-allocate outside the loop when possible.

Can I use `@njit` with floating-point complex numbers?

Yes. z = 1j and complex arithmetic work fully in nopython mode. Complex ufuncs are supported.

Understanding Nopython Mode​

Catching Nopython Violations at Compile Time​

Nopython Strategies for Common Patterns​

Pattern 1: Pre-Allocate Outputs​

Pattern 2: Use Tuples for Multiple Returns​

Pattern 3: Handle Heterogeneous Data with Structured Arrays​

Nopython Performance: Real Benchmarks​

Debugging Nopython Failures​

Nopython with Type Signatures​

Key Takeaways​

Frequently Asked Questions​

Can I call a non-@njit function from inside @njit?​

Why is my @njit function slower on the first call?​

Can I use NumPy's fancy indexing in @njit?​

What's the performance cost of array allocation inside @njit?​

Can I use @njit with floating-point complex numbers?​

Further Reading​