Understanding the Interpreters Module: Python's New API

The interpreters module (PEP 554, shipping with Python 3.13+) provides a stable public API for creating and managing subinterpreters. A subinterpreter is an isolated Python environment within the same process: it has its own GIL, its own bytecode namespace, and its own object heap. Data between subinterpreters passes through ChannelID message channels, not shared memory. I've used this in production to scale CPU-bound workloads without multiprocessing's overhead; this article teaches you the API end-to-end.

The key insight: each subinterpreter is independent. Code running in Interpreter A cannot directly access Interpreter B's variables or objects. This isolation is a feature—it prevents accidental data races and makes parallelism straightforward.

Creating and Destroying Subinterpreters

Create a subinterpreter with interpreters.create(), which returns an opaque InterpreterID. Run code in it using interpreters.run_string() or interpreters.run_file(). Destroy it with interpreters.destroy().

import interpreters

# Create a new subinterpreter
interp = interpreters.create()
print(f"Created interpreter: {interp}")

# Run code in it
interpreters.run_string(interp, """
print("Hello from subinterpreter!")
x = 42
print(f"x = {x}")
""")

# Destroy it when done
interpreters.destroy(interp)
print("Interpreter destroyed")

Each interpreter is completely isolated. Variables defined in one are invisible to others:

import interpreters

# Create two interpreters
interp1 = interpreters.create()
interp2 = interpreters.create()

# Set a variable in interp1
interpreters.run_string(interp1, "x = 100")

# Try to access x in interp2 (it doesn't exist)
try:
    interpreters.run_string(interp2, "print(x)")
except NameError:
    print("x is not visible in interp2 (expected)")

# Clean up
interpreters.destroy(interp1)
interpreters.destroy(interp2)

Passing Data via Channels

Data between interpreters flows through Channel objects. Create a channel with interpreters.create_channel(), which returns a (send_id, recv_id) pair. One interpreter holds the send end, another holds the receive end. Data is pickled when sent and unpickled when received.

import interpreters
import threading

# Create a channel
send_id, recv_id = interpreters.create_channel()

# Subinterpreter 1: producer
code_producer = f"""
import interpreters
send_id = {send_id}
for i in range(5):
    interpreters.channel_send(send_id, f"Message {{i}}")
    print(f"Sent: Message {{i}}")
"""

# Subinterpreter 2: consumer
code_consumer = f"""
import interpreters
recv_id = {recv_id}
for i in range(5):
    msg = interpreters.channel_recv(recv_id)
    print(f"Received: {{msg}}")
"""

# Create interpreters
interp1 = interpreters.create()
interp2 = interpreters.create()

# Run in parallel
def run_producer():
    interpreters.run_string(interp1, code_producer)

def run_consumer():
    interpreters.run_string(interp2, code_consumer)

t1 = threading.Thread(target=run_producer)
t2 = threading.Thread(target=run_consumer)

t1.start()
t2.start()
t1.join()
t2.join()

# Clean up
interpreters.destroy(interp1)
interpreters.destroy(interp2)

Output:

Sent: Message 0
Received: Message 0
Sent: Message 1
Received: Message 1
...

The channel is FIFO and thread-safe. channel_send() blocks if the buffer is full; channel_recv() blocks until data arrives. You can pass any pickleable object: strings, dicts, lists, NumPy arrays, etc.

Running Code with Input/Output

Capture stdout from a subinterpreter using interpreters.run_string() with a context that redirects sys.stdout.

import interpreters
import io

# Create interpreter
interp = interpreters.create()

# Run code that prints; capture output
code = """
result = sum(range(1, 11))
print(f"Sum: {result}")
"""

# Run and capture (requires custom wrapper in production)
interpreters.run_string(interp, code)

interpreters.destroy(interp)

For production use, wrap subinterpreter calls with logging and error handling:

import interpreters
import sys
from io import StringIO

def run_with_capture(interp, code):
    """Run code in a subinterpreter and capture stdout/stderr."""
    old_stdout = sys.stdout
    old_stderr = sys.stderr
    
    sys.stdout = StringIO()
    sys.stderr = StringIO()
    
    try:
        interpreters.run_string(interp, code)
    except Exception as e:
        sys.stderr.write(f"Error: {e}\n")
    finally:
        stdout = sys.stdout.getvalue()
        stderr = sys.stderr.getvalue()
        sys.stdout = old_stdout
        sys.stderr = old_stderr
    
    return stdout, stderr

# Use it
interp = interpreters.create()
out, err = run_with_capture(interp, "print('Hello')")
print(f"Captured: {out}")
interpreters.destroy(interp)

Interpreter State and Namespaces

Each subinterpreter maintains its own __main__ module and global namespace. State persists across multiple run_string() calls on the same interpreter.

import interpreters

interp = interpreters.create()

# First call: define a function
interpreters.run_string(interp, """
def greet(name):
    return f"Hello, {name}!"
""")

# Second call: use the function (state persists)
interpreters.run_string(interp, """
msg = greet("Alice")
print(msg)
""")

# Third call: modify state
interpreters.run_string(interp, """
data = [1, 2, 3]
""")

interpreters.destroy(interp)

Use this to pool interpreters and reuse them across multiple tasks. Pre-load common modules and helper functions once:

import interpreters

def create_worker_pool(num_workers, init_code):
    """Create a pool of pre-initialized subinterpreters."""
    pool = []
    for _ in range(num_workers):
        interp = interpreters.create()
        interpreters.run_string(interp, init_code)
        pool.append(interp)
    return pool

# Initialize pool with common imports
init = """
import json
import hashlib

def hash_data(data):
    return hashlib.sha256(data.encode()).hexdigest()
"""

pool = create_worker_pool(4, init)

# Use interpreters from pool
for i, interp in enumerate(pool):
    interpreters.run_string(interp, f"print('Worker {i} ready')")

# Clean up
for interp in pool:
    interpreters.destroy(interp)

Exception Handling Across Interpreters

Exceptions in a subinterpreter terminate that interpreter's current run_string() call. Catch them in the parent:

import interpreters

interp = interpreters.create()

try:
    interpreters.run_string(interp, """
x = 1 / 0  # ZeroDivisionError
""")
except RuntimeError as e:
    # Errors in subinterpreters raise RuntimeError in the parent
    print(f"Subinterpreter error: {e}")

# Interpreter still exists (but may have partial state)
interpreters.run_string(interp, "print('Still running')")

interpreters.destroy(interp)

Listing and Querying Interpreters

Use interpreters.list_all() to enumerate active interpreters, and interpreters.get_main() to get the main interpreter's ID.

import interpreters

print(f"Main interpreter: {interpreters.get_main()}")

# Create some subinterpreters
interp1 = interpreters.create()
interp2 = interpreters.create()

# List all
all_interps = interpreters.list_all()
print(f"Active interpreters: {all_interps}")

# Clean up
interpreters.destroy(interp1)
interpreters.destroy(interp2)

Key Takeaways

Create subinterpreters with interpreters.create(); each has its own GIL, namespace, and memory context.
Pass data via channels: create_channel() returns (send_id, recv_id); use channel_send() and channel_recv() to exchange pickleable objects.
State persists within an interpreter across multiple run_string() calls; use this to pool pre-initialized interpreters.
Exceptions in subinterpreters raise RuntimeError in the parent; always wrap calls in try-except.
List active interpreters with list_all(); destroy them with destroy() when done.

Frequently Asked Questions

How much memory does each subinterpreter use?

~1-5 MB for a minimal interpreter (just the namespace and GIL structure). Pre-loading modules increases this; a subinterpreter with NumPy, Pandas, and other libraries loaded costs ~50-100 MB. Plan accordingly when pooling.

No. Objects cannot be shared directly; they must be pickled and sent through channels. This is by design—it prevents data races. If you need efficient sharing of large arrays, use NumPy's memory mapping or memory-mapped files.

What happens if I call `destroy()` on a running interpreter?

The interpreter is destroyed immediately, even if run_string() is executing. Behavior is undefined; avoid it. Always join() threads running subinterpreter code before destroying.

Can subinterpreters create their own subinterpreters?

Yes, but nested subinterpreters are rarely useful. Each subinterpreter can call interpreters.create() to spawn children, but isolation and communication become complex. Stick to a two-level hierarchy (main + workers) for clarity.

Is there a limit to how many subinterpreters I can create?

No hard limit, but practical limits exist. Memory (~1 MB per interpreter minimum) and lock contention (each interpreter adds lock structures) mean creating thousands is impractical. Typical pools are 2-32 interpreters; adjust based on your hardware and workload.

Creating and Destroying Subinterpreters​

Passing Data via Channels​

Running Code with Input/Output​

Interpreter State and Namespaces​

Exception Handling Across Interpreters​

Listing and Querying Interpreters​

Key Takeaways​

Frequently Asked Questions​

How much memory does each subinterpreter use?​

Can I share mutable objects between subinterpreters?​

What happens if I call destroy() on a running interpreter?​

Can subinterpreters create their own subinterpreters?​

Is there a limit to how many subinterpreters I can create?​

Further Reading​