Understanding the Interpreters Module: Python's New API
The interpreters module (PEP 554, shipping with Python 3.13+) provides a stable public API for creating and managing subinterpreters. A subinterpreter is an isolated Python environment within the same process: it has its own GIL, its own bytecode namespace, and its own object heap. Data between subinterpreters passes through ChannelID message channels, not shared memory. I've used this in production to scale CPU-bound workloads without multiprocessing's overhead; this article teaches you the API end-to-end.
The key insight: each subinterpreter is independent. Code running in Interpreter A cannot directly access Interpreter B's variables or objects. This isolation is a feature—it prevents accidental data races and makes parallelism straightforward.
Creating and Destroying Subinterpreters
Create a subinterpreter with interpreters.create(), which returns an opaque InterpreterID. Run code in it using interpreters.run_string() or interpreters.run_file(). Destroy it with interpreters.destroy().
import interpreters
# Create a new subinterpreter
interp = interpreters.create()
print(f"Created interpreter: {interp}")
# Run code in it
interpreters.run_string(interp, """
print("Hello from subinterpreter!")
x = 42
print(f"x = {x}")
""")
# Destroy it when done
interpreters.destroy(interp)
print("Interpreter destroyed")
Each interpreter is completely isolated. Variables defined in one are invisible to others:
import interpreters
# Create two interpreters
interp1 = interpreters.create()
interp2 = interpreters.create()
# Set a variable in interp1
interpreters.run_string(interp1, "x = 100")
# Try to access x in interp2 (it doesn't exist)
try:
interpreters.run_string(interp2, "print(x)")
except NameError:
print("x is not visible in interp2 (expected)")
# Clean up
interpreters.destroy(interp1)
interpreters.destroy(interp2)
Passing Data via Channels
Data between interpreters flows through Channel objects. Create a channel with interpreters.create_channel(), which returns a (send_id, recv_id) pair. One interpreter holds the send end, another holds the receive end. Data is pickled when sent and unpickled when received.
import interpreters
import threading
# Create a channel
send_id, recv_id = interpreters.create_channel()
# Subinterpreter 1: producer
code_producer = f"""
import interpreters
send_id = {send_id}
for i in range(5):
interpreters.channel_send(send_id, f"Message {{i}}")
print(f"Sent: Message {{i}}")
"""
# Subinterpreter 2: consumer
code_consumer = f"""
import interpreters
recv_id = {recv_id}
for i in range(5):
msg = interpreters.channel_recv(recv_id)
print(f"Received: {{msg}}")
"""
# Create interpreters
interp1 = interpreters.create()
interp2 = interpreters.create()
# Run in parallel
def run_producer():
interpreters.run_string(interp1, code_producer)
def run_consumer():
interpreters.run_string(interp2, code_consumer)
t1 = threading.Thread(target=run_producer)
t2 = threading.Thread(target=run_consumer)
t1.start()
t2.start()
t1.join()
t2.join()
# Clean up
interpreters.destroy(interp1)
interpreters.destroy(interp2)
Output:
Sent: Message 0
Received: Message 0
Sent: Message 1
Received: Message 1
...
The channel is FIFO and thread-safe. channel_send() blocks if the buffer is full; channel_recv() blocks until data arrives. You can pass any pickleable object: strings, dicts, lists, NumPy arrays, etc.
Running Code with Input/Output
Capture stdout from a subinterpreter using interpreters.run_string() with a context that redirects sys.stdout.
import interpreters
import io
# Create interpreter
interp = interpreters.create()
# Run code that prints; capture output
code = """
result = sum(range(1, 11))
print(f"Sum: {result}")
"""
# Run and capture (requires custom wrapper in production)
interpreters.run_string(interp, code)
interpreters.destroy(interp)
For production use, wrap subinterpreter calls with logging and error handling:
import interpreters
import sys
from io import StringIO
def run_with_capture(interp, code):
"""Run code in a subinterpreter and capture stdout/stderr."""
old_stdout = sys.stdout
old_stderr = sys.stderr
sys.stdout = StringIO()
sys.stderr = StringIO()
try:
interpreters.run_string(interp, code)
except Exception as e:
sys.stderr.write(f"Error: {e}\n")
finally:
stdout = sys.stdout.getvalue()
stderr = sys.stderr.getvalue()
sys.stdout = old_stdout
sys.stderr = old_stderr
return stdout, stderr
# Use it
interp = interpreters.create()
out, err = run_with_capture(interp, "print('Hello')")
print(f"Captured: {out}")
interpreters.destroy(interp)
Interpreter State and Namespaces
Each subinterpreter maintains its own __main__ module and global namespace. State persists across multiple run_string() calls on the same interpreter.
import interpreters
interp = interpreters.create()
# First call: define a function
interpreters.run_string(interp, """
def greet(name):
return f"Hello, {name}!"
""")
# Second call: use the function (state persists)
interpreters.run_string(interp, """
msg = greet("Alice")
print(msg)
""")
# Third call: modify state
interpreters.run_string(interp, """
data = [1, 2, 3]
""")
interpreters.destroy(interp)
Use this to pool interpreters and reuse them across multiple tasks. Pre-load common modules and helper functions once:
import interpreters
def create_worker_pool(num_workers, init_code):
"""Create a pool of pre-initialized subinterpreters."""
pool = []
for _ in range(num_workers):
interp = interpreters.create()
interpreters.run_string(interp, init_code)
pool.append(interp)
return pool
# Initialize pool with common imports
init = """
import json
import hashlib
def hash_data(data):
return hashlib.sha256(data.encode()).hexdigest()
"""
pool = create_worker_pool(4, init)
# Use interpreters from pool
for i, interp in enumerate(pool):
interpreters.run_string(interp, f"print('Worker {i} ready')")
# Clean up
for interp in pool:
interpreters.destroy(interp)
Exception Handling Across Interpreters
Exceptions in a subinterpreter terminate that interpreter's current run_string() call. Catch them in the parent:
import interpreters
interp = interpreters.create()
try:
interpreters.run_string(interp, """
x = 1 / 0 # ZeroDivisionError
""")
except RuntimeError as e:
# Errors in subinterpreters raise RuntimeError in the parent
print(f"Subinterpreter error: {e}")
# Interpreter still exists (but may have partial state)
interpreters.run_string(interp, "print('Still running')")
interpreters.destroy(interp)
Listing and Querying Interpreters
Use interpreters.list_all() to enumerate active interpreters, and interpreters.get_main() to get the main interpreter's ID.
import interpreters
print(f"Main interpreter: {interpreters.get_main()}")
# Create some subinterpreters
interp1 = interpreters.create()
interp2 = interpreters.create()
# List all
all_interps = interpreters.list_all()
print(f"Active interpreters: {all_interps}")
# Clean up
interpreters.destroy(interp1)
interpreters.destroy(interp2)
Key Takeaways
- Create subinterpreters with
interpreters.create(); each has its own GIL, namespace, and memory context. - Pass data via channels:
create_channel()returns(send_id, recv_id); usechannel_send()andchannel_recv()to exchange pickleable objects. - State persists within an interpreter across multiple
run_string()calls; use this to pool pre-initialized interpreters. - Exceptions in subinterpreters raise
RuntimeErrorin the parent; always wrap calls in try-except. - List active interpreters with
list_all(); destroy them withdestroy()when done.
Frequently Asked Questions
How much memory does each subinterpreter use?
~1-5 MB for a minimal interpreter (just the namespace and GIL structure). Pre-loading modules increases this; a subinterpreter with NumPy, Pandas, and other libraries loaded costs ~50-100 MB. Plan accordingly when pooling.
Can I share mutable objects between subinterpreters?
No. Objects cannot be shared directly; they must be pickled and sent through channels. This is by design—it prevents data races. If you need efficient sharing of large arrays, use NumPy's memory mapping or memory-mapped files.
What happens if I call destroy() on a running interpreter?
The interpreter is destroyed immediately, even if run_string() is executing. Behavior is undefined; avoid it. Always join() threads running subinterpreter code before destroying.
Can subinterpreters create their own subinterpreters?
Yes, but nested subinterpreters are rarely useful. Each subinterpreter can call interpreters.create() to spawn children, but isolation and communication become complex. Stick to a two-level hierarchy (main + workers) for clarity.
Is there a limit to how many subinterpreters I can create?
No hard limit, but practical limits exist. Memory (~1 MB per interpreter minimum) and lock contention (each interpreter adds lock structures) mean creating thousands is impractical. Typical pools are 2-32 interpreters; adjust based on your hardware and workload.