Skip to main content

Per-Interpreter GIL: Isolating State and Avoiding Deadlocks

The per-interpreter GIL is the architectural foundation of free-threaded Python. Each subinterpreter holds its own lock; threads running in Interpreter A don't block threads in Interpreter B. This isolation is powerful but introduces new deadlock patterns. I've debugged production deadlocks where one thread held Interpreter A's GIL while waiting for a channel message from Interpreter B, which was blocked trying to send. This article teaches the mental model and patterns to avoid such traps.

A subinterpreter's GIL protects its object heap and reference counts. Threads running code in that interpreter must hold the GIL. Unlike the process-wide GIL, there's no global serialization point. Code in Interpreter A and Interpreter B truly runs in parallel.

The Per-Interpreter GIL: Isolation Semantics

Each subinterpreter is a Python execution context with:

  • Its own __main__ module and global namespace.
  • Its own object heap (objects created in A are invisible to B).
  • Its own GIL (a lock protecting A's reference counts).

Threads running code in A acquire A's GIL. Threads running code in B acquire B's GIL. The two locks are independent; no global serialization.

import interpreters
import threading
import time

# Create two interpreters
interp_a = interpreters.create()
interp_b = interpreters.create()

# Code that holds the GIL for a long time
code_cpu_bound = """
import time
start = time.time()
while time.time() - start < 2:
x = 1 + 1
print("Done computing")
"""

# Run CPU-bound code in both interpreters concurrently
t_a = threading.Thread(target=lambda: interpreters.run_string(interp_a, code_cpu_bound))
t_b = threading.Thread(target=lambda: interpreters.run_string(interp_b, code_cpu_bound))

start = time.time()
t_a.start()
t_b.start()
t_a.join()
t_b.join()
elapsed = time.time() - start

print(f"Both completed in {elapsed:.1f}s")
# On free-threaded: ~2 seconds (parallel)
# On GIL-bound: would be serial (one interpreter doesn't exist, so invalid example)

interpreters.destroy(interp_a)
interpreters.destroy(interp_b)

On free-threaded Python, both threads run in parallel (~2 seconds total). On GIL-bound Python, subinterpreters would still hold the global GIL, so parallelism wouldn't improve (but subinterpreters are only available on free-threaded Python 3.13+).

Isolation Guarantees

Within an interpreter, Python's semantics are unchanged:

  • A function's local variables are isolated (thread-local by nature).
  • Global variables are shared (multiple threads in the same interpreter access the same __main__ namespace).
  • Object mutation is serialized by the GIL (only one thread holds it at a time).

Across interpreters:

  • Direct object sharing is impossible (objects are tied to their interpreter's heap).
  • Data must be pickled (serialization) or shared via low-level mechanisms (ctypes, memmap).

Example: attempting to share an object directly fails:

import interpreters

# Create two interpreters
interp_a = interpreters.create()
interp_b = interpreters.create()

code_a = """
x = {"data": [1, 2, 3]}
"""

code_b = """
# Trying to access x from interp_a would fail
# (but there's no syntax for it; each interpreter is isolated)
print("x is not visible here")
"""

interpreters.run_string(interp_a, code_a)
interpreters.run_string(interp_b, code_b)

interpreters.destroy(interp_a)
interpreters.destroy(interp_b)

Data must flow through channels:

import interpreters
import threading

send_id, recv_id = interpreters.create_channel()

code_a = f"""
import interpreters
send_id = {send_id}
x = {{"data": [1, 2, 3]}}
interpreters.channel_send(send_id, x)
"""

code_b = f"""
import interpreters
recv_id = {recv_id}
x = interpreters.channel_recv(recv_id)
print(f"Received: {{x}}")
"""

interp_a = interpreters.create()
interp_b = interpreters.create()

t_a = threading.Thread(target=lambda: interpreters.run_string(interp_a, code_a))
t_b = threading.Thread(target=lambda: interpreters.run_string(interp_b, code_b))

t_a.start()
t_b.start()
t_a.join()
t_b.join()

interpreters.destroy(interp_a)
interpreters.destroy(interp_b)

Deadlock Scenario 1: Channel Blocking with GIL

A classic deadlock: Thread A holds Interpreter A's GIL and tries to receive from a channel. Interpreter B's code (running in Thread B) tries to send but is blocked. If Interpreter B code acquires another lock that Thread A holds (unlikely but possible with nested locks), deadlock occurs.

More commonly, Thread A and Thread B both wait for channel data without sending:

import interpreters
import threading

send_id, recv_id = interpreters.create_channel()

# DEADLOCK: both sides try to receive, no one sends
code_a = f"""
import interpreters
recv_id = {recv_id}
msg = interpreters.channel_recv(recv_id) # Waits forever
print(msg)
"""

code_b = f"""
import interpreters
send_id = {send_id}
msg = interpreters.channel_recv(send_id) # Can't receive on send side!
"""

interp_a = interpreters.create()
interp_b = interpreters.create()

t_a = threading.Thread(target=lambda: interpreters.run_string(interp_a, code_a))
t_b = threading.Thread(target=lambda: interpreters.run_string(interp_b, code_b))

t_a.start()
t_b.start()

# This will hang forever
# t_a.join()
# t_b.join()

print("(Commented out to avoid hanging)")

interpreters.destroy(interp_a)
interpreters.destroy(interp_b)

Lesson: Ensure channel producers and consumers are paired correctly. Use a timeout to detect deadlocks:

import interpreters
import threading

send_id, recv_id = interpreters.create_channel()

code_a = f"""
import interpreters
send_id = {send_id}
interpreters.channel_send(send_id, "hello", timeout=2)
"""

code_b = f"""
import interpreters
recv_id = {recv_id}
msg = interpreters.channel_recv(recv_id, timeout=2)
print(f"Received: {{msg}}")
"""

interp_a = interpreters.create()
interp_b = interpreters.create()

t_a = threading.Thread(target=lambda: interpreters.run_string(interp_a, code_a))
t_b = threading.Thread(target=lambda: interpreters.run_string(interp_b, code_b))

t_a.start()
t_b.start()

try:
t_a.join(timeout=5)
t_b.join(timeout=5)
except:
print("Timeout detected")

interpreters.destroy(interp_a)
interpreters.destroy(interp_b)

Deadlock Scenario 2: Nested Locks Across Interpreters

If Thread A (holding Interpreter A's GIL) tries to acquire a lock held by Thread B (running in Interpreter B), and Thread B tries to send a channel message to Thread A, deadlock occurs.

Example (avoid this pattern):

import interpreters
import threading
import time

send_id, recv_id = interpreters.create_channel()
external_lock = threading.Lock()

# Thread A: holds GIL_A, acquires external_lock, tries to recv
code_a = f"""
import interpreters
import threading
import time

external_lock = None # Passed differently in real code
recv_id = {recv_id}

# Simulate holding the GIL for computation
time.sleep(1)

# Try to receive (while still holding GIL_A implicitly)
msg = interpreters.channel_recv(recv_id, timeout=2)
print(f"Received: {{msg}}")
"""

# Thread B: holds GIL_B, tries to acquire external_lock and send
code_b = f"""
import interpreters
import threading
import time

external_lock = None # Passed differently in real code
send_id = {send_id}

# Try to acquire external_lock (if Thread A holds it, we block)
# with external_lock:
# interpreters.channel_send(send_id, "hello")
# DEADLOCK: Thread A waiting on recv, Thread B waiting on lock
"""

# This example shows the pattern (avoid nesting GIL + external locks)

Pattern to avoid: Never hold an external lock while waiting on a channel. Keep locks and channel I/O separate.

Safe pattern:

import interpreters
import threading

send_id, recv_id = interpreters.create_channel()

# Separate channel I/O from lock-protected sections
code_a = f"""
import interpreters

recv_id = {recv_id}

# Wait for message (not holding any external lock)
msg = interpreters.channel_recv(recv_id, timeout=2)

# Process message (acquire locks if needed)
print(f"Received: {{msg}}")
"""

code_b = f"""
import interpreters

send_id = {send_id}

# Send message
interpreters.channel_send(send_id, "hello")
"""

interp_a = interpreters.create()
interp_b = interpreters.create()

t_a = threading.Thread(target=lambda: interpreters.run_string(interp_a, code_a))
t_b = threading.Thread(target=lambda: interpreters.run_string(interp_b, code_b))

t_a.start()
t_b.start()
t_a.join()
t_b.join()

interpreters.destroy(interp_a)
interpreters.destroy(interp_b)

print("Safe, no deadlock")

Deadlock Prevention Checklist

  1. Channel pairing: Ensure every channel_send() has a corresponding channel_recv(). Use timeouts to detect mismatches.
  2. Lock ordering: If using external locks (not per-interpreter GILs), acquire them in a consistent global order across threads to prevent circular waits.
  3. Separate concerns: Don't hold external locks while blocking on channel I/O. Decouple synchronization.
  4. Watchdog threads: In critical services, spawn a watchdog thread that detects hung interpreters and logs/restarts them.

Example watchdog:

import interpreters
import threading
import time

def create_monitored_interpreter():
"""Create an interpreter and a watchdog thread."""
interp = interpreters.create()
last_ping = time.time()
lock = threading.Lock()

def watchdog():
"""Monitor for deadlocks; if no ping in 10 seconds, log warning."""
while True:
time.sleep(10)
with lock:
if time.time() - last_ping > 10:
print(f"Warning: Interpreter {interp} might be deadlocked")

def run_code(code):
nonlocal last_ping
with lock:
last_ping = time.time()
interpreters.run_string(interp, code)
with lock:
last_ping = time.time()

watchdog_thread = threading.Thread(target=watchdog, daemon=True)
watchdog_thread.start()

return interp, run_code

# Use monitored interpreter
interp, run = create_monitored_interpreter()

code = """
import time
time.sleep(2)
print("Done")
"""

run(code)

interpreters.destroy(interp)

Key Takeaways

  • Per-interpreter GILs allow true parallelism; each interpreter's GIL is independent.
  • Objects don't cross interpreter boundaries; data flows via channels (serialization) or low-level shared memory.
  • Deadlock patterns differ from single-GIL Python: avoid holding external locks while blocking on channel I/O.
  • Use timeouts on channel operations to detect hangs.
  • Separate lock-protected sections from channel I/O to maintain deadlock freedom.

Frequently Asked Questions

Can a thread acquire two interpreters' GILs simultaneously?

No. A thread runs code in one interpreter at a time (holding one GIL). Switching interpreters requires releasing the current GIL. This is by design to prevent deadlocks.

What if I need to run code that accesses multiple interpreters?

You can't directly access objects from Interpreter A while running in Interpreter B. Instead, use channels: send data from A, receive in B, process, send results back. This enforces safe serialization.

How do I debug a deadlock involving subinterpreters?

Use py-spy or gdb to inspect thread stacks. Look for threads blocking on channel_recv() or locks. Verify that channel send/recv sides match. Add logging to pinpoint the exact point where threads block.

Is the per-interpreter GIL visible to my code?

No. The GIL is acquired and released automatically; you can't explicitly call acquire() or release(). You can use threading.Lock() for explicit synchronization if needed.

What's the overhead of per-interpreter GILs vs a global GIL?

Per-interpreter GILs add slightly more overhead per interpreter (~100 bytes per lock), but the lack of global contention more than compensates. Benchmarks show 2-4x improvement on multi-core workloads.

Further Reading