ProcessPoolExecutor vs Process Pool: Which to Use?
concurrent.futures.ProcessPoolExecutor and multiprocessing.Pool both distribute work across multiple processes, but they differ in API design, result handling, and integration with async code. ProcessPoolExecutor returns Future objects and plays well with async/await and map-reduce patterns, making it the recommended choice for new code. Pool is older and lower-level, suitable for simple batch workloads where you don't need Futures. This article breaks down their architectures, APIs, and decision tree for choosing one.
Core Difference: Futures vs. Direct Results
The fundamental design difference is how they return results.
multiprocessing.Pool returns values directly:
import multiprocessing
with multiprocessing.Pool() as pool:
results = pool.map(lambda x: x ** 2, [1, 2, 3])
print(results) # [1, 4, 9] — immediate list
ProcessPoolExecutor returns Future objects (lazy, promise-like):
from concurrent.futures import ProcessPoolExecutor
with ProcessPoolExecutor(max_workers=4) as executor:
futures = [executor.submit(lambda x: x ** 2, x) for x in [1, 2, 3]]
results = [f.result() for f in futures]
print(results) # [1, 4, 9] — same result, but via Futures
This difference cascades: Futures enable non-blocking result checking, timeout handling, and cancellation—capabilities that Pool lacks.
API Comparison
| Feature | Pool | ProcessPoolExecutor |
|---|---|---|
| Module | multiprocessing | concurrent.futures |
| Task submission | map(), apply_async(), imap() | submit(), map() |
| Returns | Values (list, iterator, AsyncResult) | Future objects |
| Blocking get | AsyncResult.get() | Future.result() |
| Timeout support | Limited | Built-in on result() and wait() |
| Task cancellation | Not supported | Future.cancel() |
| Async/await integration | Manual wrapper needed | Works well with asyncio |
| Context manager | Yes | Yes |
| Startup overhead | Same | Same |
Code Comparison: Submitting Tasks
multiprocessing.Pool
import multiprocessing
def process_item(item):
return item ** 2
if __name__ == "__main__":
with multiprocessing.Pool(processes=4) as pool:
# map: all at once, blocks until done
results = pool.map(process_item, [1, 2, 3, 4, 5])
# imap: iterator, yields as ready
for result in pool.imap(process_item, [1, 2, 3, 4, 5]):
print(result)
# apply_async: one task, Future-like object
async_result = pool.apply_async(process_item, (10,))
result = async_result.get(timeout=5)
ProcessPoolExecutor
from concurrent.futures import ProcessPoolExecutor, as_completed
def process_item(item):
return item ** 2
if __name__ == "__main__":
with ProcessPoolExecutor(max_workers=4) as executor:
# submit: one task at a time, returns Future
future = executor.submit(process_item, 10)
result = future.result(timeout=5)
# map: like Pool.map, but returns iterator of Futures
futures = [executor.submit(process_item, x) for x in [1, 2, 3, 4, 5]]
results = [f.result() for f in futures]
# as_completed: process results in completion order
futures = [executor.submit(process_item, x) for x in [1, 2, 3, 4, 5]]
for future in as_completed(futures):
print(f"Result: {future.result()}")
Advanced: Future-Based Patterns
Cancellation
ProcessPoolExecutor allows you to cancel a task before it starts:
from concurrent.futures import ProcessPoolExecutor
import time
def slow_task(n):
time.sleep(n)
return n ** 2
if __name__ == "__main__":
with ProcessPoolExecutor(max_workers=2) as executor:
future = executor.submit(slow_task, 10)
time.sleep(0.1)
if future.cancel():
print("Task cancelled successfully")
else:
print("Task already running; cannot cancel")
Pool has no cancellation support. You must use terminate() on the entire pool or wrap tasks with your own cancellation logic.
Timeout with Exception Handling
ProcessPoolExecutor raises TimeoutError if a task doesn't complete:
from concurrent.futures import ProcessPoolExecutor, TimeoutError
import time
def task(duration):
time.sleep(duration)
return "Done"
if __name__ == "__main__":
with ProcessPoolExecutor(max_workers=2) as executor:
future = executor.submit(task, 10)
try:
result = future.result(timeout=2)
except TimeoutError:
print("Task exceeded 2-second timeout")
# Can still cancel now
future.cancel()
as_completed(): Process Results in Finish Order
With ProcessPoolExecutor, you can iterate futures as they complete:
from concurrent.futures import ProcessPoolExecutor, as_completed
import time
def task(x):
time.sleep(x) # Simulate varying durations
return x ** 2
if __name__ == "__main__":
with ProcessPoolExecutor(max_workers=4) as executor:
futures = {executor.submit(task, x): x for x in [3, 1, 4, 1, 5]}
for future in as_completed(futures, timeout=10):
x = futures[future]
result = future.result()
print(f"Task {x} completed: {result}")
This is powerful for pipelines where you want to process results as soon as they arrive, without waiting for slower tasks.
Integration with asyncio
ProcessPoolExecutor integrates naturally with asyncio for mixed async-parallel workloads:
import asyncio
from concurrent.futures import ProcessPoolExecutor
import time
def cpu_bound_task(n):
"""CPU-intensive work in a process pool."""
total = sum(i ** 2 for i in range(n))
return total
async def fetch_and_process(loop, executor, n):
"""Fetch data (async) and process it (parallel)."""
# Run CPU-bound work in background process pool
result = await loop.run_in_executor(executor, cpu_bound_task, n)
return result
if __name__ == "__main__":
with ProcessPoolExecutor(max_workers=4) as executor:
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
result = loop.run_until_complete(fetch_and_process(loop, executor, 10_000_000))
print(f"Result: {result}")
Pool requires manual wrapper code to integrate with asyncio.
Performance and Overhead
Both Pool and ProcessPoolExecutor have identical overhead for process startup and IPC. The choice is architectural, not performance-based. In benchmarks with thousands of short tasks:
- Pool.map(): Slightly faster (no Future wrapper objects), but only by 5–10%.
- ProcessPoolExecutor: Slightly more memory (Future objects), but negligible for typical workloads.
Choose based on API fit, not performance.
Decision Tree: Pool vs. ProcessPoolExecutor
Choose ProcessPoolExecutor if:
- You need to cancel tasks before execution.
- You want to handle timeouts with exceptions.
- You're integrating with asyncio code.
- You're processing results as they complete (as_completed).
- You're building a new project and want modern APIs.
Choose multiprocessing.Pool if:
- You're maintaining legacy code using Pool.
- Your workload is simple batch processing with no advanced features.
- You prefer lower-level control and direct result access.
- Performance is critical and you've measured the 5–10% difference.
For new code in 2026, ProcessPoolExecutor is the recommended default.
Code Example: Real-World Scenario
Here's a hybrid example: fetch URLs concurrently with asyncio, then process responses in parallel with ProcessPoolExecutor.
import asyncio
from concurrent.futures import ProcessPoolExecutor
import aiohttp
import time
def process_response(html):
"""CPU-bound: parse HTML, extract data."""
time.sleep(1) # Simulate parsing
return f"Processed: {len(html)} bytes"
async def fetch_and_process(url, session, executor, loop):
"""Fetch URL (async I/O) and process response (parallel CPU)."""
async with session.get(url) as response:
html = await response.text()
# Offload processing to process pool
result = await loop.run_in_executor(executor, process_response, html)
return result
async def main():
urls = ["https://example.com"] * 5 # 5 URLs
with ProcessPoolExecutor(max_workers=4) as executor:
loop = asyncio.get_event_loop()
async with aiohttp.ClientSession() as session:
tasks = [
fetch_and_process(url, session, executor, loop)
for url in urls
]
results = await asyncio.gather(*tasks)
print(results)
if __name__ == "__main__":
asyncio.run(main())
This pattern combines asyncio (for I/O-bound network work) with ProcessPoolExecutor (for CPU-bound parsing), achieving high throughput.
Key Takeaways
- ProcessPoolExecutor (concurrent.futures) is the modern, recommended choice for new code; it returns Future objects enabling cancellation, timeout handling, and async integration.
- multiprocessing.Pool is simpler for straightforward batch workloads but lacks Future-based controls.
- Both have identical performance overhead; choice is architectural.
- ProcessPoolExecutor integrates seamlessly with asyncio via
run_in_executor(). - Use
as_completed()to process results as they finish, not in submission order.
Frequently Asked Questions
Can I mix ProcessPoolExecutor with multiprocessing.Queue?
Generally not recommended—ProcessPoolExecutor manages processes internally. If you need IPC, use Pool with Queues or stick with lower-level Process objects. For futures across processes, use ProcessPoolExecutor with regular function returns.
What's the default max_workers for ProcessPoolExecutor?
In Python 3.13+, it's min(32, os.cpu_count() + 4). In earlier versions, it's os.cpu_count() * 5. Check the docs for your Python version.
Can I shutdown ProcessPoolExecutor gracefully?
Yes. executor.shutdown(wait=True) waits for all pending tasks to finish; wait=False cancels pending tasks. Context managers (with statement) call shutdown automatically.
Is ProcessPoolExecutor more memory-hungry than Pool?
Negligibly. The overhead is Future wrapper objects (~1 KB each), not processes. If you have 10,000 futures, that's ~10 MB extra—usually not a concern.
Can I use ProcessPoolExecutor with custom Process subclasses?
No, ProcessPoolExecutor doesn't expose process customization. If you need custom initialization (e.g., per-worker setup), use multiprocessing.Pool with initializer and initargs.
Further Reading
- concurrent.futures.ProcessPoolExecutor documentation — official API reference.
- multiprocessing.Pool documentation — older Pool reference.
- Real Python: concurrent.futures — detailed comparison and examples.
- asyncio integration guide — using ProcessPoolExecutor with async code.