Creating and Starting Python Processes: Step-by-Step
Creating a Python process is straightforward: instantiate a multiprocessing.Process object with a target function, call start() to launch it, and use join() to wait for completion. Unlike threads, processes must be explicitly managed through their full lifecycle, and you must guard the entry point with if __name__ == "__main__" to avoid recursive spawning on Windows. This article walks through process creation, argument passing, daemon processes, and the platform-specific quirks you need to know.
The Multiprocessing Process Lifecycle
A Process object has four main states: created, started, running, and terminated.
- Created: You instantiate the Process object but have not called
start(). - Started: You call
start(), and the OS spawns a new process. - Running: The process executes the target function.
- Terminated: The target function returns, or you call
terminate()/kill().
Always call join() after starting a process to wait for it to finish and collect its exit code. Failing to join leaves zombie processes.
Basic Process Creation and Execution
Here's the minimal example:
import multiprocessing
import time
def worker(name, duration):
"""A simple worker function that sleeps and prints."""
print(f"[{name}] Starting")
time.sleep(duration)
print(f"[{name}] Done after {duration}s")
if __name__ == "__main__":
# Create a process targeting the worker function
p = multiprocessing.Process(target=worker, args=("Worker-1", 2))
# Start the process (does not block)
p.start()
# Wait for the process to complete
p.join()
print("Main process finished")
Output:
[Worker-1] Starting
[Worker-1] Done after 2s
Main process finished
Key points:
targetis the callable (function, method, or callable object) to run in the new process.argsis a tuple of positional arguments to pass to the target.kwargspasses keyword arguments:Process(target=func, kwargs={"key": "value"}).start()returns immediately; the process runs in the background.join()blocks the main process until the worker finishes.
The if __name__ == "__main__" Guard
Windows does not support fork-style process creation; it uses "spawn," which pickles the entire main module and re-executes it in the child process. If you don't guard entry code, the child will try to spawn more children, leading to infinite recursion and a crash.
WRONG (will fail on Windows):
import multiprocessing
def worker():
print("Worker running")
p = multiprocessing.Process(target=worker)
p.start() # BUG: child re-executes this line, spawning grandchild, etc.
p.join()
CORRECT (works everywhere):
import multiprocessing
def worker():
print("Worker running")
if __name__ == "__main__":
p = multiprocessing.Process(target=worker)
p.start()
p.join()
On Linux and macOS (fork-based), the guard is less critical, but it's a best practice for cross-platform compatibility. Always use it.
Passing Arguments to Worker Functions
Positional Arguments with args
import multiprocessing
def add(a, b):
result = a + b
print(f"{a} + {b} = {result}")
if __name__ == "__main__":
p = multiprocessing.Process(target=add, args=(10, 20))
p.start()
p.join()
Keyword Arguments with kwargs
import multiprocessing
def greet(name, greeting="Hello"):
print(f"{greeting}, {name}!")
if __name__ == "__main__":
p = multiprocessing.Process(
target=greet,
args=("Alice",),
kwargs={"greeting": "Hi"}
)
p.start()
p.join()
Daemon vs. Non-Daemon Processes
By default, processes are non-daemon: the main process waits for all children to finish before exiting. Daemon processes are the opposite—the main process exits even if they're still running, and daemon children are terminated automatically.
import multiprocessing
import time
def long_running():
for i in range(10):
print(f"Iteration {i}")
time.sleep(1)
if __name__ == "__main__":
# Non-daemon: main waits for this to finish
p_normal = multiprocessing.Process(target=long_running, daemon=False)
# Daemon: main exits immediately, terminating this process
p_daemon = multiprocessing.Process(target=long_running, daemon=True)
p_daemon.start()
# Main exits after 0.1s; daemon process is killed
time.sleep(0.1)
print("Main exiting (daemon is terminated)")
# Output: main exits after ~0.1s; daemon prints only 0–1 iterations
Use daemon processes for:
- Background cleanup tasks (logging, cache flushing).
- Worker threads/processes that should not prevent program shutdown.
Use non-daemon for:
- Long-running jobs where you need guaranteed completion.
- Worker pools (ProcessPoolExecutor uses daemon=True internally for cleanup).
Checking Process State
The Process object provides methods to inspect status:
import multiprocessing
import time
def worker():
time.sleep(2)
if __name__ == "__main__":
p = multiprocessing.Process(target=worker)
print(f"is_alive() before start: {p.is_alive()}") # False
p.start()
print(f"is_alive() after start: {p.is_alive()}") # True
print(f"pid: {p.pid}") # OS process ID
p.join()
print(f"is_alive() after join: {p.is_alive()}") # False
print(f"exitcode: {p.exitcode}") # 0 (success)
Properties:
is_alive()– True if the process is running.pid– Operating system process ID.exitcode– None (still running), 0 (success), negative (killed by signal).name– Human-readable name; defaults to "Process-1", "Process-2", etc.
Spawning Multiple Processes
For simple loops, manage a list of Process objects:
import multiprocessing
import time
def task(task_id, duration):
print(f"Task {task_id} starting")
time.sleep(duration)
print(f"Task {task_id} done")
if __name__ == "__main__":
processes = []
# Spawn 5 processes
for i in range(5):
p = multiprocessing.Process(target=task, args=(i, 1))
p.start()
processes.append(p)
# Wait for all to finish
for p in processes:
p.join()
print("All processes finished")
For large numbers of workers, use multiprocessing.Pool or concurrent.futures.ProcessPoolExecutor (see Article 3 and 4).
Terminating a Process
You can forcefully terminate a process, though it's not graceful:
import multiprocessing
import time
def long_task():
for i in range(100):
print(f"Step {i}")
time.sleep(0.1)
if __name__ == "__main__":
p = multiprocessing.Process(target=long_task)
p.start()
time.sleep(1) # Let it run for ~1 second
# Forcefully terminate
p.terminate()
p.join()
print(f"Process exitcode: {p.exitcode}") # -15 (SIGTERM)
Caution: terminate() does not run finally blocks or context managers. Use it only when the process is unresponsive. Prefer graceful shutdown via Queues (see Article 5).
Process Name and Identification
Set a process name for debugging:
import multiprocessing
def worker():
print(f"Running in process: {multiprocessing.current_process().name}")
if __name__ == "__main__":
p = multiprocessing.Process(target=worker, name="MyWorker")
p.start()
p.join()
Within a process, get your own info:
import multiprocessing
def worker():
current = multiprocessing.current_process()
print(f"PID: {current.pid}, Name: {current.name}")
if __name__ == "__main__":
p = multiprocessing.Process(target=worker)
p.start()
p.join()
Key Takeaways
- Create a
Processobject withtarget(callable) andargs/kwargs(arguments). - Call
start()to spawn the process; it returns immediately (non-blocking). - Call
join()to wait for the process and collect its exit code. - Always guard process spawning with
if __name__ == "__main__"for cross-platform compatibility. - Use
daemon=Truefor background tasks that should not prevent program exit;daemon=Falsefor long-running jobs. - Check
is_alive()andexitcodeto inspect process status. - Use
terminate()only as a last resort; prefer graceful shutdown via IPC.
Frequently Asked Questions
Why do I need if __name__ == "__main__"?
On Windows, processes are created via "spawn" (not fork), which pickles the entire main module. Without the guard, the child re-executes module-level code, spawning grandchildren infinitely. On Linux/macOS, fork is safer, but the guard is still a best practice for portability.
Can I pass complex objects (e.g., class instances) as arguments?
Yes, if they are picklable. Multiprocessing serializes arguments with pickle before passing them to the child. Objects without __getstate__ and __setstate__ (like open file handles) will fail. See Article 8 for pickling pitfalls.
How do I capture the return value of a process?
Process does not return values directly. Use multiprocessing.Queue to send results back to the main process (see Article 5).
What's the overhead of spawning a process?
Typically 50–200 ms, including interpreter startup, module imports, and OS scheduling. This is why you should not spawn a new process for every tiny task; use pools instead.
Can I set a timeout for join()?
Yes: p.join(timeout=5) waits up to 5 seconds. If the process doesn't finish, join() returns and the process keeps running. Check is_alive() afterward to confirm termination.
Further Reading
- multiprocessing.Process documentation — official API reference.
- Real Python: Multiprocessing in Python — comprehensive guide with examples.
- PEP 371: Addition of the multiprocessing package to the standard library — design rationale.
- Python Multiprocessing Best Practices — official guidelines for safe multiprocessing code.