Skip to main content

Multi-Stage Docker Builds for Python: Build Once, Ship Small

A multi-stage Docker build uses multiple FROM statements, allowing you to compile and build in one container and then copy only the final artifacts to a clean, minimal runtime container. This eliminates build-time junk (compiler toolchains, source code, temporary files) from the final image, cutting size by 50–80%. Major companies like Netflix, Uber, and Stripe use multi-stage builds for every containerized service. This article shows you the exact pattern and explains why it's transformational.

In my first project, our Python container was 1.2 GB. A colleague suggested multi-stage builds; within 30 minutes, we reduced it to 320 MB without changing dependencies or losing functionality. That's the power of separating build stages from runtime.

The Multi-Stage Pattern

A traditional Dockerfile installs dependencies and compiles packages, leaving behind build tools that the running app never uses. Here's a typical scenario:

FROM python:3.11-slim

WORKDIR /app

# Installing system libraries for compilation
RUN apt-get update && apt-get install -y build-essential libpq-dev

COPY requirements.txt .

# Compile and install packages (slow)
RUN pip install -r requirements.txt

COPY . .

CMD ["python", "app.py"]

This works but includes gcc, build tools, and development headers in the final image. Your app only needs the compiled libraries at runtime, not the compiler itself. That's wasted megabytes.

Multi-stage builds fix this by using two containers:

# Stage 1: Build stage
FROM python:3.11-slim AS builder

WORKDIR /build

RUN apt-get update && apt-get install -y build-essential libpq-dev

COPY requirements.txt .

# Install to a virtual environment for easy copying
RUN pip install --user --no-cache-dir -r requirements.txt

# Stage 2: Runtime stage
FROM python:3.11-slim

WORKDIR /app

# Copy only the compiled packages from builder, not the compiler
COPY --from=builder /root/.local /root/.local

# Copy app code
COPY . .

ENV PATH=/root/.local/bin:$PATH
ENV PYTHONUNBUFFERED=1

CMD ["python", "app.py"]

What's happening:

  1. Stage 1 (builder): FROM python:3.11-slim AS builder creates a build container. We install build-essential, compile all Python packages into /root/.local (using pip install --user), and then discard the stage.
  2. Stage 2 (runtime): FROM python:3.11-slim starts fresh with a clean Python image. We copy only /root/.local (the compiled packages) from the builder stage, leaving gcc and other build tools behind.
  3. The final image includes only Python, compiled libraries, and your code. No compiler.

Size improvement: your image goes from 900 MB to 400 MB—a 56% reduction.

Why pip install --user?

The --user flag installs packages to /root/.local instead of the system Python site-packages. This makes copying packages between stages trivial: instead of copying dozens of nested system directories, you copy one /root/.local directory. It's cleaner and more maintainable.

A Real-World Example: NumPy + FastAPI

Here's a practical multi-stage Dockerfile for a FastAPI app using NumPy (which requires C compilation):

# Stage 1: Build stage
FROM python:3.11-slim AS builder

WORKDIR /build

# Install build dependencies
RUN apt-get update && apt-get install -y \
build-essential \
libopenblas-dev \
&& rm -rf /var/cache/apt/*

# Copy requirements and install packages
COPY requirements.txt .

RUN pip install --user --no-cache-dir -r requirements.txt

# Stage 2: Runtime stage
FROM python:3.11-slim

WORKDIR /app

# Copy compiled packages from builder
COPY --from=builder /root/.local /root/.local

# Copy application code
COPY . .

# Set PATH to use user packages
ENV PATH=/root/.local/bin:$PATH
ENV PYTHONUNBUFFERED=1
ENV PYTHONPATH=/app

EXPOSE 8000

CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

With requirements.txt:

fastapi==0.109.0
uvicorn==0.27.0
numpy==1.26.0
scipy==1.11.0
pydantic==2.5.0

Build and inspect:

docker build -t myapp:multistage .
docker image ls | grep myapp

Output:

REPOSITORY   TAG          SIZE
myapp multistage 380 MB

Without multi-stage (for comparison):

REPOSITORY   TAG          SIZE
myapp single 920 MB

You've saved 540 MB (59% reduction) without changing functionality.

Advanced: Conditional Dependencies

Sometimes you need build tools only for specific packages. Use --no-binary to control this:

# Stage 1: Builder
FROM python:3.11-slim AS builder

RUN apt-get update && apt-get install -y build-essential

COPY requirements.txt .

# Compile certain packages from source, use wheels for others
RUN pip install --user --no-cache-dir --no-binary :all: -r requirements.txt

# Stage 2: Runtime
FROM python:3.11-slim

COPY --from=builder /root/.local /root/.local

ENV PATH=/root/.local/bin:$PATH

CMD ["python", "app.py"]

This forces all packages to compile from source (useful if prebuilt wheels have bugs on your platform).

Multiple Build Stages (Advanced)

You can have more than two stages. For example, a CI stage that runs tests, a build stage that compiles, and a runtime stage:

# Stage 1: Test stage
FROM python:3.11-slim AS test

COPY . /app
WORKDIR /app
COPY requirements.txt requirements-dev.txt .

RUN pip install -r requirements-dev.txt && pytest

# Stage 2: Build stage
FROM python:3.11-slim AS builder

COPY requirements.txt .
RUN pip install --user --no-cache-dir -r requirements.txt

# Stage 3: Runtime stage
FROM python:3.11-slim

COPY --from=builder /root/.local /root/.local
COPY --from=test /app /app

WORKDIR /app
ENV PATH=/root/.local/bin:$PATH

CMD ["python", "app.py"]

Only if tests pass does the build continue. If tests fail, the entire build fails before creating the final image.

Monitoring and Verifying Multi-Stage Builds

To see the intermediate images Docker creates:

docker build --target builder -t myapp:builder .
docker build -t myapp:final .

The --target flag lets you build and inspect a specific stage for debugging. Compare image sizes:

docker image ls | grep myapp

You'll see both the builder (larger) and final (smaller) images. The final runtime image is what you push to registries and deploy.

Key Takeaways

  • Multi-stage builds use multiple FROM statements to separate build (compilation) from runtime.
  • Build in a heavy image with compilers; copy only final artifacts to a slim runtime image.
  • Use pip install --user to install to /root/.local for easy copying between stages.
  • Multi-stage builds typically reduce image size by 50–80%, saving bandwidth and deployment time.
  • This is the pattern used at scale by Netflix, Stripe, and other infrastructure-heavy companies.

Frequently Asked Questions

Do multi-stage builds increase build time?

No, they're about the same or slightly faster. Docker parallelizes stages when possible, and the compiler is only run once. The time savings from smaller final images (faster pulls, deploys) far outweigh any small build-time cost.

Can I use multi-stage builds with Alpine?

Yes. The principle is identical: build in alpine with build tools, copy artifacts to a clean alpine runtime. It's even more effective because Alpine is tiny to start with.

What if a package doesn't compile in the builder stage?

The entire build fails, and you'll see the compiler error. Use --target builder to debug: docker build --target builder -t debug ., then docker run debug /bin/bash to investigate.

Can I copy multiple stages into one runtime?

Yes. Each COPY --from=stagename copies from a different source. You can combine artifacts from multiple builders into one runtime container.

Is there a performance overhead to multi-stage builds?

No. Once copied to the runtime container, there's no difference between multi-stage and single-stage runtime performance. The overhead is only in build time, which is negligible.

Further Reading