Skip to main content

Docker Python Best Practices: Layer Caching and Optimizations

Docker builds images layer by layer, and each layer is cached. This means that if you haven't changed a layer since the last build, Docker skips rebuilding it—saving minutes on every iteration. The key to fast development is understanding cache invalidation: what causes a layer to rebuild, and how to order your Dockerfile so that frequently-changed layers come last. I've seen teams spend 20 minutes waiting for builds when the same app could build in 2 minutes with proper layer ordering.

In this article, you'll learn the caching strategies that production teams use to keep build times under 10 seconds for local development and under 2 minutes for CI/CD pipelines. The improvements are dramatic: one simple reordering typically cuts build time by 60–80% on typical Python projects.

How Docker Layer Caching Works

Every instruction in a Dockerfile creates a new layer. When you docker build, Docker:

  1. Reads the Dockerfile line by line.
  2. For each instruction, checks if an identical instruction has been executed with the same input.
  3. If yes, uses the cached layer (instant).
  4. If no, executes the instruction and stores a new layer.

The key is "with the same input." For RUN pip install -r requirements.txt, the input is the contents of requirements.txt. If requirements.txt hasn't changed, Docker uses the cached layer. But if you change even one package version in requirements.txt, the cache is invalidated, and Docker rebuilds from that point forward.

Here's where most teams go wrong: they put COPY . . (copy all source code) early in the Dockerfile. Now every time you change a single Python file, all subsequent layers rebuild, including RUN pip install. Since installing packages is the slowest step (30–120 seconds), you're rebuilding it constantly.

The solution: reorder your Dockerfile so that stable layers come early and frequently-changing layers come late.

The Optimal Dockerfile Layer Order

Use this proven pattern:

# 1. Base image (stable, rarely changes)
FROM python:3.11-slim

# 2. System-level setup (stable)
RUN apt-get update && apt-get install -y build-essential libpq-dev && rm -rf /var/cache/apt/*

# 3. Working directory (stable)
WORKDIR /app

# 4. Dependencies (changes only when requirements.txt changes)
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# 5. Application code (changes frequently)
COPY . .

# 6. Metadata (stable)
EXPOSE 8000
ENV PYTHONUNBUFFERED=1

# 7. Startup command (stable)
CMD ["uvicorn", "main:app", "--host", "0.0.0.0"]

Why this order?

  • FROM, system packages, and WORKDIR rarely change, so they stay in cache.
  • COPY requirements.txt and RUN pip install are next. They're slow, but they only rebuild when requirements.txt changes (maybe once a week). Every other day, you get a cache hit.
  • COPY . . comes last because you change your code every few minutes. Once cached, pip doesn't rebuild.
  • EXPOSE, ENV, and CMD are metadata and don't take time, so position doesn't matter.

The result: after you change your code, the next build runs in 1–3 seconds (copying files) instead of 90 seconds (installing packages).

Using .dockerignore to Reduce Copy Context

When you run docker build, Docker sends your entire working directory (the "build context") to the Docker daemon. For large projects with node_modules, pycache, .git, or .venv directories, this can be hundreds of megabytes, slowing down the build.

Create a .dockerignore file in your project root to exclude unnecessary files:

.git
.gitignore
__pycache__
*.pyc
*.pyo
*.pyd
.Python
env/
venv/
.venv/
node_modules/
.npm/
.env
.env.local
.DS_Store
*.db
*.sqlite3
.coverage
.pytest_cache/
build/
dist/
*.egg-info/
.idea/
.vscode/
*.log

Now docker build only sends relevant files to the daemon, cutting context size by 50–90% and making the build faster.

Multi-Line RUN Statements (Layer Consolidation)

Each RUN instruction creates a new layer. More layers = slightly larger images and slower builds. Consolidate related commands:

Instead of:

RUN apt-get update
RUN apt-get install -y build-essential
RUN apt-get install -y libpq-dev
RUN rm -rf /var/cache/apt/*

Write:

RUN apt-get update && apt-get install -y build-essential libpq-dev && rm -rf /var/cache/apt/*

This is a single layer with the same result. The cleanup command rm -rf /var/cache/apt/* is important: it removes the apt cache, saving 100+ MB per Dockerfile.

For readability, use backslashes:

RUN apt-get update && \
apt-get install -y build-essential libpq-dev && \
rm -rf /var/cache/apt/*

Leveraging BuildKit for Parallel Builds

Modern Docker uses BuildKit, which can build layers in parallel if they don't depend on each other. Enable it:

export DOCKER_BUILDKIT=1
docker build -t myapp:1.0 .

BuildKit also supports advanced features like inline caching and conditional layer execution, which we'll explore in advanced guides. For now, enabling BuildKit can cut overall build time by 10–20%.

Avoiding Common Cache-Breaking Mistakes

Mistake 1: Using latest tag for base images.

FROM python:latest

Docker can't predict if the base image changed, so it always rebuilds from this layer onward. Use explicit versions:

FROM python:3.11.7-slim

Mistake 2: Pinning pip packages loosely.

If your requirements.txt has fastapi (no version), and a new fastapi version releases, pip install downloads it, invalidating the cache. Pin versions:

fastapi==0.109.0
uvicorn==0.27.0
pydantic==2.5.0

Mistake 3: Putting code changes in a RUN instruction.

RUN git clone https://github.com/me/myapp /app

If the repo changes, this rebuilds and clones again. Use COPY instead:

COPY . /app

Now Docker detects that your local files changed and rebuilds smartly.

A Real-World Example: Before and After

Here's a typical beginner Dockerfile and why it's slow:

# Bad: code changes trigger pip rebuild
FROM python:3.11
COPY . /app
WORKDIR /app
RUN pip install -r requirements.txt
CMD ["python", "app.py"]

Every time you change app.py, step 3 (COPY . /app) invalidates the cache, so step 4 rebuilds (40 seconds on a first install). Over a day of development, that's an hour of wasted time.

Optimized version:

# Good: stable layers first, changing layers last
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . /app
ENV PYTHONUNBUFFERED=1
CMD ["python", "app.py"]

Now changing app.py only invalidates step 5 (1 second). Pip doesn't rebuild unless you touch requirements.txt. Build time drops from 40–50 seconds to 1–2 seconds.

Key Takeaways

  • Docker caches layers; unchanged layers build instantly.
  • Order Dockerfile layers from most-stable to least-stable: base image, system packages, dependencies, code.
  • Separate COPY requirements.txt and RUN pip install from COPY . (source code) so code changes don't invalidate pip cache.
  • Use .dockerignore to exclude pycache, .git, venv, and other large directories.
  • Consolidate RUN statements with && to reduce layers and image size.
  • Pin base image versions (python:3.11-slim, not python:latest) to avoid cache invalidation.

Frequently Asked Questions

How do I know if Docker is using cache or rebuilding?

Run docker build --progress=plain -t myapp:1.0 . and watch the output. Look for CACHED next to each step. If you see "CACHED Step 4: RUN pip install," the pip layer was reused from the previous build.

Can I force a rebuild without cache?

Yes: docker build --no-cache -t myapp:1.0 .. This rebuilds every layer, useful when you suspect cache corruption.

What if I need to update the base image but keep other caches?

Pin the base image to a specific release: FROM python:3.11.7-slim. When you're ready to upgrade, change it to python:3.11.8-slim and rebuild.

Does --no-cache-dir in pip actually matter?

Yes, it saves 100–200 MB in the image. pip install --no-cache-dir omits pip's cache, which you don't need inside a container (you're not reinstalling). This translates to smaller, faster images.

Can I view Docker layer sizes?

Yes: docker history myapp:1.0 shows each layer and its size. Useful for identifying bloat.

Further Reading