Skip to main content

Docker Python Volumes and Mounts: Persist Data Safely

Docker containers are ephemeral: when a container stops, its filesystem changes are lost. To persist data (database files, logs, cache), you use volumes or bind mounts—special storage mechanisms that survive container restarts. A volume is a Docker-managed directory on the host; a bind mount maps a host directory into the container. Choosing the right approach depends on your use case, and mistakes can lead to data loss. This article explains both patterns with examples so you can safely persist data in containerized Python applications.

In my second year managing Docker containers, a colleague restarted the database container without realizing the data wasn't in a volume. Months of accumulated data vanished. That's when I learned: volumes are non-optional for production. They're also simple once you understand the distinction between volumes and bind mounts.

Volumes vs. Bind Mounts: The Key Difference

Volumes are Docker-managed directories. You create a volume with docker volume create mydata, and Docker stores it on the host (usually /var/lib/docker/volumes/ on Linux). Your application doesn't know where the volume lives; Docker handles it.

Bind mounts are host directories mapped directly into containers. You specify a host path (/home/user/data) and a container path (/app/data), and files appear in both places simultaneously.

FeatureVolumeBind mount
ManagementDocker-managedUser-managed
Location/var/lib/docker/volumes/Any host path
PerformanceOptimized by DockerDepends on host filesystem
PortabilityWorks across hostsHost-specific
Ease of backupEasy (Docker commands)Manual
Use caseProduction data (databases, caches)Development (live code reload)

For production: use volumes. Docker handles backup and portability.

For development: use bind mounts. You edit code on your machine and see changes instantly in the container.

Volumes: Persistent Storage in Production

Create a named volume:

docker volume create pgdata

Use it in a container:

docker run -v pgdata:/var/lib/postgresql/data postgres:15-alpine

The -v pgdata:/var/lib/postgresql/data flag mounts the volume at /var/lib/postgresql/data inside the container. Postgres writes to that directory, and the data persists even if the container stops.

Start the container again, and the data is still there:

docker run -v pgdata:/var/lib/postgresql/data postgres:15-alpine

In Docker Compose, declare volumes at the top level and reference them in services:

version: '3.9'

services:
postgres:
image: postgres:15-alpine
volumes:
- pgdata:/var/lib/postgresql/data
environment:
- POSTGRES_PASSWORD=mypassword

app:
build: .
depends_on:
- postgres

volumes:
pgdata:
driver: local

When you docker compose down, the volume persists. To destroy it (and all data), use docker compose down -v.

Bind Mounts: Development with Live Reload

Bind mounts are perfect for local development. Mount your source code into the container so that changes on your machine are immediately visible inside the container.

version: '3.9'

services:
app:
build: .
ports:
- "8000:8000"
volumes:
- .:/app # Bind mount: current directory to /app inside container
command: uvicorn main:app --host 0.0.0.0 --port 8000 --reload

When you edit main.py locally, the change appears in /app/main.py inside the container, and uvicorn --reload restarts the server automatically. No rebuild needed.

This works because:

  1. .:/app maps your local directory to /app inside the container.
  2. Both paths point to the same files (no copying).
  3. Changes propagate instantly.

Excluding Files from Bind Mounts

When using bind mounts in development, you might want to exclude large directories like node_modules or pycache. Use a named volume on top of the bind mount:

services:
app:
build: .
volumes:
- .:/app # Bind mount the entire directory
- /app/__pycache__ # Exclude __pycache__ (use container's version)
- /app/.venv # Exclude virtual environment (use container's version)

Now pycache and .venv aren't synced from your host; the container uses its own versions. This saves bandwidth and avoids issues with compiled extensions built for the container's OS.

A Complete Example: Development + Production

Development (docker-compose.yml):

version: '3.9'

services:
app:
build: .
ports:
- "8000:8000"
volumes:
- .:/app # Live code reload
- /app/__pycache__ # Exclude pycache
environment:
- DEBUG=true
command: uvicorn main:app --host 0.0.0.0 --port 8000 --reload

postgres:
image: postgres:15-alpine
volumes:
- pgdata:/var/lib/postgresql/data
environment:
- POSTGRES_PASSWORD=devpass

volumes:
pgdata:

Run: docker compose up

Every code change reloads instantly. Postgres data persists across restarts.

Production (docker-compose.prod.yml):

version: '3.9'

services:
app:
image: myapp:1.0.0 # Built, tagged, pushed to registry
restart: always
environment:
- DEBUG=false
# No volumes for code (image is immutable)
# No healthcheck needed here (orchestrator handles it)

postgres:
image: postgres:15-alpine
restart: always
volumes:
- pgdata:/var/lib/postgresql/data # Persistent volume
environment:
- POSTGRES_PASSWORD=${DB_PASSWORD} # From secrets

volumes:
pgdata:
driver: local

Run: docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d

No bind mounts (code is baked in the image), persistent volume for database, environment variables from secrets.

Volume Inspection and Cleanup

List volumes:

docker volume ls

Inspect a volume:

docker volume inspect pgdata

Output:

[
{
"Name": "pgdata",
"Driver": "local",
"Mountpoint": "/var/lib/docker/volumes/pgdata/_data",
"Labels": {},
"Scope": "local"
}
]

The Mountpoint is where Docker stores the volume data on the host.

Remove unused volumes:

docker volume prune

This deletes volumes not in use by any container. Be careful; it's not reversible.

Backup and Restore Volumes

Backup a volume to a tar file:

docker run --rm \
-v pgdata:/data \
-v /backup:/backup \
alpine tar czf /backup/pgdata.tar.gz -C /data .

This runs a temporary Alpine container, mounts the pgdata volume, and tars it to /backup. Restore:

docker run --rm \
-v pgdata:/data \
-v /backup:/backup \
alpine tar xzf /backup/pgdata.tar.gz -C /data

For production databases, use native backup tools (pg_dump for PostgreSQL, mongodump for MongoDB) in addition to volume backups.

Key Takeaways

  • Volumes are Docker-managed storage for persistent data; use them in production.
  • Bind mounts map host directories into containers; use them in development for live code reload.
  • Volumes survive container restarts and stop/start cycles; bind mounts are instant but host-dependent.
  • Exclude large directories (pycache, .venv) from bind mounts using named volumes overlaid on them.
  • Always use volumes for databases, logs, and cached data.

Frequently Asked Questions

Can I use bind mounts in production?

Technically yes, but it's risky. Bind mounts depend on the host filesystem; if the host directory is deleted or the host changes, data is lost. Volumes are managed by Docker and survive host changes, making them safer.

What happens if I delete a volume while a container is using it?

The container keeps running (the volume is still mounted), but Docker can't recreate it. Once the container stops, the data is gone. Always check docker ps before deleting volumes.

Can I share a volume between multiple containers?

Yes. Multiple containers can mount the same volume. But ensure only one writes at a time to avoid corruption (e.g., multiple database instances shouldn't write to the same pgdata volume).

How do I know how much space a volume uses?

Use docker system df to see total space used by images, containers, and volumes. For individual volume size, use du -sh /var/lib/docker/volumes/volumename/_data on the host.

Can I mount a volume in Windows Docker?

Yes. On Windows, volumes are stored in a Hyper-V VM or WSL2. Bind mounts work for Windows paths (e.g., C:\Users\data:/app/data), but performance is slower than on Linux.

Further Reading