Docker Python Production Ready: Networking and Performance Tuning
Moving from local development to production requires more than just containerization. Production Python applications need reverse proxies (Nginx), health checks, resource limits, proper logging, and careful networking configuration. These elements prevent crashes, slow restarts, and mysterious disconnects. This article covers the setup patterns that production teams use to ship resilient Python containers that handle real-world traffic and scaling.
I learned these practices the hard way: a production Flask app crashed at 2 AM because it had no memory limits and consumed all available RAM. A colleague's Celery worker hung silently because we weren't monitoring its health. These failures shaped how I now approach production containerization.
Putting a Reverse Proxy in Front of Your App
Your Python app shouldn't listen directly on port 80 or 443. Instead, use a reverse proxy like Nginx to handle HTTP routing, TLS termination, request buffering, and compression. This protects your app from slow clients and simplifies deployment.
Here's a production docker-compose.yml with Nginx:
version: '3.9'
services:
# Nginx reverse proxy
nginx:
image: nginx:1.25-alpine
container_name: nginx-proxy
ports:
- "80:80"
- "443:443"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
- ./ssl:/etc/nginx/ssl:ro # TLS certificates
depends_on:
- app
networks:
- web
# Python FastAPI application
app:
build: .
container_name: fastapi-app
environment:
- DATABASE_URL=postgresql://user:pass@postgres:5432/myapp
depends_on:
postgres:
condition: service_healthy
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
interval: 30s
timeout: 5s
retries: 3
start_period: 10s
networks:
- web
# Don't expose port 8000 directly; Nginx proxies to it
expose:
- "8000"
# PostgreSQL database
postgres:
image: postgres:15-alpine
environment:
- POSTGRES_PASSWORD=securepass
volumes:
- pgdata:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 10s
timeout: 5s
retries: 5
networks:
- web
volumes:
pgdata:
networks:
web:
driver: bridge
And your nginx.conf:
worker_processes auto;
error_log /var/log/nginx/error.log warn;
pid /var/run/nginx.pid;
events {
worker_connections 1024;
}
http {
upstream app {
# Load balance across app containers
server app:8000;
}
server {
listen 80;
server_name _;
# Request size limits
client_max_body_size 10M;
# Proxy to Python app
location / {
proxy_pass http://app;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# Timeouts
proxy_connect_timeout 60s;
proxy_send_timeout 60s;
proxy_read_timeout 60s;
}
# Health check endpoint (Nginx internal)
location /nginx-health {
access_log off;
return 200 "OK";
}
}
}
Your FastAPI app stays on port 8000 (only exposed to the internal network), and Nginx handles all external traffic. Clients connect to Nginx on 80/443, not directly to your Python app.
Health Checks for Reliable Restarts
A health check tells Docker (or an orchestrator like Kubernetes) if a container is actually healthy. Without health checks, a hung Python app keeps running silently, and requests timeout.
Add a health check to your Dockerfile:
FROM python:3.11-slim
WORKDIR /app
RUN pip install --no-cache-dir fastapi uvicorn curl
COPY . .
HEALTHCHECK --interval=30s --timeout=5s --retries=3 --start-period=10s \
CMD curl -f http://localhost:8000/health || exit 1
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
And expose a health endpoint in your app:
from fastapi import FastAPI, HTTPException
app = FastAPI()
# Simple health check
@app.get("/health")
def health_check():
try:
# Verify database connection
db.session.execute("SELECT 1")
return {"status": "healthy"}
except Exception as e:
raise HTTPException(status_code=503, detail="Database unreachable")
@app.get("/readiness")
def readiness():
"""Readiness probe for orchestrators like Kubernetes."""
# More intensive check: ensure app is ready to serve
if not app.state.initialized:
raise HTTPException(status_code=503, detail="Initializing")
return {"ready": True}
Docker periodically runs the health check. If it fails 3 times in a row, Docker marks the container as unhealthy and can restart it:
docker inspect myapp:1.0 | grep -A 5 '"Health"'
Output:
"Health": {
"Status": "healthy",
"FailingStreak": 0,
"Runs": [...]
}
Resource Limits: Prevent Runaway Containers
Without limits, a Python app can consume all available CPU and memory, crashing the host. Always set limits in Compose:
services:
app:
build: .
deploy:
resources:
limits:
cpus: '1.0' # Max 1 CPU core
memory: 512M # Max 512 MB RAM
reservations:
cpus: '0.5' # Request 0.5 CPU
memory: 256M # Request 256 MB RAM
The limits prevent the container from exceeding thresholds. If it tries, Docker terminates it. The reservations are requests (hints to the orchestrator, not hard limits). This prevents:
- Memory leaks from eating all RAM
- Runaway loops from consuming CPU indefinitely
- One container from starving others
Monitor actual usage:
docker stats myapp
Output:
CONTAINER CPU % MEM USAGE / LIMIT
fastapi-app 0.5% 85M / 512M
If your app consistently uses 400M of 512M, it's close to the limit. Either optimize the app or increase the limit.
Logging: Send Container Logs to a Centralized Service
By default, Docker logs go to the host's /var/lib/docker/containers/. For production, send logs to a centralized service (ELK, Datadog, CloudWatch) so you can search and aggregate across containers.
In docker-compose.yml:
services:
app:
build: .
logging:
driver: "awslogs" # AWS CloudWatch Logs
options:
awslogs-group: "/ecs/myapp"
awslogs-region: "us-east-1"
awslogs-stream-prefix: "ecs"
Or with Splunk:
logging:
driver: "splunk"
options:
splunk-token: "your-hec-token"
splunk-url: "https://your-splunk-instance.com"
splunk-source: "docker:app"
In your Python app, log structured data (JSON) for easy parsing:
import json
import logging
import sys
# Structured logging
logging.basicConfig(
level=logging.INFO,
format="%(message)s",
stream=sys.stdout
)
logger = logging.getLogger(__name__)
@app.get("/api/items/{item_id}")
def get_item(item_id: int):
logger.info(json.dumps({
"event": "item_retrieved",
"item_id": item_id,
"timestamp": datetime.utcnow().isoformat()
}))
return {"id": item_id, "name": "Example"}
Structured JSON logs are easier to search and alert on in centralized logging systems.
Network Configuration: Custom Networks vs. Bridge
Docker creates a bridge network by default where services discover each other by name. For better control, define custom networks:
version: '3.9'
services:
app:
build: .
networks:
- web # Frontend network
- internal # Backend network (app can reach db, but external can't)
postgres:
image: postgres:15
networks:
- internal # Only app can reach postgres
networks:
web:
driver: bridge
internal:
driver: bridge
This limits database access: external traffic hits the web network, but only the app can reach the internal network. Better security.
A Complete Production Example
Combining all best practices:
version: '3.9'
services:
nginx:
image: nginx:1.25-alpine
ports:
- "80:80"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
depends_on:
- app
networks:
- web
deploy:
resources:
limits:
cpus: '0.5'
memory: 256M
app:
build: .
expose:
- "8000"
environment:
- DATABASE_URL=postgresql://user:pass@postgres:5432/myapp
- LOG_LEVEL=info
depends_on:
postgres:
condition: service_healthy
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
interval: 30s
timeout: 5s
retries: 3
networks:
- web
- internal
logging:
driver: "awslogs"
options:
awslogs-group: "/ecs/myapp"
awslogs-region: "us-east-1"
deploy:
resources:
limits:
cpus: '2.0'
memory: 1G
reservations:
cpus: '1.0'
memory: 512M
postgres:
image: postgres:15-alpine
environment:
- POSTGRES_PASSWORD=securepass
volumes:
- pgdata:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 10s
timeout: 5s
retries: 5
networks:
- internal
deploy:
resources:
limits:
cpus: '1.0'
memory: 512M
volumes:
pgdata:
networks:
web:
driver: bridge
internal:
driver: bridge
This setup:
- Reverse proxy in front
- Health checks for monitoring
- Resource limits to prevent runaway
- Structured logging to centralized service
- Multiple networks for security
- Persistent volumes for data
It's production-ready.
Key Takeaways
- Use a reverse proxy (Nginx) in front of your Python app for better resource management and security.
- Health checks allow Docker/Kubernetes to detect hung containers and restart them automatically.
- Resource limits prevent one container from consuming all CPU/memory and crashing the host.
- Send logs to a centralized service (CloudWatch, Datadog) for searchability and alerting.
- Use custom networks to restrict inter-service communication; only expose what's necessary.
Frequently Asked Questions
Do I need Nginx if I'm deploying to Kubernetes?
Kubernetes has Ingress and Service resources that handle routing. You might still use Nginx as an Ingress controller, but it's not required the way it is for standalone Docker.
How often should health checks run?
Every 30–60 seconds is typical. More frequent checks consume more resources and might mask transient issues. Less frequent means longer time to detect failures.
What resource limits should I use?
Start with 1 CPU and 512 MB for a typical Flask/FastAPI app. Monitor actual usage, then adjust. Memory leaks and CPU spikes indicate a need to optimize the app or increase limits.
Can I change resource limits on a running container?
Not with docker-compose. You'd need to update the Compose file and restart: docker compose up -d. With Kubernetes, you can update limits dynamically.
How do I handle graceful shutdowns?
Set a stop_grace_period in Compose: stop_grace_period: 30s. This gives the app time to finish requests before Docker forcefully kills it.