Skip to main content

Docker Python Production Ready: Networking and Performance Tuning

Moving from local development to production requires more than just containerization. Production Python applications need reverse proxies (Nginx), health checks, resource limits, proper logging, and careful networking configuration. These elements prevent crashes, slow restarts, and mysterious disconnects. This article covers the setup patterns that production teams use to ship resilient Python containers that handle real-world traffic and scaling.

I learned these practices the hard way: a production Flask app crashed at 2 AM because it had no memory limits and consumed all available RAM. A colleague's Celery worker hung silently because we weren't monitoring its health. These failures shaped how I now approach production containerization.

Putting a Reverse Proxy in Front of Your App

Your Python app shouldn't listen directly on port 80 or 443. Instead, use a reverse proxy like Nginx to handle HTTP routing, TLS termination, request buffering, and compression. This protects your app from slow clients and simplifies deployment.

Here's a production docker-compose.yml with Nginx:

version: '3.9'

services:
# Nginx reverse proxy
nginx:
image: nginx:1.25-alpine
container_name: nginx-proxy
ports:
- "80:80"
- "443:443"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
- ./ssl:/etc/nginx/ssl:ro # TLS certificates
depends_on:
- app
networks:
- web

# Python FastAPI application
app:
build: .
container_name: fastapi-app
environment:
- DATABASE_URL=postgresql://user:pass@postgres:5432/myapp
depends_on:
postgres:
condition: service_healthy
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
interval: 30s
timeout: 5s
retries: 3
start_period: 10s
networks:
- web
# Don't expose port 8000 directly; Nginx proxies to it
expose:
- "8000"

# PostgreSQL database
postgres:
image: postgres:15-alpine
environment:
- POSTGRES_PASSWORD=securepass
volumes:
- pgdata:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 10s
timeout: 5s
retries: 5
networks:
- web

volumes:
pgdata:

networks:
web:
driver: bridge

And your nginx.conf:

worker_processes auto;
error_log /var/log/nginx/error.log warn;
pid /var/run/nginx.pid;

events {
worker_connections 1024;
}

http {
upstream app {
# Load balance across app containers
server app:8000;
}

server {
listen 80;
server_name _;

# Request size limits
client_max_body_size 10M;

# Proxy to Python app
location / {
proxy_pass http://app;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;

# Timeouts
proxy_connect_timeout 60s;
proxy_send_timeout 60s;
proxy_read_timeout 60s;
}

# Health check endpoint (Nginx internal)
location /nginx-health {
access_log off;
return 200 "OK";
}
}
}

Your FastAPI app stays on port 8000 (only exposed to the internal network), and Nginx handles all external traffic. Clients connect to Nginx on 80/443, not directly to your Python app.

Health Checks for Reliable Restarts

A health check tells Docker (or an orchestrator like Kubernetes) if a container is actually healthy. Without health checks, a hung Python app keeps running silently, and requests timeout.

Add a health check to your Dockerfile:

FROM python:3.11-slim

WORKDIR /app

RUN pip install --no-cache-dir fastapi uvicorn curl

COPY . .

HEALTHCHECK --interval=30s --timeout=5s --retries=3 --start-period=10s \
CMD curl -f http://localhost:8000/health || exit 1

CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

And expose a health endpoint in your app:

from fastapi import FastAPI, HTTPException

app = FastAPI()

# Simple health check
@app.get("/health")
def health_check():
try:
# Verify database connection
db.session.execute("SELECT 1")
return {"status": "healthy"}
except Exception as e:
raise HTTPException(status_code=503, detail="Database unreachable")

@app.get("/readiness")
def readiness():
"""Readiness probe for orchestrators like Kubernetes."""
# More intensive check: ensure app is ready to serve
if not app.state.initialized:
raise HTTPException(status_code=503, detail="Initializing")
return {"ready": True}

Docker periodically runs the health check. If it fails 3 times in a row, Docker marks the container as unhealthy and can restart it:

docker inspect myapp:1.0 | grep -A 5 '"Health"'

Output:

"Health": {
"Status": "healthy",
"FailingStreak": 0,
"Runs": [...]
}

Resource Limits: Prevent Runaway Containers

Without limits, a Python app can consume all available CPU and memory, crashing the host. Always set limits in Compose:

services:
app:
build: .
deploy:
resources:
limits:
cpus: '1.0' # Max 1 CPU core
memory: 512M # Max 512 MB RAM
reservations:
cpus: '0.5' # Request 0.5 CPU
memory: 256M # Request 256 MB RAM

The limits prevent the container from exceeding thresholds. If it tries, Docker terminates it. The reservations are requests (hints to the orchestrator, not hard limits). This prevents:

  • Memory leaks from eating all RAM
  • Runaway loops from consuming CPU indefinitely
  • One container from starving others

Monitor actual usage:

docker stats myapp

Output:

CONTAINER        CPU %   MEM USAGE / LIMIT
fastapi-app 0.5% 85M / 512M

If your app consistently uses 400M of 512M, it's close to the limit. Either optimize the app or increase the limit.

Logging: Send Container Logs to a Centralized Service

By default, Docker logs go to the host's /var/lib/docker/containers/. For production, send logs to a centralized service (ELK, Datadog, CloudWatch) so you can search and aggregate across containers.

In docker-compose.yml:

services:
app:
build: .
logging:
driver: "awslogs" # AWS CloudWatch Logs
options:
awslogs-group: "/ecs/myapp"
awslogs-region: "us-east-1"
awslogs-stream-prefix: "ecs"

Or with Splunk:

logging:
driver: "splunk"
options:
splunk-token: "your-hec-token"
splunk-url: "https://your-splunk-instance.com"
splunk-source: "docker:app"

In your Python app, log structured data (JSON) for easy parsing:

import json
import logging
import sys

# Structured logging
logging.basicConfig(
level=logging.INFO,
format="%(message)s",
stream=sys.stdout
)
logger = logging.getLogger(__name__)

@app.get("/api/items/{item_id}")
def get_item(item_id: int):
logger.info(json.dumps({
"event": "item_retrieved",
"item_id": item_id,
"timestamp": datetime.utcnow().isoformat()
}))
return {"id": item_id, "name": "Example"}

Structured JSON logs are easier to search and alert on in centralized logging systems.

Network Configuration: Custom Networks vs. Bridge

Docker creates a bridge network by default where services discover each other by name. For better control, define custom networks:

version: '3.9'

services:
app:
build: .
networks:
- web # Frontend network
- internal # Backend network (app can reach db, but external can't)

postgres:
image: postgres:15
networks:
- internal # Only app can reach postgres

networks:
web:
driver: bridge
internal:
driver: bridge

This limits database access: external traffic hits the web network, but only the app can reach the internal network. Better security.

A Complete Production Example

Combining all best practices:

version: '3.9'

services:
nginx:
image: nginx:1.25-alpine
ports:
- "80:80"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
depends_on:
- app
networks:
- web
deploy:
resources:
limits:
cpus: '0.5'
memory: 256M

app:
build: .
expose:
- "8000"
environment:
- DATABASE_URL=postgresql://user:pass@postgres:5432/myapp
- LOG_LEVEL=info
depends_on:
postgres:
condition: service_healthy
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
interval: 30s
timeout: 5s
retries: 3
networks:
- web
- internal
logging:
driver: "awslogs"
options:
awslogs-group: "/ecs/myapp"
awslogs-region: "us-east-1"
deploy:
resources:
limits:
cpus: '2.0'
memory: 1G
reservations:
cpus: '1.0'
memory: 512M

postgres:
image: postgres:15-alpine
environment:
- POSTGRES_PASSWORD=securepass
volumes:
- pgdata:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 10s
timeout: 5s
retries: 5
networks:
- internal
deploy:
resources:
limits:
cpus: '1.0'
memory: 512M

volumes:
pgdata:

networks:
web:
driver: bridge
internal:
driver: bridge

This setup:

  • Reverse proxy in front
  • Health checks for monitoring
  • Resource limits to prevent runaway
  • Structured logging to centralized service
  • Multiple networks for security
  • Persistent volumes for data

It's production-ready.

Key Takeaways

  • Use a reverse proxy (Nginx) in front of your Python app for better resource management and security.
  • Health checks allow Docker/Kubernetes to detect hung containers and restart them automatically.
  • Resource limits prevent one container from consuming all CPU/memory and crashing the host.
  • Send logs to a centralized service (CloudWatch, Datadog) for searchability and alerting.
  • Use custom networks to restrict inter-service communication; only expose what's necessary.

Frequently Asked Questions

Do I need Nginx if I'm deploying to Kubernetes?

Kubernetes has Ingress and Service resources that handle routing. You might still use Nginx as an Ingress controller, but it's not required the way it is for standalone Docker.

How often should health checks run?

Every 30–60 seconds is typical. More frequent checks consume more resources and might mask transient issues. Less frequent means longer time to detect failures.

What resource limits should I use?

Start with 1 CPU and 512 MB for a typical Flask/FastAPI app. Monitor actual usage, then adjust. Memory leaks and CPU spikes indicate a need to optimize the app or increase limits.

Can I change resource limits on a running container?

Not with docker-compose. You'd need to update the Compose file and restart: docker compose up -d. With Kubernetes, you can update limits dynamically.

How do I handle graceful shutdowns?

Set a stop_grace_period in Compose: stop_grace_period: 30s. This gives the app time to finish requests before Docker forcefully kills it.

Further Reading