Skip to main content

Deploying FastAPI to Production

Deploying to production means your code runs on remote servers, handles real traffic, stores customer data, and processes payments. A poorly deployed app crashes at scale, loses data, or leaks secrets. This guide covers containerization (Docker), orchestration (Kubernetes or cloud-managed services), database setup, environment configuration, monitoring, and zero-downtime deployments. You'll go from localhost to a production-grade SaaS backend serving thousands of users.

Docker: Containerizing Your FastAPI App

Docker packages your app with dependencies, ensuring it runs identically on any machine. Create a Dockerfile:

# Dockerfile
FROM python:3.11-slim

# Set working directory
WORKDIR /app

# Install system dependencies
RUN apt-get update && apt-get install -y \
gcc \
&& rm -rf /var/lib/apt/lists/*

# Copy requirements
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy application code
COPY . .

# Expose port
EXPOSE 8000

# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \
CMD python -c "import requests; requests.get('http://localhost:8000/health')"

# Run the app
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]

Build and test locally:

docker build -t saas-backend:latest .
docker run -p 8000:8000 -e DATABASE_URL=postgresql://... saas-backend:latest
# Visit http://localhost:8000/docs

Multi-Stage Builds for Efficiency

Reduce image size with multi-stage builds:

# Dockerfile
FROM python:3.11-slim as builder

WORKDIR /app
COPY requirements.txt .
RUN pip install --user --no-cache-dir -r requirements.txt

FROM python:3.11-slim

WORKDIR /app

# Copy only the installed packages from builder
COPY --from=builder /root/.local /root/.local
ENV PATH=/root/.local/bin:$PATH

# Copy application code
COPY . .

EXPOSE 8000
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]

This reduces the final image from ~900 MB to ~300 MB.

Deploying to AWS ECS (Elastic Container Service)

AWS ECS is a managed container orchestration service (simpler than Kubernetes for small teams):

# 1. Create an ECR repository
aws ecr create-repository --repository-name saas-backend

# 2. Build and push to ECR
aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin <account-id>.dkr.ecr.us-east-1.amazonaws.com
docker build -t saas-backend:latest .
docker tag saas-backend:latest <account-id>.dkr.ecr.us-east-1.amazonaws.com/saas-backend:latest
docker push <account-id>.dkr.ecr.us-east-1.amazonaws.com/saas-backend:latest

# 3. Create ECS task definition (define CPU, memory, environment variables)
# (Use AWS Console or CLI)

# 4. Create ECS service with load balancer
aws ecs create-service \
--cluster saas-prod \
--service-name saas-backend \
--task-definition saas-backend \
--desired-count 3 \
--load-balancers targetGroupArn=arn:aws:...,containerName=saas-backend,containerPort=8000

ECS automatically scales tasks, manages health checks, and distributes traffic.

Environment Configuration in Production

Never hardcode secrets. Use environment variables or AWS Secrets Manager:

# config.py
import os
from typing import Optional

class Settings:
# Required secrets (fail if not set)
DATABASE_URL: str = os.getenv("DATABASE_URL")
if not DATABASE_URL:
raise ValueError("DATABASE_URL environment variable is required")

STRIPE_SECRET_KEY: str = os.getenv("STRIPE_SECRET_KEY")
SECRET_KEY: str = os.getenv("SECRET_KEY")

# Optional settings with defaults
REDIS_URL: str = os.getenv("REDIS_URL", "redis://localhost:6379/0")
SENTRY_DSN: Optional[str] = os.getenv("SENTRY_DSN")
ENVIRONMENT: str = os.getenv("ENVIRONMENT", "production")
DEBUG: bool = os.getenv("DEBUG", "false").lower() == "true"

# Validate production settings
if ENVIRONMENT == "production":
assert not DEBUG, "DEBUG must be False in production"
assert SENTRY_DSN, "SENTRY_DSN required in production"

settings = Settings()

In AWS ECS task definition, set environment variables:

{
"containerDefinitions": [
{
"name": "saas-backend",
"image": "account-id.dkr.ecr.region.amazonaws.com/saas-backend:latest",
"environment": [
{"name": "ENVIRONMENT", "value": "production"},
{"name": "DEBUG", "value": "false"}
],
"secrets": [
{"name": "DATABASE_URL", "valueFrom": "arn:aws:secretsmanager:..."},
{"name": "STRIPE_SECRET_KEY", "valueFrom": "arn:aws:secretsmanager:..."}
]
}
]
}

AWS Secrets Manager rotates secrets automatically; your app fetches the latest value at startup.

Database Setup for Production

Use a managed database (RDS, Cloud SQL) to avoid ops overhead:

# AWS RDS PostgreSQL
aws rds create-db-instance \
--db-instance-identifier saas-prod-db \
--db-instance-class db.t3.micro \
--engine postgres \
--master-username postgres \
--master-user-password <strong-password> \
--allocated-storage 100 \
--vpc-security-group-ids sg-xxxxxxxx \
--backup-retention-period 30 \
--multi-az # For high availability

RDS handles backups, failover, and patching automatically. Set backup retention to 30 days for recovery options.

Running Database Migrations in Production

Never run migrations during traffic hours. Use a pre-deployment script:

#!/bin/bash
# scripts/migrate.sh
set -e

echo "Connecting to production database..."
export DATABASE_URL=postgresql://${PROD_DB_USER}:${PROD_DB_PASS}@${PROD_DB_HOST}/saas

echo "Running migrations..."
alembic upgrade head

echo "Verifying schema..."
python -c "from app.models import Base; import sqlalchemy; print('Schema OK')"

echo "Migrations complete."

Add to CI/CD pipeline (before ECS deployment):

# .github/workflows/deploy.yml
jobs:
migrate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Run database migrations
env:
PROD_DB_HOST: ${{ secrets.PROD_DB_HOST }}
PROD_DB_USER: ${{ secrets.PROD_DB_USER }}
PROD_DB_PASS: ${{ secrets.PROD_DB_PASS }}
run: bash scripts/migrate.sh

Load Balancing and Horizontal Scaling

Run multiple instances behind a load balancer:

┌─────────────────────────┐
│ Load Balancer (ALB) │ <- Receives all traffic
└─────────────────────────┘
↓↓↓
┌──────┴──────┬──────────┐
↓ ↓ ↓
┌────────┐ ┌────────┐ ┌────────┐
│Instance│ │Instance│ │Instance│ <- ECS tasks
│ Port │ │ Port │ │ Port │
│ 8000 │ │ 8000 │ │ 8000 │
└────────┘ └────────┘ └────────┘

Configure auto-scaling based on CPU or request count:

# AWS Application Auto Scaling
aws application-autoscaling register-scalable-target \
--service-namespace ecs \
--resource-id service/saas-prod/saas-backend \
--scalable-dimension ecs:service:DesiredCount \
--min-capacity 3 \
--max-capacity 20

aws application-autoscaling put-scaling-policy \
--policy-name cpu-scaling \
--service-namespace ecs \
--resource-id service/saas-prod/saas-backend \
--scalable-dimension ecs:service:DesiredCount \
--policy-type TargetTrackingScaling \
--target-tracking-scaling-policy-configuration \
TargetValue=70,PredefinedMetricSpecification={PredefinedMetricType=ECSServiceAverageCPUUtilization}

When CPU hits 70%, ECS adds instances; when it drops to 30%, it removes them.

Zero-Downtime Deployments

Deploy new versions without downtime using rolling updates:

# .github/workflows/deploy.yml
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Push to ECR
run: |
docker build -t saas-backend:${{ github.sha }} .
docker push account-id.dkr.ecr.region.amazonaws.com/saas-backend:${{ github.sha }}

- name: Update ECS task definition
run: |
aws ecs update-service \
--cluster saas-prod \
--service saas-backend \
--force-new-deployment \
--region us-east-1

- name: Wait for deployment
run: |
aws ecs wait services-stable \
--cluster saas-prod \
--services saas-backend \
--region us-east-1

ECS performs a rolling update: old tasks are drained (stop accepting new requests), new tasks start, traffic shifts when ready. Zero downtime.

Monitoring and Alerting

Set up CloudWatch alarms for critical metrics:

# Alert if error rate exceeds 1%
aws cloudwatch put-metric-alarm \
--alarm-name saas-high-error-rate \
--alarm-description "Alert if 5xx error rate > 1%" \
--metric-name HTTPServerErrors \
--namespace AWS/ApplicationELB \
--statistic Sum \
--period 300 \
--threshold 10 \
--comparison-operator GreaterThanThreshold \
--alarm-actions arn:aws:sns:us-east-1:account-id:pagerduty-topic

Configure CloudWatch Dashboard:

import boto3

cloudwatch = boto3.client("cloudwatch")
cloudwatch.put_dashboard(
DashboardName="SaaS-Backend-Dashboard",
DashboardBody=json.dumps({
"widgets": [
{
"type": "metric",
"properties": {
"metrics": [
["AWS/ApplicationELB", "RequestCount"],
["AWS/ApplicationELB", "TargetResponseTime"],
["AWS/ApplicationELB", "HTTPServerErrors"]
],
"period": 60,
"stat": "Average"
}
}
]
})
)

HTTPS and TLS Certificates

Use AWS Certificate Manager for free TLS certificates:

aws acm request-certificate \
--domain-name saasapp.com \
--subject-alternative-names "*.saasapp.com" \
--validation-method DNS

Configure the load balancer to use the certificate:

aws elbv2 create-listener \
--load-balancer-arn arn:aws:elasticloadbalancing:... \
--protocol HTTPS \
--port 443 \
--default-actions Type=forward,TargetGroupArn=arn:aws:... \
--certificates CertificateArn=arn:aws:acm:...

All traffic is encrypted; certificates renew automatically.

Deployment Checklist

Before going live:

  • Database backups configured (30-day retention)
  • SSL/TLS certificate installed
  • Secrets stored in AWS Secrets Manager (not in code)
  • Environment variables set correctly (DEBUG=false in prod)
  • Health checks passing
  • Load balancer distributing traffic across instances
  • Auto-scaling configured (min 3 instances)
  • Sentry error tracking enabled
  • Prometheus metrics endpoint responding
  • CloudWatch alarms configured
  • Database migrations tested and applied
  • Monitoring dashboard accessible
  • Runbook (incident response guide) written

Key Takeaways

  • Docker containerizes your app; push to ECR and deploy via ECS or Kubernetes.
  • Use managed databases (RDS) for backups, failover, and patching.
  • Run database migrations before deploying new code.
  • Scale horizontally: run 3+ instances behind a load balancer with auto-scaling.
  • Use rolling updates for zero-downtime deployments.
  • Monitor with CloudWatch and Sentry; alert on high error rate and latency.

Frequently Asked Questions

Should I use Kubernetes or AWS ECS?

Kubernetes is complex but very flexible; use it if you're already experienced or need multi-cloud. ECS is AWS-native, simpler, and cheaper for small teams. Start with ECS; migrate to Kubernetes if you outgrow it.

How do I handle database connections in a load-balanced setup?

Use connection pooling: PgBouncer or RDS Proxy. Each instance opens one connection; PgBouncer multiplexes requests. Without it, 10 instances open 10 database connections, exhausting the connection pool.

What if a deployment fails?

ECS detects unhealthy tasks (failed health checks) and automatically rolls back, restoring old task definition. For manual rollback: aws ecs update-service --cluster ... --service ... --task-definition saas-backend:5 (previous version).

How do I backup the database?

Use RDS automated backups (enabled by default, 30-day retention). For compliance (e.g., GDPR), also copy backups to S3 in another region. Test restore procedures quarterly.

Can I deploy from my laptop or must I use CI/CD?

Always use CI/CD (GitHub Actions, GitLab CI). Manual deployments risk mistakes, inconsistency, and audit trail loss. CI/CD runs migrations, tests, security scans, and deploys atomically.

Further Reading