Secrets Management in Python: API Keys and Credentials

Secrets are sensitive credentials that grant access to external systems: API keys, database passwords, encryption keys, and authentication tokens. If secrets are hardcoded in your source code, they are exposed on GitHub, visible in logs, and accessible to anyone with repository access. Attackers automatically scan public repositories for hardcoded secrets using tools like git-secrets and truffleHog. Managing secrets outside of source code and restricting access to them is a core responsibility of every application developer. Python provides several patterns for secure secret management, from simple environment variables to dedicated vault systems.

The Dangers of Hardcoded Secrets

The single worst mistake in secrets management is hardcoding credentials in source code. This anti-pattern is common and has compromised countless production systems:

# INSECURE — never hardcode secrets
import psycopg2

conn = psycopg2.connect(
    host='prod-db.example.com',
    database='users',
    user='admin',
    password='SuperSecret123!',  # EXPOSED on GitHub!
)

When this code is pushed to GitHub, automated bots scrape the repository and extract the password within minutes. The attacker can now connect to your production database. Even if you delete the commit, the password remains in the Git history. To revoke it, you must change the password in the database and manually audit access logs to see if the attacker connected.

Using Environment Variables with `python-dotenv`

The standard practice is to read secrets from environment variables, which are set by the deployment environment (Docker, Kubernetes, CI/CD platform) and not stored in source code. The python-dotenv library loads environment variables from a .env file during development:

# Secure approach using environment variables
import os
import psycopg2
from dotenv import load_dotenv

# Load variables from .env file (development only)
load_dotenv()

# Read secrets from environment
db_host = os.getenv('DB_HOST')
db_name = os.getenv('DB_NAME')
db_user = os.getenv('DB_USER')
db_password = os.getenv('DB_PASSWORD')

conn = psycopg2.connect(
    host=db_host,
    database=db_name,
    user=db_user,
    password=db_password,
)

The .env file is stored locally (not in git) and contains development credentials:

# .env (local file, added to .gitignore)
DB_HOST=localhost
DB_NAME=dev_users
DB_USER=dev_user
DB_PASSWORD=DevPassword123!
API_KEY=sk_test_12345678

In production, environment variables are injected by the deployment platform (Docker, Kubernetes, AWS Lambda) without needing a .env file. Python reads them directly from the environment.

Set up Git to prevent accidental commits of .env files:

# Add to .gitignore
echo ".env" >> .gitignore
echo ".env.local" >> .gitignore
echo "*.key" >> .gitignore

# Check for hardcoded secrets before committing
pip install detect-secrets
detect-secrets scan --baseline .secrets.baseline

The detect-secrets tool scans your repository for patterns that look like secrets (long random strings, common key names) and can be integrated into a Git pre-commit hook.

Validation and Type Safety for Secrets

Use typed configuration classes to validate that all required secrets are present and valid:

# Type-safe configuration with validation
from pydantic import BaseSettings, validator
from typing import Optional

class Settings(BaseSettings):
    """Application configuration loaded from environment."""
    
    # Database secrets
    db_host: str
    db_name: str
    db_user: str
    db_password: str
    
    # API keys
    stripe_api_key: str
    sendgrid_api_key: Optional[str] = None
    
    # Derived settings
    debug: bool = False
    
    class Config:
        env_file = '.env'
        env_file_encoding = 'utf-8'
    
    @validator('stripe_api_key')
    def validate_stripe_key(cls, v: str) -> str:
        if not v.startswith('sk_'):
            raise ValueError("Stripe API key must start with 'sk_'")
        return v

# Load and validate settings
try:
    settings = Settings()
except ValueError as e:
    print(f"Configuration error: {e}")
    raise

# Use settings safely
conn = psycopg2.connect(
    host=settings.db_host,
    database=settings.db_name,
    user=settings.db_user,
    password=settings.db_password,
)

By using a configuration class, you ensure all required secrets are present at startup and fail fast if any are missing or invalid.

Using Vault Systems for Production

For production systems, especially large organizations, use a dedicated vault system like HashiCorp Vault, AWS Secrets Manager, or Azure Key Vault. These systems provide:

Centralized secret storage and rotation.
Access logging and auditing.
Encryption at rest and in transit.
Automated secret rotation.
Permission management (who can read which secrets).

Here is an example using AWS Secrets Manager:

# Fetch secrets from AWS Secrets Manager
import boto3
import json

def get_secret(secret_name: str) -> dict:
    """Retrieve a secret from AWS Secrets Manager."""
    client = boto3.client('secretsmanager', region_name='us-east-1')
    try:
        response = client.get_secret_value(SecretId=secret_name)
        if 'SecretString' in response:
            return json.loads(response['SecretString'])
    except client.exceptions.ResourceNotFoundException:
        raise ValueError(f"Secret '{secret_name}' not found")

# Usage
db_secret = get_secret('prod/database')
api_secret = get_secret('prod/stripe-key')

# Connect using retrieved secrets
conn = psycopg2.connect(
    host=db_secret['host'],
    database=db_secret['name'],
    user=db_secret['user'],
    password=db_secret['password'],
)

Vault systems are significantly more secure than environment variables for production because they provide encryption, rotation, and access control. They are especially important for teams that need compliance (SOC 2, HIPAA, PCI DSS).

Preventing Secrets in Logs

Even with proper secret management, secrets can leak through logs. Never log secrets:

# INSECURE — logs the password
import logging

logger = logging.getLogger(__name__)
logger.info(f"Connecting to {db_host} with password {db_password}")

# SECURE — never log secrets
logger.info(f"Connecting to {db_host}")

Use a logging filter to automatically redact secrets:

import logging
import re

class SecretRedactingFilter(logging.Filter):
    """Remove secrets from log messages."""
    
    def __init__(self):
        super().__init__()
        # Patterns to redact: API keys, passwords, tokens
        self.patterns = [
            r'(password["\']?\s*[:=]\s*)["\']?[^"\'\\s]+["\']?',
            r'(api[_-]?key["\']?\s*[:=]\s*)["\']?[^"\'\\s]+["\']?',
            r'(token["\']?\s*[:=]\s*)["\']?[^"\'\\s]+["\']?',
        ]
    
    def filter(self, record: logging.LogRecord) -> bool:
        message = record.getMessage()
        for pattern in self.patterns:
            message = re.sub(pattern, r'\1***REDACTED***', message, flags=re.IGNORECASE)
        record.msg = message
        return True

# Apply the filter to all handlers
logger = logging.getLogger()
logger.addFilter(SecretRedactingFilter())

With this filter, log messages automatically redact secrets, preventing accidental leaks.

Key Takeaways

Never hardcode secrets in source code; use environment variables or vault systems instead.
Use python-dotenv to load environment variables from a .env file during development; .env should never be committed to Git.
Validate that all required secrets are present at application startup using typed configuration classes.
In production, use a dedicated vault system (AWS Secrets Manager, HashiCorp Vault) for encryption, rotation, and access control.
Scan source code with detect-secrets and integrate it into pre-commit hooks to prevent accidental commits of secrets.

Frequently Asked Questions

Should I commit `.env.example` to show what environment variables are needed?

Yes. Create a .env.example file with placeholder values (not real credentials) and commit it to Git. This shows what secrets the application needs without exposing real values.

Can I use GitHub Secrets to store API keys?

GitHub Secrets are suitable for CI/CD workflows (storing credentials for automated deploys), but not for application code. Application secrets must be injected at runtime from a vault or environment variables.

How often should I rotate secrets?

Rotate secrets immediately when a compromise is suspected. For routine rotation, every 30–90 days is standard. Vault systems can automate this.

What if I accidentally commit a secret?

Immediately invalidate the credential in the remote system (change the password, revoke the API key). Then remove it from Git history using git filter-repo or BFG Repo-Cleaner. Committing is not enough; Git history persists.

Can I use encrypted secrets in environment variables?

You can encrypt secrets and store the encrypted version in the environment, then decrypt at runtime. However, you must store the decryption key somewhere, which creates a circular dependency. Vault systems handle this more elegantly.

The Dangers of Hardcoded Secrets​

Using Environment Variables with python-dotenv​

Validation and Type Safety for Secrets​

Using Vault Systems for Production​

Preventing Secrets in Logs​

Key Takeaways​

Frequently Asked Questions​

Should I commit .env.example to show what environment variables are needed?​

Can I use GitHub Secrets to store API keys?​

How often should I rotate secrets?​

What if I accidentally commit a secret?​

Can I use encrypted secrets in environment variables?​

Further Reading​