Skip to main content

Schema Validation for Configs: Using Pydantic

Pydantic is a Python library for data validation and parsing that automatically catches configuration errors at startup before they cause runtime failures. Instead of reading environment variables into an untyped dictionary and hoping you did not misspell a key, you define a Pydantic model that declares the exact structure, types, and constraints of your configuration. Pydantic validates the data, raises errors for missing required fields or type mismatches, and gives you typed attributes your IDE can autocomplete.

A Pydantic Settings model integrates seamlessly with environment variables and .env files: it reads them automatically, validates them against your schema, and raises clear errors if validation fails. This pattern has become the standard in modern Python frameworks (FastAPI, Django Ninja, Litestar) and is used in production by companies like Slack and Spotify.

Installing Pydantic and Creating a Settings Model

Install Pydantic v2:

pip install pydantic pydantic-settings

Create a config.py file with a Pydantic Settings model:

from pydantic import Field, field_validator
from pydantic_settings import BaseSettings

class Settings(BaseSettings):
"""Application settings with validation."""

database_host: str = "localhost"
database_port: int = 5432
database_user: str
database_password: str

api_key: str = Field(description="API key for external service")
api_url: str = "https://api.example.com"

debug: bool = False
log_level: str = "INFO"
max_workers: int = 4

class Config:
env_file = ".env"
case_sensitive = False # DATABASE_PASSWORD and database_password both work

@field_validator("database_port")
@classmethod
def validate_port(cls, v):
if not (1 <= v <= 65535):
raise ValueError("Port must be between 1 and 65535")
return v

@field_validator("log_level")
@classmethod
def validate_log_level(cls, v):
valid = {"DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"}
if v.upper() not in valid:
raise ValueError(f"Log level must be one of {valid}")
return v.upper()

# At application startup
try:
settings = Settings()
except Exception as e:
print(f"Configuration error: {e}")
exit(1)

# Use typed settings throughout
print(f"Connecting to {settings.database_host}:{settings.database_port}")

When you instantiate Settings(), Pydantic:

  1. Reads environment variables and .env file
  2. Maps them to model fields (case-insensitive by default)
  3. Validates types (e.g., database_port must be an integer)
  4. Runs custom validators
  5. Raises a clear error if any field is invalid or required

If DATABASE_PASSWORD is missing, you get:

pydantic_core._pydantic_core.ValidationError: 1 validation error for Settings
database_password
Field required (type=missing,input_type=missing,input_value={})

This error happens at startup, so you know immediately if your configuration is broken.

Using Validators to Enforce Business Rules

Validators let you add custom logic beyond type checking. Common examples:

from pydantic import Field, field_validator, field_serializer
from pydantic_settings import BaseSettings
from urllib.parse import urlparse

class Settings(BaseSettings):
database_url: str
cache_ttl: int = Field(default=3600, gt=0) # gt=0 means must be positive
allowed_hosts: str = "localhost,127.0.0.1"

@field_validator("database_url")
@classmethod
def validate_database_url(cls, v):
# Ensure database_url is a valid connection string
if not v.startswith(("postgres://", "mysql://", "sqlite://")):
raise ValueError("Database URL must start with a valid scheme")
return v

@field_validator("cache_ttl")
@classmethod
def validate_cache_ttl(cls, v):
if v < 1:
raise ValueError("Cache TTL must be at least 1 second")
return v

@field_validator("allowed_hosts", mode="before")
@classmethod
def parse_allowed_hosts(cls, v):
# Convert comma-separated string to list
if isinstance(v, str):
return [host.strip() for host in v.split(",")]
return v

settings = Settings(
database_url="postgres://localhost/mydb",
cache_ttl=7200,
allowed_hosts="example.com,staging.example.com"
)

print(settings.allowed_hosts) # ['example.com', 'staging.example.com']

Validators run in order, allowing you to normalize data (convert strings to lists, parse URLs, decrypt secrets) or enforce constraints (port ranges, URL schemes, positive numbers).

Multi-Environment Configurations with Pydantic

Handle different environments cleanly:

from pydantic import Field
from pydantic_settings import BaseSettings
import os

class Settings(BaseSettings):
environment: str = "development"
database_host: str
database_name: str = "myapp"
log_level: str = "INFO"

class Config:
env_file = ".env"

@field_validator("environment")
@classmethod
def validate_environment(cls, v):
valid = {"development", "staging", "production"}
if v not in valid:
raise ValueError(f"Environment must be one of {valid}")
return v

# Load base .env, then environment-specific overrides
env = os.getenv("ENVIRONMENT", "development")
os.environ.setdefault("ENVIRONMENT", env)

# For staging: load .env.staging after base .env
if env != "development":
from dotenv import load_dotenv
load_dotenv(f".env.{env}", override=True)

settings = Settings()

# Conditional defaults
if settings.environment == "production":
settings.log_level = "WARNING"
settings.database_host = "prod-db.internal"
elif settings.environment == "staging":
settings.database_host = "staging-db.internal"

Automatic Documentation and Type Safety

Pydantic models provide introspection and type hints your IDE understands:

from pydantic import BaseModel, Field

class DatabaseSettings(BaseModel):
host: str = Field(description="Database server hostname")
port: int = Field(default=5432, description="Database port (1-65535)")
user: str = Field(description="Database user")
password: str = Field(description="Database password (from env var)")

# IDE autocomplete works
db = DatabaseSettings(host="localhost", port=5432, user="admin", password="secret")
print(db.host) # IDE knows this is a string

# Generate JSON schema
print(DatabaseSettings.model_json_schema())

Output:

{
"properties": {
"host": {"type": "string", "description": "Database server hostname"},
"port": {"type": "integer", "default": 5432, "description": "Database port"},
"user": {"type": "string", "description": "Database user"},
"password": {"type": "string", "description": "Database password"}
},
"required": ["host", "user", "password"]
}

This schema can be used for documentation, API specs, or configuration generation.

Comparing Configuration Approaches

ApproachType SafetyValidationEase of UseFor Production
Plain dict + os.getenv()NoneManualEasyPoor—errors at runtime
dataclass + validationGoodManualGoodFair
Pydantic SettingsExcellentAutomaticExcellentExcellent—errors at startup
ConfigParser (INI files)NoneManualModerateFair—no type checking
YAML + validationGoodManualGoodGood if encrypted

Pydantic is the clear winner for type safety and developer experience.

Key Takeaways

  • Pydantic Settings automatically reads environment variables and .env files, validates them against a schema, and raises errors at startup if validation fails.
  • Define a Settings class with typed fields, defaults, and Field descriptions; Pydantic enforces types and constraints automatically.
  • Use @field_validator decorators to add custom validation logic (URL parsing, range checks, enum validation) without boilerplate.
  • Set case_sensitive=False in Config to allow environment variable names in any case (DATABASE_PASSWORD and database_password both work).
  • Pydantic models generate JSON schema automatically, enabling documentation and IDE autocomplete—improving productivity and reducing bugs.

Frequently Asked Questions

How do I keep secrets out of Pydantic config logs?

Use Field(exclude=True) or model_config = ConfigDict(str_strip_whitespace=True, extra="forbid") to exclude sensitive fields from repr. Or override __repr__ to redact password-like fields. Better: configure your logger to redact keys matching *password, *key, *token.

Can Pydantic load configuration from multiple sources (file + env + secrets manager)?

Yes. Use validators with external calls: @field_validator("database_password") def fetch_secret(cls, v): if v == "FETCH_FROM_VAULT": return fetch_from_vault(); return v. Or load secrets before instantiating Settings: os.environ["API_KEY"] = fetch_from_aws_secrets_manager().

What is the difference between Field() and field_validator()?

Field() declares a model attribute with type, default, constraints (like gt=0), and descriptions. field_validator() is a function decorator that adds custom validation logic. Use both together: Field(gt=0) ensures the value is positive; a validator can add domain-specific rules like ensure_power_of_two().

How do I handle optional configuration (fields that might not be set)?

Use Optional[type] = None or Pydantic v2's field_name: type | None = None. If a field is optional but not None, use a default: api_timeout: int = 30 (default 30 seconds) vs api_timeout: int | None = None (truly optional).

Can I reload configuration without restarting the application?

Yes, but carefully. Store settings in a variable and rebuild it: new_settings = Settings(); app.settings = new_settings. For thread-safe reloads, use a lock. For long-running services (web servers), provide a /reload-config endpoint that re-instantiates Settings and applies changes to live connections.

Further Reading