Pydantic Serialization: Export and Transform Data
Validation brings data in; serialization takes it out. After you validate an API request into a Pydantic model, you must export it as JSON, insert it into a database, or pass it to another system. Pydantic provides two primary serialization methods—model_dump() (to Python dict) and model_dump_json() (to JSON string)—plus advanced options for customizing the output: excluding fields, renaming keys, controlling datetime format, and handling computed properties. This article covers serialization from basic to advanced use cases.
Basic Serialization: Dict and JSON
Once a model is instantiated and validated, export it in seconds:
from pydantic import BaseModel, EmailStr
from datetime import datetime
class User(BaseModel):
id: int
name: str
email: EmailStr
created_at: datetime
# Create and validate
user = User(
id=1,
name="Alice",
email="[email protected]",
created_at="2026-06-01T10:00:00Z"
)
# Export to dict
user_dict = user.model_dump()
print(user_dict)
# Output: {
# 'id': 1,
# 'name': 'Alice',
# 'email': '[email protected]',
# 'created_at': datetime(2026, 6, 1, 10, 0, 0, tzinfo=...)
# }
# Export to JSON string
user_json = user.model_dump_json()
print(user_json)
# Output: '{"id":1,"name":"Alice","email":"[email protected]","created_at":"2026-06-01T10:00:00Z"}'
# With pretty-printing
user_json_pretty = user.model_dump_json(indent=2)
print(user_json_pretty)
# Output:
# {
# "id": 1,
# "name": "Alice",
# "email": "[email protected]",
# "created_at": "2026-06-01T10:00:00Z"
# }
Note: model_dump() returns Python types (datetime as datetime object), while model_dump_json() serializes to JSON-compatible strings (datetime as ISO 8601 string).
Excluding Fields
API responses often exclude sensitive data (passwords, tokens) or internal fields:
from pydantic import BaseModel
class UserAccount(BaseModel):
id: int
name: str
email: str
password_hash: str
api_token: str
user = UserAccount(
id=1,
name="Alice",
email="[email protected]",
password_hash="hashed_secret_123",
api_token="sk-abc123..."
)
# Exclude sensitive fields
safe_user = user.model_dump(exclude={"password_hash", "api_token"})
print(safe_user)
# Output: {'id': 1, 'name': 'Alice', 'email': '[email protected]'}
# For JSON response
safe_json = user.model_dump_json(exclude={"password_hash", "api_token"})
Use exclude to prevent leaking internal fields to clients.
Including Only Specific Fields
For API endpoints that return only a subset of fields (list views vs. detail views):
from pydantic import BaseModel
class Product(BaseModel):
id: int
name: str
description: str
price: float
in_stock: bool
product = Product(
id=1,
name="Laptop",
description="High-performance laptop for developers",
price=999.99,
in_stock=True
)
# List view: only essential fields
list_view = product.model_dump(include={"id", "name", "price"})
print(list_view)
# Output: {'id': 1, 'name': 'Laptop', 'price': 999.99}
# Detail view: all fields
detail_view = product.model_dump()
Use include to return only the fields needed for a given response type.
Field Renaming and Custom Keys
Sometimes your Python model fields don't match the API contract. Use Field() with alias or validation_alias:
from pydantic import BaseModel, Field
class Order(BaseModel):
order_id: int = Field(alias="id")
total_amount: float = Field(alias="total")
created_timestamp: str = Field(alias="createdAt")
# Input uses aliases (from API)
data = {
"id": 123,
"total": 99.99,
"createdAt": "2026-06-01T10:00:00Z"
}
order = Order(**data)
print(order.order_id) # 123
# Output: use by_alias=True to match API contract
api_response = order.model_dump(by_alias=True)
print(api_response)
# Output: {'id': 123, 'total': 99.99, 'createdAt': '2026-06-01T10:00:00Z'}
# Without by_alias, Python names are used
python_view = order.model_dump(by_alias=False)
print(python_view)
# Output: {'order_id': 123, 'total_amount': 99.99, 'created_timestamp': '2026-06-01T10:00:00Z'}
Aliases decouple your Python naming conventions (snake_case) from API contracts (camelCase).
Serialization Modes
Control how complex types are serialized:
from pydantic import BaseModel, SerializationInfo
from datetime import datetime
class Event(BaseModel):
id: int
timestamp: datetime
data: dict
event = Event(
id=1,
timestamp=datetime(2026, 6, 1, 10, 0, 0),
data={"key": "value"}
)
# Mode "python" (default): native Python types
python_dump = event.model_dump(mode="python")
print(type(python_dump["timestamp"])) # <class 'datetime.datetime'>
# Mode "json": JSON-serializable types (strings, numbers, lists, dicts)
json_dump = event.model_dump(mode="json")
print(type(json_dump["timestamp"])) # <class 'str'>
print(json_dump["timestamp"]) # "2026-06-01T10:00:00"
Use mode="python" for internal processing, mode="json" for API responses.
Custom Serialization with Computed Fields
For derived/computed fields, use @computed_field:
from pydantic import BaseModel, computed_field
class Rectangle(BaseModel):
width: float
height: float
@computed_field
@property
def area(self) -> float:
return self.width * self.height
@computed_field
@property
def perimeter(self) -> float:
return 2 * (self.width + self.height)
rect = Rectangle(width=10, height=5)
print(rect.model_dump())
# Output: {
# 'width': 10,
# 'height': 5,
# 'area': 50,
# 'perimeter': 30
# }
Computed fields are calculated at serialization time and included in dumps automatically.
Nested Model Serialization
When serializing models with nested models, controls cascade:
from pydantic import BaseModel, Field
class Address(BaseModel):
street: str
city: str
zip_code: str
country: str
class User(BaseModel):
name: str
email: str
address: Address
is_admin: bool = False
user_data = {
"name": "Alice",
"email": "[email protected]",
"address": {
"street": "123 Main",
"city": "NYC",
"zip_code": "10001",
"country": "USA"
},
"is_admin": False
}
user = User(**user_data)
# Exclude nested fields with nested dict syntax
safe_user = user.model_dump(exclude={
"is_admin": True,
"address": {"zip_code", "country"}
})
print(safe_user)
# Output: {
# 'name': 'Alice',
# 'email': '[email protected]',
# 'address': {
# 'street': '123 Main',
# 'city': 'NYC'
# }
# }
Nested exclusions use a dict structure: keys are field names, values are exclusion sets or True (exclude entire field).
Serializing Lists and Collections
Collections of models serialize element-by-element:
from pydantic import BaseModel
from typing import list
class Task(BaseModel):
id: int
title: str
completed: bool
tasks = [
Task(id=1, title="Setup project", completed=True),
Task(id=2, title="Write tests", completed=False),
Task(id=3, title="Deploy", completed=False)
]
# Convert list to JSON
tasks_json = "[" + ", ".join(t.model_dump_json() for t in tasks) + "]"
print(tasks_json)
# Or use pydantic_core directly for lists
from pydantic.json import pydantic_encoder
import json
tasks_dict = [t.model_dump() for t in tasks]
tasks_json = json.dumps(tasks_dict)
For better performance, create a wrapper model:
from pydantic import BaseModel
from typing import list
class TaskList(BaseModel):
items: list[Task]
task_list = TaskList(items=tasks)
print(task_list.model_dump_json())
# Handles the entire list efficiently
Custom Serializers
For fine-grained control, use field_serializer:
from pydantic import BaseModel, field_serializer
from datetime import datetime
class Event(BaseModel):
name: str
timestamp: datetime
@field_serializer("timestamp")
def serialize_timestamp(self, value: datetime, _info):
# Custom format: "2026-06-01 10:00:00" instead of ISO 8601
return value.strftime("%Y-%m-%d %H:%M:%S")
event = Event(
name="Conference",
timestamp=datetime(2026, 6, 1, 10, 0, 0)
)
print(event.model_dump())
# Output: {
# 'name': 'Conference',
# 'timestamp': '2026-06-01 10:00:00'
# }
Field serializers run during model_dump() and model_dump_json(), letting you customize output per field.
Key Takeaways
- Use
model_dump()to export to dict;model_dump_json()to export to JSON string. - Use
excludeto remove sensitive fields from output;includeto return only specific fields. - Use
Field(alias="...")to decouple Python naming from API contracts; setby_alias=Truein dumps. - Use
mode="json"to get JSON-serializable types in dicts (strings instead of datetime objects). - Use
@computed_fieldto include derived fields in serialization automatically. - Use
@field_serializerfor custom field-level output formatting.
Frequently Asked Questions
What's the performance difference between model_dump and model_dump_json?
model_dump() is slightly faster (no JSON encoding), but model_dump_json() is optimized in Pydantic v2 and is typically only 10-20% slower. For API endpoints, the difference is negligible compared to network latency.
Can I serialize to formats other than JSON (XML, YAML)?
Pydantic provides dict export; use third-party libraries for other formats. Export to dict, then use xmltodict or pyyaml to convert.
How do I handle circular references in serialization?
Avoid circular references in your model design. If necessary, use exclude during serialization or SerializerFunctionWrapHandler to break the cycle.
Can I serialize to database row format?
Yes. Design your model fields to match your database schema, then use model_dump() to create a dict for INSERT statements.
What if a field has both a validator and a serializer?
Validators run during instantiation (input); serializers run during export (output). They operate independently.