Skip to main content

Pydantic Serialization: Export and Transform Data

Validation brings data in; serialization takes it out. After you validate an API request into a Pydantic model, you must export it as JSON, insert it into a database, or pass it to another system. Pydantic provides two primary serialization methods—model_dump() (to Python dict) and model_dump_json() (to JSON string)—plus advanced options for customizing the output: excluding fields, renaming keys, controlling datetime format, and handling computed properties. This article covers serialization from basic to advanced use cases.

Basic Serialization: Dict and JSON

Once a model is instantiated and validated, export it in seconds:

from pydantic import BaseModel, EmailStr
from datetime import datetime

class User(BaseModel):
id: int
name: str
email: EmailStr
created_at: datetime

# Create and validate
user = User(
id=1,
name="Alice",
email="[email protected]",
created_at="2026-06-01T10:00:00Z"
)

# Export to dict
user_dict = user.model_dump()
print(user_dict)
# Output: {
# 'id': 1,
# 'name': 'Alice',
# 'email': '[email protected]',
# 'created_at': datetime(2026, 6, 1, 10, 0, 0, tzinfo=...)
# }

# Export to JSON string
user_json = user.model_dump_json()
print(user_json)
# Output: '{"id":1,"name":"Alice","email":"[email protected]","created_at":"2026-06-01T10:00:00Z"}'

# With pretty-printing
user_json_pretty = user.model_dump_json(indent=2)
print(user_json_pretty)
# Output:
# {
# "id": 1,
# "name": "Alice",
# "email": "[email protected]",
# "created_at": "2026-06-01T10:00:00Z"
# }

Note: model_dump() returns Python types (datetime as datetime object), while model_dump_json() serializes to JSON-compatible strings (datetime as ISO 8601 string).

Excluding Fields

API responses often exclude sensitive data (passwords, tokens) or internal fields:

from pydantic import BaseModel

class UserAccount(BaseModel):
id: int
name: str
email: str
password_hash: str
api_token: str

user = UserAccount(
id=1,
name="Alice",
email="[email protected]",
password_hash="hashed_secret_123",
api_token="sk-abc123..."
)

# Exclude sensitive fields
safe_user = user.model_dump(exclude={"password_hash", "api_token"})
print(safe_user)
# Output: {'id': 1, 'name': 'Alice', 'email': '[email protected]'}

# For JSON response
safe_json = user.model_dump_json(exclude={"password_hash", "api_token"})

Use exclude to prevent leaking internal fields to clients.

Including Only Specific Fields

For API endpoints that return only a subset of fields (list views vs. detail views):

from pydantic import BaseModel

class Product(BaseModel):
id: int
name: str
description: str
price: float
in_stock: bool

product = Product(
id=1,
name="Laptop",
description="High-performance laptop for developers",
price=999.99,
in_stock=True
)

# List view: only essential fields
list_view = product.model_dump(include={"id", "name", "price"})
print(list_view)
# Output: {'id': 1, 'name': 'Laptop', 'price': 999.99}

# Detail view: all fields
detail_view = product.model_dump()

Use include to return only the fields needed for a given response type.

Field Renaming and Custom Keys

Sometimes your Python model fields don't match the API contract. Use Field() with alias or validation_alias:

from pydantic import BaseModel, Field

class Order(BaseModel):
order_id: int = Field(alias="id")
total_amount: float = Field(alias="total")
created_timestamp: str = Field(alias="createdAt")

# Input uses aliases (from API)
data = {
"id": 123,
"total": 99.99,
"createdAt": "2026-06-01T10:00:00Z"
}
order = Order(**data)
print(order.order_id) # 123

# Output: use by_alias=True to match API contract
api_response = order.model_dump(by_alias=True)
print(api_response)
# Output: {'id': 123, 'total': 99.99, 'createdAt': '2026-06-01T10:00:00Z'}

# Without by_alias, Python names are used
python_view = order.model_dump(by_alias=False)
print(python_view)
# Output: {'order_id': 123, 'total_amount': 99.99, 'created_timestamp': '2026-06-01T10:00:00Z'}

Aliases decouple your Python naming conventions (snake_case) from API contracts (camelCase).

Serialization Modes

Control how complex types are serialized:

from pydantic import BaseModel, SerializationInfo
from datetime import datetime

class Event(BaseModel):
id: int
timestamp: datetime
data: dict

event = Event(
id=1,
timestamp=datetime(2026, 6, 1, 10, 0, 0),
data={"key": "value"}
)

# Mode "python" (default): native Python types
python_dump = event.model_dump(mode="python")
print(type(python_dump["timestamp"])) # <class 'datetime.datetime'>

# Mode "json": JSON-serializable types (strings, numbers, lists, dicts)
json_dump = event.model_dump(mode="json")
print(type(json_dump["timestamp"])) # <class 'str'>
print(json_dump["timestamp"]) # "2026-06-01T10:00:00"

Use mode="python" for internal processing, mode="json" for API responses.

Custom Serialization with Computed Fields

For derived/computed fields, use @computed_field:

from pydantic import BaseModel, computed_field

class Rectangle(BaseModel):
width: float
height: float

@computed_field
@property
def area(self) -> float:
return self.width * self.height

@computed_field
@property
def perimeter(self) -> float:
return 2 * (self.width + self.height)

rect = Rectangle(width=10, height=5)
print(rect.model_dump())
# Output: {
# 'width': 10,
# 'height': 5,
# 'area': 50,
# 'perimeter': 30
# }

Computed fields are calculated at serialization time and included in dumps automatically.

Nested Model Serialization

When serializing models with nested models, controls cascade:

from pydantic import BaseModel, Field

class Address(BaseModel):
street: str
city: str
zip_code: str
country: str

class User(BaseModel):
name: str
email: str
address: Address
is_admin: bool = False

user_data = {
"name": "Alice",
"email": "[email protected]",
"address": {
"street": "123 Main",
"city": "NYC",
"zip_code": "10001",
"country": "USA"
},
"is_admin": False
}
user = User(**user_data)

# Exclude nested fields with nested dict syntax
safe_user = user.model_dump(exclude={
"is_admin": True,
"address": {"zip_code", "country"}
})
print(safe_user)
# Output: {
# 'name': 'Alice',
# 'email': '[email protected]',
# 'address': {
# 'street': '123 Main',
# 'city': 'NYC'
# }
# }

Nested exclusions use a dict structure: keys are field names, values are exclusion sets or True (exclude entire field).

Serializing Lists and Collections

Collections of models serialize element-by-element:

from pydantic import BaseModel
from typing import list

class Task(BaseModel):
id: int
title: str
completed: bool

tasks = [
Task(id=1, title="Setup project", completed=True),
Task(id=2, title="Write tests", completed=False),
Task(id=3, title="Deploy", completed=False)
]

# Convert list to JSON
tasks_json = "[" + ", ".join(t.model_dump_json() for t in tasks) + "]"
print(tasks_json)
# Or use pydantic_core directly for lists

from pydantic.json import pydantic_encoder
import json
tasks_dict = [t.model_dump() for t in tasks]
tasks_json = json.dumps(tasks_dict)

For better performance, create a wrapper model:

from pydantic import BaseModel
from typing import list

class TaskList(BaseModel):
items: list[Task]

task_list = TaskList(items=tasks)
print(task_list.model_dump_json())
# Handles the entire list efficiently

Custom Serializers

For fine-grained control, use field_serializer:

from pydantic import BaseModel, field_serializer
from datetime import datetime

class Event(BaseModel):
name: str
timestamp: datetime

@field_serializer("timestamp")
def serialize_timestamp(self, value: datetime, _info):
# Custom format: "2026-06-01 10:00:00" instead of ISO 8601
return value.strftime("%Y-%m-%d %H:%M:%S")

event = Event(
name="Conference",
timestamp=datetime(2026, 6, 1, 10, 0, 0)
)
print(event.model_dump())
# Output: {
# 'name': 'Conference',
# 'timestamp': '2026-06-01 10:00:00'
# }

Field serializers run during model_dump() and model_dump_json(), letting you customize output per field.

Key Takeaways

  • Use model_dump() to export to dict; model_dump_json() to export to JSON string.
  • Use exclude to remove sensitive fields from output; include to return only specific fields.
  • Use Field(alias="...") to decouple Python naming from API contracts; set by_alias=True in dumps.
  • Use mode="json" to get JSON-serializable types in dicts (strings instead of datetime objects).
  • Use @computed_field to include derived fields in serialization automatically.
  • Use @field_serializer for custom field-level output formatting.

Frequently Asked Questions

What's the performance difference between model_dump and model_dump_json?

model_dump() is slightly faster (no JSON encoding), but model_dump_json() is optimized in Pydantic v2 and is typically only 10-20% slower. For API endpoints, the difference is negligible compared to network latency.

Can I serialize to formats other than JSON (XML, YAML)?

Pydantic provides dict export; use third-party libraries for other formats. Export to dict, then use xmltodict or pyyaml to convert.

How do I handle circular references in serialization?

Avoid circular references in your model design. If necessary, use exclude during serialization or SerializerFunctionWrapHandler to break the cycle.

Can I serialize to database row format?

Yes. Design your model fields to match your database schema, then use model_dump() to create a dict for INSERT statements.

What if a field has both a validator and a serializer?

Validators run during instantiation (input); serializers run during export (output). They operate independently.

Further Reading