Save ML Models in Python: Pickle vs Joblib
Saving a trained machine learning model to disk is the foundation of deployment. Before you write a single API endpoint, you must choose a serialization format that balances portability, size, speed, and security. Python offers multiple options: pickle (built-in, simple), joblib (optimized for NumPy arrays), ONNX (cross-platform), and framework-specific formats like TensorFlow SavedModel.
Each format has different trade-offs. The wrong choice can cost you weeks later—pickle files are insecure and Python-only; joblib is fast but also Python-only; ONNX guarantees portability but requires conversion; SavedModel is framework-specific but well-optimized. This article walks you through each option and gives you a decision tree.
How Model Serialization Works
Model serialization is the process of converting your trained model object (weights, tree structure, hyperparameters) from memory into bytes that can be written to a file or transmitted over a network. When you load the file later, you reconstruct the original object exactly (or approximately, depending on the format).
Python's pickle module uses bytecode to represent any Python object, making it universal—you can pickle scikit-learn classifiers, PyTorch models, custom classes, and even closures. Joblib is a specialized wrapper around pickle that optimizes for NumPy arrays (using memory-mapped I/O). ONNX (Open Neural Network Exchange) is a standardized format designed for interoperability: a model trained in PyTorch can be converted to ONNX and then loaded in C++, C#, JavaScript, or Java. SavedModel is TensorFlow-specific but includes metadata and execution graphs.
The fundamental choice is: do you need cross-platform serving, or is Python-only acceptable? Cross-platform means using ONNX or a framework-specific format; Python-only means pickle or joblib.
Pickle: The Python Standard
Pickle is Python's built-in serialization format. It works on any Python object and is incredibly simple:
import pickle
from sklearn.ensemble import RandomForestClassifier
# Train a model
model = RandomForestClassifier(n_estimators=100)
model.fit(X_train, y_train)
# Save it
with open("model.pkl", "wb") as f:
pickle.dump(model, f)
# Load it
with open("model.pkl", "rb") as f:
model = pickle.load(f)
Advantages:
- No configuration; works out-of-the-box.
- Fast serialization and deserialization.
- Supports any Python object (scikit-learn, custom classes, XGBoost).
Disadvantages:
- Security risk: pickle can execute arbitrary code when loading untrusted files. Never unpickle data from untrusted sources.
- Python-only: You cannot load a pickle file in Java, C++, or JavaScript.
- Version fragility: Pickles may break when library versions change significantly.
When to use: Internal-only Python services where you control all input. Not recommended for public APIs or cross-language systems.
Joblib: Optimized for Scikit-Learn and NumPy
Joblib is scikit-learn's recommended serialization library. It wraps pickle but optimizes for NumPy arrays by using memory-mapped I/O, reducing memory footprint for large arrays:
import joblib
from sklearn.ensemble import RandomForestClassifier
# Train a model
model = RandomForestClassifier(n_estimators=100)
model.fit(X_train, y_train)
# Save it
joblib.dump(model, "model.joblib")
# Load it
model = joblib.load("model.joblib")
Advantages:
- Faster and more memory-efficient than pickle for scikit-learn models with large array data.
- Slightly more robust to library version changes (scikit-learn handles compatibility internally).
- Industry standard for scikit-learn deployment.
Disadvantages:
- Still Python-only.
- Still inherits pickle's security concerns (though joblib adds minimal extra risk).
- Files are often larger than ONNX for the same model.
When to use: Scikit-learn, XGBoost, and LightGBM models in Python-only infrastructure. This is the most common choice for traditional ML.
Comparison Table: Serialization Formats
| Format | Framework | Portability | File Size | Load Speed | Security | Pros |
|---|---|---|---|---|---|---|
| Pickle | Any Python | Python only | Large | Fast | Unsafe | Simplest, universal in Python |
| Joblib | Scikit-Learn, XGBoost | Python only | Medium | Faster | Unsafe | Optimized for NumPy, scikit-learn standard |
| ONNX | Any (via conversion) | Cross-language | Small | Fast | Safe | Portable, smaller, secure, 2 GB model limit |
| SavedModel | TensorFlow | Cross-language | Medium | Medium | Safe | TensorFlow ecosystem, includes metadata |
| PyTorch TorchScript | PyTorch | C++, Python | Medium | Fast | Safe | PyTorch-native, production optimized |
ONNX: Cross-Platform Portability
ONNX (Open Neural Network Exchange) is an open standard that represents models as a computation graph, independent of the framework. A scikit-learn logistic regression, a PyTorch neural network, and an XGBoost classifier can all be converted to ONNX and then served in any language.
Converting a scikit-learn model to ONNX:
from sklearn.ensemble import RandomForestClassifier
from skl2onnx import convert_sklearn
from skl2onnx.common.data_types import FloatTensorType
# Train the model
model = RandomForestClassifier(n_estimators=100)
model.fit(X_train, y_train)
# Define input shape (e.g., 4 features)
initial_type = [("float_input", FloatTensorType([None, 4]))]
# Convert to ONNX
onnx_model = convert_sklearn(model, initial_types=initial_type)
# Save it
with open("model.onnx", "wb") as f:
f.write(onnx_model.SerializeToString())
Loading and inferring with ONNX Runtime (works in any language):
import onnxruntime as rt
import numpy as np
# Load the ONNX model
sess = rt.InferenceSession("model.onnx", providers=["CPUExecutionProvider"])
# Prepare input
input_name = sess.get_inputs()[0].name
output_name = sess.get_outputs()[0].name
X_test = np.array([[5.1, 3.5, 1.4, 0.2]], dtype=np.float32)
# Infer
pred = sess.run([output_name], {input_name: X_test})
print("Prediction:", pred[0])
Advantages:
- Framework-agnostic: run the same model in Python, C++, JavaScript, or C#.
- Smaller file size due to graph optimization.
- Safer than pickle (no arbitrary code execution).
- Mature ecosystem (ONNX Runtime, ONNX opset 21 in 2026).
Disadvantages:
- Conversion step required; not all scikit-learn transformers are supported.
- Some custom Python logic cannot be converted; you must rewrite in ONNX operators.
- Debugging converted models is harder than debugging Python.
When to use: Production systems that need to serve across multiple languages (Python API, C++ inference, JavaScript client). Also when you deploy to edge devices or browsers via ONNX.js.
SavedModel: TensorFlow's Native Format
TensorFlow uses SavedModel, a directory-based format that includes the graph, weights, and metadata:
import tensorflow as tf
# Train a model
model = tf.keras.Sequential([
tf.keras.layers.Dense(10, activation="relu", input_shape=(4,)),
tf.keras.layers.Dense(3, activation="softmax")
])
model.compile(optimizer="adam", loss="categorical_crossentropy")
model.fit(X_train, y_train, epochs=10)
# Save it
model.save("my_model")
# Load it
loaded_model = tf.keras.models.load_model("my_model")
# Infer
pred = loaded_model.predict(X_test)
Advantages:
- Native to TensorFlow; includes graph, weights, and serving metadata.
- Can be served via TensorFlow Serving (a battle-tested production system).
- Supports TensorFlow 2.x Lite (edge) and TensorFlow.js (browser).
Disadvantages:
- Tightly coupled to TensorFlow; not portable to other frameworks without conversion.
- SavedModel is heavier than ONNX for the same model.
When to use: TensorFlow-based models in production, especially if you use TensorFlow Serving for orchestration.
Decision Tree
Use this flowchart to choose:
- Do you need to serve in languages other than Python? → Yes: ONNX (or SavedModel if TensorFlow). No: continue.
- Is this a scikit-learn or XGBoost model? → Yes: joblib. No: continue.
- Do you have a custom Python class that cannot be converted to ONNX? → Yes: pickle. No: ONNX.
Key Takeaways
- Pickle is simple but Python-only and insecure; avoid for untrusted input.
- Joblib is the scikit-learn standard; faster and more robust than pickle for traditional ML.
- ONNX is the future for portable, cross-language inference; use it for production systems that may expand beyond Python.
- SavedModel is TensorFlow's native format; use it if you leverage TensorFlow Serving.
- Always test deserialization after serialization; version mismatches can silently corrupt models.
Frequently Asked Questions
Can I unpickle a file created with an older version of scikit-learn?
Usually yes, but not guaranteed. Scikit-learn maintains backward compatibility, but complex pipelines with custom transformers may break. Always test deserialization with your target environment.
What is the security risk of pickle?
Pickle can execute arbitrary Python code during deserialization. If you load a pickle from an untrusted user, they can achieve remote code execution. Never unpickle data from the public internet; use ONNX instead.
Can I convert any PyTorch model to ONNX?
Most PyTorch models can be exported to ONNX, but custom Python operations, dynamic control flow, and plugin code may fail. Test carefully and validate accuracy after conversion.
How much smaller is ONNX than joblib?
ONNX files are typically 30–50% smaller due to graph optimization and eliminating Python-specific metadata. For a 50 MB joblib model, expect 25–35 MB in ONNX.
Further Reading
- scikit-learn: Model Persistence — official guidance on joblib vs pickle.
- ONNX Official Documentation — format specification and tools.
- ONNX Runtime — high-performance inference engine.
- TensorFlow SavedModel Guide — TensorFlow-specific serialization.