Seaborn (Part 2): Statistical plots — histograms and box plots
Continuing from Seaborn Part 1, we focus on plots that quantify distribution and spread rather than pairwise relationships alone.
📚 Prerequisites
- DataFrame slicing and Seaborn fundamentals.
🎯 What you'll master
- Compare histograms and KDE overlays for modality and skewness.
- Interpret whisker conventions in Seaborn box plots responsibly.
Histogram with KDE overlay
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
rng = np.random.default_rng(7)
scores = rng.normal(loc=72, scale=11, size=1800)
fig, ax = plt.subplots()
sns.histplot(scores, kde=True, stat="density", bins=35, ax=ax)
ax.set_title("Exam score distribution")
plt.show()
stat="density" keeps KDE comparable to the normalized histogram shape.
Box plots grouped by category
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
rng = np.random.default_rng(42)
lat_east = rng.normal(loc=140, scale=22, size=96)
lat_west = rng.normal(loc=128, scale=18, size=96)
df = pd.DataFrame({
"latency": np.concatenate([lat_east, lat_west]),
"region": (["east"] * len(lat_east)) + (["west"] * len(lat_west)),
})
fig, ax = plt.subplots()
sns.boxplot(data=df, x="region", y="latency", ax=ax)
plt.show()
Edges mark quartiles; whiskers extend to typical ranges unless points are flagged as outliers.
💡 Key takeaways
- Layer KDE only when bin counts justify smooth curves—thin data exposes artificial bumps.
- Report sample sizes whenever readers compare categorical boxes.
➡️ Next steps
Explore relational and categorical plots in Seaborn Part 3.