Skip to main content

Seaborn (Part 2): Statistical plots — histograms and box plots

Continuing from Seaborn Part 1, we focus on plots that quantify distribution and spread rather than pairwise relationships alone.


📚 Prerequisites

  • DataFrame slicing and Seaborn fundamentals.

🎯 What you'll master

  • Compare histograms and KDE overlays for modality and skewness.
  • Interpret whisker conventions in Seaborn box plots responsibly.

Histogram with KDE overlay

import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

rng = np.random.default_rng(7)
scores = rng.normal(loc=72, scale=11, size=1800)

fig, ax = plt.subplots()
sns.histplot(scores, kde=True, stat="density", bins=35, ax=ax)
ax.set_title("Exam score distribution")
plt.show()

stat="density" keeps KDE comparable to the normalized histogram shape.


Box plots grouped by category

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

rng = np.random.default_rng(42)
lat_east = rng.normal(loc=140, scale=22, size=96)
lat_west = rng.normal(loc=128, scale=18, size=96)
df = pd.DataFrame({
"latency": np.concatenate([lat_east, lat_west]),
"region": (["east"] * len(lat_east)) + (["west"] * len(lat_west)),
})

fig, ax = plt.subplots()
sns.boxplot(data=df, x="region", y="latency", ax=ax)
plt.show()

Edges mark quartiles; whiskers extend to typical ranges unless points are flagged as outliers.


💡 Key takeaways

  • Layer KDE only when bin counts justify smooth curves—thin data exposes artificial bumps.
  • Report sample sizes whenever readers compare categorical boxes.

➡️ Next steps

Explore relational and categorical plots in Seaborn Part 3.