ch13s3_SeabornForStatisticalVisualization
[Seaborn](https://seaborn.pydata.org/) is a **high-level data visualization library** built on top of Matplotlib.
Chapter 13: Data Visualization — Seaborn for Statistical Visualization
🌊 Seaborn for Statistical Visualization
Seaborn is a high-level data visualization library built on top of Matplotlib.
It’s specifically designed for creating statistical and aesthetically pleasing visualizations with minimal code.
Seaborn integrates tightly with Pandas DataFrames, making it ideal for exploring and analyzing datasets visually.
⚙️ 1. Installing and Setting Up Seaborn
pip install seaborn
import seaborn as sns
import matplotlib.pyplot as plt
sns.set_theme(style="whitegrid", palette="muted")
🎨 Seaborn automatically handles color palettes, themes, and statistical aggregation.
📊 2. Distribution Plots — Understanding Data Spread
Distribution plots reveal how data values are distributed, showing frequency and shape.
Example — Histogram with Density Curve
import seaborn as sns
import matplotlib.pyplot as plt
data = [35, 45, 50, 55, 60, 62, 65, 68, 70, 72, 75, 80, 85]
sns.histplot(data, bins=6, kde=True, color="royalblue")
plt.title("Distribution of Values", fontsize=14)
plt.xlabel("Value")
plt.ylabel("Frequency")
plt.show()
💡
kde=Trueadds a smooth curve showing the probability density function (PDF).
🟢 3. Scatter Plots — Visualizing Relationships
Scatter plots show relationships between two continuous variables and optionally group data by categories.
data = sns.load_dataset("iris")
sns.scatterplot(data=data, x="sepal_length", y="sepal_width", hue="species", style="species", s=100)
plt.title("Sepal Dimensions by Species", fontsize=14)
plt.xlabel("Sepal Length (cm)")
plt.ylabel("Sepal Width (cm)")
plt.show()
🎯 Use
huefor color grouping andstyleorsizefor additional dimensions.
🧩 4. Box and Violin Plots — Comparing Distributions
These plots are ideal for comparing spread, median, and outliers across categories.
Example — Boxplot
sns.boxplot(data=data, x="species", y="sepal_length", palette="Set2")
plt.title("Sepal Length by Species (Boxplot)")
plt.show()
Example — Violin Plot
sns.violinplot(data=data, x="species", y="sepal_length", inner="quartile", palette="pastel")
plt.title("Sepal Length by Species (Violin Plot)")
plt.show()
📦 Boxplots emphasize median & quartiles, while 🎻 violin plots show full distribution density.
📈 5. Regression and Trend Analysis
Seaborn’s regression functions help visualize linear or non-linear trends in data.
tips = sns.load_dataset("tips")
sns.regplot(data=tips, x="total_bill", y="tip", scatter_kws={"alpha":0.6}, line_kws={"color":"red"})
plt.title("Relationship between Bill and Tip")
plt.show()
sns.regplot()automatically fits and visualizes a regression line with confidence intervals.
🔗 6. Pair Plots — Exploring Multivariate Relationships
Pair plots visualize all pairwise relationships in a dataset.
sns.pairplot(data=data, hue="species", diag_kind="kde", palette="husl")
plt.suptitle("Pairwise Relationships — Iris Dataset", y=1.02)
plt.show()
Each diagonal plot shows a univariate distribution, while off-diagonal plots show bivariate relationships.
🔥 7. Heatmaps — Correlation and Matrix Data
Heatmaps are great for visualizing matrix-like data or correlation coefficients.
corr = data.corr(numeric_only=True)
sns.heatmap(corr, annot=True, cmap="coolwarm", linewidths=0.5)
plt.title("Correlation Heatmap")
plt.show()
Use heatmaps to quickly detect relationships or multicollinearity between variables.
🧮 8. Categorical Plots and Facet Grids
Seaborn provides versatile functions for categorical data visualization.
sns.catplot(data=tips, x="day", y="total_bill", kind="box", hue="sex", height=5, aspect=1.2)
plt.title("Boxplot by Day and Gender")
plt.show()
You can also split data into subplots using FacetGrid for comparisons:
g = sns.FacetGrid(tips, col="sex", row="time", margin_titles=True)
g.map_dataframe(sns.scatterplot, x="total_bill", y="tip", color="teal")
g.add_legend()
plt.show()
🎨 9. Customization and Themes
Change aesthetics globally with:
sns.set_theme(style="darkgrid", context="talk", palette="deep")
| Parameter | Description |
|---|---|
style | Axes background (e.g., “whitegrid”, “dark”) |
context | Scaling for presentation (“notebook”, “talk”, “poster”) |
palette | Color scheme (e.g., “muted”, “deep”, “coolwarm”) |
Try
sns.color_palette("flare")orsns.set_palette("pastel")for soft, publication-ready colors.
🧭 10. Summary — Common Seaborn Plot Types
| Function | Type | Purpose |
|---|---|---|
sns.histplot() | Distribution | Visualize histogram/density |
sns.boxplot() | Distribution | Compare groups by median & spread |
sns.violinplot() | Distribution | Show data density & quartiles |
sns.scatterplot() | Relationship | Plot two continuous variables |
sns.regplot() | Relationship | Add regression trendline |
sns.heatmap() | Correlation | Visualize matrix or correlations |
sns.pairplot() | Multivariate | Explore variable relationships |
sns.catplot() | Categorical | Multi-faceted group plots |
🧭 Conclusion
Seaborn brings beauty, simplicity, and statistical intelligence to data visualization.
It reduces boilerplate code and automatically handles themes, grouping, and color mapping — allowing you to focus on insights instead of formatting.
“With Seaborn, your data tells its story — elegantly, statistically, and beautifully.”