Published: November 21, 2025 • Language: python • Chapter: 14 • Sub: 3 • Level: beginner

Chapter 14: Introduction to Machine Learning — Linear Regression Example

📈 Linear Regression: Predicting Continuous Values

Linear Regression is one of the most fundamental algorithms in machine learning.
It models the relationship between input variables (features) and a continuous output (target) using a linear equation.

It’s the foundation for many advanced algorithms and serves as the perfect starting point for understanding supervised learning.

🧩 1. Concept Overview

The Linear Regression Equation

[ y = β₀ + β₁x₁ + β₂x₂ + … + βₙxₙ + ε ]

Where:

y → Target variable (e.g., house price)
x₁ … xₙ → Input features (e.g., rooms, location, income)
β₀ → Intercept (baseline value)
β₁ … βₙ → Coefficients (effect of each feature)
ε → Random error or noise

🏠 2. Example — Predicting California Housing Prices

We’ll use Scikit‑Learn’s California Housing dataset, a modern replacement for the deprecated Boston dataset.

from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Load the dataset
data = fetch_california_housing()
X = pd.DataFrame(data.data, columns=data.feature_names)
y = pd.Series(data.target, name="MedianHouseValue")

# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Initialize and train model
model = LinearRegression()
model.fit(X_train, y_train)

# Predict on test data
y_pred = model.predict(X_test)

# Evaluate performance
mse = mean_squared_error(y_test, y_pred)
mae = mean_absolute_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

print(f"Mean Squared Error (MSE): {mse:.3f}")
print(f"Mean Absolute Error (MAE): {mae:.3f}")
print(f"R² Score: {r2:.3f}")

📊 3. Visualizing Predictions

Predicted vs Actual Values

plt.figure(figsize=(6, 6))
sns.scatterplot(x=y_test, y=y_pred, alpha=0.6, color='royalblue')
plt.plot([y_test.min(), y_test.max()], [y_test.min(), y_test.max()], 'r--')
plt.xlabel("Actual Values")
plt.ylabel("Predicted Values")
plt.title("Predicted vs Actual House Prices")
plt.grid(True, linestyle='--', alpha=0.4)
plt.show()

The closer the points are to the red diagonal line, the better the model’s predictions.

Residual Plot (Error Distribution)

residuals = y_test - y_pred
sns.histplot(residuals, bins=30, kde=True, color="orange")
plt.title("Residual Distribution")
plt.xlabel("Prediction Error (Residual)")
plt.show()

Residuals centered around zero indicate a well‑fitted model.

🧠 4. Understanding Model Coefficients

Each coefficient shows how much the target changes for a unit increase in the corresponding feature.

coef_df = pd.DataFrame({
    "Feature": X.columns,
    "Coefficient": model.coef_
}).sort_values(by="Coefficient", ascending=False)

print(coef_df)

# Optional: visualize
plt.figure(figsize=(8, 5))
sns.barplot(data=coef_df, x="Coefficient", y="Feature", palette="viridis")
plt.title("Feature Importance (Linear Coefficients)")
plt.show()

Positive coefficients increase the predicted price, negative ones decrease it.

📏 5. Regression Metrics Summary

Metric	Description	Lower = Better	Function
MSE (Mean Squared Error)	Average squared difference between predicted and actual values.	✅	`mean_squared_error()`
RMSE (Root MSE)	Square root of MSE (in same units as target).	✅	`sqrt(mse)`
MAE (Mean Absolute Error)	Average absolute difference.	✅	`mean_absolute_error()`
R² (Coefficient of Determination)	Proportion of variance explained by the model (1 = perfect).	❌	`r2_score()`

Example output:

Mean Squared Error (MSE): 0.54
Mean Absolute Error (MAE): 0.48
R² Score: 0.74

Interpretation:

R² = 0.74 → 74% of housing price variation is explained by the model.
Low MSE / MAE → model predictions are close to actual values.

🔍 6. Common Pitfalls and Improvements

Issue	Description	Fix
Outliers	Extreme data points distort regression line.	Use robust scalers or remove outliers.
Non‑linear relationships	Linear model can’t capture curves.	Use polynomial regression or tree‑based models.
Feature scaling	Unscaled features affect coefficient magnitude.	Apply `StandardScaler`.
Multicollinearity	Highly correlated features distort coefficients.	Use PCA or remove redundant features.

🚀 7. Takeaways

Linear Regression assumes linearity, independence, and constant variance.
It’s interpretable, efficient, and forms the basis of many advanced algorithms.
Always visualize results to understand where the model succeeds or fails.
Real‑world performance often improves with regularization (Ridge, Lasso).

🧭 Conclusion

Linear Regression is not just a basic algorithm — it’s the foundation of predictive modeling.
By applying it to real data (like the California housing dataset), you learn how to train, evaluate, visualize, and interpret ML models end‑to‑end.

“All models are wrong, but some are useful.” — George Box