← Back to All Guides
Guide 09 — Advanced

Multi-Class Outage Cause Prediction with XGBoost & SHAP

Prefer not to install anything? Click the badge above to open this guide as a runnable notebook in Google Colab. Sign in with any Google account, then use Runtime → Run all to execute every cell, or step through them one at a time.

Prerequisite: Complete Guide 01: Outage Prediction first. This guide extends the binary classifier into a multi-class model that predicts the cause of an outage and explains why.

What You Will Learn

In Guide 01, you built a binary classifier: outage or no outage. But utility reliability teams need more—they need to know why an outage is likely. Is it vegetation encroachment during the growing season? Weather-driven during storm season? Equipment failure on aging infrastructure? And what about the significant share of outages with undetermined causes? In this guide you will:

  • Build a multi-class classifier that predicts outage cause codes (vegetation, weather, equipment, animal, overload, and unknown)
  • Use XGBoost instead of Random Forest for improved performance on imbalanced classes
  • Implement a time-aware train/test split that simulates real-world deployment
  • Add asset features (transformer age, kVA ratings) alongside weather
  • Use SHAP values to explain individual predictions—not just global feature importance
  • Benchmark your model against SP&L's annual SAIFI reliability metrics

SP&L Data You Will Use

  • outage_history.csv (load_outage_history()) — feeder-level outage events with cause_code, duration, affected_customers
  • weather_data.csv (load_weather_data()) — 52,608 hourly records with temperature, humidity, wind, storm flags
  • transformers.csv (load_transformers()) — ~21,000 transformers with age, kVA ratings
  • network_edges.csv (load_network_edges()) — conductor segments with impedance data

Additional Libraries

pip install xgboost shap
0

Verify Your Setup

Before starting, verify that your environment is configured correctly. Run this cell first to confirm all dependencies are installed and data files are accessible.

# Step 0: Verify your setup try: import pandas as pd import numpy as np from demo_data.load_demo_data import load_outage_history outages = load_outage_history() print(f"Setup OK! Loaded {len(outages):,} outage records.") except ModuleNotFoundError as e: print(f"Missing library: {e}") print("Run: pip install -r requirements.txt") except FileNotFoundError: print("Data files not found. Run from the repo root:") print(" cd Dynamic-Network-Model && jupyter lab")
Setup OK! Loaded N records.

Working directory: All guides assume your working directory is the repository root (Dynamic-Network-Model/). Start Jupyter Lab from there: cd Dynamic-Network-Model && jupyter lab

Extra dependencies: pip install xgboost shap

Having trouble? Check our Troubleshooting Guide for solutions to common setup and data loading issues.

1

Load and Merge All Data Sources

Unlike Guide 01 where we used only weather data, here we merge weather and asset data to give the model a richer picture of outage drivers.

import pandas as pd import numpy as np import matplotlib.pyplot as plt import seaborn as sns import xgboost as xgb import shap from sklearn.metrics import classification_report, confusion_matrix from sklearn.preprocessing import LabelEncoder from demo_data.load_demo_data import ( load_outage_history, load_weather_data, load_transformers, load_network_edges ) # Load all datasets outages = load_outage_history() weather = load_weather_data() transformers = load_transformers() edges = load_network_edges() print(f"Outages: {len(outages):,} events") print(f"Cause codes: {outages['cause_code'].unique()}") print(f"\nCause code distribution:") print(outages["cause_code"].value_counts())
Outages: 3,247 events
Cause codes: ['vegetation' 'weather' 'equipment_failure' 'animal_contact' 'overload' 'unknown']

Cause code distribution:
vegetation 987
weather 842
equipment_failure 614
animal_contact 389
overload 247
unknown 168

Why multi-class? A binary model tells you "an outage might happen." A multi-class model tells you "an outage might happen due to vegetation." This lets crews prepare the right response: tree trimming crews for vegetation, line patrol for equipment, or storm staging for weather-driven events.

2

Build Enriched Feature Set

We combine daily weather summaries with asset condition data for each outage's feeder. This gives the model both environmental and infrastructure context. Note that we keep all cause codes including “unknown”—in real utility data, a significant portion of outages have undetermined causes, and the model should learn to recognize these patterns rather than ignoring them.

# Create daily weather features (same as Guide 01) weather["date"] = weather["timestamp"].dt.date daily_weather = weather.groupby("date").agg({ "temperature": ["max", "min", "mean"], "wind_speed": ["max", "mean"], "is_storm": "max", "humidity": "mean", }).reset_index() daily_weather.columns = ["date", "temp_max", "temp_min", "temp_mean", "wind_max", "wind_mean", "is_storm", "humidity_mean"] # Add asset features per feeder feeder_assets = transformers.groupby("feeder_id").agg({ "age_years": "mean", "kva_rating": "sum", }).reset_index() feeder_assets.columns = ["feeder_id", "avg_asset_age", "total_kva"] # Build the training table: one row per outage event outages["date"] = outages["fault_detected"].dt.date df = outages.merge(daily_weather, on="date", how="left") df = df.merge(feeder_assets, on="feeder_id", how="left") # Add time features df["fault_detected"] = pd.to_datetime(df["fault_detected"]) df["month"] = df["fault_detected"].dt.month df["hour"] = df["fault_detected"].dt.hour df["day_of_week"] = df["fault_detected"].dt.dayofweek df["is_summer"] = df["month"].isin([6, 7, 8]).astype(int) df["is_storm_season"] = df["month"].isin([3, 4, 5, 6]).astype(int) # Drop rows with missing weather data (keep all cause codes including unknown) df = df.dropna(subset=["temp_max"]) print(f"Training samples: {len(df):,}") print(f"\nFeatures built per event:") print(f" Weather: 7 | Asset: 2 | Calendar: 5")
3

Time-Aware Train/Test Split

In Guide 01, we used a random split. But in production, your model always predicts the future based on the past. A time-aware split is more honest: train on 2020–2023 data, test on 2024–2025.

# Define features and target feature_cols = [ "temp_max", "temp_min", "temp_mean", "wind_max", "wind_mean", "is_storm", "humidity_mean", "avg_asset_age", "total_kva", "month", "hour", "day_of_week", "is_summer", "is_storm_season" ] # Encode cause codes as integers le = LabelEncoder() df["cause_label"] = le.fit_transform(df["cause_code"]) cause_names = le.classes_ # Time-aware split: train on 2020-2023, test on 2024-2025 train_mask = df["fault_detected"].dt.year <= 2023 test_mask = df["fault_detected"].dt.year >= 2024 X_train = df.loc[train_mask, feature_cols] y_train = df.loc[train_mask, "cause_label"] X_test = df.loc[test_mask, feature_cols] y_test = df.loc[test_mask, "cause_label"] print(f"Training set: {len(X_train):,} events (2020-2023)") print(f"Test set: {len(X_test):,} events (2024-2025)") print(f"Classes: {list(cause_names)}")

Why not random split? Random splitting lets future data "leak" into the training set. If your model trains on a July 2024 storm and tests on a June 2024 event, it has an unfair advantage. Time-aware splitting gives you honest performance estimates that reflect how the model will actually perform when deployed.

4

Train XGBoost Multi-Class Classifier

XGBoost (Extreme Gradient Boosting) builds trees sequentially, where each new tree corrects the mistakes of the previous ones. It typically outperforms Random Forest on structured data, especially with class imbalance.

# Calculate class weights for imbalanced data class_counts = y_train.value_counts().sort_index() total = len(y_train) n_classes = len(class_counts) class_weights = {i: total / (n_classes * count) for i, count in class_counts.items()} # Assign sample weights sample_weights = y_train.map(class_weights) # Train XGBoost model = xgb.XGBClassifier( n_estimators=300, max_depth=6, learning_rate=0.1, subsample=0.8, colsample_bytree=0.8, objective="multi:softprob", num_class=n_classes, random_state=42, eval_metric="mlogloss", ) model.fit( X_train, y_train, sample_weight=sample_weights, eval_set=[(X_test, y_test)], verbose=50 ) print("Training complete.")

A note on eval_set: We pass (X_test, y_test) as the evaluation set so you can watch the model’s loss decrease during training. Be aware that this means the training procedure can “see” test set performance—if you add early_stopping_rounds, the model will use the test set to decide when to stop, which is a mild form of information leakage. In production pipelines, best practice is to create a separate validation set (e.g., a 2023-only holdout) for monitoring and early stopping, and reserve the true test set (2024–2025) for final evaluation only. For this tutorial, the impact is minimal since we are not tuning hyperparameters against the eval set, but it is an important distinction to understand.

XGBoost vs Random Forest: Random Forest builds trees independently (in parallel). XGBoost builds trees sequentially, with each tree focusing on the mistakes of the ensemble so far. The learning_rate controls how aggressively each tree corrects errors. Lower rates (0.01–0.1) with more trees generally give better results but take longer to train.

5

Evaluate Multi-Class Performance

# Predict on test set y_pred = model.predict(X_test) # Classification report with cause code names print(classification_report(y_test, y_pred, target_names=cause_names)) # Confusion matrix cm = confusion_matrix(y_test, y_pred) fig, ax = plt.subplots(figsize=(8, 7)) sns.heatmap(cm, annot=True, fmt="d", cmap="Blues", xticklabels=cause_names, yticklabels=cause_names, ax=ax) ax.set_xlabel("Predicted Cause") ax.set_ylabel("Actual Cause") ax.set_title("Multi-Class Confusion Matrix: Outage Cause Prediction") plt.xticks(rotation=45, ha="right") plt.tight_layout() plt.show()

Look at the confusion matrix to understand where the model gets confused. Weather and vegetation outages are often the hardest to distinguish because storms can cause both tree-contact and direct wind/lightning damage.

6

Explain Predictions with SHAP

Feature importance tells you which features matter globally. SHAP (SHapley Additive exPlanations) goes further: it tells you how much each feature contributed to a specific prediction and in which direction.

# Create SHAP explainer explainer = shap.TreeExplainer(model) shap_values = explainer.shap_values(X_test) # Summary plot: how each feature affects each class fig, axes = plt.subplots(1, len(cause_names), figsize=(20, 6)) for i, cause in enumerate(cause_names): plt.sca(axes[i]) shap.summary_plot(shap_values[i], X_test, feature_names=feature_cols, show=False, max_display=8) axes[i].set_title(f"{cause}") plt.tight_layout() plt.show()

Reading SHAP plots: Each dot is one prediction. The x-axis shows the SHAP value (positive = pushes toward this class, negative = pushes away). The color shows the feature value (red = high, blue = low). For example, if you see high wind_max values (red dots) pushed to the right for the "weather" class, it means high wind makes the model more confident the outage is weather-driven.

7

Explain Individual Predictions

SHAP's real power is explaining individual events. Pick a specific outage and see exactly why the model predicted its cause.

# Pick a specific outage event from the test set event_idx = 0 event = X_test.iloc[event_idx] actual = cause_names[y_test.iloc[event_idx]] predicted = cause_names[y_pred[event_idx]] print(f"Event details:") print(f" Actual cause: {actual}") print(f" Predicted cause: {predicted}") print(f" Wind max: {event['wind_max']:.1f} mph") print(f" Storm flag: {event['is_storm']}") print(f" Avg asset age: {event['avg_asset_age']:.0f} years") # Waterfall plot for this prediction predicted_class = y_pred[event_idx] shap.waterfall_plot( shap.Explanation( values=shap_values[predicted_class][event_idx], base_values=explainer.expected_value[predicted_class], data=event.values, feature_names=feature_cols ) )

The waterfall plot shows how the model built its prediction step by step. Each bar shows one feature's contribution. Red bars push toward the predicted class; blue bars push against it. The final value is the model's log-odds for that class.

8

Benchmark Against SAIFI Metrics

How does your model's predicted outage distribution compare to SP&L's actual reliability metrics? This bridges the gap between ML model accuracy and real utility KPIs.

# Compute SAIFI from outage_history (no separate reliability file needed) total_customers = 48_000 # SP&L total customers served annual_saifi = (outages.groupby(outages["fault_detected"].dt.year)["affected_customers"] .sum() / total_customers) print("Annual SAIFI (computed from outage_history):") print(annual_saifi) # Compare predicted vs actual cause distribution for test years test_outages = df[test_mask] actual_dist = test_outages["cause_code"].value_counts(normalize=True) pred_causes = pd.Series(cause_names[y_pred]) pred_dist = pred_causes.value_counts(normalize=True) comparison = pd.DataFrame({ "Actual %": (actual_dist * 100).round(1), "Predicted %": (pred_dist * 100).round(1) }) print("\nCause distribution comparison (test set):") print(comparison) # Plot side-by-side fig, ax = plt.subplots(figsize=(10, 5)) x = np.arange(len(comparison)) width = 0.35 ax.bar(x - width/2, comparison["Actual %"], width, label="Actual", color="#2D6A7A") ax.bar(x + width/2, comparison["Predicted %"], width, label="Predicted", color="#5FCCDB") ax.set_xticks(x) ax.set_xticklabels(comparison.index, rotation=45, ha="right") ax.set_ylabel("Percentage of Outages") ax.set_title("Predicted vs Actual Outage Cause Distribution (2024-2025)") ax.legend() plt.tight_layout() plt.show()
9

Seasonal Cause Analysis

Outage causes vary by season. Vegetation peaks in spring/summer during the growing season. Weather outages spike during storm season. Equipment failures may increase in extreme heat. Let's validate the model captures these patterns.

# Predicted causes by month test_outages_with_pred = test_outages.copy() test_outages_with_pred["predicted_cause"] = cause_names[y_pred] monthly_causes = test_outages_with_pred.groupby( ["month", "predicted_cause"] ).size().unstack(fill_value=0) fig, ax = plt.subplots(figsize=(12, 6)) monthly_causes.plot(kind="bar", stacked=True, ax=ax, colormap="Set2") ax.set_xlabel("Month") ax.set_ylabel("Predicted Outage Count") ax.set_title("Predicted Outage Causes by Month") ax.legend(title="Cause", bbox_to_anchor=(1.05, 1)) plt.tight_layout() plt.show()

Reproducibility and Model Persistence

# For reproducible results, set random seeds at the top of your notebook np.random.seed(42) # Save the trained model for later use import joblib joblib.dump(model, "outage_cause_model.pkl") # Load it back model = joblib.load("outage_cause_model.pkl")

Why these hyperparameters? n_estimators=300, max_depth=6, learning_rate=0.1 are standard starting points for tree-based models. More trees (300 vs the default 100) compensate for the low learning rate, while max_depth=6 limits tree complexity to prevent overfitting on our modest dataset. Setting np.random.seed(42) at the top of your notebook ensures that data shuffling, train/test splits, and model initialization produce identical results every run.

What You Built and Next Steps

  1. Merged weather and asset data into a rich feature set
  2. Built a multi-class XGBoost classifier that predicts outage cause codes
  3. Used time-aware splitting to honestly evaluate forward-looking performance
  4. Applied SHAP to explain both global patterns and individual predictions
  5. Benchmarked model predictions against SP&L's reliability metrics
  6. Analyzed seasonal variation in predicted outage causes

Ideas to Try Next

  • Temporal Convolutional Networks: Replace XGBoost with a TCN to capture sequence patterns in time-ordered outage data
  • Hyperparameter tuning: Use optuna or sklearn.model_selection.RandomizedSearchCV to optimize XGBoost parameters
  • Storm features: Engineer additional storm-related features from weather["is_storm"] such as consecutive storm hours or storm intensity windows
  • Spatial features: Add feeder topology features (length, number of taps, rural vs urban)
  • Probability calibration: Use CalibratedClassifierCV to ensure predicted probabilities reflect true likelihoods

Key Terms Glossary

  • Multi-class classification — predicting one of several categories (not just binary yes/no)
  • XGBoost — Extreme Gradient Boosting; builds trees sequentially to correct errors
  • SHAP values — SHapley Additive exPlanations; a game-theory approach to explain individual predictions
  • Time-aware split — using past data for training and future data for testing to simulate deployment
  • Class imbalance — when some categories have far fewer examples than others
  • SAIFI — System Average Interruption Frequency Index; average number of interruptions per customer per year
  • Feature engineering — creating informative input variables from raw data