Guide 09: Advanced outage prediction.

Adam BrownAuthor

Working notebookFormat

SP&L dataDataset

← Back to All Guides

Guide 09 — Advanced

Prefer not to install anything? Click the badge above to open this guide as a runnable notebook in Google Colab. Sign in with any Google account, then use Runtime → Run all to execute every cell, or step through them one at a time.

Prerequisite: Complete Guide 01: Outage Prediction first. This guide extends the binary classifier into a multi-class model that predicts the cause of an outage and explains why.

What You Will Learn

In Guide 01, you built a binary classifier: outage or no outage. But utility reliability teams need more—they need to know why an outage is likely. Is it vegetation encroachment during the growing season? Weather-driven during storm season? Equipment failure on aging infrastructure? And what about the significant share of outages with undetermined causes? In this guide you will:

Build a multi-class classifier that predicts outage cause codes (vegetation, weather, equipment, animal, overload, and unknown)
Use XGBoost instead of Random Forest for improved performance on imbalanced classes
Implement a time-aware train/test split that simulates real-world deployment
Add asset features (transformer age, kVA ratings) alongside weather
Use SHAP values to explain individual predictions—not just global feature importance
Benchmark your model against SP&L's annual SAIFI reliability metrics

SP&L Data You Will Use

outage_history.csv (load_outage_history()) — feeder-level outage events with cause_code, duration, affected_customers
weather_data.csv (load_weather_data()) — 52,608 hourly records with temperature, humidity, wind, storm flags
transformers.csv (load_transformers()) — ~21,000 transformers with age, kVA ratings
network_edges.csv (load_network_edges()) — conductor segments with impedance data

Additional Libraries

pip install xgboost shap

Verify Your Setup

Before starting, verify that your environment is configured correctly. Run this cell first to confirm all dependencies are installed and data files are accessible.

# Step 0: Verify your setup
try:
    import pandas as pd
    import numpy as np
    from demo_data.load_demo_data import load_outage_history
    outages = load_outage_history()
    print(f"Setup OK! Loaded {len(outages):,} outage records.")
except ModuleNotFoundError as e:
    print(f"Missing library: {e}")
    print("Run: pip install -r requirements.txt")
except FileNotFoundError:
    print("Data files not found. Run from the repo root:")
    print("  cd Dynamic-Network-Model && jupyter lab")
                    

Setup OK! Loaded N records.

Working directory: All guides assume your working directory is the repository root (Dynamic-Network-Model/). Start Jupyter Lab from there: cd Dynamic-Network-Model && jupyter lab

Extra dependencies: pip install xgboost shap

Having trouble? Check our Troubleshooting Guide for solutions to common setup and data loading issues.

Load and Merge All Data Sources

Unlike Guide 01 where we used only weather data, here we merge weather and asset data to give the model a richer picture of outage drivers.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import xgboost as xgb
import shap
from sklearn.metrics import classification_report, confusion_matrix
from sklearn.preprocessing import LabelEncoder

from demo_data.load_demo_data import (
    load_outage_history, load_weather_data,
    load_transformers, load_network_edges
)

# Load all datasets
outages = load_outage_history()
weather = load_weather_data()
transformers = load_transformers()
edges = load_network_edges()

print(f"Outages: {len(outages):,} events")
print(f"Cause codes: {outages['cause_code'].unique()}")
print(f"\nCause code distribution:")
print(outages["cause_code"].value_counts())
                    

Outages: 3,247 events
Cause codes: ['vegetation' 'weather' 'equipment_failure' 'animal_contact' 'overload' 'unknown']

Cause code distribution:
vegetation 987
weather 842
equipment_failure 614
animal_contact 389
overload 247
unknown 168

Why multi-class? A binary model tells you "an outage might happen." A multi-class model tells you "an outage might happen due to vegetation." This lets crews prepare the right response: tree trimming crews for vegetation, line patrol for equipment, or storm staging for weather-driven events.

Build Enriched Feature Set

We combine daily weather summaries with asset condition data for each outage's feeder. This gives the model both environmental and infrastructure context. Note that we keep all cause codes including “unknown”—in real utility data, a significant portion of outages have undetermined causes, and the model should learn to recognize these patterns rather than ignoring them.

# Create daily weather features (same as Guide 01)
weather["date"] = weather["timestamp"].dt.date
daily_weather = weather.groupby("date").agg({
    "temperature":    ["max", "min", "mean"],
    "wind_speed":     ["max", "mean"],
    "is_storm":       "max",
    "humidity":       "mean",
}).reset_index()
daily_weather.columns = ["date", "temp_max", "temp_min", "temp_mean",
                          "wind_max", "wind_mean", "is_storm", "humidity_mean"]

# Add asset features per feeder
feeder_assets = transformers.groupby("feeder_id").agg({
    "age_years":  "mean",
    "kva_rating": "sum",
}).reset_index()
feeder_assets.columns = ["feeder_id", "avg_asset_age", "total_kva"]

# Build the training table: one row per outage event
outages["date"] = outages["fault_detected"].dt.date
df = outages.merge(daily_weather, on="date", how="left")
df = df.merge(feeder_assets, on="feeder_id", how="left")

# Add time features
df["fault_detected"] = pd.to_datetime(df["fault_detected"])
df["month"] = df["fault_detected"].dt.month
df["hour"] = df["fault_detected"].dt.hour
df["day_of_week"] = df["fault_detected"].dt.dayofweek
df["is_summer"] = df["month"].isin([6, 7, 8]).astype(int)
df["is_storm_season"] = df["month"].isin([3, 4, 5, 6]).astype(int)

# Drop rows with missing weather data (keep all cause codes including unknown)
df = df.dropna(subset=["temp_max"])

print(f"Training samples: {len(df):,}")
print(f"\nFeatures built per event:")
print(f"  Weather: 7 | Asset: 2 | Calendar: 5")
                    

Time-Aware Train/Test Split

In Guide 01, we used a random split. But in production, your model always predicts the future based on the past. A time-aware split is more honest: train on 2020–2023 data, test on 2024–2025.

# Define features and target
feature_cols = [
    "temp_max", "temp_min", "temp_mean",
    "wind_max", "wind_mean", "is_storm", "humidity_mean",
    "avg_asset_age", "total_kva",
    "month", "hour", "day_of_week", "is_summer", "is_storm_season"
]

# Encode cause codes as integers
le = LabelEncoder()
df["cause_label"] = le.fit_transform(df["cause_code"])
cause_names = le.classes_

# Time-aware split: train on 2020-2023, test on 2024-2025
train_mask = df["fault_detected"].dt.year <= 2023
test_mask  = df["fault_detected"].dt.year >= 2024

X_train = df.loc[train_mask, feature_cols]
y_train = df.loc[train_mask, "cause_label"]
X_test  = df.loc[test_mask, feature_cols]
y_test  = df.loc[test_mask, "cause_label"]

print(f"Training set: {len(X_train):,} events (2020-2023)")
print(f"Test set:     {len(X_test):,} events (2024-2025)")
print(f"Classes: {list(cause_names)}")
                    

Why not random split? Random splitting lets future data "leak" into the training set. If your model trains on a July 2024 storm and tests on a June 2024 event, it has an unfair advantage. Time-aware splitting gives you honest performance estimates that reflect how the model will actually perform when deployed.

Train XGBoost Multi-Class Classifier

XGBoost (Extreme Gradient Boosting) builds trees sequentially, where each new tree corrects the mistakes of the previous ones. It typically outperforms Random Forest on structured data, especially with class imbalance.

# Calculate class weights for imbalanced data
class_counts = y_train.value_counts().sort_index()
total = len(y_train)
n_classes = len(class_counts)
class_weights = {i: total / (n_classes * count)
                 for i, count in class_counts.items()}

# Assign sample weights
sample_weights = y_train.map(class_weights)

# Train XGBoost
model = xgb.XGBClassifier(
    n_estimators=300,
    max_depth=6,
    learning_rate=0.1,
    subsample=0.8,
    colsample_bytree=0.8,
    objective="multi:softprob",
    num_class=n_classes,
    random_state=42,
    eval_metric="mlogloss",
)

model.fit(
    X_train, y_train,
    sample_weight=sample_weights,
    eval_set=[(X_test, y_test)],
    verbose=50
)

print("Training complete.")
                    

A note on eval_set: We pass (X_test, y_test) as the evaluation set so you can watch the model’s loss decrease during training. Be aware that this means the training procedure can “see” test set performance—if you add early_stopping_rounds, the model will use the test set to decide when to stop, which is a mild form of information leakage. In production pipelines, best practice is to create a separate validation set (e.g., a 2023-only holdout) for monitoring and early stopping, and reserve the true test set (2024–2025) for final evaluation only. For this tutorial, the impact is minimal since we are not tuning hyperparameters against the eval set, but it is an important distinction to understand.

XGBoost vs Random Forest: Random Forest builds trees independently (in parallel). XGBoost builds trees sequentially, with each tree focusing on the mistakes of the ensemble so far. The learning_rate controls how aggressively each tree corrects errors. Lower rates (0.01–0.1) with more trees generally give better results but take longer to train.

Evaluate Multi-Class Performance

# Predict on test set
y_pred = model.predict(X_test)

# Classification report with cause code names
print(classification_report(y_test, y_pred,
      target_names=cause_names))

# Confusion matrix
cm = confusion_matrix(y_test, y_pred)
fig, ax = plt.subplots(figsize=(8, 7))
sns.heatmap(cm, annot=True, fmt="d", cmap="Blues",
            xticklabels=cause_names, yticklabels=cause_names, ax=ax)
ax.set_xlabel("Predicted Cause")
ax.set_ylabel("Actual Cause")
ax.set_title("Multi-Class Confusion Matrix: Outage Cause Prediction")
plt.xticks(rotation=45, ha="right")
plt.tight_layout()
plt.show()
                    

Look at the confusion matrix to understand where the model gets confused. Weather and vegetation outages are often the hardest to distinguish because storms can cause both tree-contact and direct wind/lightning damage.

Explain Predictions with SHAP

Feature importance tells you which features matter globally. SHAP (SHapley Additive exPlanations) goes further: it tells you how much each feature contributed to a specific prediction and in which direction.

# Create SHAP explainer
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_test)

# Summary plot: how each feature affects each class
fig, axes = plt.subplots(1, len(cause_names), figsize=(20, 6))
for i, cause in enumerate(cause_names):
    plt.sca(axes[i])
    shap.summary_plot(shap_values[i], X_test,
                       feature_names=feature_cols,
                       show=False, max_display=8)
    axes[i].set_title(f"{cause}")
plt.tight_layout()
plt.show()
                    

Reading SHAP plots: Each dot is one prediction. The x-axis shows the SHAP value (positive = pushes toward this class, negative = pushes away). The color shows the feature value (red = high, blue = low). For example, if you see high wind_max values (red dots) pushed to the right for the "weather" class, it means high wind makes the model more confident the outage is weather-driven.

Explain Individual Predictions

SHAP's real power is explaining individual events. Pick a specific outage and see exactly why the model predicted its cause.

# Pick a specific outage event from the test set
event_idx = 0
event = X_test.iloc[event_idx]
actual = cause_names[y_test.iloc[event_idx]]
predicted = cause_names[y_pred[event_idx]]

print(f"Event details:")
print(f"  Actual cause:    {actual}")
print(f"  Predicted cause: {predicted}")
print(f"  Wind max:        {event['wind_max']:.1f} mph")
print(f"  Storm flag:      {event['is_storm']}")
print(f"  Avg asset age:   {event['avg_asset_age']:.0f} years")

# Waterfall plot for this prediction
predicted_class = y_pred[event_idx]
shap.waterfall_plot(
    shap.Explanation(
        values=shap_values[predicted_class][event_idx],
        base_values=explainer.expected_value[predicted_class],
        data=event.values,
        feature_names=feature_cols
    )
)
                    

The waterfall plot shows how the model built its prediction step by step. Each bar shows one feature's contribution. Red bars push toward the predicted class; blue bars push against it. The final value is the model's log-odds for that class.

Benchmark Against SAIFI Metrics

How does your model's predicted outage distribution compare to SP&L's actual reliability metrics? This bridges the gap between ML model accuracy and real utility KPIs.

# Compute SAIFI from outage_history (no separate reliability file needed)
total_customers = 238_000  # SP&L total customers served
annual_saifi = (outages.groupby(outages["fault_detected"].dt.year)["affected_customers"]
                .sum() / total_customers)
print("Annual SAIFI (computed from outage_history):")
print(annual_saifi)

# Compare predicted vs actual cause distribution for test years
test_outages = df[test_mask]

actual_dist = test_outages["cause_code"].value_counts(normalize=True)
pred_causes = pd.Series(cause_names[y_pred])
pred_dist = pred_causes.value_counts(normalize=True)

comparison = pd.DataFrame({
    "Actual %": (actual_dist * 100).round(1),
    "Predicted %": (pred_dist * 100).round(1)
})
print("\nCause distribution comparison (test set):")
print(comparison)

# Plot side-by-side
fig, ax = plt.subplots(figsize=(10, 5))
x = np.arange(len(comparison))
width = 0.35
ax.bar(x - width/2, comparison["Actual %"], width, label="Actual", color="#2D6A7A")
ax.bar(x + width/2, comparison["Predicted %"], width, label="Predicted", color="#5FCCDB")
ax.set_xticks(x)
ax.set_xticklabels(comparison.index, rotation=45, ha="right")
ax.set_ylabel("Percentage of Outages")
ax.set_title("Predicted vs Actual Outage Cause Distribution (2024-2025)")
ax.legend()
plt.tight_layout()
plt.show()
                    

Seasonal Cause Analysis

Outage causes vary by season. Vegetation peaks in spring/summer during the growing season. Weather outages spike during storm season. Equipment failures may increase in extreme heat. Let's validate the model captures these patterns.

# Predicted causes by month
test_outages_with_pred = test_outages.copy()
test_outages_with_pred["predicted_cause"] = cause_names[y_pred]

monthly_causes = test_outages_with_pred.groupby(
    ["month", "predicted_cause"]
).size().unstack(fill_value=0)

fig, ax = plt.subplots(figsize=(12, 6))
monthly_causes.plot(kind="bar", stacked=True, ax=ax,
                    colormap="Set2")
ax.set_xlabel("Month")
ax.set_ylabel("Predicted Outage Count")
ax.set_title("Predicted Outage Causes by Month")
ax.legend(title="Cause", bbox_to_anchor=(1.05, 1))
plt.tight_layout()
plt.show()
                    

Reproducibility and Model Persistence

# For reproducible results, set random seeds at the top of your notebook
np.random.seed(42)

# Save the trained model for later use
import joblib
joblib.dump(model, "outage_cause_model.pkl")

# Load it back
model = joblib.load("outage_cause_model.pkl")
                    

Why these hyperparameters? n_estimators=300, max_depth=6, learning_rate=0.1 are standard starting points for tree-based models. More trees (300 vs the default 100) compensate for the low learning rate, while max_depth=6 limits tree complexity to prevent overfitting on our modest dataset. Setting np.random.seed(42) at the top of your notebook ensures that data shuffling, train/test splits, and model initialization produce identical results every run.

✓

What You Built and Next Steps

Merged weather and asset data into a rich feature set
Built a multi-class XGBoost classifier that predicts outage cause codes
Used time-aware splitting to honestly evaluate forward-looking performance
Applied SHAP to explain both global patterns and individual predictions
Benchmarked model predictions against SP&L's reliability metrics
Analyzed seasonal variation in predicted outage causes

Ideas to Try Next

Temporal Convolutional Networks: Replace XGBoost with a TCN to capture sequence patterns in time-ordered outage data
Hyperparameter tuning: Use optuna or sklearn.model_selection.RandomizedSearchCV to optimize XGBoost parameters
Storm features: Engineer additional storm-related features from weather["is_storm"] such as consecutive storm hours or storm intensity windows
Spatial features: Add feeder topology features (length, number of taps, rural vs urban)
Probability calibration: Use CalibratedClassifierCV to ensure predicted probabilities reflect true likelihoods

← Beginner: Outage Prediction Next: Advanced Load Forecasting →

— Adam · adam@sgridworks.com