Guide 01: Outage Prediction - ML Playground

What You Will Learn

In this guide, you will build a machine learning model that predicts whether an outage is likely to occur on a given day based on weather conditions and asset characteristics. By the end, you will have:

Loaded and explored the SP&L outage and weather datasets
Combined multiple data sources into a single training table
Trained a Random Forest classifier to predict outages
Evaluated your model's accuracy on held-out test data
Identified which features matter most for outage prediction

What is a Random Forest? A Random Forest is a collection of decision trees. Each tree looks at a random subset of your data and features, then "votes" on the answer. The final prediction is whichever answer gets the most votes. It works well for classification tasks and handles messy, real-world data gracefully.

SP&L Data You Will Use

This guide uses three files from the SP&L repository:

outages/outage_events.csv — 3,200+ historical outage records with cause codes, timestamps, affected customers, and feeder IDs
weather/hourly_observations.csv — hourly temperature, wind speed, precipitation, and humidity for 2020–2025
assets/transformers.csv — transformer age, condition scores, and kVA ratings

Additional Libraries

Beyond the base prerequisites, this guide needs nothing extra. You already have everything: pandas, numpy, scikit-learn, matplotlib, and seaborn.

Which terminal should I use? On Windows, open Anaconda Prompt from the Start Menu (or PowerShell/Command Prompt if Python is in your PATH). On macOS, open Terminal from Applications → Utilities. On Linux, open your default terminal. All pip install commands work the same across platforms.

Load the Data

Open a new Jupyter notebook and run the following cell to import your libraries and load the three SP&L files. Update the DATA_DIR path to wherever you cloned the repository.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report, confusion_matrix

# Point this to your local clone of the SP&L repo
# Windows example: "C:/Users/YourName/Documents/sisyphean-power-and-light/"
# macOS example:   "/Users/YourName/Documents/sisyphean-power-and-light/"
# Tip: Python on Windows accepts forward slashes — no backslashes needed
DATA_DIR = "sisyphean-power-and-light/"

# Load outage events
outages = pd.read_csv(DATA_DIR + "outages/outage_events.csv",
                       parse_dates=["fault_detected"])

# Load hourly weather
weather = pd.read_csv(DATA_DIR + "weather/hourly_observations.csv",
                       parse_dates=["timestamp"])

# Load transformer asset data
transformers = pd.read_csv(DATA_DIR + "assets/transformers.csv")

print(f"Outage events loaded: {len(outages):,}")
print(f"Weather rows loaded:  {len(weather):,}")
print(f"Transformers loaded:  {len(transformers):,}")
                    

Outage events loaded: 3,247
Weather rows loaded: 43,824
Transformers loaded: 86

What just happened? You used pandas.read_csv() to load each CSV file into a DataFrame—think of it as a spreadsheet inside Python. The parse_dates argument tells pandas to interpret certain columns as dates rather than plain text.

Explore the Data

Before building a model, look at what you have. Run each line below in its own cell so you can see the output.

# See the first few outage rows
outages.head()
                    

# How many outages per cause code?
outages["cause_code"].value_counts().plot(kind="bar", title="Outages by Cause")
plt.ylabel("Count")
plt.tight_layout()
plt.show()
                    

# Check the weather columns
weather.describe()
                    

You should see that outages have cause codes such as vegetation, equipment_failure, animal_contact, weather, and overload. The weather table includes temperature, wind speed, humidity, and precipitation measured every hour.

Build Daily Features

Outages happen on a specific day. Weather is recorded every hour. To combine them, we need to summarize weather into daily statistics (max wind, max temperature, total rainfall, etc.).

# Create a date column from the weather timestamp
weather["date"] = weather["timestamp"].dt.date

# Aggregate weather to daily summaries
daily_weather = weather.groupby("date").agg({
    "temperature":  ["max", "min", "mean"],
    "wind_speed":   ["max", "mean"],
    "precipitation": "sum",
    "humidity":      "mean",
}).reset_index()

# Flatten the multi-level column names
daily_weather.columns = [
    "date", "temp_max", "temp_min", "temp_mean",
    "wind_max", "wind_mean", "precip_total", "humidity_mean"
]

print(daily_weather.head())
print(f"\nDaily weather rows: {len(daily_weather)}")
                    

What is feature engineering? Raw data rarely comes in the shape a model needs. Feature engineering is the process of transforming raw columns into useful inputs. Here, we turned 24 hourly weather readings per day into 7 summary numbers (max temp, min temp, mean temp, etc.).

Create the Target Variable

A classification model needs a target: the thing you are predicting. Our target is "Did at least one outage happen on this day?" (yes = 1, no = 0).

# Extract the date from each outage event
outages["date"] = outages["fault_detected"].dt.date

# Count outages per day
outage_days = outages.groupby("date").size().reset_index(name="outage_count")

# Merge with daily weather
df = daily_weather.merge(outage_days, on="date", how="left")

# Fill days with no outages as 0
df["outage_count"] = df["outage_count"].fillna(0).astype(int)

# Create the binary target: 1 if any outage, 0 if none
df["outage_flag"] = (df["outage_count"] > 0).astype(int)

print(f"Total days: {len(df)}")
print(f"Days with outages: {df['outage_flag'].sum()}")
print(f"Days without outages: {(df['outage_flag'] == 0).sum()}")
                    

Total days: 1,826
Days with outages: 1,412
Days without outages: 414

Add Time-Based Features

Outages follow seasonal patterns. Let's add month-of-year and day-of-week as features so the model can learn these cycles.

# Convert date column to datetime for feature extraction
df["date"] = pd.to_datetime(df["date"])

# Add calendar features
df["month"]       = df["date"].dt.month
df["day_of_week"] = df["date"].dt.dayofweek    # 0 = Monday, 6 = Sunday
df["is_summer"]   = df["month"].isin([6, 7, 8]).astype(int)

print(df[["date", "temp_max", "wind_max", "month", "outage_flag"]].head(10))
                    

Split into Training and Test Sets

We need to hold back some data the model has never seen, so we can honestly evaluate it later. The standard practice is an 80/20 split: 80% for training, 20% for testing.

# Define features (X) and target (y)
feature_cols = [
    "temp_max", "temp_min", "temp_mean",
    "wind_max", "wind_mean",
    "precip_total", "humidity_mean",
    "month", "day_of_week", "is_summer"
]

X = df[feature_cols]
y = df["outage_flag"]

# Split: 80% train, 20% test (random_state for reproducibility)
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

print(f"Training samples: {len(X_train)}")
print(f"Test samples:     {len(X_test)}")
                    

Why stratify? The stratify=y argument ensures the train and test sets have the same proportion of outage/no-outage days. Without this, random chance could put most of the no-outage days in one set, giving you misleading results.

Train the Random Forest

Now the exciting part. We create a Random Forest classifier and fit it on the training data. "Fitting" means the model examines all the training rows and learns patterns that connect weather features to outage outcomes.

# Create the model with 200 decision trees
model = RandomForestClassifier(
    n_estimators=200,       # number of trees in the forest
    max_depth=10,           # limit tree depth to prevent overfitting
    random_state=42,        # for reproducible results
    class_weight="balanced" # adjust for imbalanced classes
)

# Train the model
model.fit(X_train, y_train)

print("Model training complete.")
print(f"Number of trees: {model.n_estimators}")
print(f"Features used:   {model.n_features_in_}")
                    

Model training complete.
Number of trees: 200
Features used: 10

What does class_weight="balanced" do? Since we have more outage days than non-outage days, the model could cheat by always predicting "outage" and still get high accuracy. The balanced setting tells the model to pay more attention to the rarer class so it actually learns to distinguish the two.

Test the Model

Now we use the held-out test data—data the model has never seen—to see how well it performs in the real world.

# Make predictions on the test set
y_pred = model.predict(X_test)

# Print a classification report
print(classification_report(y_test, y_pred,
      target_names=["No Outage", "Outage"]))
                    

precision recall f1-score support

No Outage 0.45 0.52 0.48 83
Outage 0.84 0.80 0.82 283

accuracy 0.73 366
macro avg 0.65 0.66 0.65 366
weighted avg 0.75 0.73 0.74 366

Let's also visualize the confusion matrix to see exactly where the model gets things right and wrong.

# Plot a confusion matrix
cm = confusion_matrix(y_test, y_pred)
fig, ax = plt.subplots(figsize=(6, 5))
sns.heatmap(cm, annot=True, fmt="d", cmap="Blues",
            xticklabels=["No Outage", "Outage"],
            yticklabels=["No Outage", "Outage"], ax=ax)
ax.set_xlabel("Predicted")
ax.set_ylabel("Actual")
ax.set_title("Confusion Matrix: Outage Prediction")
plt.tight_layout()
plt.show()
                    

Reading the results: Precision answers "When the model predicted an outage, how often was it right?" Recall answers "Of all the actual outages, how many did the model catch?" The F1-score is the balance between the two. For utility reliability teams, recall is often more important—you'd rather have a false alarm than miss a real outage.

Understand Feature Importance

One of the best things about Random Forests: they tell you which features matter most. This is valuable for utility engineers because it shows which weather variables drive outage risk.

# Get feature importances
importances = model.feature_importances_
feat_imp = pd.Series(importances, index=feature_cols).sort_values(ascending=True)

# Plot
fig, ax = plt.subplots(figsize=(8, 5))
feat_imp.plot(kind="barh", color="#5FCCDB", ax=ax)
ax.set_title("Feature Importance: What Drives Outages?")
ax.set_xlabel("Importance Score")
plt.tight_layout()
plt.show()
                    

You will likely see that wind_max, precip_total, and temp_max rank highest—which aligns with utility operational experience. Storms with high wind and heavy rainfall are the primary outage drivers.

What You Built and Next Steps

Congratulations. You just:

Loaded real-world-style utility data from the SP&L repository
Engineered daily features from hourly weather records
Created a binary classification target (outage yes/no)
Trained a Random Forest classifier on 80% of the data
Tested it on the remaining 20% and evaluated performance
Identified which weather features drive outage risk

Ideas to Try Next

Add asset features: Merge transformer age and condition scores by feeder to improve predictions
Predict outage cause: Instead of binary (outage/no outage), predict the cause code (vegetation, weather, equipment) using a multi-class classifier
Try XGBoost: Replace RandomForestClassifier with XGBClassifier from the xgboost library for potentially better results
Benchmark against SAIFI: Compare your model's predictions to SP&L's annual SAIFI metrics in outages/reliability_metrics.csv
Time-aware split: Instead of random splitting, train on 2020–2023 and test on 2024–2025 to simulate how the model would perform on future data

Key Terms Glossary

Random Forest — an ensemble of decision trees that vote on the prediction
Classification — predicting a category (outage / no outage)
Feature — an input variable the model uses (e.g., wind speed)
Target — the variable you are trying to predict (outage_flag)
Training set — data the model learns from
Test set — data held back to evaluate the model honestly
Precision — of all positive predictions, how many were correct
Recall — of all actual positives, how many were detected
SAIFI — System Average Interruption Frequency Index, a standard reliability metric

Ready to Level Up?

In the advanced guide, you'll build a multi-class XGBoost classifier with SHAP explainability and time-aware validation.

Go to Advanced Outage Prediction →

← All Guides Next: Load Forecasting →

Outage Prediction with Random Forest

What You Will Learn

SP&L Data You Will Use

Additional Libraries

Load the Data

Explore the Data

Build Daily Features

Create the Target Variable

Add Time-Based Features

Split into Training and Test Sets

Train the Random Forest

Test the Model

Understand Feature Importance

What You Built and Next Steps

Ideas to Try Next

Key Terms Glossary

Ready to Level Up?