What You Will Learn
Utilities spend billions of dollars maintaining and replacing aging infrastructure. Instead of replacing equipment on a fixed schedule (time-based maintenance), predictive maintenance uses data to identify which assets are most likely to fail soon—so crews can prioritize the right work. In this guide you will:
- Load transformer and outage data from the SP&L dataset
- Engineer features from asset age, kVA rating, and outage history
- Train an XGBoost classifier to predict transformer failure risk
- Evaluate your model and generate a risk-ranked asset list
- Visualize which factors contribute most to failure risk
What is XGBoost? XGBoost (eXtreme Gradient Boosting) is an optimized version of Gradient Boosting that trains faster and often produces more accurate results. It is one of the most widely used ML algorithms in industry and dominates tabular data competitions on Kaggle.
SP&L Data You Will Use
- transformers.csv (
load_transformers()) — ~21,000 transformers with kVA rating, age_years, manufacturer, phase, and status - outage_history.csv (
load_outage_history()) — outage events linked to equipment failures with cause, duration, and customers affected - weather_data.csv (
load_weather_data()) — weather exposure data for environmental stress analysis
Additional Libraries
Which terminal should I use? On Windows, open Anaconda Prompt from the Start Menu (or PowerShell / Command Prompt if Python is already in your PATH). On macOS, open Terminal from Applications → Utilities. On Linux, open your default terminal. All pip install commands work the same across platforms.
Verify Your Setup
Before starting, verify that your environment is configured correctly. Run this cell first to confirm all dependencies are installed and data files are accessible.
Working directory: All guides assume your working directory is the repository root (Dynamic-Network-Model/). Start Jupyter Lab from there: cd Dynamic-Network-Model && jupyter lab
Extra dependency: pip install xgboost
Having trouble? Check our Troubleshooting Guide for solutions to common setup and data loading issues.
Load the Data
Explore the Transformer Data
Create the Failure Target
We need to label each transformer: has it experienced an equipment-failure outage? We'll use the outage history to identify feeders linked to "equipment failure" causes and flag the transformers on those feeders.
Engineer Maintenance Features
Outage history can serve as a proxy for maintenance exposure. Feeders with frequent or long-duration outages suggest areas where equipment is under greater stress.
Why use outage statistics as features? Even without a dedicated maintenance log, the number and average duration of outages on a feeder capture real-world stress. Feeders with many long outages are likely serving equipment under greater strain.
Prepare Features and Split
Train the XGBoost Model
Test and Evaluate
What is AUC-ROC? AUC (Area Under the ROC Curve) measures how well the model distinguishes between positive and negative classes across all probability thresholds. A score of 1.0 is perfect, 0.5 is random guessing. For maintenance prioritization, anything above 0.7 is useful because you don't need perfect accuracy—you just need to rank assets by risk.
Plot the ROC Curve
Generate a Risk-Ranked Asset List
The real value of this model is not just accuracy—it's the ability to produce a prioritized list of assets that maintenance crews can act on.
Feature Importance
You will typically see age_years and total_outages at the top. Older transformers on feeders with frequent outages are the highest-risk assets—which aligns with engineering intuition and validates the model.
What You Built and Next Steps
- Loaded transformer and outage data from the SP&L data loader API
- Created a binary failure target from equipment-failure outage records
- Engineered features from asset age, kVA rating, and outage history
- Trained an XGBoost classifier with class-imbalance handling
- Evaluated performance with classification report and ROC curve
- Generated a risk-ranked asset list for maintenance prioritization
Ideas to Try Next
- Add weather exposure: Calculate cumulative storm exposure per transformer from the weather data
- Survival analysis: Use the
lifelineslibrary to model time-to-failure instead of binary failure - Include loading history: Use peak loading percentages from feeder load data to measure stress over time
- Extend to network edges: Apply the same approach using
load_network_edges()data for conductor failure analysis - Cost-benefit analysis: Combine failure probability with replacement cost and outage impact to optimize capital spending
Key Terms Glossary
- XGBoost — an optimized gradient boosting library for high-performance ML
- Predictive maintenance — using data to predict failures before they occur, replacing time-based schedules
- AUC-ROC — measures how well the model distinguishes between classes; 1.0 = perfect, 0.5 = random
- Class imbalance — when one category (e.g., "no failure") is much more common than the other
- scale_pos_weight — XGBoost parameter that compensates for class imbalance
- Risk score — the model's predicted probability of failure, used to rank assets
Ready to Level Up?
In the advanced guide, you'll use survival analysis to predict when transformers will fail and build risk-prioritized replacement schedules.
Go to Advanced Predictive Maintenance →