You already understand distribution system problems worth solving. AI-assisted development tools let you go from that understanding to a working model. These guides cover every use case on the SP&L dataset—from beginner fundamentals to advanced techniques like deep learning, reinforcement learning, and production deployment.
← Back to SP&L OverviewYou already know how distribution systems work—that domain expertise is the hard part. Every guide below follows the same pattern: load the data, explore it, build features, train a model, and test your results. No prior ML experience required. If you get stuck on any step, an AI coding assistant can explain what the code does and help you adapt it to your own problems.
Each guide assumes you have the following tools ready. If you are brand new to Python, install Anaconda—it bundles Python, Jupyter, and most of the libraries listed below. Anaconda is available for Windows, macOS, and Linux.
pip install commands in that promptjupyter labTip: You can also use PowerShell or Command Prompt if you add Python to your PATH during installation.
pip install commands (use pip3 if not using Anaconda)jupyter labTip: On macOS you may need pip3 instead of pip if you installed Python via Homebrew or python.org.
Run this single command to install all core libraries at once (works on Windows, macOS, and Linux):
Windows note on file paths: All code in these guides uses forward slashes in file paths (e.g., "sisyphean-power-and-light/outages/outage_events.csv"). Python on Windows handles forward slashes correctly, so you do not need to change them to backslashes. Just update the DATA_DIR variable to match where you cloned the repository.
Tip: Each guide also lists any additional libraries it needs at the top. Install them as you go.
These guides are designed to work alongside AI coding assistants. You don't need to memorize pandas syntax or know how scikit-learn's API works before you start—your domain expertise is the irreplaceable ingredient. AI tools bridge the ML knowledge gap in real time: ask them to explain a code block, debug an error, or adapt a model to a different dataset. The combination of your engineering judgment and AI-assisted coding is what makes this approach work.
Pick the distribution engineering problem that matters to you. Each guide is self-contained—start with Guide 01 or jump straight to your area of expertise.
Train a Random Forest classifier to predict whether weather and asset conditions will cause an outage. Uses the 3,200+ outage events and weather data from SP&L.
Build a day-ahead load forecast using 5 years of hourly substation data. Start with a simple baseline, then train a gradient-boosted model that accounts for weather and seasonality.
Run power flow simulations on the SP&L network to determine how much solar each feeder can handle. Learn to identify thermal and voltage violations using OpenDSS.
Predict which transformers are most likely to fail using asset age, condition scores, loading history, and weather exposure. Build a risk-scoring model with XGBoost.
Model the SP&L distribution network as a graph and simulate automated fault isolation and service restoration. Calculate how much customer downtime FLISR could have avoided.
Analyze voltage profiles across the SP&L network and build a rule-based Volt-VAR controller. Then introduce a simple reinforcement learning agent to learn optimal control policies.
Stress-test the grid against high-solar and high-EV futures. Use Monte Carlo simulation to model uncertain adoption rates and identify which feeders hit capacity limits first.
Build an anomaly detector for AMI voltage data using Isolation Forest. Then construct a simple autoencoder in PyTorch to flag unusual grid behavior in real time.
These cover techniques that used to require a dedicated data science team—deep learning, reinforcement learning, survival analysis, and production deployment patterns. With AI-assisted development, an experienced power engineer can build and understand these models. Each guide builds on its beginner counterpart.
Upgrade from binary classification to multi-class outage cause prediction using XGBoost. Add asset features, implement time-aware validation, and use SHAP to explain individual predictions.
Build an LSTM neural network in PyTorch for multi-step ahead load forecasting. Learn sequence modeling, sliding window preparation, and compare deep learning to gradient boosting baselines.
Train a surrogate ML model to predict hosting capacity without running full power flow simulations. Achieve 100x+ speedup with LightGBM, quantile regression for uncertainty, and spatial mapping.
Move beyond "will it fail?" to "when will it fail?" using survival analysis. Build Kaplan-Meier curves, Cox Proportional Hazards models, and risk-prioritized replacement schedules.
Apply reinforcement learning to optimize post-fault switching sequences. Simulate microgrid islanding during emergencies and model cold load pickup for realistic restoration planning.
Scale from tabular Q-learning to Deep Q-Networks (DQN) with neural networks. Control multiple devices simultaneously with experience replay, target networks, and multi-objective rewards.
Optimize grid upgrade investments under uncertain DER adoption. Build cost-benefit models, evaluate non-wires alternatives, and create a 5-year investment roadmap with stochastic programming.
Replace basic autoencoders with Variational Autoencoders for probabilistic anomaly scoring. Implement a real-time sliding window detection pipeline with adaptive thresholds.
The SP&L dataset, these guides, and an AI coding assistant—that's the fastest path from understanding a distribution system problem to having a working model. Request early access to clone the repository and start building.
Get Early Access