A realistic synthetic utility you can clone, query, model, and break. Distribution grid plus generation plant. From feeders and transformers to boiler feed pumps and vibration data. Domain experts have always understood the problems worth solving. With this dataset and modern AI-assisted dev tools, you can build the ML applications that used to require a dedicated data science team.
SP&L is a fictional municipal utility serving approximately 238,000 customers across 23 substations in a mixed urban, suburban, and rural service territory. Powered in part by a 300 MW 2×1 combined-cycle generating station. The Dynamic Network Model is a fully-realized open-source representation of SP&L's distribution system and generation assets, complete with time-series load data, DER installations, historical outage records, asset metadata, weather correlations, protection device configurations, and plant-side rotating equipment instrumentation.
A distribution system large enough to be realistic, small enough to fit on a laptop.
Organized into logical layers from raw topology to pre-built analysis notebooks. Each layer is documented, versioned, and usable independently.
OpenDSS-native circuit definitions for all 104 feeders. Substation configurations, transformer ratings, line impedances, switch and protection device locations. The raw electrical infrastructure, in a format every distribution engineer can run.
Five years of hourly customer load, DER generation profiles, weather correlations, and feeder-level voltage/current measurements. Realistic seasonality, weather sensitivity, growth trends, and the day-to-day noise that real grid data carries.
3,200+ historical outage events with root cause classification (equipment, vegetation, weather, animal, vehicle, unknown), customer impact, restoration time, and crew dispatch records. Computed SAIDI, SAIFI, and CAIDI by feeder, by year, and by major-event-day classification.
Solar PV, battery storage, and EV charging installations with location, size, vintage, and inverter characteristics. Hosting capacity baselines per circuit and the queue of pending interconnections that would push some circuits over the edge.
Plant-side data from the 300 MW 2×1 combined-cycle station. Boiler feed pump vibration histories, condenser performance, heat-rate trends, forced-outage records, valve cycle counts. Useful for digital-twin work and rotating-equipment ML.
Twenty-three Jupyter-style guides walk through specific ML use cases against this data: outage prediction, load forecasting, hosting capacity, predictive maintenance, FLISR optimization, volt-var, DER scenarios, anomaly detection, rotating-equipment health, and OpenDSS power-flow integration.
Two things have kept power engineers from building their own ML tools: data locked behind NDAs and CEII restrictions, and the gap between understanding a grid problem and implementing an ML solution. SP&L removes the first barrier. AI-assisted dev tools remove the second.
Predict which feeders or asset classes are most likely to experience an outage in the next 24-72 hours. Features: weather, vegetation, age, recent fault rates, load patterns.
Hourly and day-ahead load forecasts at the feeder level. Train against five years of weather-correlated load history; validate on the last twelve months.
Compute hosting capacity for solar interconnection per feeder using OpenDSS. Identify circuits with headroom; identify circuits where the next interconnection triggers a constraint.
Failure-probability models for transformers and switchgear using age, loading, fault history, and weather exposure. Output: prioritized inspection list.
Optimize switching schemes for fault isolation and service restoration. Compare automated FLISR sequences against the actual outage history.
Conservation voltage reduction and reactive power dispatch using time-series load and DER data. Quantify the kW savings from a CVR strategy on each feeder.
Run scenarios across DER adoption trajectories, EV penetration rates, and storage deployments. Identify which feeders need infrastructure investment first.
Detect anomalous voltage, current, or load patterns that indicate equipment degradation, theft, or measurement errors. Reconstruct grid state from sparse measurements.
Vibration-based ML for boiler feed pumps and turbines. Train on the plant instrumentation data; deploy as an alerting layer over a real plant historian.
Heat-rate forecasting, condenser performance models, combined-cycle dispatch optimization. The plant data layer is built for digital-twin work.
Clone SP&L, fire up an AI coding assistant, and start building. No procurement, no NDAs, no specialized team required. Your domain expertise is the irreplaceable ingredient.