A 30-minute SAIDI improvement for a 1,000 MW utility, using a conservative $20,000/MWh value-of-lost-load, avoids $10 million per year in outage costs. A typical SRE program costs $2 million per year to run. That's a 400% return on investment before counting reduced truck rolls, automated alert triage, or deferred capital spending. The math isn't complicated. The hard part is believing it, then doing the work.
Every conversation about grid reliability eventually lands on cost. Executives want to know what they're buying. Regulators want to know what ratepayers are getting. Engineers want to know they'll actually get the tools and headcount to execute. This piece lays out the full economic case for applying site reliability engineering to utility operations, with real numbers, verifiable formulas, and direct comparisons to traditional investment patterns.
§ 01The math
The core formula for quantifying reliability improvement value uses the Value of Lost Load (VOLL), a well-established metric in utility economics. VOLL represents the average economic cost per megawatt-hour of unserved energy during an outage.
A note on precision: the formula uses peak load as a simplification. Outages don't always occur at peak. More precise models apply a load factor of 0.5 to 0.7 to reflect average demand during outage hours, which would reduce the avoided-cost estimate proportionally. The reference case below uses peak load to illustrate the upper bound. Utilities building an internal business case should use their own coincident-peak and average-demand data.
Walk through the reference case.
| Parameter | Value | Notes |
|---|---|---|
| Peak load | 1,000 MW | Mid-size utility service territory |
| SAIDI improvement | 30 minutes (0.5 hours) | Upper range; 15-30 min typical in first 12-18 months |
| VOLL | $20,000/MWh | Conservative for mixed residential/commercial/industrial |
| Annual avoided cost | $10,000,000 | 0.5 × 1,000 × $20,000 |
| SRE investment | $2,000,000/year | Staff, tooling, training, consulting |
| Net annual benefit | $8,000,000 | 400% ROI |
Because both VOLL and achievable SAIDI improvement vary by utility, the matrix below shows how the economics shift across scenarios. All figures assume a 1,000 MW peak load utility.
| VOLL ($/MWh) | ΔSAIDI 15 min | ΔSAIDI 30 min | ΔSAIDI 60 min |
|---|---|---|---|
| $10,000 (residential-heavy) | $2.5M | $5.0M | $10.0M |
| $20,000 (reference case) | $5.0M | $10.0M | $20.0M |
| $35,000 (ERCOT system-wide) | $8.75M | $17.5M | $35.0M |
Even the most conservative scenario ($10,000 VOLL, 15-minute SAIDI gain) yields $2.5 million in avoided costs, enough to cover a fully staffed SRE program. The reference case at $20,000 and 30 minutes delivers the headline $10 million. A utility using ERCOT-calibrated VOLL with aggressive SAIDI targets could justify the program several times over.
The formula scales linearly with system size. A 500 MW utility with 30-minute SAIDI improvement sees $5 million in avoided costs. A 2,000 MW utility sees $20 million. Even a small rural cooperative serving 200 MW of peak load generates $2 million in avoided outage costs, enough to fully fund an SRE program. The economics work at every scale because the investment is primarily operational, not capital.
§ 02SRE vs. traditional investment patterns
Utilities have spent decades solving reliability problems the same way: more poles, more wires, more transformers, more substations. These are proven approaches. They carry a specific financial profile that makes them slow to deploy and slow to pay back.
| Category | Traditional utility | SRE approach |
|---|---|---|
| Capital intensity | High (poles, wires, transformers) | Low (software, training, process) |
| Payback period | 10-30 years | 6-18 months |
| Rate base growth | Required for shareholder returns | Not dependent on rate base |
| Scalability | Linear with infrastructure build | Exponential with automation |
| Risk profile | Long-duration asset risk | Operational, reversible |
| Regulatory treatment | CAPEX, added to rate base | Primarily OPEX |
A new substation costs $10-50 million, takes 3-7 years to permit and build, and delivers its reliability benefit only to the customers it directly serves. An SRE program costs $2 million per year, delivers measurable SAIDI improvement within 12 months, and its benefits compound across the entire service territory as automation and operational discipline spread.
This isn't an argument against capital investment. Utilities need physical infrastructure. The argument is that SRE provides a fundamentally different investment profile: low upfront cost, fast payback, system-wide benefit. For a utility looking to improve reliability metrics on a compressed timeline, there's no faster path.
§ 03Where the operational savings come from
The $10 million VOLL-based figure captures the macro benefit of reduced customer outage minutes. An SRE program also generates direct operational savings through toil reduction. Toil, as defined in Part 02, is manual, repetitive, automatable work that scales linearly with system size. Every utility control room is full of it.
The estimates below are modeled on a typical mid-size utility control room (500-1,500 MW peak load, 3-5 operators per shift). Ranges reflect variation by utility size, existing automation level, and SCADA/DMS maturity. Your numbers may differ; the point is the order of magnitude.
- Alert triage. A typical SCADA installation generates hundreds to thousands of alarms per day, most of them nuisance-level. Automating classification and prioritization reclaims 2-4 operator hours per shift across a 24/7 rotation. At $80/hour fully loaded, that's $175K-$350K annually per dispatch center.
- Routine switching. Automated switching orders, pre-validated load-transfer sequences, and remote-controlled devices cut manual dispatch time by 20-40%. For a utility running 2,000 switching operations annually, that's 400-800 saved crew-hours, $50K-$100K at typical loaded rates.
- Compliance reporting. Automating IEEE 1366 index calculation, NERC CIP evidence collection, and rate-case reliability exhibits reclaims 0.5-1.5 FTEs in most utilities. $75K-$225K annually.
- Truck roll avoidance. Sensor correlation and automated fault location reduce unnecessary crew dispatches by 15-30%. At typical truck-roll costs of $500-$1,500 each, a utility running 10,000 annual dispatches saves $750K-$4.5M.
Taken together, operational savings add $1M-$5M per year on top of VOLL-based avoided costs. The SRE program pays for itself in toil reduction alone, before the reliability benefit shows up.
§ 04Deferred capital
The highest-leverage economic benefit of SRE isn't listed on the avoided-cost line. It's in the capital investment you didn't have to make. When automated FLISR, predictive maintenance, and optimized switching restore reliability on an aging feeder, you defer or avoid the rebuild. A typical distribution feeder rebuild costs $2-5 million. Deferring it 5-10 years through better operational practice translates to $200K-$500K per year per feeder in deferred capital charges, compounded across a fleet of hundreds of feeders.
This is where the 400% ROI story understates the real economics. The SRE line item is $2M. The avoided-outage savings are $10M. The deferred capital is another $10-20M across a mid-size utility. The combined benefit-to-cost ratio routinely exceeds 10x.
The SRE line item is two million. The avoided outage cost is ten. The deferred capital is another ten to twenty. Call it 10x.
§ 05Rate-case narrative
Rate cases reward utilities that demonstrate prudent, measurable operations. SRE-backed reliability improvements give you all three dimensions commissioners look for:
- Measurable. SLI dashboards, burn-rate trends, and continuous SAIDI tracking give you a paper trail that holds up under cross-examination.
- Prudent. Error budget policies, chaos-validated recovery plans, and blameless postmortems demonstrate disciplined decision-making.
- Scalable. The SRE investment profile (low capital, fast payback, system-wide benefit) aligns with the kind of spending that commissioners approve quickly.
Utilities that adopt SRE practices early will frame the next generation of rate-case narratives. The ones that wait will spend their time explaining why their reliability investments cost more and delivered less.
§ 06The compounding effect
The economics get better over time. Year one delivers the 400% headline ROI. Year two builds on that: the automation you shipped in year one continues to prevent outages, the crews you trained in year one are now training the next cohort, the processes you measured in year one now drive year two's targets higher. By year five, the cumulative benefit typically runs 5-10x the initial investment. The discipline compounds the same way good engineering always does.
§ 07Next in the series
Part 07 closes the series. The economics are clear, the techniques are real, the regulatory fit is there. The remaining question is organizational: how do you actually build SRE culture inside a utility? It takes about 18 months. Here's the roadmap.
— Adam · adam@sgridworks.com · Apr 8, 2026