Writing · SRE for the Grid · Part 06

The $10 million SAIDI improvement.

A 30-minute SAIDI improvement avoids $10 million per year in outage costs. A typical SRE program costs $2 million. The math is not complicated. The hard part is doing the work.

Adam BrownAuthor
10 minReading time
Apr 8, 2026Published
Part 6 of 7SRE for the Grid

A 30-minute SAIDI improvement for a 1,000 MW utility, using a conservative $20,000/MWh value-of-lost-load, avoids $10 million per year in outage costs. A typical SRE program costs $2 million per year to run. That's a 400% return on investment before counting reduced truck rolls, automated alert triage, or deferred capital spending. The math isn't complicated. The hard part is believing it, then doing the work.

Every conversation about grid reliability eventually lands on cost. Executives want to know what they're buying. Regulators want to know what ratepayers are getting. Engineers want to know they'll actually get the tools and headcount to execute. This piece lays out the full economic case for applying site reliability engineering to utility operations, with real numbers, verifiable formulas, and direct comparisons to traditional investment patterns.

§ 01The math

The core formula for quantifying reliability improvement value uses the Value of Lost Load (VOLL), a well-established metric in utility economics. VOLL represents the average economic cost per megawatt-hour of unserved energy during an outage.

annual_avoided_cost = ΔSAIDI (hours) × peak_load (MW) × VOLL ($/MWh) EQ. 01

A note on precision: the formula uses peak load as a simplification. Outages don't always occur at peak. More precise models apply a load factor of 0.5 to 0.7 to reflect average demand during outage hours, which would reduce the avoided-cost estimate proportionally. The reference case below uses peak load to illustrate the upper bound. Utilities building an internal business case should use their own coincident-peak and average-demand data.

Walk through the reference case.

Parameter Value Notes
Peak load1,000 MWMid-size utility service territory
SAIDI improvement30 minutes (0.5 hours)Upper range; 15-30 min typical in first 12-18 months
VOLL$20,000/MWhConservative for mixed residential/commercial/industrial
Annual avoided cost$10,000,0000.5 × 1,000 × $20,000
SRE investment$2,000,000/yearStaff, tooling, training, consulting
Net annual benefit$8,000,000400% ROI
Fig. 01 · Reference case. 1,000 MW utility, 30-min SAIDI gain, conservative VOLL.
On VOLL $20,000/MWh is conservative. The Lawrence Berkeley National Laboratory Interruption Cost Estimate (ICE) Calculator puts residential VOLL at $2,000-$5,000/MWh, commercial at $10,000-$50,000/MWh, and industrial at $25,000-$100,000/MWh depending on sector, duration, and time of day. ERCOT's 2024 value-of-lost-load analysis by The Brattle Group arrived at a system-wide weighted average of $35,685/MWh. A mixed service territory averaging $20,000/MWh is a defensible starting point for planning. Some PUC filings have used numbers two to five times higher.

Because both VOLL and achievable SAIDI improvement vary by utility, the matrix below shows how the economics shift across scenarios. All figures assume a 1,000 MW peak load utility.

VOLL ($/MWh) ΔSAIDI 15 min ΔSAIDI 30 min ΔSAIDI 60 min
$10,000 (residential-heavy)$2.5M$5.0M$10.0M
$20,000 (reference case)$5.0M$10.0M$20.0M
$35,000 (ERCOT system-wide)$8.75M$17.5M$35.0M
Fig. 02 · Avoided-cost matrix. Pick your VOLL and target, read the dollar figure.

Even the most conservative scenario ($10,000 VOLL, 15-minute SAIDI gain) yields $2.5 million in avoided costs, enough to cover a fully staffed SRE program. The reference case at $20,000 and 30 minutes delivers the headline $10 million. A utility using ERCOT-calibrated VOLL with aggressive SAIDI targets could justify the program several times over.

The formula scales linearly with system size. A 500 MW utility with 30-minute SAIDI improvement sees $5 million in avoided costs. A 2,000 MW utility sees $20 million. Even a small rural cooperative serving 200 MW of peak load generates $2 million in avoided outage costs, enough to fully fund an SRE program. The economics work at every scale because the investment is primarily operational, not capital.

§ 02SRE vs. traditional investment patterns

Utilities have spent decades solving reliability problems the same way: more poles, more wires, more transformers, more substations. These are proven approaches. They carry a specific financial profile that makes them slow to deploy and slow to pay back.

Category Traditional utility SRE approach
Capital intensityHigh (poles, wires, transformers)Low (software, training, process)
Payback period10-30 years6-18 months
Rate base growthRequired for shareholder returnsNot dependent on rate base
ScalabilityLinear with infrastructure buildExponential with automation
Risk profileLong-duration asset riskOperational, reversible
Regulatory treatmentCAPEX, added to rate basePrimarily OPEX
Fig. 03 · Two very different financial profiles. Utilities need both; they're built for different jobs.

A new substation costs $10-50 million, takes 3-7 years to permit and build, and delivers its reliability benefit only to the customers it directly serves. An SRE program costs $2 million per year, delivers measurable SAIDI improvement within 12 months, and its benefits compound across the entire service territory as automation and operational discipline spread.

This isn't an argument against capital investment. Utilities need physical infrastructure. The argument is that SRE provides a fundamentally different investment profile: low upfront cost, fast payback, system-wide benefit. For a utility looking to improve reliability metrics on a compressed timeline, there's no faster path.

§ 03Where the operational savings come from

The $10 million VOLL-based figure captures the macro benefit of reduced customer outage minutes. An SRE program also generates direct operational savings through toil reduction. Toil, as defined in Part 02, is manual, repetitive, automatable work that scales linearly with system size. Every utility control room is full of it.

The estimates below are modeled on a typical mid-size utility control room (500-1,500 MW peak load, 3-5 operators per shift). Ranges reflect variation by utility size, existing automation level, and SCADA/DMS maturity. Your numbers may differ; the point is the order of magnitude.

  • Alert triage. A typical SCADA installation generates hundreds to thousands of alarms per day, most of them nuisance-level. Automating classification and prioritization reclaims 2-4 operator hours per shift across a 24/7 rotation. At $80/hour fully loaded, that's $175K-$350K annually per dispatch center.
  • Routine switching. Automated switching orders, pre-validated load-transfer sequences, and remote-controlled devices cut manual dispatch time by 20-40%. For a utility running 2,000 switching operations annually, that's 400-800 saved crew-hours, $50K-$100K at typical loaded rates.
  • Compliance reporting. Automating IEEE 1366 index calculation, NERC CIP evidence collection, and rate-case reliability exhibits reclaims 0.5-1.5 FTEs in most utilities. $75K-$225K annually.
  • Truck roll avoidance. Sensor correlation and automated fault location reduce unnecessary crew dispatches by 15-30%. At typical truck-roll costs of $500-$1,500 each, a utility running 10,000 annual dispatches saves $750K-$4.5M.

Taken together, operational savings add $1M-$5M per year on top of VOLL-based avoided costs. The SRE program pays for itself in toil reduction alone, before the reliability benefit shows up.

§ 04Deferred capital

The highest-leverage economic benefit of SRE isn't listed on the avoided-cost line. It's in the capital investment you didn't have to make. When automated FLISR, predictive maintenance, and optimized switching restore reliability on an aging feeder, you defer or avoid the rebuild. A typical distribution feeder rebuild costs $2-5 million. Deferring it 5-10 years through better operational practice translates to $200K-$500K per year per feeder in deferred capital charges, compounded across a fleet of hundreds of feeders.

This is where the 400% ROI story understates the real economics. The SRE line item is $2M. The avoided-outage savings are $10M. The deferred capital is another $10-20M across a mid-size utility. The combined benefit-to-cost ratio routinely exceeds 10x.

The SRE line item is two million. The avoided outage cost is ten. The deferred capital is another ten to twenty. Call it 10x.

§ 05Rate-case narrative

Rate cases reward utilities that demonstrate prudent, measurable operations. SRE-backed reliability improvements give you all three dimensions commissioners look for:

  • Measurable. SLI dashboards, burn-rate trends, and continuous SAIDI tracking give you a paper trail that holds up under cross-examination.
  • Prudent. Error budget policies, chaos-validated recovery plans, and blameless postmortems demonstrate disciplined decision-making.
  • Scalable. The SRE investment profile (low capital, fast payback, system-wide benefit) aligns with the kind of spending that commissioners approve quickly.

Utilities that adopt SRE practices early will frame the next generation of rate-case narratives. The ones that wait will spend their time explaining why their reliability investments cost more and delivered less.

§ 06The compounding effect

The economics get better over time. Year one delivers the 400% headline ROI. Year two builds on that: the automation you shipped in year one continues to prevent outages, the crews you trained in year one are now training the next cohort, the processes you measured in year one now drive year two's targets higher. By year five, the cumulative benefit typically runs 5-10x the initial investment. The discipline compounds the same way good engineering always does.

§ 07Next in the series

Part 07 closes the series. The economics are clear, the techniques are real, the regulatory fit is there. The remaining question is organizational: how do you actually build SRE culture inside a utility? It takes about 18 months. Here's the roadmap.

— Adam · adam@sgridworks.com · Apr 8, 2026

Part 07 →

Building SRE culture at a utility: the 18-month transformation.

Part 7 of 7 · SRE for the Grid

Read next →