Writing · Field notes from operations

Writing.

Two series and a handful of standalone essays on grid reliability, SRE for utilities, BTM economics, and what it actually takes to modernize infrastructure you can't turn off. Plain text, specific numbers, no transformation decks.

§ 01

SRE for the Grid.

Seven-part series porting the Google SRE methodology onto distribution reliability. Budget, don't scorecard.

01 · Mar 11, 2026

BGP, FLISR, and the shape of a failure you can afford.

Network engineers have known for two decades that reliability is a budget, not a goal. The same math works on a 12.47 kV distribution feeder, if you stop treating SAIDI as a scorecard and start treating it as a currency.

18 minRead →
02 · Mar 3, 2026

Why your grid is already running SRE.

FLISR is network failover. Reclosers are retry-with-backoff. Storm drills are chaos engineering. You just don't call them that, and you don't do them systematically.

12 minRead →
03 · Mar 18, 2026

Why N-1/N-2 planning can't keep up with the modern grid.

Contingency planning gives binary pass/fail results for a probabilistic world. The 2003 blackout proved every component can pass and the system still fails.

14 minRead →
04 · Mar 25, 2026

Chaos engineering for the grid: breaking things safely on purpose.

Utilities already practice chaos engineering. Storm drills, black start exercises, protection relay testing. The missing piece is making it systematic, continuous, and measured.

11 minRead →
05 · Apr 1, 2026

SRE doesn't replace IEEE 1366. It makes it better.

How SRE practices complement IEEE 1366 reliability reporting. Per-feeder error budgets, burn-rate tracking, and real-time SAIDI forecasting layered onto existing regulatory frameworks.

13 minRead →
06 · Apr 8, 2026

The $10 million SAIDI improvement.

A 30-minute SAIDI improvement avoids $10 million per year in outage costs. A typical SRE program costs $2 million. The math works at every scale.

9 minRead →
07 · Apr 15, 2026

Building SRE culture at a utility: the 18-month transformation.

SRE adoption at a utility is an 18-month organizational transformation. The cultural barriers are real and solvable. Here is the roadmap, from existing frameworks to new roles.

16 minRead →
§ 02

AI in the Control Room.

Four-part series on deploying AI assistants at utilities without leaking CEII, breaking NERC CIP, or waking up legal counsel.

§ 03

Essays.

Standalone writing on modernization, iteration, and the operator's view of the grid.