Writing · Essay

The Agentic Epoch. Autonomy at the grid edge.

The substation is ready to think. Local MoE models, self-hosted agents, and open-source tools have closed the gap between what the grid actually needs and what cognition can deliver inside the fence.

Adam BrownAuthor
10 minReading time
Apr 2026Published
EssayStandalone

For a century the substation has been a sensor endpoint. SCADA samples it. A dispatcher somewhere else makes the decisions. Automation runs the narrow loops nobody has time to think about. The substation itself has never been asked to think. The pieces are finally in place to change that.

Three movements are pulling in the same direction. Open-weights models reached a capability threshold that closed the gap between "local" and "useful" for the reasoning tasks the grid actually needs. Mixture-of-experts architectures made big-model quality cheap enough to run on commodity industrial compute. Agent runtimes grew up fast enough that a substation-local agent can hold memory across shifts, plan multi-step work, call tools, and reach an operator on whatever channel they happen to be near. Autonomy at the grid edge has stopped being a slide deck. It's a stack you can stand up today.

§ 01What “autonomy at the edge” actually means

The phrase "autonomous substation" describes something specific: a reasoning locus inside the fence that can interpret local telemetry in operational context, maintain state across days and seasons, reason about rare or novel conditions using general knowledge, act on decisions within predeclared and auditable envelopes, and escalate anything outside those envelopes to a human who actually wants the page.

Autonomy runs in degrees. Every real deployment climbs one rung at a time, and the higher rungs only ship after the lower rungs have been operationally boring for months. Nothing on this ladder requires a breakthrough.

01
AdvisoryAgent drafts recommendations offline. Human makes every decision. Nothing touches the grid.
02
ShadowAgent runs alongside existing automation, logs its own recommendation on every decision, actuates nothing. Months of shadow output build the evidence base.
03
SupervisedAgent proposes setpoint changes on slow control loops. A human confirms. Every action is logged, reviewable, and reversible.
04
Bounded autonomousAgent acts freely within declared safe envelopes. Anything outside the envelope escalates. The envelope itself is published and auditable.
05
Coordinated autonomousAgents at adjacent substations reason together about dynamic protection, fault isolation, and restoration within a shared envelope. The horizon the ladder points toward.

§ 02The stack that makes it possible

Three layers, all open, all already on the shelf.

Hardware

A substation doesn't need a data center. A fanless industrial PC with 32–64 GB of RAM and a modest NPU or GPU runs a serious model at interactive latency. A Jetson Orin NX drops into a DIN rail next to the RTU. The point isn't the specific box. Commodity compute that can do real reasoning on substation-local data is quiet and cool enough to bolt into existing racks, and the price envelope is already below what utilities routinely spend on a smart relay.

Models

Gemma 4 E4B is the model I reach for first, and honestly the best small-footprint edge model I've run. E4B is the edge-optimized variant of Google's Gemma 4 family. The larger Gemma 4 siblings lean on mixture-of-experts routing; E4B (“effective 4B”) is tuned for modest hardware, with a per-token compute footprint near a dense 4B model and quality that punches far above weight class. It reasons over a feeder topology, drafts a switching sequence, summarizes a burst of relay events, and routes retrieval requests to a local vector store, all without a single packet leaving the perimeter.

Primer · Mixture of Experts

Dense transformer models fire every parameter for every token. A mixture-of-experts (MoE) model trains many specialist sub-networks (experts) plus a small router network that picks two or three of them per token. The math only touches the activated experts, so per-token compute stays at small-model cost even though the model's total parameter count can be much larger.

You get closer to big-model capability at small-model inference cost. The tradeoff: all experts still have to sit in memory or be streamed in, so the VRAM budget reflects the total parameter count rather than the activated slice.

Beyond Gemma 4 E4B, the open-weights ecosystem is deep. Qwen, Llama, Mistral, DeepSeek, and Phi cover most of the quality and latency envelope a utility cares about. All run on Ollama, vLLM, or llama.cpp. All are swappable. Every six months the capability bar moves up for free, in place, without a procurement cycle.

Agent runtime

Hermes is the runtime I run in production. Persistent memory lives in a shared Obsidian vault. Sub-agents spawn per task and retire when they're done. Nightly maintenance agents curate memory and surface contradictions to recent decisions. Telegram and Discord gateways let the agent reach a human wherever they actually are. External tools plug in through MCP (Model Context Protocol), so the same agent that drafts an interconnection study can also read a historian, query an ADMS, or watch a regulatory docket.

None of this is proprietary. All of it is replaceable. Most of it is already in your organization's GitHub lineage if you've been paying attention.

§ 03Walking the ladder into real-time operations

The honest sequence from where most utilities are today to the substation that reasons runs through the same five rungs, and each rung is valuable on its own. The hardest work across all of them isn't the agent; it's the integration with the legacy SCADA, ADMS, and RTU stacks that sit between a recommendation and the device. That plumbing is the multi-year payoff, and it earns out regardless of how far up the ladder the agent eventually climbs.

Rung 1 · Offline reasoning. Start where nothing is safety-critical. Drafting interconnection studies, summarizing event bursts after the fact, triaging work orders, watching regulatory dockets, generating post-event reports. The agent learns operational feel, engineers get their evenings back, and nothing on the grid depends on the agent being right the first time.

Rung 2 · Shadow mode at the edge. Deploy a local agent in the substation alongside the existing automation. Let it watch Volt/VAR optimization make its decisions, score them, log its own alternative. No actuation. Months of shadow output reveal where the agent agrees with incumbent rules, where it would have caught a drift the rule missed, and where the rule is still the better bet. This rung is the cheapest and the most operationally informative.

Rung 3 · Supervised setpoint adjustment. Start with the narrowest, slowest control loops. Volt/VAR setpoints on a 15-minute cadence. Conservation voltage reduction targets reviewed on change. Capacitor bank dispatch schedules. The agent proposes, a human confirms, the action actuates and logs. Audit trail follows the framework laid out in the control-room series. Every out-of-envelope recommendation escalates.

Rung 4 · Bounded autonomy. Once the supervised rung is boring, the envelope gets published. The agent may adjust Volt/VAR setpoints within ±2% of nominal voltage targets. It may dispatch DER within pre-cleared interconnection envelopes. It may adjust adaptive protection settings within coordinated ranges set by the protection engineer. It may manage microgrid islanding within declared load thresholds. Inside the envelope the agent acts. Outside the envelope it escalates. The envelope itself is published, reviewable, and amendable.

Rung 5 · Coordinated autonomy. Agents at adjacent substations reason together about dynamic protection coordination, fault isolation, and service restoration. A bulk-system disturbance ripples through an ensemble of local reasoners, each one holding the context of its own feeder and cooperating on a restoration sequence that respects the physics and the customer. This is the horizon. It's reachable on the same hardware and the same runtime.

Clarification

"Real-time" on the agent layer doesn't require an LLM in the microsecond control loop. Deterministic relays and protective devices keep running the physics. The agent sits on top, reasoning and authorizing on operator-scale timescales (seconds to minutes) and handing actions to the automation that knows how to execute them. Cognition at the edge layers onto the existing protection and control architecture; it doesn't replace it.

§ 04Use cases worth naming

Skipping the hedge, the near-term unlocks:

  • Volt/VAR Optimization. Agent-driven VVO that learns from load patterns, DER output forecasts, and seasonal drift. Works even when the ADMS loses its WAN link to the substation.
  • DER dispatch and curtailment. Agent arbitrates between competing interconnection envelopes, EV charging demand, and storage state-of-charge. Local reasoning, local action, auditable decision trail.
  • Adaptive protection. Agent proposes protection setting changes as topology evolves or DER penetration grows, always inside coordination envelopes the protection engineer has signed.
  • Autonomous restoration. Agent reasons about fault isolation with access to crew status, weather nowcast, and DER island potential. Full restoration sequences get human-confirmed in seconds, not minutes.
  • Feeder-scale load and generation forecasting. Local model learns the quirks of this feeder, not the average of the service territory. Forecast quality improves without ever shipping customer-interval data off-site.
  • Microgrid islanding and resynchronization. Agent holds operational state through islanded operation, coordinates DER and load, and manages resynchronization with the bulk system.

Every one of these is already possible on the stack described above. What's missing is deployments, not breakthroughs.

§ 05Why open source is the right foundation

A proprietary stack for this layer would be a disaster for utilities. Three reasons.

Sovereignty. CEII, CIP, customer data, operational topology. None of this should depend on a vendor's willingness to keep selling you a contract. Open weights and open runtimes keep the stack yours, your auditor's, and your regulator's, in that order.

Auditability. A regulator, an internal auditor, or your own reliability engineer has to be able to read the agent's decision trail, probe model behavior on edge cases, and confirm that what shipped matches what was declared. Closed boxes fail that test by design. Open weights and open interfaces pass it by default.

Portability. Six months from now, a better model will land. Twelve months from now, a better runtime. Twenty-four months from now, a hardware class that doubles the efficiency envelope. Open interfaces (MCP, Ollama's API, the agent-runtime protocols) let you swap components without rewriting the stack. The whole stack becomes a Ship of Theseus that never takes the grid offline to replace a plank.

The community advantage compounds. Open-weights models improve monthly. Tooling stabilizes. The utility-specific patterns that emerge get shared between utilities because no one owns the substrate they're written on. The next utility solves the problem once, and everyone benefits. That's what "open" means in practice.

§ 06A brief word on the gates that stay

None of this is a case for reckless autonomy. The gates that matter stay in place on every rung. Hallucinated setpoints are unacceptable in OT, so human confirmation stays mandatory until shadow-mode evidence earns the next rung. Every agent action, every model call, every file read, gets logged and attributable. Memory drifts if nobody curates it, so consolidation runs nightly. Agents are still bad at knowing what they don't know, so operators keep owning the envelope design. The governance is the point; autonomy is the payoff.

§ 07What this future looks like

A substation that reasons. A feeder that notices. A crew dispatched to the right place before the customer calls. A regulator who can read the agent's reasoning and sign off on it. A grid that adapts to DER penetration, storm damage, and load growth without a dispatcher having to carry every variable in their head.

I run the ancestor of this in my office closet. A shared vault, Hermes, Gemma 4 E4B, and a set of gateways that find me when it matters. Scaling the same architecture down to a substation industrial PC is an engineering problem, not a research problem. The path from where the industry is today to the autonomous substation is clear, short, and already underway in the repos that utilities haven't bothered to notice yet.

The Agentic Epoch at the grid edge is a choice utilities get to make. The ones that decide to run this stack will hit reliability, flexibility, and decarbonization outcomes the old architecture can't touch. The ones that wait for the vendor pitch deck will implement the same thing two years late and at four times the price. The stack is open. The models are good. Gemma 4 E4B is the best of them right now, and the next one will be better.

The substation is ready to think. Whoever gives it permission first gets to define what the next century of grid operations looks like.

— Adam · adam@sgridworks.com · Apr 2026