Production infrastructure notes

Spend the
error budget wisely

Infrastructure notes on running systems that can fail but must recover. Production patterns for AI workloads, VMware deployments, and audit-ready operations — from an architect operating regulated fintech systems for 10+ years.

Free · Operators only · No spam

The premise

Every production system has an error budget — the amount of failure it can absorb before SLAs break. The job is not zero downtime. It is spending that budget deliberately: on planned maintenance, controlled rollouts, and calculated risk — not on surprises at 3am.

What I write about

VMware vSphere NVIDIA AI Enterprise Dell VxRail HPE Synergy vGPU & MIG Audit-ready ops Reliability patterns NCA-AIIO prep

the error budget

One deep technical note every Friday. Production patterns, audit-ready configurations, and lessons from operating mission-critical infrastructure. Written by an architect, for architects.

Free. Unsubscribe anytime. No spam, ever.

10+

Years fintech infrastructure

10

VMware clusters in production

Anonymous

No vendor influence