About Me

I'm Brandon, a DevOps engineer, consultant, and infrastructure architect who has spent the better part of 15 years building, breaking, and rebuilding production systems at scale.

I started my career as a software engineer, but got pulled into the infrastructure side of things around 2011 when I taught myself Chef to manage a fleet of 50+ Ruby on Rails servers in AWS. Back then, most people hadn't even heard the word "DevOps." I've been in the space ever since.

The Consulting Years

Most of my career has been spent as a consultant, which means I've been dropped into a lot of different environments (usually ones that are on fire) and asked to make things work. A few highlights:

I spent time as one of two architects providing cloud governance and guardrails across 100+ AWS accounts for a global industrial company you've definitely heard of. Over 800 developers across multiple continents, and we were the ones making sure nobody read from glacier multiple times per second (it happened) or spun up a Postgres server per branch and never destroyed them (also happened).

Right after COVID hit, a major online bank was buckling under the load of government stimulus payments. Their site was going down daily for weeks. They called us in a panic to stabilize their OpenShift clusters. The kicker? They'd shipped every laptop in their inventory to employees for remote work and didn't have any left for contractors. So we got to fix their production crisis from flaky remote VMs. We got them stabilized though.

I helped one of the world's largest cold storage logistics companies automate their warehouse operations. I've toured the inside of negative 24°F freezers and watched a fully automated facility get built from the ground up. Not every DevOps job keeps you at a desk.

I've consulted for several of the largest banks and insurance companies in the U.S., worked on a digital payments product, and briefly worked with a major global foundation and a well-known open source foundation on a payment fraud detection engine for developing countries in Africa. In some of these regions, fraud costs over half their GDP annually. That one didn't make it past the engagement planning phase for reasons outside my control, and it's still the project I wish had happened.

One of my favorites: a major cloud provider was having trouble deploying their own DevOps product efficiently. They brought us in for a discovery, and the problem was immediately obvious. They were handcuffed by internal policy. They'd deploy to one region and then wait three days before deploying to the next. I asked how often they caught issues during that waiting period versus in the first few hours. The answer was always the first few hours. So why wait three days? "Corporate policy." If you're being told to move faster but you're wearing handcuffs, at some point you've got to fight to get them taken off.

More recently, I worked with a premium sports equipment brand on an app that I can't discuss publicly, but it involved custom-trained computer vision models on Azure and an AI-powered RAG solution combining vector search with a graph database. That project is where a lot of my current AI interest took root.

Beyond Client Work

My employer runs an annual internal innovation competition across the company worldwide. Two years running, I've assembled and led a team that won the North America round, the Americas round, and competed in the world finals. Both years they flew us to Europe to compete. My first project was a digital twin <-> simulator that you could plug your own AI agent into. The second was a platform to use genetic algorithms with RL to create AI that I could hook into the simulator and use the simulator as a training environment.

I was sent to Germany last year for a quantum computing hackathon. I'm part of an early partner program with one of the major AI labs, attended a bootcamp at their headquarters in January, and I'm currently developing AI training material and hands-on labs for the 50+ architects on my team in North America.

What This Blog Is About

Breaking Prod is where I write about the things I actually deal with. The things I'm passionate about, the things I find interesting, the hot takes I may have. It's a place for me to share my experience.

Topics I cover regularly:

Kubernetes: From homelab clusters to production workloads, with an emphasis on what happens when things go sideways
CI/CD & GitLab: Pipeline design, container builds, and automating everything that can be automated
Infrastructure as Code
AI-Assisted DevOps: How I'm integrating AI tooling into infrastructure workflows, from coding assistants to automated content pipelines
Homelab Projects: Because sometimes the best way to learn a technology is to deploy it in your basement and see what breaks
Observability: How is this not ubiquitous yet? Why do I have to keep explaining the value?

Why "Breaking Prod"?

Because everyone has. If you've shipped code to production, you've broken production. The name is a nod to that shared experience. The cold sweat of a failed deploy, the 3 AM PagerDuty alert, the quiet relief when the rollback actually works. The best engineers aren't the ones who never break things. They're the ones who break things, learn from it, and build systems that recover gracefully.

Get In Touch

The best way to reach me is through the newsletter. Subscribe (it's free) and reply to any email. I read everything.

If you find something useful here, share it with someone who'd benefit. If you find something wrong, let me know. I'd rather fix it than let bad information sit.