Spike Avoidance

Tom McLaughlin   ·  

A spike is a sudden, dramatic increase in demand, effort, or activity that far exceeds normal operating levels. I consider spikes a threat to sustainability.

Startup culture’s reliance on the veneer of coolness and a potential future payoff as a way to fortify employees against these spikes will shape the workforce, selecting for younger people who are more naive and risk tolerant. The effects of this compound over time.

When we allow spikes to occur regularly, several predictable problems emerge: resource exhaustion, quality degradation, system instability, and the inevitable crash that follows. The dramatic valleys after spikes leave systems and people less capable than before the spike began.

Crunch time is not inevitable, but it comes at a cost. Teams must be able to say no to good ideas. New features may have to be put on hold while the team focuses on operational readiness.

I’m not claiming there won’t be Incidents. DNS misconfigurations, us-east-1 outages, DDOS, the Slashdot effect, tornadoes, illness, politics, and war. Once any of these affects our ability to stay solvent, we either prepare or we incur costs.

Spike avoidance requires acknowledging natural work rhythms. Tax time. Holidays. Kids going back to school. Winter. Night and day. Entitlement checks. We have two choices - we can scale the systems to smooth out the load on our individual resources, or we can do nothing and watch our employees and systems strain under the load.

Near as I can tell, the options are to either fortify the system’s capacity to handle the load, or squeeze your existing resources and wait for the crash.