/Lorin Hochstein

OOPS Writeups tl;dr: Operational Surprises (OOPS) is when something unexpected happened in operations and presents an opportunity to discover how the observed system behavior deviated from the mental model of how the system is supposed to behave. The template shared in this post is based on the template used at Netflix.

Root Cause Of Failure, Root Cause Of Success tl;dr: “Root cause of failure” doesn’t make sense in the context of complex systems failure, because a collection of control processes keep the system up and running. A system failure is a failure of this overall set of processes." Lorin draws an analogy to illustrate this and points to the fact that if there's no root cause of success, why should there be one for failure.

