One of the most important core concepts that will act as the base for other core concept trainings: Engineering Luck and Resource Allocation. This article will help break down what system reliability means, what the least reliable component is typically, and how to increase the overall reliability in a system.
Listen: Apple Podcasts | Spotify
System Reliability
The reliability of a system is a product of the reliability of the components that make up the system.
Garret Hardin, Filters Against Folly.
Product = Multiplication. Nothing is 100% perfect and because of that we are going to be multiplying fractions or percentages. What happens when you multiply percentages or fractions? Every additional component drives the product down.
90% * 90% = 81%.
The product is less than the components.
As an example:
If 90% of the time you paid the electric bill and 90% of the time the light bulb works. The product would be that 81% of the time you’d get an outcome you expect (turn on light switch and light comes on).
90% * 90% * 90% = 72.9%.
Let’s add another component to the system. Your boss wants the lights on in the building at 5AM everyday. 90% of the time you flip the light switch at 5AM. With the system reliability (90% * 90% * 90%) he can expect the lights on 72.9% of the time at 5AM.
Even though all components are equally reliable at 90% you can see that as the system went from two 90% components to three, the overall system reliability decreased.
What if you end up finding a better technology? Surely something with 99% reliability will solve the system’s issues.
90% * 90% * 90% * 99% = 72.1%.
Even though the new technology was more reliable than all the other components it still drove the overall system reliability down.
More components drive down reliability.
You can replace components and potentially improve reliability, but you can’t add more components to improve reliability.
Least Reliable Component
What do you think that would be?
In every single system the least reliable component is human behavior (if it is present).
90% * 60% = 54%.
The technology is 90% and the human is 60%. 60% of the time we do what we say we are going to do and paired with a technology that does it’s job 90% of the time we get an outcome of 54%.
What if we improve the tech/plan/law (anything that isn’t the human behavior)?
99% * 60% = 59.4%.
Why is it that when technology is getting exponentially better but the overall outcome we are getting is only marginally improved?
This is because of the least reliable component, the human.
What if we took that 9% increase in tech and made the human component 9% better?
90% * 69% = 62.1%
This improves the system reliability more, 62.1% > 59.4%.
The takeaway?
If you are able to make an improvement on the system, make it toward the LEAST reliable measure. If you can remove it, even better.
Improving components with lesser reliability > Improving components with greater reliability.
Less components = more reliable.
Increase System Reliability
- Remove as many components as possible.
- Focus on the human behavior (or the least reliable component, but this is generally going to be the human) until it’s as reliable as the other components.
Example:
“I think I’m going to eat this way. I’m committed to salads every day for lunch.”
You make it 5 days out of the week and fall off on the weekend.
“Yeah that’s decent that’s a great start 5/7 days.”
That’s 71% that was accomplished.
If your car started 71% of the time, your car brakes worked 71% of the time, or your computer would only start 71% of the time, would it be a limit to what you are trying to accomplish?
Probably.
We have a high expectation of technology to be reliable and a low expectation of our own behaviors.
This is one of the reasons why our behaviors are so inconsistent is because it wasn’t properly framed in terms of how it fits into a system.
When we want a more reliable outcome we have to turn to the system reliability. Until your behavior as a human is close to the reliability of the other components the system won’t be as reliable.
Hope this helped to shed some light as to why simply adding more “stuff” doesn’t actually make the outcome any better if the lagging component bringing the system down isn’t addressed. System reliability is an underpinning for the next trainings: Engineering Luck and Resource Allocation. After you’ve opened the loop be sure to do your 6WU found below and reflect, then sit with the open loop for a bit before coming back to the training to go over it again to see what else you can take away.
6WU: Wisdom Comes From Multiple Perspectives
Live to Learn. Give to Earn. Share your takeaways in six words in the tweet below and then read what others have said so that you can learn from their perspectives as well.
https://twitter.com/TheGuardianAcad/status/1610638932781465600?s=20&t=O3nKhjhJDYoetfhQmj19jw