Tuesday, April 23, 2024

Facebook’s Failure Shows Why We Shouldn’t Rely on It for Everything


Key Takeaways

  • Facebook’s technical troubles had been unlucky, however the issue would possible have been resolved a lot quicker if it did not rely on so many interconnected techniques.
  • There’s no technique to forestall system failures utterly, however there are methods to make them much less possible.
  • Having backup plans for when (not if, when) a system fails could make the distinction between ‘annoying’ and ‘catastrophic.’

- Advertisement -
A white thumbs down icon on a black keyboard key.

fongfong2 / Getty Images

- Advertisement -



The latest Facebook debacle demonstrates how interconnected techniques are sure to fail and why we should not use them for all the pieces.

- Advertisement -

Losing Facebook, WhatsApp, and Instagram for a number of hours on Monday was inconvenient, damaging to companies, and in some instances, virtually catastrophic. According to Facebook, it was all because of configuration modifications to its community coordinating routers.

It’s an affordable clarification, however the truth that a single error like that would deliver not simply Facebook however different Facebook-owned techniques grinding to a halt is a bit alarming. 

One improper router config change brought about a number of companies, and even VR headsets, to cease working completely. On prime of that, by Facebook’s personal admission, it additionally had a cascading impact on how the corporate’s information facilities talk, bringing all their companies to a halt.

“The reliance on interconnected systems does carry with it an inherent risk of system or even service failure,” mentioned Francesco Altomare, senior technical gross sales engineer at GlobalDots, in an e-mail interview with Lifewire,

“To counter this daunting risk, companies utilize the principle of SRE (System Reliability Engineering), as well as other tools, which all deal with varying levels of redundancy built into every layer of a system’s infrastructure.”

Facebook displayed on a smartphone, sitting next to a laptop computer on a glass top table.

Timothy Hales Bennett / Unsplash




What Can Go Wrong

It’s value noting that when a system like that fails, it normally requires an ideal storm of issues going improper. It’s much less like a home of playing cards ready to fall and extra like an uncovered thermal exhaust port on an area station the scale of a small moon.

Most corporations take steps to try to make sure that the one factor that would throw all the pieces into chaos by no means occurs—however regardless, it could occur.

“Unexpected failures are a part of business and could arise as a result of worker negligence, faults in internet service provider’s network, or even cloud storage services undergoing issues,” mentioned Sally Stevens, co-founder of QuickPeopleSearch, in an e-mail interview.

“…As long as the necessary steps to protect the system—such as backups, on-site router, and tiered access—are put in place, these failures are quite unlikely.” Though even with a military of fail-safes, it is nonetheless attainable for the lynchpin to fail.

If the system that controls issues like main types of contact, home equipment, doorways, and so forth., fails, the outcomes may be vital. From delicate inconvenience to full-on catastrophic, relying on how a lot people and firms rely on all of it.

A group of engineers meeting around a table in an office.

Hinterhaus Productions / Getty Images



“There is also the risk of hackers getting into the system from any of the least protected devices, such as refrigerators and oven toasters,” added Stevens, “which could lead to data theft and ransomware.”


How We Can Prepare

There’s no technique to assure {that a} system won’t ever fail, however there are steps that may be taken to both make failure much less possible or to handle failure extra easily. A mix of the 2 approaches that marries fail-safes and countermeasures with contingency plans and backup techniques could be ideally suited.

“For eliminating these hazards created by third-party products and services that are effectively handled, roles and duties regarding Third-Party Risk Management must be strictly outlined,” mentioned Daniela Sawyer, founder and chief know-how officer of DiscoverPeopleQuick, in an e-mail interview, “To flourish in these new surroundings, risk managers must grasp the essential parts of such a sophisticated ecosystem.”

What occurred with Facebook, WhatsApp, and Instagram was unlucky, but additionally hopefully eye-opening. People who rely on interconnected techniques should perceive that the appropriate factor going improper can disrupt all the pieces. And measures have to be put in place (or scrutinized and refined) to make such disruptions much less possible and fewer impactful.

In Facebook’s case, its drawback wasn’t the router troubles, however fairly having virtually its whole ecosystem linked to all the pieces else. Thus, with Facebook (the service) down, Facebook (the corporate) needed to spend far more time and vitality merely organizing and addressing the difficulty. If it both did not use such a deep-rooted, interconnected system or had backup plans in place to cope with an outage like that, it possible would have taken far much less time to repair.

Was this web page useful?




Source link

More articles

- Advertisement -
- Advertisement -

Latest article