Example: biology
Byzantine Fault Tolerance
Found 2 free book(s)Distributed Systems
www.cl.cam.ac.ukIf one component of a system stops working, we call that a fault, and many distributed systems strive to provide fault tolerance: that is, the system as a whole continues functioning despite the fault. Dealing with faults is what makes distributed computing fundamentally di erent, and often harder, compared to programming a single computer.
In Search of an Understandable Consensus Algorithm ... - Raft
raft.github.ioused to solve a variety of fault tolerance problems in dis-tributed systems. For example, large-scale systems that have a single cluster leader, such as GFS [8], HDFS [38], and RAMCloud [33], typically use a separate replicated state machine to manage leader election and store config-uration information that must survive leader crashes. Ex-