This book covers the most essential techniques for designing and building dependable distributed systems, from traditional fault tolerance to the blockchain technology. Topics include checkpointing and logging, recovery-orientated computing, replication, distributed consensus, Byzantine fault tolerance, as well as blockchain.
This book intentionally includes traditional fault tolerance techniques so that readers can appreciate better the huge benefits brought by the blockchain technology and why it has been touted as a disruptive technology, some even regard it at the same level of the Internet. This book also expresses a grave concern on using traditional consensus algorithms in blockchain because with the limited scalability of such algorithms, the primary benefits of using blockchain in the first place, such as decentralization and immutability, could be easily lost under cyberattacks.
Author(s): Wenbing Zhao
Publisher: Wiley-Scrivener
Year: 2021
Language: English
Pages: 464
City: Beverly
Cover
Half-Title Page
Series Page
Title Page
Copyright Page
Dedication
Contents
List of Figures
List of Tables
Acknowledgments
Preface
References
1 Introduction
1.1 Basic Concepts and Terminologies for Dependable Computing
1.1.1 System Models
1.1.2 Threat Models
1.1.3 Dependability Attributes and Evaluation Metrics
1.2 Means to Achieve Dependability
1.2.1 Fault Avoidance
1.2.2 Fault Detection and Diagnosis
1.2.3 Fault Removal
1.2.4 Fault Tolerance
1.3 System Security
REFERENCES
2 Logging and Checkpointing
2.1 System Model
2.1.1 Fault Model
2.1.2 Process State and Global State
2.1.3 Piecewise Deterministic Assumption
2.1.4 Output Commit
2.1.5 Stable Storage
2.2 Checkpoint-Based Protocols
2.2.1 Uncoordinated Checkpointing
2.2.2 Tamir and Sequin Global Checkpointing Protocol
2.2.3 Chandy and Lamport Distributed Snapshot Protocol
2.2.4 Discussion
2.3 Log Based Protocols
2.3.1 Pessimistic Logging
2.3.2 Sender-Based Message Logging
REFERENCES
3 Recovery-Oriented Computing
3.1 System Model
3.2 Fault Detection and Localization
3.2.1 Component Interactions Modeling and Anomaly Detection
3.2.2 Path Shapes Modeling and Root Cause Analysis
3.2.3 Inference-Based Fault Diagnosis
3.3 Microreboot
3.3.1 Microrebootable System Design Guideline
3.3.2 Automatic Recovery with Microreboot
3.3.3 Implications of the Microrebooting Technique
3.4 Overcoming Operator Errors
3.4.1 The Operator Undo Model
3.4.2 The Operator Undo Framework
REFERENCES
4 Data and Service Replication
4.1 Service Replication
4.1.1 Replication Styles
4.1.2 Implementation of Service Replication
4.2 Data Replication
4.3 Optimistic Replication
4.3.1 System Models
4.3.2 Establish Ordering among Operations
4.3.3 State Transfer Systems
4.3.4 Operation Transfer System
4.3.5 Update Commitment
4.4 CAP Theorem
4.4.1 2 out 3
4.4.2 Implications of Enabling Partition Tolerance
REFERENCES
5 Group Communication Systems
5.1 System Model
5.2 Sequencer Based Group Communication System
5.2.1 Normal Operation
5.2.2 Membership Change
5.2.3 Proof of Correctness
5.3 Sender Based Group Communication System
5.3.1 Total Ordering Protocol
5.3.2 Membership Change Protocol
5.3.3 Recovery Protocol
5.3.4 The Flow Control Mechanism
5.4 Vector Clock Based Group Communication System
REFERENCES
6 Consensus and the Paxos Algorithms
6.1 The Consensus Problem
6.2 The Paxos Algorithm
6.2.1 Algorithm for Choosing a Value
6.2.2 Algorithm for Learning a Value
6.2.3 Proof of Correctness
6.2.4 Reasoning of the Paxos Algorithm
6.3 Multi-Paxos
6.3.1 Checkpointing and Garbage Collection
6.3.2 Leader Election and View Change
6.4 Dynamic Paxos
6.4.1 Dynamic Paxos
6.4.2 Cheap Paxos
6.5 Fast Paxos
6.5.1 The Basic Steps
6.5.2 Collision Recovery, Quorum Requirement, and Value Selection Rule
6.6 Implementations of the Paxos Family Algorithms
6.6.1 Hard Drive Failures
6.6.2 Multiple Coordinators
6.6.3 Membership Changes
6.6.4 Limited Disk Space for Logging
REFERENCES
7 Byzantine Fault Tolerance
7.1 The Byzantine Generals Problem
7.1.1 System Model
7.1.2 The Oral Message Algorithms
7.1.3 Proof of Correctness for the Oral Message Algorithms
7.2 Practical Byzantine Fault Tolerance
7.2.1 System Model
7.2.2 Overview of the PBFT Algorithm
7.2.3 Normal Operation of PBFT
7.2.4 Garbage Collection
7.2.5 View Change
7.2.6 Proof of Correctness
7.2.7 Optimizations
7.3 Fast Byzantine Agreement
7.4 Speculative Byzantine Fault Tolerance
7.4.1 The Agreement Protocol
7.4.2 The View Change Protocol
7.4.3 The Checkpointing Protocol
7.4.4 Proof of Correctness
REFERENCES
8 Cryptocurrency and Blockchain
8.1 History of Cryptocurrency
8.2 Bitcoin
8.2.1 Decentralized network and architecture
8.2.2 Self-contained cryptography
8.2.3 Decentralized data structure
8.2.4 Decentralized algorithms
8.3 Ethereum
8.3.1 Ethereum Computing Model
8.3.2 Block and Consensus
8.3.3 Tokenization
8.4 Attacks on Blockchain
References
9 Consensus Algorithms for Blockchain
9.1 Model on Blockchain Consensus
9.1.1 Requirements on Puzzle Design
9.1.2 Zero-Knowledge Proof
9.2 Proof of Work
9.3 Proof of Resources
9.3.1 Using Storage as Resource
9.3.2 Using Computing as Resource
9.4 Virtual Mining
9.4.1 PeerCoin PoS
9.4.2 Fixed-Epoch Time Based PoS Schemes
9.4.3 Proof of Elapsed Time
References
10 Blockchain Applications
10.1 The Value of Blockchain
10.1.1 Non-functional benefits
10.1.2 Functional Benefits
10.2 Blockchain-Enabled Cyber-Physical Systems
10.2.1 Cyber-Physical Systems
10.2.2 Application Categories
10.2.3 Blockchain-Enabled Operations in CPS
10.3 On Blockchain Throughput
10.3.1 On-Chain Approach
10.3.2 Off-Chain Approach
10.4 A Critical Look on Blockchain from Economy Perspective
10.4.1 Blockchain Technology from the Economic View
10.4.2 Economic Functions of Blockchain
10.4.3 Blockchain as a Financial Infrastructure
References
Index
EULA