Asynchronous On-Chip Networks and Fault-Tolerant Techniques is the first comprehensive study of fault-tolerance and fault-caused deadlock effects in asynchronous on-chip networks, aiming to overcome these drawbacks and ensure greater reliability of applications.
As a promising alternative to the widely used synchronous on-chip networks for multicore processors, asynchronous on-chip networks can be vulnerable to faults even if they can deliver the same performance with much lower energy and area compared with their synchronous counterparts – faults can not only corrupt data transmission but also cause a unique type of deadlock. By adopting a new redundant code along with a dynamic fault detection and recovery scheme, the authors demonstrate that asynchronous on-chip networks can be efficiently hardened to tolerate both transient and permanent faults and overcome fault-caused deadlocks.
This book will serve as an essential guide for researchers and students studying interconnection networks, fault-tolerant computing, asynchronous system design, circuit design and on-chip networking, as well as for professionals interested in designing fault-tolerant and high-throughput asynchronous circuits.
Author(s): Wei Song, Guangda Zhang
Publisher: CRC Press
Year: 2022
Language: English
Pages: 380
City: Boca Raton
Cover
Half Title
Title Page
Copyright Page
Dedication
Contents
Preface
CHAPTER 1: Introduction
1.1. ASYNCHRONOUS CIRCUITS
1.2. ASYNCHRONOUS ON-CHIP NETWORKS
1.3. FAULT-TOLERANT ASYNCHRONOUS ON-CHIP NETWORKS
1.3.1. Protection for QDI Links
1.3.2. Deadlock Detection
1.3.3. Network Recovery
CHAPTER 2: Asynchronous Circuits
2.1. CIRCUIT CLASSIFICATION
2.1.1. Delay-Insensitive
2.1.2. Quasi-Delay-Insensitive
2.1.3. Speed-Independent
2.1.4. Relaxed QDI
2.1.5. Self-Timed
2.2. HANDSHAKE PROTOCOLS
2.2.1. Return-to-Zero
2.2.2. Non-Return-to-Zero
2.3. DATA ENCODING
2.3.1. Non-Delay-Insensitive Codes
2.3.2. Delay-Insensitive Codes
2.3.2.1. 1-of-n Encoding
2.3.2.2. m-of-n Encoding
2.3.2.3. Other DI Encoding
2.3.3. Code Evaluation
2.4. ASYNCHRONOUS PIPELINES
2.4.1. Bundled-Data Pipeline
2.4.2. Multi-Rail Pipeline
2.4.3. Performance Comparison
2.4.3.1. Pipeline Delay
2.4.3.2. Pipeline Throughput
2.4.3.3. Area and Power Consumption
2.5. IMPLEMENTATION OF ASYNCHRONOUS CIRCUITS
2.5.1. Functional Analysis
2.5.2. Common Circuit Components
2.5.2.1. Basic Components
2.5.2.2. Arbiters
2.5.2.3. Allocators
2.5.3. Metastability and Synchronization
2.5.4. Optimization with Traditional EDA Tools
2.5.4.1. Loop Elimination
2.5.4.2. Speed Optimization
CHAPTER 3: Asynchronous Networks-on-Chip
3.1. CONCEPTS OF NETWORKS-ON-CHIP
3.1.1. Network Layer Model
3.1.2. Network Topology
3.1.3. Switching Techniques
3.1.3.1. Circuit Switching and Packet Switching
3.1.3.2. Virtual Channel
3.1.3.3. Other Flow Control Methods
3.1.3.4. Quality of Service
3.1.4. Routing Algorithms
3.1.4.1. Deterministic and Non-Deterministic
3.1.4.2. Deadlock and Livelock
3.2. ASYNCHRONOUS NETWORKS-ON-CHIP
3.2.1. Taxonomy of Asynchronous On-Chip Networks
3.2.2. Previous Asynchronous NoCs
3.2.2.1. SpiNNaker
3.2.2.2. ASPIN
3.2.2.3. QoS NoC
3.2.2.4. ANOC
3.2.2.5. MANGO
3.2.2.6. QNoC
CHAPTER 4: Optimizing Asynchronous On-Chip Networks
4.1. CHANNEL SLICING
4.1.1. Synchronization Overhead
4.1.2. Channel Slicing
4.1.3. Lookahead Pipeline
4.1.4. Channel Sliced Wormhole Router
4.1.4.1. Router Structure
4.1.4.2. Performance Evaluation
4.2. SPATIAL DIVISION MULTIPLEXING
4.2.1. Problems of the Virtual Channel Flow Control
4.2.1.1. Slow Switch Allocation
4.2.1.2. Large Area Overhead
4.2.1.3. Long Pipeline Synchronization Latency
4.2.2. Spatial Division Multiplexing
4.2.3. SDM Router
4.2.3.1. Router Structure
4.2.3.2. Performance Evaluation
4.2.4. Comparison between SDM and VC
4.2.4.1. Area Model
4.2.4.2. Latency Model
4.2.4.3. Model for VC Routers
4.2.4.4. Performance Analysis
4.3. AREA REDUCTION USING CLOS NETWORKS
4.3.1. Clos Switching Networks
4.3.2. Dispatching Algorithm
4.3.2.1. Concurrent Round-Robin Dispatching
4.3.2.2. Asynchronous Dispatching
4.3.2.3. Performance of CRRD and AD
4.3.3. Asynchronous Clos Scheduler
4.3.3.1. Implementation
4.3.3.2. Performance
4.3.4. SDM Router Using 2-Stage Clos Switch
4.3.4.1. Asynchronous 2-Stage Clos Switch
4.3.4.2. Router Implementation
4.3.4.3. Performance Evaluation
CHAPTER 5: Fault-Tolerant Asynchronous Circuits
5.1. FAULT CLASSIFICATION
5.1.1. Transient Faults
5.1.2. Permanent Faults
5.1.3. Intermittent Faults
5.2. FAULT-TOLERANT TECHNIQUES
5.2.1. Masking Factors
5.2.2. Redundancy Techniques
5.3. IMPACT OF TRANSIENT FAULTS ON QDI PIPELINES
5.3.1. Faults on Synchronous and QDI Pipelines
5.3.2. Impact Modeling of Transient Faults
5.3.2.1. Faults on Data with Positive Ack
5.3.2.2. Faults on Data with Negative Ack
5.3.2.3. Faults on Completion Detector and Ack
5.3.2.4. Physical-Layer Deadlock
5.4. DEADLOCK MODELING
5.4.1. Deadlock Caused by Permanent Faults
5.4.2. Deadlock Caused by Transient Faults
5.4.3. Deadlock Analysis
5.5. RELATED WORK
5.5.1. Tolerating Transient Faults
5.5.1.1. Information Redundancy
5.5.1.2. Physical and Other Redundancy
5.5.2. Management for Permanent Faults and Deadlocks
5.5.2.1. Conventional Techniques
5.5.2.2. Fault-Caused Physical-Layer Deadlocks
5.6. GENERAL DEADLOCK MANAGEMENT STRATEGY
CHAPTER 6: Fault-Tolerant Coding
6.1. COMPARISON WITH RELATED WORK
6.1.1. Non-QDI Designs
6.1.2. QDI Designs
6.1.3. Unordered and Systematic Codes
6.2. DIRC CODING SCHEME
6.2.1. Arithmetic Rules
6.2.1.1. Rules for 1-of-n Codes
6.2.1.2. Rules for m-of-n Codes
6.2.2. Delay-Insensitive Redundant Check Codes
6.2.3. Check Generation and Error Correction
6.2.4. Error Filtering
6.2.5. Code Evaluation
6.3. IMPLEMENTATION OF DIRC PIPELINES
6.3.1. 1-of-n Adders and Error Filters
6.3.2. Generation of Check Words
6.3.3. Redundant Protection of Acknowledge Wires
6.3.4. Variants of DIRC Pipelines
6.3.4.1. Latency and Area
6.3.4.2. Different Construction Patterns
6.3.4.3. DIRC in Asynchronous NoCs
6.4. LATENCY AND AREA MODELS
6.4.1. Latency Analysis
6.4.2. Area Model for One Stage
6.4.3. Models for Different Constructions
6.5. EXPERIMENTAL RESULTS
6.5.1. Performance Evaluation
6.5.2. Fault-Tolerance Evaluation
6.5.3. Comparison with Related Work
6.6. SUMMARY
CHAPTER 7: Deadlock Detection
7.1. BASELINE QDI NOC
7.1.1. Network Principles
7.1.2. Asynchronous Protocols
7.2. FAULT IMPACT ON DATA PATH
7.2.1. Fault Classifications
7.2.2. General Fault Impact
7.3. DETECTING PERMANENT FAULT ON DATA PATH
7.3.1. Data Path Partition
7.3.2. Deadlock Caused by Permanent Link Fault
7.3.3. Deadlock Patterns Due to Permanent Link Fault
7.3.4. Time-Out Detection Mechanism
7.3.5. Detection of Permanent Router Fault
7.4. HANDLING DEADLOCKS CAUSED BY DIFFERENT FAULTS
7.4.1. Fault Diagnosis
7.4.2. Modified Time-Out Mechanism
7.5. SUMMARY
CHAPTER 8: Deadlock Recovery
8.1. DEADLOCK REMOVAL BY DRAIN AND RELEASE
8.1.1. The Drain Operation
8.1.2. Buffer Controller at Router Input
8.1.3. The Release Operation
8.2. FAULTY LINK ISOLATION BY USING SDM
8.2.1. Spatial Division Multiplexing
8.2.2. Switch Allocator Reconfiguration
8.3. RECOVERY FROM INTERMITTENT AND TRANSIENT FAULTS
8.4. TECHNICAL ISSUES
8.5. SUMMARY
CHAPTER 9: Summary
9.1. OVERALL REMARKS
9.2. FUTURE WORK
Bibliography