Implementing Parallel and Distributed Systems

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Parallel and distributed systems (PADS) have evolved from the early days of computational science and supercomputers to a wide range of novel computing paradigms, each of which is exploited to tackle specific problems or application needs, including distributed systems, parallel computing, and cluster computing, generally called high-performance computing (HPC). Grid, Cloud, and Fog computing patterns are the most important of these PADS paradigms, which share common concepts in practice. Many-core architectures, multi-core cluster-based supercomputers, and Cloud Computing paradigms in this era of exascale computers have tremendously influenced the way computing is applied in science and academia (e.g., scientific computing and large-scale simulations). Implementing Parallel and Distributed Systems presents a PADS infrastructure known as Parvicursor that can facilitate the construction of such scalable and high-performance parallel distributed systems as HPC, Grid, and Cloud Computing. This book covers parallel programming models, techniques, tools, development frameworks, and advanced concepts of parallel computer systems used in the construction of distributed and HPC systems. It specifies a roadmap for developing high-performance client-server applications for distributed environments and supplies step-by-step procedures for constructing a native and object-oriented C++ platform. Parvicursor project is an essential effort to establish an OS-based framework to fast develop new high-performance distributed paradigms. The basic infrastructure is inspired by concepts mostly available in Cluster, Grid, and Cloud environments. From the viewpoint of the developer users, the Parvicursor.NET Framework provides a set of rich, cross-platform, object-oriented, high-performance, and low-level C++ class libraries which enable quick and solid development of the next-generation networked/distributed applications and paradigms. From the viewpoint of the Parvicursor core, it supplies some critical services developed with parts in kernel mode and some other parts in userspace. In the Chapter 8, we briefly introduce the .NET Framework and existing technologies within the platform. The presented material herein will make the understanding of the Parvicursor.NET Framework more tangible to the reader. In this section, we examine a practical example as an attempt to describe how porting the Microsoft .NET Framework classes written in C# language into native code through the standard C++ for Parvicursor.NET Framework is achieved. At the end, we point out a lot of technical points for developers targeting the Parvicursor platform in porting their applications from C# to native C++. Features: Hardware and software perspectives on parallelism Parallel programming many-core processors, computer networks and storage systems Parvicursor.NET Framework: a partial, native, and cross-platform C++ implementation of the .NET Framework xThread: a distributed thread programming model by combining thread-level parallelism and distributed memory programming models xDFS: a native cross-platform framework for efficient file transfer Parallel programming for HPC systems and supercomputers using message passing interface (MPI) Focusing on data transmission speed that exploits the computing power of multicore processors and cutting-edge system-on-chip (SoC) architectures, it explains how to implement an energy-efficient infrastructure and examines distributing threads amongst Cloud nodes. Taking a solid approach to design and implementation, this book is a complete reference for designing, implementing, and deploying these very complicated systems.

Author(s): Alireza Poshtkohi, M.B. Ghaznavi-Ghoushchi
Publisher: CRC Press
Year: 2023

Language: English
Pages: 426

Cover
Half Title
Title Page
Copyright Page
Dedication
Table of Contents
Preface
Acknowledgement
Authors
Chapter 1: Introduction
1.1 Introduction
1.2 History of Computing
1.2.1 Analogue Computers
1.2.2 Digital Computers: Modern Hardware Advances in Computer Architectures
1.3 A Brief Introduction to Parallel and Distributed Systems
1.4 Conclusion
Notes
Reference
Chapter 2: IoT and Distributed Systems
2.1 Introduction
2.2 CPS and IoT
2.3 Internet of Things (IoT)
2.4 Distributed Systems and Distributed Computing via IoT
Reference
Chapter 3: Advanced Operating System Concepts in Distributed Systems Design
3.1 Introduction
3.2 An Introduction to Modern Operating Systems
3.2.1 Process Management
3.2.2 Memory Management
3.2.3 Storage Management (SM)
3.2.4 Userspace and Kernel Space
3.3 Memory Hierarchy Models
3.3.1 Main Memory
3.4 A Brief Review on Modern OS Kernels
3.4.1 Microkernel Operating System
3.4.2 Monolithic Operating System
3.4.3 Hybrid Operating System
3.4.4 Exokernel Operating System
3.4.5 Object-Oriented Operating System (O3S)
3.4.6 Language-Based Operating System (LOS)
3.4.7 System Calls to Request Linux and Windows OS Services
3.4.8 System Calls in the Linux Operating System
3.4.9 Costs Due to the Mode Switch of System Calls
3.4.10 Costs Due to the Footprints of System Calls
3.4.11 Effect of System Calls on the Userspace IPC
3.4.12 Critical Overheads due to Frequent Copies
3.4.13 System Calls in the Windows Operating System
3.4.14 Timeline of Operating System Evolution
Chapter 4: Parallelism for the Many-Core Era: Hardware and Software Perspectives
4.1 Introduction
4.2 Exploiting Instruction-Level Parallelism (ILP) by Hardware and Software Approaches
4.2.1 Superscalar Processors
4.2.2 The Downside of Instruction-Level Parallelism and Power Consumption Problem
4.3 Thread-Level Parallelism (TLP) and Multi-Processor and Multi-Core Parallelism
4.3.1 Introduction
4.3.2 Thread-Level Parallelism
4.3.3 Multi-Processor Parallelism
4.3.4 Multi-Core Parallelism
4.4 Heterogenous Computing on Many Cores
4.5 Latest Optimal Approaches in Synchronisation
4.5.1 Deadlock
4.5.2 Race Condition
4.5.3 Priority Inversion
4.5.4 Starvation
4.5.5 Livelock
4.5.6 Convoying
4.6 Installation Steps of the Integrated Development Environment (IDE) Code::Blocks on Unix-Like Operating Systems Such as Linux
References
Chapter 5: Parallelisation for the Many- Core Era: A Programming Perspective
5.1 Introduction
5.2 Building Cross-Platform Concurrent Systems Utilising Multi-Threaded Programming on Top of the Parvicursor.NET Framework for Distributed Systems
5.2.1 Introduction
5.2.2 Thread Creation and Management in the Parvicursor.NET Framework
5.2.3 Implementing the System::Threading::Timer Class of the ECMA Standard Based on the Thread Class in the Parvicursor.NET Framework
5.2.4 Synchronisation in the Parvicursor.NET Framework
5.2.5 Two Concurrency Examples Relied on Synchronisation Classes in the Parvicursor.NET Framework
5.2.6 Thread Pools: Design and Implementation of the System::Threading::ThreadPool Class of the ECMA .NET Standard Based on the Parvicursor.NET Framework
5.2.7 Four Examples of Concurrency and Parallel Processing Based on the ThreadPool Class
5.2.8 Low-Level Implementation of Threads in the Linux Operating System: Userspace Fibres
5.2.9 A Practical Implementation of Synchronisation: Linux Futexes
5.3 Non-Blocking Synchronisation and Transactional Memory
5.3.1 Introduction
5.3.2 Non-Blocking Synchronisation Algorithms
5.3.3 Transactional Memory
References
Chapter 6: Storage Systems: A Parallel Programming Perspective
6.1 Introduction
6.2 Storage Systems and Disc Input/Output Mechanisms for Use in Distributed Systems
6.2.1 Introduction
6.2.2 Disc Drives from a Hardware Perspective
6.2.3 Disc Input/Output Scheduler in Operating Systems
6.2.4 Benchmarking the Performance and Throughput of Disc I/O Based on the IOzone Tool
6.3 Cross-Platform Disc I/O Programming and Manipulation of Folders Based on the Parvicursor.NET Framework for Distributed Systems
6.3.1 Storage and Retrieval of Data Files Based on the FileStream Class
6.3.2 Two Non-Concurrent and Concurrent Examples for Using the FileStream Class
6.3.3 Management of Files and Folders Based on the Two Classes Directory and File
6.3.4 Two Examples of Non-Concurrent and Concurrent Use of the Directory Class
Reference
Chapter 7: Computer Networks: A Parallel Programming Approach
7.1 Substantial Concepts of Computer Networks for Distributed Systems Design
7.1.1 Introduction
7.2 An Introduction to Modern Computer Networks
7.3 OSI Model and TCP/IP and UDP Protocol Suite to Structure Communications in Computer Networks
7.3.1 The OSI Reference Model
7.3.2 The TCP/IP Protocol Suite
7.4 Network Programming Based on TCP Sockets and Thread-Level Parallesim to Develop Distributed Client-Server Programs Atop the Parvicursor.NET Framework
7.4.1 An Introduction to the Socket Programming Model
7.4.2 A General Description of Network Programming Classes in the Parvicursor.NET Framework
7.4.3 A Short Overview of the HTTP 20 Protocol
7.4.4 Example 1: A Simple Client Program of the HTTP Protocol to Retrieve a Web Page
7.4.5 Example 2: A Concurrent Client/Server Program Based on Threads to Upload a File from a Client to a Server
7.5 Asynchronous Methods in Parvicursor Platform: An Optimum Computing Paradigm to Exploit the Processing Power of Multi/Many-Core Systems for Increasing the Performance of Distributed Systems
7.5.1 Introduction
7.5.2 Example: Asynchronous Translation of Domain Names to IP Addresses Based on an Asynchronous DNS Resolver
7.5.3 Implementation of an Asynchronous DNS Resolver Based on the Parvicursor.NET Framework
7.6 Addressing the Scalability Issue of Communication Systems in the Parvicursor Platform
7.6.1 Introduction
7.6.2 Design Strategies of Client-Server Applications
7.6.3 Asynchronous Sockets in Parvicursor Platform as a Proposed Standard to Develop Highly Scalable Optimum Communication Systems for Distributed Systems
7.6.4 Example 1: An Asynchronous Echo Client-Server
7.6.5 Example 2: The Design and Implementation of a Highly Concurrent and Scalable HTTP Proxy Server Supporting Tens of Thousands of Client Connections
Notes
References
Chapter 8: Parvicursor.NET Framework: A Partial, Native, and Cross-Platform C++ Implementation of the .NET Framework
8.1 Introduction
8.2 Common Language Infrastructure (CLI)
8.3 Parvicursor.NET Framework
8.4 The Compilation and Loading Process of .NET-CLI-Based Application Programs
8.4.1 AOT and JIT Compilations
8.4.2 Cross-Mode Execution Switches (C++/CLI Managed/Unmanaged Interop Transitions)
8.4.3 Platform Invocation Services (P/Invoke)
8.4.4 .NET Memory Footprint
8.5 The Compilation and Loading Process of Native Parvicursor.NET-Based Application Programs
8.6 Parvicursor.NET Socket Interface (PSI)
8.7 Parvicursor Object Passing Interface (POPI) over PSI
8.8 Cross-Process, Cross-Language and Cross-Platform Parvicursor.NET Remoting Architecture (PR)
8.9 Parvicursor.NET Framework Programming Reference Guide
8.9.1 Using Namespace System
8.9.2 Using Namespace System::IO
8.9.3 Using Namespace System::Threading
8.9.4 Using Namespace System::Collections
8.9.5 Using Namespace System::Net
8.9.6 Using Namespace System::Net::Sockets
8.9.7 Using Namespace Parvicursor::Net
8.9.8 Using Namespace Parvicursor::Serialisation
8.10 Presented Parvicursor.NET Sample Usages
References
Chapter 9: Parvicursor Infrastructure to Facilitate the Design of Grid/Cloud Computing and HPC Systems
9.1 Parvicursor: A Native and Cross-Platform Peer-to-Peer Framework to Design the Next-Generation Distributed System Paradigms
9.1.1 Introduction
9.2 Cross-Platform and High-Performance Parvicursor Platform to Develop the Next-Generation Distributed Middleware Systems
9.2.1 Network Communication
9.2.2 Heterogeneity
9.2.3 Scalability
9.2.4 Standardisation
9.2.5 Performance
9.2.6 Resource Sharing
9.2.7 Concurrency Support of Multicore Processors and Distributed Threading
9.3 Peer-to-Peer Paradigms and the Use of the Parvicursor Platform to Construct Large-Scale P2P Distributed Middleware Platforms such as Supercomputers and Traditional Distributed Systems
9.4 xThread Abstraction: The Distributed Multi-threaded Programming Model Proposed by Parvicursor Platform for Distributed Systems
9.5 Practical Examples Using the xThread Abstraction
9.5.1 Example 1: A Simple Sum of Two Numbers Based on a Distributed P2P Architecture with Two Nodes
9.5.2 Example 2: Calculating the Value of the Number π to n Decimal Places Grounded on a Distributed P2P Master/Slave Architecture with m+1 Nodes
9.6 The Proof of Concept of the Philosophy behind the Parvicursor Project as a New Standard to Build the Next-Generation Distributed P2P Middleware Systems: The Design and Implementation of a Middleware Supporting Third-Party Data Transfers in xDFS Framew
9.7 Our Future Works to Extend the Parvicursor Platform
Notes
References
Chapter 10: xDFS: A Native Cross-Platform Framework for Efficient File Transfers in Dynamic Cloud/Internet Environments
10.1 Introduction
10.2 The Next-Generation Requirements of Grid-Based File Transport Protocols
10.2.1 Towards a Low-Cost, Low-Power and Low-Overhead Data Transfer Protocol for Sensor and Ad Hoc Networks
10.2.2 Universality and Interoperability Issues and Scenario-Based Complexity Reduction
10.2.3 Towards a Service-Oriented Approach (SOA) for Secure File Transfers
10.3 High-Performance Server Design Architectures for Grid-Based File Transfer Protocols
10.3.1 Data Copies
10.3.2 Memory Allocation
10.3.3 Context Switching
10.3.4 Synchronisation Issues
10.4 Some Proposed xDFS Server Architectures in FTSM Upload Mode
10.4.1 Multi-Processed xDFS Server Architecture
10.4.2 Multi-Threaded xDFS Server Architecture
10.4.3 Multi-Threaded Event-Driven Pipelined xDFS Server Architecture
10.5 DotDFS and xDFS File Transport Protocols
10.5.1 Overall xDFS Features
10.5.1.1 Transport Independence
10.5.1.2 Flexible Connectivity
10.5.1.3 Feature Negotiation and Prerequisites
10.5.1.4 Resource Access
10.5.1.5 Unicode File Name Support
10.5.1.6 Distributed File System Mode (DFSM)
10.5.1.7 Path Mode (PathM)
10.5.1.8 Authentication, Data Integrity, and Data Confidentiality
10.5.2 xDFS xFTSM Protocol
10.6 The Native, Cross-Platform, and Cross-Language Implementation of xDFS Protocol
10.6.1 The Architecture of xDFS Implementation in Download and Upload Modes
10.6.2 A Novel Hybrid Concurrency Pattern for xDFS POSIX-AIO-Enabled Implementation (PHCP)
10.6.3 Some Important Points Regarding the Implementation of the xDFS Protocol
10.6.3.1 The Overheads of Exception Handling
10.6.3.2 Vectored I/O
10.6.3.3 Cross-Language, Cross-Runtime and Cross-Platform Parvicursor.NET Wrappers
10.6.3.4 Parvicursor.NET Inline Expansion
10.6.3.5 Parvicursor.NET Runtime Profiler
10.7 Comparison of xDFS Protocol with DotDFS, FTP, GridFTP and HTTP Protocols
10.7.1 Some Major Criticisms on FTP and GridFTP Protocols and xDFS/DotDFS Protocol Alternatives over Them
10.8 Experimental Studies
10.8.1 Single Stream Performance in Download Mode
10.8.2 Single Stream Performance in Upload Mode
10.8.3 Harnessing Parallelism in Download Mode
10.8.4 Harnessing Parallelism in Upload Mode
10.8.5 Full xDFS/DotDFS Runtime Characterisation
10.9 Conclusion and Future Works
References
Chapter 11: Parallel Programming Languages for High-Performance Computing
11.1 Introduction
11.2 A Brief History of Supercomputing
11.3 Parallel Programming Models and Languages for HPC
11.3.1 MPI
11.3.2 Charm++
11.3.3 Partitioned Global Address Space (PGAS)
11.4 A Concise Introduction to the MPI Standard in C Language
11.4.1 MPI Setup Routines
11.4.2 MPI Blocking Point-to-Point Communication Routines
11.4.3 MPI Non-Blocking Point-to-Point Communication Routines
11.4.4 MPI Collective Operation Routines
11.5 Case Studies
11.5.1 A Warm-Up MPI Example
11.5.2 Scalability of MPI Programs
11.5.3 Parallel Sparse Matrix-Vector Multiplication
11.5.4 Parallel Sparse Matrix-Matrix Multiplication
Index