Learn how to accelerate C++ programs using data parallelism and SYCL.
This book enables C++ programmers to be at the forefront of this exciting and important development that is helping to push computing to new levels. This updated second edition is full of practical advice, detailed explanations, and code examples to illustrate key topics.
SYCL enables access to parallel resources in modern accelerated heterogeneous systems. Now, a single C++ application can use any combination of devices–including GPUs, CPUs, FPGAs, and ASICs–that are suitable to the problems at hand.
This book teaches data-parallel programming using C++ with SYCL and walks through everything needed to program accelerated systems. The book begins by introducing data parallelism and foundational topics for effective use of SYCL. Later chapters cover advanced topics, including error handling, hardware-specific programming, communication and synchronization, and memory model considerations.
Computer hardware development is driven by our needs to solve larger and more complex problems, but those hardware advances are largely useless unless programmers like you and me have languages that allow us to implement our ideas and exploit the power available with reasonable effort. There are numerous examples of amazing hardware, and the first solutions to use them have often been proprietary since it saves time not having to bother with committees agreeing on standards. However, in the history of computing, they have eventually always ended up as vendor lock-in—unable to compete with open standards that allow developers to target any hardware and share code—because ultimately the resources of the worldwide community and ecosystem are far greater than any individual vendor, not to mention how open software standards drive hardware competition.
If you are new to parallel programming that is okay. If you have never heard of SYCL or the DPC++ compilerthat is also okay. Compared with programming in CUDA, C++ with SYCL offers portability beyond NVIDIA, and portability beyond GPUs, plus a tight alignment to enhance modern C++ as it evolves too. C++ with SYCL offers these advantages without sacrificing performance. C++ with SYCL allows us to accelerate our applications by harnessing the combined capabilities of CPUs, GPUs, FPGAs, and processing devices of the future without being tied to any one vendor.
SYCL is an industry-driven Khronos Group standard adding advanced support for data parallelism with C++ to exploit accelerated (heterogeneous) systems. SYCL provides mechanisms for C++ compilers that are highly synergistic with C++ and C++ build systems. DPC++ is an open source compiler project based on LLVM that adds SYCL support. All examples in this book should work with any C++ compiler supporting SYCL 2020 including the DPC++ compiler. If you are a C programmer who is not well versed in C++, you are in good company. Several of the authors of this book happily share that they picked up much of C++ by reading books that utilized C++ like this one. With a little patience, this book should also be approachable by C programmers with a desire to write modern C++ programs.
All source code for the examples used in this book is freely available on GitHub. The examples are written in modern SYCL and are regularly updated to ensure compatibility with multiple compilers.
What You Will Learn:
Accelerate C++ programs using data-parallel programming
Use SYCL and C++ compilers that support SYCL
Write portable code for accelerators that is vendor and device agnostic
Optimize code to improve performance for specific accelerators
Be poised to benefit as new accelerators appear from many vendors
Who This Book Is For:
New data-parallel programming and computer programmers interested in data-parallel programming using C++.
"This book, now in is second edition, is the premier resource to learn SYCL 2020 and is the ONLY book you need to become part of this community." Erik Lindahl, GROMACS and Stockholm University
Author(s): James Reinders; Ben Ashbaugh; James Brodman; Michael Kinsner; John Pennycook; Xinmin Tian
Publisher: Apress
Year: 2023
Language: English
Pages: 648
Cover
Front Matter
1. Introduction
2. Where Code Executes
3. Data Management
4. Expressing Parallelism
5. Error Handling
6. Unified Shared Memory
7. Buffers
8. Scheduling Kernels and Data Movement
9. Communication and Synchronization
10. Defining Kernels
11. Vectors and Math Arrays
12. Device Information and Kernel Specialization
13. Practical Tips
14. Common Parallel Patterns
15. Programming for GPUs
16. Programming for CPUs
17. Programming for FPGAs
18. Libraries
19. Memory Model and Atomics
20. Backend Interoperability
21. Migrating CUDA Code
Back Matter