If you need to learn CUDA but dont have experience with parallel computing, CUDA Programming: A Developers Introduction offers a detailed guide to CUDA with a grounding in parallel fundamentals. It starts by introducing CUDA and bringing you up to speed on GPU parallelism and hardware, then delving into CUDA installation. Chapters on core concepts including threads, blocks, grids, and memory focus on both parallel and CUDA-specific issues. Later, the book demonstrates CUDA in practice for optimizing applications, adjusting to new hardware, and solving common problems.
Comprehensive introduction to parallel programming with CUDA, for readers new to both
Detailed instructions help readers optimize the CUDA software development kit
Practical techniques illustrate working with memory, threads, algorithms, resources, and more
Covers CUDA on multiple hardware platforms: Mac, Linux and Windows with several NVIDIA chipsets
Each chapter includes exercises to test reader knowledge
Author(s): Shane Cook
Series: Applications of GPU computing
Edition: 1
Publisher: Morgan Kaufmann
Year: 2012
Language: English
Pages: 600
Tags: Библиотека;Компьютерная литература;CUDA / OpenCL;
Front Cover......Page 1
CUDA Programming: A Developer’s Guide to Parallel
Computing with GPUs......Page 4
Copyright......Page 5
Contents......Page 6
Preface......Page 14
INTRODUCTION......Page 16
VON NEUMANN ARCHITECTURE......Page 17
CRAY......Page 20
CONNECTION MACHINE......Page 21
CELL PROCESSOR......Page 22
MULTINODE COMPUTING......Page 24
THE EARLY DAYS OF GPGPU CODING......Page 26
THE DEATH OF THE SINGLE-CORE SOLUTION......Page 27
NVIDIA AND CUDA......Page 28
GPU HARDWARE......Page 30
ALTERNATIVES TO CUDA......Page 31
CONCLUSION......Page 34
TRADITIONAL SERIAL CODE......Page 36
SERIAL/PARALLEL PROBLEMS......Page 38
CONCURRENCY......Page 39
TYPES OF PARALLELISM......Page 42
FLYNN’S TAXONOMY......Page 45
SOME COMMON PARALLEL PATTERNS......Page 46
CONCLUSION......Page 51
PC ARCHITECTURE......Page 52
GPU HARDWARE......Page 57
COMPUTE LEVELS......Page 61
INSTALLING THE SDK UNDER WINDOWS......Page 68
VISUAL STUDIO......Page 69
LINUX......Page 73
INSTALLING A DEBUGGER......Page 77
COMPILATION MODEL......Page 81
ERROR HANDLING......Page 82
CONCLUSION......Page 83
THREADS......Page 84
BLOCKS......Page 93
GRIDS......Page 98
WARPS......Page 106
BLOCK SCHEDULING......Page 110
A PRACTICAL EXAMPLE—HISTOGRAMS......Page 112
CONCLUSION......Page 118
INTRODUCTION......Page 122
CACHES......Page 123
REGISTER USAGE......Page 126
SHARED MEMORY......Page 135
CONSTANT MEMORY......Page 165
GLOBAL MEMORY......Page 182
TEXTURE MEMORY......Page 215
CONCLUSION......Page 217
SERIAL AND PARALLEL CODE......Page 218
PROCESSING DATASETS......Page 224
PROFILING......Page 234
AN EXAMPLE USING AES......Page 246
CONCLUSION......Page 280
References......Page 281
MULTI-CPU SYSTEMS......Page 282
MULTI-GPU SYSTEMS......Page 283
ALGORITHMS ON MULTIPLE GPUS......Page 284
WHICH GPU?......Page 285
SINGLE-NODE SYSTEMS......Page 289
STREAMS......Page 290
MULTIPLE-NODE SYSTEMS......Page 305
CONCLUSION......Page 316
STRATEGY 1: PARALLEL/SERIAL GPU/CPU PROBLEM BREAKDOWN......Page 320
STRATEGY 2: MEMORY CONSIDERATIONS......Page 335
STRATEGY 3: TRANSFERS......Page 349
STRATEGY 4: THREAD USAGE, CALCULATIONS, AND DIVERGENCE......Page 376
STRATEGY 5: ALGORITHMS......Page 401
STRATEGY 6: RESOURCE CONTENTIONS......Page 429
STRATEGY 7: SELF-TUNING APPLICATIONS......Page 450
CONCLUSION......Page 454
LIBRARIES......Page 456
CUDA COMPUTING SDK......Page 490
DIRECTIVE-BASED PROGRAMMING......Page 506
WRITING YOUR OWN KERNELS......Page 514
CONCLUSION......Page 517
INTRODUCTION......Page 518
CPU PROCESSOR......Page 520
GPU DEVICE......Page 522
PCI-E BUS......Page 524
CPU MEMORY......Page 525
AIR COOLING......Page 527
LIQUID COOLING......Page 528
DESKTOP CASES AND MOTHERBOARDS......Page 532
MASS STORAGE......Page 533
POWER CONSIDERATIONS......Page 537
OPERATING SYSTEMS......Page 540
CONCLUSION......Page 541
ERRORS WITH CUDA DIRECTIVES......Page 542
PARALLEL PROGRAMMING ISSUES......Page 551
ALGORITHMIC ISSUES......Page 559
FINDING AND AVOIDING ERRORS......Page 562
DEVELOPING FOR FUTURE GPUS......Page 570
FURTHER RESOURCES......Page 575
CONCLUSION......Page 577
References......Page 578
Index......Page 580