Computer Architecture: A Quantitative Approach, 3rd Edition, 2002

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

This best-selling title, considered for over a decade to be essential reading for every serious student and practitioner of computer design, has been updated throughout to address the most important trends facing computer designers today. In this edition, the authors bring their trademark method of quantitative analysis not only to high performance desktop machine design, but also to the design of embedded and server systems. They have illustrated their principles with designs from all three of these domains, including examples from consumer electronics, multimedia and web technologies, and high performance computing.The book retains its highly rated features: Fallacies and Pitfalls, which share the hard-won lessons of real designers; Historical Perspectives, which provide a deeper look at computer design history; Putting it all Together, which present a design example that illustrates the principles of the chapter; Worked Examples, which challenge the reader to apply the concepts, theories and methods in smaller scale problems; and Cross-Cutting Issues, which show how the ideas covered in one chapter interact with those presented in others. In addition, a new feature, Another View, presents brief design examples in one of the three domains other than the one chosen for Putting It All Together.The authors present a new organization of the material as well, reducing the overlap with their other text, Computer Organization and Design: A Hardware/Software Approach 2/e, and offering more in-depth treatment of advanced topics in multithreading, instruction level parallelism, VLIW architectures, memory hierarchies, storage devices and network technologies.Also new to this edition, is the adoption of the MIPS 64 as the instruction set architecture. In addition to several online appendixes, two new appendixes will be printed in the book: one contains a complete review of the basic concepts of pipelining, the other provides solutions a selection of the exercises. Both will be invaluable to the student or professional learning on her own or in the classroom. Hennessy and Patterson continue to focus on fundamental techniques for designing real machines and for maximizing their cost/performance. * Presents state-of-the-art design examples including:* IA-64 architecture and its first implementation, the Itanium * Pipeline designs for Pentium III and Pentium IV * The cluster that runs the Google search engine * EMC storage systems and their performance* Sony Playstation 2* Infiniband, a new storage area and system area network* SunFire 6800 multiprocessor server and its processor the UltraSPARC III* Trimedia TM32 media processor and the Transmeta Crusoe processor* Examines quantitative performance analysis in the commercial server market and the embedded market, as well as the traditional desktop market.Updates all the examples and figures with the most recent benchmarks, such as SPEC 2000.* Expands coverage of instruction sets to include descriptions of digital signal processors, media processors, and multimedia extensions to desktop processors.* Analyzes capacity, cost, and performance of disks over two decades.Surveys the role of clusters in scientific computing and commercial computing.* Presents a survey, taxonomy, and the benchmarks of errors and failures in computer systems.* Presents detailed descriptions of the design of storage systems and of clusters.* Surveys memory hierarchies in modern microprocessors and the key parameters of modern disks.* Presents a glossary of networking terms.

Author(s): John L. Hennessy
Edition: 3
Year: 2002

Language: English
Pages: 1142

1 Fundamentals of Computer Design......Page 2
Introduction......Page 3
The Changing Face of Computing and the Task of the Computer Designer......Page 6
Technology Trends......Page 13
Cost, Price and their Trends......Page 16
Measuring and Reporting Performance......Page 27
Quantitative Principles of Computer Design......Page 42
Putting It All Together: Performance and Price- Performance......Page 51
Another View: Power Consumption and Ef . ciency as the Metric......Page 60
Fallacies and Pitfalls......Page 61
Concluding Remarks......Page 71
Historical Perspective and References......Page 72
2 Instruction Set Principles and Examples......Page 88
Introduction......Page 89
Classifying Instruction Set Architectures......Page 91
Memory Addressing......Page 95
Addressing Modes for Signal Processing......Page 101
Type and Size of Operands......Page 104
Operands for Media and Signal Processing......Page 106
Operations for Media and Signal Processing......Page 108
Instructions for Control Flow......Page 112
Encoding an Instruction Set......Page 117
Crosscutting Issues: The Role of Compilers......Page 120
Putting It All Together: The MIPS Architecture......Page 130
Another View: The Trimedia TM32 CPU......Page 141
Fallacies and Pitfalls......Page 142
Concluding Remarks......Page 148
Historical Perspective and References......Page 150
3 Instruction-Level Parallelism and itsDynamic Exploitation......Page 168
Instruction- Level Parallelism: Concepts and Challenges......Page 169
Overcoming Data Hazards with Dynamic Scheduling......Page 179
Dynamic Scheduling: Examples and the Algorithm......Page 187
Reducing Branch Costs with Dynamic Hardware Prediction......Page 195
High Performance Instruction Delivery......Page 209
Taking Advantage of More ILP with Multiple Issue......Page 216
Hardware- Based Speculation......Page 226
Studies of the Limitations of ILP......Page 242
Limitations on ILP for Realizable Processors......Page 257
Putting It All Together: The P6 Microarchitecture......Page 264
Another View: Thread Level Parallelism......Page 277
Fallacies and Pitfalls......Page 278
Concluding Remarks......Page 281
Historical Perspective and References......Page 285
4 Exploiting Instruction Level Parallelism with Software Approaches......Page 302
Basic Compiler Techniques for Exposing ILP......Page 303
Static Branch Prediction......Page 313
Static Multiple Issue: the VLIW Approach......Page 316
Advanced Compiler Support for Exposing and Exploiting ILP......Page 320
Hardware Support for Exposing More Parallelism at Compile- Time......Page 342
Crosscutting Issues......Page 352
Putting It All Together: The Intel IA- 64 Architecture and Itanium Processor......Page 353
Another View: ILP in the Embedded and Mobile Markets......Page 365
Fallacies and Pitfalls......Page 374
Concluding Remarks......Page 375
Historical Perspective and References......Page 377
5 Memory- Hierarchy Design......Page 386
Introduction......Page 387
Review of the ABCs of Caches......Page 390
Cache Performance......Page 404
Reducing Cache Miss Penalty......Page 412
Reducing Miss Rate......Page 422
Reducing Cache Miss Penalty or Miss Rate via Parallelism......Page 435
Reducing Hit Time......Page 444
Main Memory and Organizations for Improving Performance......Page 449
Memory Technology......Page 456
Virtual Memory......Page 462
Protection and Examples of Virtual Memory......Page 471
Crosscutting Issues in the Design of Memory Hierarchies......Page 481
Putting It All Together: Alpha 21264 Memory Hierarchy......Page 485
Another View: The Emotion Engine of the Sony Playstation 2......Page 493
Another View: The Sun Fire 6800 Server......Page 497
Fallacies and Pitfalls......Page 502
Concluding Remarks......Page 509
Historical Perspective and References......Page 512
6 Multiprocessors and Thread- Level Parallelism......Page 528
Introduction......Page 529
Characteristics of Application Domains......Page 543
Symmetric Shared- Memory Architectures......Page 552
Performance of Symmetric Shared- Memory Multiprocessors......Page 564
Distributed Shared- Memory Architectures......Page 581
Performance of Distributed Shared- Memory Multiprocessors......Page 591
Synchronization......Page 599
Models of Memory Consistency: An Introduction......Page 613
Multithreading: Exploiting Thread- Level Parallelism within a Processor......Page 617
Crosscutting Issues......Page 622
Putting It All Together: Sun's Wild . re Prototype......Page 629
Another View: Multithreading in a Commercial Server......Page 644
Another View: Embedded Multiprocessors......Page 645
Fallacies and Pitfalls......Page 646
Concluding Remarks......Page 652
Historical Perspective and References......Page 659
7 Storage Systems......Page 682
Introduction......Page 683
Types of Storage Devices......Page 685
Buses — Connecting I/ O Devices to CPU/ Memory......Page 698
Reliability, Availability, and Dependability......Page 707
RAID: Redundant Arrays of Inexpensive Disks......Page 712
Errors and Failures in Real Systems......Page 718
I/ O Performance Measures......Page 722
A Little Queuing Theory......Page 728
Benchmarks of Storage Performance and Availability......Page 739
Crosscutting Issues......Page 745
Designing an I/ O System in Five Easy Pieces......Page 750
Putting It All Together: EMC Symmetrix and Celerra......Page 763
Another View: Sanyo DSC- 110 Digital Camera......Page 770
Fallacies and Pitfalls......Page 773
Concluding Remarks......Page 779
Historical Perspective and References......Page 780
8 Interconnection Networks and Clusters......Page 794
Introduction......Page 795
A Simple Network......Page 802
Interconnection Network Media......Page 812
Connecting More Than Two Computers......Page 815
Network Topology......Page 824
Practical Issues for Commercial Interconnection Networks......Page 832
Examples of Interconnection Networks......Page 836
Internetworking......Page 842
Crosscutting Issues for Interconnection Networks......Page 847
Clusters......Page 851
Designing a Cluster......Page 856
Putting It All Together: The Goggle Cluster of PCs......Page 870
19 inches......Page 873
Another View: Inside a Cell Phone......Page 877
Fallacies and Pitfalls......Page 882
Concluding Remarks......Page 885
Historical Perspective and References......Page 886
C: A Survey of RISC Architectures for Desktop, Server, and Embedded Computers......Page 898
Introduction......Page 900
Addressing Modes and Instruction Formats......Page 902
Instructions: The MIPS Core Subset......Page 903
Instructions: Multimedia Extensions of the Desktop/ Server RISCs......Page 914
Instructions: Digital Signal- Processing Extensions of the Embedded RISCs......Page 916
Instructions: Common Extensions to MIPS Core......Page 917
Instructions Unique to MIPS64......Page 922
Instructions Unique to Alpha......Page 924
Instructions Unique to SPARC v. 9......Page 925
Instructions Unique to PowerPC......Page 929
Instructions Unique to PA- RISC 2.0......Page 930
Instructions Unique to ARM......Page 933
Instructions Unique to Thumb......Page 934
Instructions Unique to SuperH......Page 935
Instructions Unique to MIPS16......Page 936
Concluding Remarks......Page 938
References......Page 939
D: An Alternative to RISC: The Intel 80x86......Page 943
Introduction......Page 947
80x86 Registers and Data Addressing Modes......Page 948
80x86 Integer Operations......Page 951
80x86 Floating- Point Operations......Page 955
80x86 Instruction Encoding......Page 957
Putting It All Together: Measurements of Instruction Set Usage......Page 959
Concluding Remarks......Page 965
Historical Perspective and References......Page 966
E: Another Alternative to RISC: The VAX Architecture......Page 967
VAX Operands and Addressing Modes......Page 970
Encoding VAX Instructions......Page 973
VAX Operations......Page 974
An Example to Put It All Together: swap......Page 978
A Longer Example: sort......Page 981
Fallacies and Pitfalls......Page 986
Concluding Remarks......Page 987
Historical Perspective and Further Reading......Page 988
Exercises......Page 989
F: The IBM 360/ 370 Architecture for Mainframe Computers......Page 990
Introduction......Page 993
System/ 360 Instruction Set......Page 994
360 Detailed Measurements......Page 997
Historical Perspective and References......Page 991
G: Vector Processors......Page 999
Why Vector Processors?......Page 1002
Basic Vector Architecture......Page 1004
Two Real- World Issues: Vector Length and Stride......Page 1016
Enhancing Vector Performance......Page 1023
Effectiveness of Compiler Vectorization......Page 1032
Putting It All Together: Performance of Vector Processors......Page 1034
Fallacies and Pitfalls......Page 1040
Concluding Remarks......Page 1042
Historical Perspective and References......Page 1043
Exercises......Page 1049
H: Computer Arithmetic......Page 1054
Basic Techniques of Integer Arithmetic......Page 1056
Floating Point......Page 1067
Floating- Point Multiplication......Page 1071
Floating- Point Addition......Page 1075
Division and Remainder......Page 1081
More on Floating- Point Arithmetic......Page 1087
Speeding Up Integer Addition......Page 1091
Speeding Up Integer Multiplication and Division......Page 1099
Putting It All Together......Page 1112
Fallacies and Pitfalls......Page 1116
Historical Perspective and References......Page 1117
Exercises......Page 1123
I: Implementing Coherence Protocols......Page 1129
Implementation Issues for the Snooping Coherence Protocol......Page 1131
Implementation Issues in the Distributed Directory Protocol......Page 1135
Exercises......Page 1141