Author(s): Randal E. Bryant, David R. O’Hallaron. Manasa S., Mohit Tahiliani (eds.)
Edition: 3, global
Publisher: Pearson
Year: 2016
Language: English
Pages: 1122
Front Cover......Page 1
Contents......Page 8
Preface......Page 20
About the Authors......Page 36
Chapter 1: A Tour of Computer Systems......Page 38
1.1: Information Is Bits + Context......Page 40
1.2: Programs Are Translated by Other Programs into Different Forms......Page 41
1.3: It Pays to Understand How Compilation Systems Work......Page 43
1.4: Processors Read and Interpret Instructions Stored in Memory......Page 44
1.4.1: Hardware Organization of a System......Page 45
1.4.2: Running the hello Program......Page 47
1.5: Caches Matter......Page 48
1.7: The Operating System Manages the Hardware......Page 51
1.7.1: Processes......Page 52
1.7.2: Threads......Page 54
1.7.3: Virtual Memory......Page 55
1.8: Systems Communicate with Other Systems Using Networks......Page 56
1.9.1: Amdahl’s Law......Page 59
1.9.2: Concurrency and Parallelism......Page 61
1.9.3: The Importance of Abstractions in Computer Systems......Page 63
1.10: Summary......Page 64
Solutions to Practice Problems......Page 65
Part I: Program Structure and Execution......Page 66
Chapter 2: Representing and Manipulating Information......Page 68
2.1: Information Storage......Page 71
2.1.1: Hexadecimal Notation......Page 73
2.1.2: Data Sizes......Page 76
2.1.3: Addressing and Byte Ordering......Page 79
2.1.5: Representing Code......Page 86
2.1.6: Introduction to Boolean Algebra......Page 87
2.1.7: Bit-Level Operations in C......Page 91
2.1.8: Logical Operations in C......Page 93
2.1.9: Shift Operations in C......Page 94
2.2: Integer Representations......Page 96
2.2.1: Integral Data Types......Page 97
2.2.2: Unsigned Encodings......Page 99
2.2.3: Two’s-Complement Encodings......Page 101
2.2.4: Conversions between Signed and Unsigned......Page 107
2.2.5: Signed versus Unsigned in C......Page 111
2.2.6: Expanding the Bit Representation of a Number......Page 113
2.2.7: Truncating Numbers......Page 118
2.2.8: Advice on Signed versus Unsigned......Page 120
2.3.1: Unsigned Addition......Page 121
2.3.2: Two’s-Complement Addition......Page 127
2.3.3: Two’s-Complement Negation......Page 132
2.3.4: Unsigned Multiplication......Page 133
2.3.5: Two’s-Complement Multiplication......Page 134
2.3.6: Multiplying by Constant......Page 138
2.3.7: Dividing by Powers of 2......Page 140
2.3.8: Final Thoughts on Integer Arithmetic......Page 144
2.4: Floating Point......Page 145
2.4.1: Fractional Binary Numbers......Page 146
2.4.2: IEEE Floating-Point Representation......Page 149
2.4.3: Example Numbers......Page 152
2.4.4: Rounding......Page 157
2.4.5: Floating-Point Operations......Page 159
2.4.6: Floating Point in C......Page 161
2.5: Summary......Page 163
Bibliographic Notes......Page 164
Homework Problems......Page 165
Solutions to Practice Problems......Page 180
Chapter 3: Machine-Level Representation of Program......Page 200
3.1: A Historical Perspective......Page 203
3.2: Program Encodings......Page 206
3.2.1: Machine-Level Code......Page 207
3.2.2: Code Examples......Page 209
3.2.3: Notes on Formatting......Page 212
3.3: Data Formats......Page 214
3.4: Accessing Information......Page 216
3.4.1: Operand Specifiers......Page 217
3.4.2: Data Movement Instructions......Page 219
3.4.3: Data Movement Example......Page 223
3.4.4: Pushing and Popping Stack Data......Page 226
3.5.1: Load Effective Address......Page 228
3.5.3: Shift Operations......Page 231
3.5.4: Discussion......Page 233
3.5.5: Special Arithmetic Operations......Page 234
3.6: Control......Page 237
3.6.1: Condition Codes......Page 238
3.6.2: Accessing the Condition Codes......Page 239
3.6.3: Jump Instructions......Page 242
3.6.4: Jump Instruction Encodings......Page 244
3.6.5: Implementing Conditional Branches with Conditional Control......Page 246
3.6.6: Implementing Conditional Branches with Conditional Moves......Page 251
3.6.7: Loop......Page 257
3.6.8: Switch Statements......Page 269
3.7: Procedures......Page 275
3.7.1: The Run-Time Stack......Page 276
3.7.2: Control Transfer......Page 278
3.7.3: Data Transfer......Page 282
3.7.4: Local Storage on the Stack......Page 285
3.7.5: Local Storage in Registers......Page 286
3.7.6: Recursive Procedures......Page 290
3.8.1: Basic Principles......Page 292
3.8.2: Pointer Arithmetic......Page 294
3.8.3: Nested Arrays......Page 295
3.8.4: Fixed-Size Arrays......Page 297
3.8.5: Variable-Size Arrays......Page 299
3.9.1: Structures......Page 302
3.9.2: Unions......Page 306
3.9.3: Data Alignment......Page 310
3.10: Combining Control and Data in Machine-Level Programs......Page 313
3.10.1: Understanding Pointers......Page 314
3.10.3: Out-of-Bounds Memory References and Buffer Overflow......Page 316
3.10.4: Thwarting Buffer Overflow Attacks......Page 321
3.10.5: Supporting Variable-Size Stack Frames......Page 327
3.11: Floating-Point Code......Page 330
3.11.1: Floating-Point Movement and Conversion Operations......Page 333
3.11.2: Floating-Point Code in Procedures......Page 338
3.11.3: Floating-Point Arithmetic Operations......Page 339
3.11.4: Defining and Using Floating-Point Constants......Page 341
3.11.5: Using Bitwise Operations in Floating-Point Code......Page 342
3.11.6: Floating-Point Comparison Operations......Page 343
3.12: Summary......Page 346
Bibliographic Notes......Page 347
Homework Problems......Page 348
Solutions to Practice Problems......Page 362
Chapter 4: Processor Architecture......Page 388
4.1.1: Programmer-Visible State......Page 392
4.1.2: Y86-64 Instructions......Page 393
4.1.3: Instruction Encoding......Page 395
4.1.4: Y86-64 Exceptions......Page 400
4.1.5: Y86-64 Programs......Page 401
4.1.6: Some Y86-64 Instruction Details......Page 407
4.2: Logic Design and the Hardware Control Language HCL......Page 409
4.2.1: Logic Gates......Page 410
4.2.2: Combinational Circuits and HCL Boolean Expressions......Page 411
4.2.3: Word-Level Combinational Circuits and HCL Integer Expressions......Page 413
4.2.4: Set Membership......Page 417
4.2.5: Memory and Clocking......Page 418
4.3.1: Organizing Processing into Stages......Page 421
4.3.2: SEQ Hardware Structure......Page 433
4.3.3: SEQ Timing......Page 437
4.3.4: SEQ Stage Implementations......Page 441
4.4.1: Computational Pipelines......Page 449
4.4.2: A Detailed Look at Pipeline Operation......Page 451
4.4.3: Limitations of Pipelining......Page 453
4.4.4: Pipelining a System with Feedback......Page 456
4.5.1: SEQ+: Rearranging the Computation Stages......Page 458
4.5.2: Inserting Pipeline Registers......Page 459
4.5.3: Rearranging and Relabeling Signals......Page 463
4.5.4: Next PC Prediction......Page 464
4.5.5: Pipeline Hazards......Page 466
4.5.6: Exception Handling......Page 481
4.5.7: PIPE Stage Implementations......Page 484
4.5.8: Pipeline Control Logic......Page 492
4.5.9: Performance Analysis......Page 501
4.5.10: Unfinished Business......Page 505
4.6: Summary......Page 507
4.6.1: Y86-64 Simulators......Page 509
Homework Problems......Page 510
Solutions to Practice Problems......Page 517
Chapter 5: Optimizing Program Performance......Page 532
5.1: Capabilities and Limitations of Optimizing Compilers......Page 535
5.2: Expressing Program Performance......Page 539
5.3: Program Example......Page 541
5.4: Eliminating Loop Inefficiencies......Page 545
5.5: Reducing Procedure Calls......Page 549
5.6: Eliminating Unneeded Memory References......Page 551
5.7: Understanding Modern Processors......Page 554
5.7.1: Overall Operation......Page 555
5.7.2: Functional Unit Performance......Page 560
5.7.3: An Abstract Model of Processor Operation......Page 562
5.8: Loop Unrolling......Page 568
5.9.1: Multiple Accumulators......Page 573
5.9.2: Reassociation Transformation......Page 578
5.10: Summary of Results for Optimizing Combining Code......Page 584
5.11.1: Register Spilling......Page 585
5.11.2: Branch Prediction and Misprediction Penalties......Page 586
5.12: Understanding Memory Performance......Page 590
5.12.1: Load Performance......Page 591
5.12.2: Store Performance......Page 592
5.13: Life in the Real World: Performance Improvement Techniques......Page 598
5.14.1: Program Profiling......Page 599
5.14.2: Using a Profiler to Guide Optimization......Page 602
5.15: Summary......Page 605
Bibliographic Notes......Page 606
Homework Problems......Page 607
Solutions to Practice Problems......Page 610
Chapter 6: The Memory Hierarchy......Page 616
6.1.1: Random Access Memory......Page 618
6.1.2: Disk Storage......Page 626
6.1.3: Solid State Disks......Page 637
6.1.4: Storage Technology Trends......Page 639
6.2: Locality......Page 641
6.2.1: Locality of References to Program Data......Page 643
6.2.2: Locality of Instruction Fetches......Page 644
6.2.3: Summary of Locality......Page 645
6.3: The Memory Hierarchy......Page 646
6.3.1: Caching in the Memory Hierarchy......Page 647
6.4: Cache Memories......Page 651
6.4.1: Generic Cache Memory Organization......Page 652
6.4.2: Direct-Mapped Caches......Page 654
6.4.3: Set Associative Caches......Page 661
6.4.4: Fully Associative Caches......Page 663
6.4.5: Issues with Writes......Page 667
6.4.7: Performance Impact of Cache Parameters......Page 668
6.5: Writing Cache-Friendly Code......Page 670
6.6.1: The Memory Mountain......Page 676
6.6.2: Rearranging Loops to Increase Spatial Locality......Page 680
6.6.3: Exploiting Locality in Your Programs......Page 684
Bibliographic Notes......Page 685
Homework Problems......Page 686
Solutions to Practice Problems......Page 697
Part II: Running Programs on a System......Page 704
Chapter 7: Linking......Page 706
7.1: Compiler Drivers......Page 708
7.2: Static Linking......Page 709
7.3: Object Files......Page 710
7.4: Relocatable Object Files......Page 711
7.5: Symbols and Symbol Tables......Page 712
7.6: Symbol Resolution......Page 716
7.6.1: How Linkers Resolve Duplicate Symbol Names......Page 717
7.6.2: Linking with Static Libraries......Page 721
7.6.3: How Linkers Use Static Libraries to Resolve References......Page 725
7.7: Relocation......Page 726
7.7.1: Relocation Entries......Page 727
7.7.2: Relocating Symbol References......Page 728
7.8: Executable Object Files......Page 732
7.9: Loading Executable Object Files......Page 734
7.10: Dynamic Linking with Shared Libraries......Page 735
7.11: Loading and Linking Shared Libraries from Applications......Page 738
7.12: Position-Independent Code (PIC)......Page 741
7.13: Library Interpositioning......Page 744
7.13.2: Link-Time Interpositioning......Page 745
7.13.3: Run-Time Interpositioning......Page 747
7.15: Summary......Page 750
Homework Problems......Page 751
Solutions to Practice Problems......Page 754
Chapter 8: Exceptional Control Flow......Page 758
8.1: Exceptions......Page 760
8.1.1: Exception Handling......Page 761
8.1.2: Classes of Exceptions......Page 763
8.1.3: Exceptions in Linux/x86-64 Systems......Page 766
8.2.1: Logical Control Flow......Page 769
8.2.2: Concurrent Flows......Page 770
8.2.4: User and Kernel Modes......Page 771
8.2.5: Context Switches......Page 773
8.3: System Call Error Handling......Page 774
8.4: Process Control......Page 775
8.4.2: Creating and Terminating Processes......Page 776
8.4.3: Reaping Child Processes......Page 780
8.4.4: Putting Processes to Sleep......Page 786
8.4.5: Loading and Running Programs......Page 787
8.4.6: Using fork and execve to Run Programs......Page 790
8.5: Signals......Page 793
8.5.1: Signal Terminology......Page 795
8.5.2: Sending Signals......Page 796
8.5.3: Receiving Signals......Page 799
8.5.4: Blocking and Unblocking Signals......Page 801
8.5.5: Writing Signal Handlers......Page 803
8.5.6: Synchronizing Flows to Avoid Nasty Concurrency Bugs......Page 813
8.5.7: ExplicitlyWaiting for Signals......Page 815
8.6: Nonlocal Jumps......Page 818
8.7: Tools for Manipulating Processes......Page 823
Bibliographic Notes......Page 824
Homework Problems......Page 825
Solutions to Practice Problems......Page 832
Chapter 9: Virtual Memory......Page 838
9.1: Physical and Virtual Addressing......Page 840
9.2: Address Spaces......Page 841
9.3: VM as a Tool for Caching......Page 842
9.3.2: Page Tables......Page 843
9.3.4: Page Faults......Page 845
9.3.6: Locality to the Rescue Again......Page 847
9.4: VM as a Tool for Memory Management......Page 848
9.5: VM as a Tool for Memory Protection......Page 849
9.6: Address Translation......Page 850
9.6.2: Speeding Up Address Translation with a TLB......Page 854
9.6.3: Multi-Level Page Tables......Page 856
9.6.4: Putting It Together: End-to-End Address Translation......Page 858
9.7: Case Study: The Intel Core i7/Linux Memory System......Page 862
9.7.1: Core i7 Address Translation......Page 863
9.7.2: Linux Virtual Memory System......Page 865
9.8.1: Shared Objects Revisited......Page 870
9.8.3: The execve Function Revisited......Page 873
9.8.4: User-Level Memory Mapping with the mmap Function......Page 874
9.9: Dynamic Memory Allocation......Page 876
9.9.1: The malloc and free Functions......Page 877
9.9.2: Why Dynamic Memory Allocation?......Page 880
9.9.3: Allocator Requirements and Goals......Page 881
9.9.5: Implementation Issues......Page 883
9.9.6: Implicit Free Lists......Page 884
9.9.8: Splitting Free Blocks......Page 886
9.9.10: Coalescing Free Blocks......Page 887
9.9.11: Coalescing with Boundary Tags......Page 888
9.9.12: Putting It Together: Implementing a Simple Allocator......Page 891
9.9.13: Explicit Free Lists......Page 899
9.9.14: Segregated Free Lists......Page 900
9.10: Garbage Collection......Page 902
9.10.1: Garbage Collector Basics......Page 903
9.10.2: Mark&Sweep Garbage Collectors......Page 904
9.10.3: Conservative Mark&Sweep for C Programs......Page 906
9.11.1: Dereferencing Bad Pointers......Page 907
9.11.3: Allowing Stack Buffer Overflows......Page 908
9.11.5: Making Off-by-One Errors......Page 909
9.11.7: Misunderstanding Pointer Arithmetic......Page 910
9.11.9: Referencing Data in Free Heap Blocks......Page 911
9.12: Summary......Page 912
Homework Problems......Page 913
Solutions to Practice Problems......Page 917
Part III: Interaction and Communication between Programs......Page 924
Chapter 10: System-Level I/O......Page 926
10.1: Unix I/O......Page 927
10.2: Files......Page 928
10.3: Opening and Closing Files......Page 930
10.4: Reading and Writing Files......Page 932
10.5.1: Rio Unbuffered Input and Output Functions......Page 934
10.5.2: Rio Buffered Input Functions......Page 935
10.6: Reading File Metadata......Page 940
10.7: Reading Directory Contents......Page 942
10.8: Sharing Files......Page 943
10.9: I/O Redirection......Page 946
10.11: Putting It Together: Which I/O Functions Should I Use?......Page 948
10.12: Summary......Page 950
Homework Problems......Page 951
Solutions to Practice Problems......Page 952
Chapter 11: Network Programming......Page 954
11.1: The Client-Server Programming Model......Page 955
11.2: Networks......Page 956
11.3: The Global IP Internet......Page 961
11.3.1: IP Addresses......Page 962
11.3.2: Internet Domain Names......Page 964
11.3.3: Internet Connections......Page 966
11.4: The Sockets Interface......Page 969
11.4.1: Socket Address Structures......Page 970
11.4.3: The connect Function......Page 971
11.4.5: The listen Function......Page 972
11.4.6: The accept Function......Page 973
11.4.7: Host and Service Conversion......Page 974
11.4.8: Helper Functions for the Sockets Interface......Page 979
11.4.9: Example Echo Client and Server......Page 981
11.5.1: Web Basics......Page 985
11.5.2: Web Content......Page 986
11.5.3: HTTP Transactions......Page 987
11.5.4: Serving Dynamic Content......Page 990
11.6: Putting It Together: The Tiny Web Server......Page 993
11.7: Summary......Page 1001
Homework Problems......Page 1002
Solutions to Practice Problems......Page 1003
Chapter 12: Concurrent Programming......Page 1008
12.1: Concurrent Programming with Processes......Page 1010
12.1.1: A Concurrent Server Based on Processes......Page 1011
12.1.2: Pros and Cons of Processes......Page 1012
12.2: Concurrent Programming with I/O Multiplexing......Page 1014
12.2.1: A Concurrent Event-Driven Server Based on I/O Multiplexing......Page 1017
12.3: Concurrent Programming with Threads......Page 1022
12.3.1: Thread Execution Model......Page 1023
12.3.2: Posix Threads......Page 1024
12.3.4: Terminating Threads......Page 1025
12.3.6: Detaching Threads......Page 1026
12.3.7: Initializing Threads......Page 1027
12.3.8: A Concurrent Server Based on Threads......Page 1028
12.4: Shared Variables in Threaded Programs......Page 1029
12.4.1: Threads Memory Model......Page 1030
12.4.2: Mapping Variables to Memory......Page 1031
12.5: Synchronizing Threads with Semaphores......Page 1032
12.5.1: Progress Graphs......Page 1036
12.5.2: Semaphores......Page 1038
12.5.3: Using Semaphores for Mutual Exclusion......Page 1039
12.5.4: Using Semaphores to Schedule Shared Resources......Page 1041
12.5.5: Putting It Together: A Concurrent Server Based on Prethreading......Page 1045
12.6: Using Threads for Parallelism......Page 1050
12.7.1: Thread Safety......Page 1057
12.7.2: Reentrancy......Page 1060
12.7.3: Using Existing Library Functions in Threaded Programs......Page 1061
12.7.4: Races......Page 1062
12.7.5: Deadlocks......Page 1064
Bibliographic Notes......Page 1067
Homework Problems......Page 1068
Solutions to Practice Problems......Page 1073
Appendix A: Error Handling......Page 1078
A.1: Error Handling in Unix Systems......Page 1079
A.2: Error-Handling Wrappers......Page 1080
References......Page 1084
Index......Page 1090
Back Cover......Page 1122