The Art of 64-Bit Assembly, Volume 1: x86-64 Machine Organization and Programming

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

A new assembly language programming book from a well-loved master. Art of 64-bit Assembly Language capitalizes on the long-lived success of Hyde's seminal The Art of Assembly Language. Randall Hyde's The Art of Assembly Language has been the go-to book for learning assembly language for decades. Hyde's latest work, Art of 64-bit Assembly Language is the 64-bit version of this popular text. This book guides you through the maze of assembly language programming by showing how to write assembly code that mimics operations in High-Level Languages. This leverages your HLL knowledge to rapidly understand x86-64 assembly language. This new work uses the Microsoft Macro Assembler (MASM), the most popular x86-64 assembler today. Hyde covers the standard integer set, as well as the x87 FPU, SIMD parallel instructions, SIMD scalar instructions (including high-performance floating-point instructions), and MASM's very powerful macro facilities. You'll learn in detail: how to implement high-level language data and control structures in assembly language; how to write parallel algorithms using the SIMD (single-instruction, multiple-data) instructions on the x86-64; and how to write stand alone assembly programs and assembly code to link with HLL code. You'll also learn how to optimize certain algorithms in assembly to produce faster code.

Author(s): Randall Hyde
Publisher: No Starch Press
Year: 2021

Language: English
Commentary: True PDF
Pages: 1032

Brief Contents
Contents in Detail
Foreword
Acknowledgments
Introduction
A Note About the Source Code in This Book
Part I: Machine Organization
Chapter 1: Hello, World of Assembly Language
1.1 What You’ll Need
1.2 Setting Up MASM on Your Machine
1.3 Setting Up a Text Editor on Your Machine
1.4 The Anatomy of a MASM Program
1.5 Running Your First MASM Program
1.6 Running Your First MASM/C++ Hybrid Program
1.7 An Introduction to the Intel x86-64 CPU Family
1.8 The Memory Subsystem
1.9 Declaring Memory Variables in MASM
1.9.1 Associating Memory Addresses with Variables
1.9.2 Associating Data Types with Variables
1.10 Declaring (Named) Constants in MASM
1.11 Some Basic Machine Instructions
1.11.1 The mov Instruction
1.11.2 Type Checking on Instruction Operands
1.11.3 The add and sub Instructions
1.11.4 The lea Instruction
1.11.5 The call and ret Instructions and MASM Procedures
1.12 Calling C/C++ Procedures
1.13 Hello, World!
1.14 Returning Function Results in Assembly Language
1.15 Automating the Build Process
1.16 Microsoft ABI Notes
1.16.1 Variable Size
1.16.2 Register Usage
1.16.3 Stack Alignment
1.17 For More Information
1.18 Test Yourself
Chapter 2: Computer Data Representation and Operations
2.1 Numbering Systems
2.1.1 A Review of the Decimal System
2.1.2 The Binary Numbering System
2.1.3 Binary Conventions
2.2 The Hexadecimal Numbering System
2.3 A Note About Numbers vs. Representation
2.4 Data Organization
2.4.1 Bits
2.4.2 Nibbles
2.4.3 Bytes
2.4.4 Words
2.4.5 Double Words
2.4.6 Quad Words and Octal Words
2.5 Logical Operations on Bits
2.5.1 The AND Operation
2.5.2 The OR Operation
2.5.3 The XOR Operation
2.5.4 The NOT Operation
2.6 Logical Operations on Binary Numbers and Bit Strings
2.7 Signed and Unsigned Numbers
2.8 Sign Extension and Zero Extension
2.9 Sign Contraction and Saturation
2.10 Brief Detour: An Introduction to Control Transfer Instructions
2.10.1 The jmp Instruction
2.10.2 The Conditional Jump Instructions
2.10.3 The cmp Instruction and Corresponding Conditional Jumps
2.10.4 Conditional Jump Synonyms
2.11 Shifts and Rotates
2.12 Bit Fields and Packed Data
2.13 IEEE Floating-Point Formats
2.13.1 Single-Precision Format
2.13.2 Double-Precision Format
2.13.3 Extended-Precision Format
2.13.4 Normalized Floating-Point Values
2.13.5 Non-Numeric Values
2.13.6 MASM Support for Floating-Point Values
2.14 Binary-Coded Decimal Representation
2.15 Characters
2.15.1 The ASCII Character Encoding
2.15.2 MASM Support for ASCII Characters
2.16 The Unicode Character Set
2.16.1 Unicode Code Points
2.16.2 Unicode Code Planes
2.16.3 Unicode Encodings
2.17 MASM Support for Unicode
2.18 For More Information
2.19 Test Yourself
Chapter 3: Memory Access and Organization
3.1 Runtime Memory Organization
3.1.1 The .code Section
3.1.2 The .data Section
3.1.3 The .const Section
3.1.4 The .data? Section
3.1.5 Organization of Declaration Sections Within Your Programs
3.1.6 Memory Access and 4K Memory Management Unit Pages
3.2 How MASM Allocates Memory for Variables
3.3 The Label Declaration
3.4 Little-Endian and Big-Endian Data Organization
3.5 Memory Access
3.6 MASM Support for Data Alignment
3.7 The x86-64 Addressing Modes
3.7.1 x86-64 Register Addressing Modes
3.7.2 x86-64 64-Bit Memory Addressing Modes
3.7.3 Large Address Unaware Applications
3.8 Address Expressions
3.9 The Stack Segment and the push and pop Instructions
3.9.1 The Basic push Instruction
3.9.2 The Basic pop Instruction
3.9.3 Preserving Registers with the push and pop Instructions
3.10 The Stack Is a LIFO Data Structure
3.11 Other push and pop Instructions
3.12 Removing Data from the Stack Without Popping It
3.13 Accessing Data You’ve Pushed onto the Stack Without Popping It
3.14 Microsoft ABI Notes
3.15 For More Information
3.16 Test Yourself
Chapter 4: Constants, Variables, and Data Types
4.1 The imul Instruction
4.2 The inc and dec Instructions
4.3 MASM Constant Declarations
4.3.1 Constant Expressions
4.3.2 this and $ Operators
4.3.3 Constant Expression Evaluation
4.4 The MASM typedef Statement
4.5 Type Coercion
4.6 Pointer Data Types
4.6.1 Using Pointers in Assembly Language
4.6.2 Declaring Pointers in MASM
4.6.3 Pointer Constants and Pointer Constant Expressions
4.6.4 Pointer Variables and Dynamic Memory Allocation
4.6.5 Common Pointer Problems
4.7 Composite Data Types
4.8 Character Strings
4.8.1 Zero-Terminated Strings
4.8.2 Length-Prefixed Strings
4.8.3 String Descriptors
4.8.4 Pointers to Strings
4.8.5 String Functions
4.9 Arrays
4.9.1 Declaring Arrays in Your MASM Programs
4.9.2 Accessing Elements of a Single-Dimensional Array
4.9.3 Sorting an Array of Values
4.10 Multidimensional Arrays
4.10.1 Row-Major Ordering
4.10.2 Column-Major Ordering
4.10.3 Allocating Storage for Multidimensional Arrays
4.10.4 Accessing Multidimensional Array Elements in Assembly Language
4.11 Records/Structs
4.11.1 MASM Struct Declarations
4.11.2 Accessing Record/Struct Fields
4.11.3 Nesting MASM Structs
4.11.4 Initializing Struct Fields
4.11.5 Arrays of Structs
4.11.6 Aligning Fields Within a Record
4.12 Unions
4.12.1 Anonymous Unions
4.12.2 Variant Types
4.13 Microsoft ABI Notes
4.14 For More Information
4.15 Test Yourself
Part II: Assembly Language Programming
Chapter 5: Procedures
5.1 Implementing Procedures
5.1.1 The call and ret Instructions
5.1.2 Labels in a Procedure
5.2 Saving the State of the Machine
5.3 Procedures and the Stack
5.3.1 Activation Records
5.3.2 The Assembly Language Standard Entry Sequence
5.3.3 The Assembly Language Standard Exit Sequence
5.4 Local (Automatic) Variables
5.4.1 Low-Level Implementation of Automatic (Local) Variables
5.4.2 The MASM Local Directive
5.4.3 Automatic Allocation
5.5 Parameters
5.5.1 Pass by Value
5.5.2 Pass by Reference
5.5.3 Low-Level Parameter Implementation
5.5.4 Declaring Parameters with the proc Directive
5.5.5 Accessing Reference Parameters on the Stack
5.6 Calling Conventions and the Microsoft ABI
5.7 The Microsoft ABI and Microsoft Calling Convention
5.7.1 Data Types and the Microsoft ABI
5.7.2 Parameter Locations
5.7.3 Volatile and Nonvolatile Registers
5.7.4 Stack Alignment
5.7.5 Parameter Setup and Cleanup (or “What’s with These Magic Instructions?”)
5.8 Functions and Function Results
5.9 Recursion
5.10 Procedure Pointers
5.11 Procedural Parameters
5.12 Saving the State of the Machine, Part II
5.13 Microsoft ABI Notes
5.14 For More Information
5.15 Test Yourself
Chapter 6: Arithmetic
6.1 x86-64 Integer Arithmetic Instructions
6.1.1 Sign- and Zero-Extension Instructions
6.1.2 The mul and imul Instructions
6.1.3 The div and idiv Instructions
6.1.4 The cmp Instruction, Revisited
6.1.5 The setcc Instructions
6.1.6 The test Instruction
6.2 Arithmetic Expressions
6.2.1 Simple Assignments
6.2.2 Simple Expressions
6.2.3 Complex Expressions
6.2.4 Commutative Operators
6.3 Logical (Boolean) Expressions
6.4 Machine and Arithmetic Idioms
6.4.1 Multiplying Without mul or imul
6.4.2 Dividing Without div or idiv
6.4.3 Implementing Modulo-N Counters with AND
6.5 Floating-Point Arithmetic
6.5.1 Floating-Point on the x86-64
6.5.2 FPU Registers
6.5.3 FPU Data Types
6.5.4 The FPU Instruction Set
6.5.5 FPU Data Movement Instructions
6.5.6 Conversions
6.5.7 Arithmetic Instructions
6.5.8 Comparison Instructions
6.5.9 Constant Instructions
6.5.10 Transcendental Instructions
6.5.11 Miscellaneous Instructions
6.6 Converting Floating-Point Expressions to Assembly Language
6.6.1 Converting Arithmetic Expressions to Postfix Notation
6.6.2 Converting Postfix Notation to Assembly Language
6.7 SSE Floating-Point Arithmetic
6.7.1 SSE MXCSR Register
6.7.2 SSE Floating-Point Move Instructions
6.7.3 SSE Floating-Point Arithmetic Instructions
6.7.4 SSE Floating-Point Comparisons
6.7.5 SSE Floating-Point Conversions
6.8 For More Information
6.9 Test Yourself
Chapter 7: Low-Level Control Structures
7.1 Statement Labels
7.1.1 Using Local Symbols in Procedures
7.1.2 Initializing Arrays with Label Addresses
7.2 Unconditional Transfer of Control (jmp)
7.2.1 Register-Indirect Jumps
7.2.2 Memory-Indirect Jumps
7.3 Conditional Jump Instructions
7.4 Trampolines
7.5 Conditional Move Instructions
7.6 Implementing Common Control Structures in Assembly Language
7.6.1 Decisions
7.6.2 if/then/else Sequences
7.6.3 Complex if Statements Using Complete Boolean Evaluation
7.6.4 Short-Circuit Boolean Evaluation
7.6.5 Short-Circuit vs. Complete Boolean Evaluation
7.6.6 Efficient Implementation of if Statements in Assembly Language
7.6.7 switch/case Statements
7.7 State Machines and Indirect Jumps
7.8 Loops
7.8.1 while Loops
7.8.2 repeat/until Loops
7.8.3 forever/endfor Loops
7.8.4 for Loops
7.8.5 The break and continue Statements
7.8.6 Register Usage and Loops
7.9 Loop Performance Improvements
7.9.1 Moving the Termination Condition to the End of a Loop
7.9.2 Executing the Loop Backward
7.9.3 Using Loop-Invariant Computations
7.9.4 Unraveling Loops
7.9.5 Using Induction Variables
7.10 For More Information
7.11 Test Yourself
Chapter 8: Advanced Arithmetic
8.1 Extended-Precision Operations
8.1.1 Extended-Precision Addition
8.1.2 Extended-Precision Subtraction
8.1.3 Extended-Precision Comparisons
8.1.4 Extended-Precision Multiplication
8.1.5 Extended-Precision Division
8.1.6 Extended-Precision Negation Operations
8.1.7 Extended-Precision AND Operations
8.1.8 Extended-Precision OR Operations
8.1.9 Extended-Precision XOR Operations
8.1.10 Extended-Precision NOT Operations
8.1.11 Extended-Precision Shift Operations
8.1.12 Extended-Precision Rotate Operations
8.2 Operating on Different-Size Operands
8.3 Decimal Arithmetic
8.3.1 Literal BCD Constants
8.3.2 Packed Decimal Arithmetic Using the FPU
8.4 For More Information
8.5 Test Yourself
Chapter 9: Numeric Conversion
9.1 Converting Numeric Values to Strings
9.1.1 Converting Numeric Values to Hexadecimal Strings
9.1.2 Converting Extended-Precision Hexadecimal Values to Strings
9.1.3 Converting Unsigned Decimal Values to Strings
9.1.4 Converting Signed Integer Values to Strings
9.1.5 Converting Extended-Precision Unsigned Integers to Strings
9.1.6 Converting Extended-Precision Signed Decimal Values to Strings
9.1.7 Formatted Conversions
9.1.8 Converting Floating-Point Values to Strings
9.2 String-to-Numeric Conversion Routines
9.2.1 Converting Decimal Strings to Integers
9.2.2 Converting Hexadecimal Strings to Numeric Form
9.2.3 Converting Unsigned Decimal Strings to Integers
9.2.4 Conversion of Extended-Precision String to Unsigned Integer
9.2.5 Conversion of Extended-Precision Signed Decimal String to Integer
9.2.6 Conversion of Real String to Floating-Point
9.3 For More Information
9.4 Test Yourself
Chapter 10: Table Lookups
10.1 Tables
10.1.1 Function Computation via Table Lookup
10.1.2 Generating Tables
10.1.3 Table-Lookup Performance
10.2 For More Information
10.3 Test Yourself
Chapter 11: SIMD Instructions
11.1 The SSE/AVX Architectures
11.2 Streaming Data Types
11.3 Using cpuid to Differentiate Instruction Sets
11.4 Full-Segment Syntax and Segment Alignment
11.5 SSE, AVX, and AVX2 Memory Operand Alignment
11.6 SIMD Data Movement Instructions
11.6.1 The (v)movd and (v)movq Instructions
11.6.2 The (v)movaps, (v)movapd, and (v)movdqa Instructions
11.6.3 The (v)movups, (v)movupd, and (v)movdqu Instructions
11.6.4 Performance of Aligned and Unaligned Moves
11.6.5 The (v)movlps and (v)movlpd Instructions
11.6.6 The movhps and movhpd Instructions
11.6.7 The vmovhps and vmovhpd Instructions
11.6.8 The movlhps and vmovlhps Instructions
11.6.9 The movhlps and vmovhlps Instructions
11.6.10 The (v)movshdup and (v)movsldup Instructions
11.6.11 The (v)movddup Instruction
11.6.12 The (v)lddqu Instruction
11.6.13 Performance Issues and the SIMD Move Instructions
11.6.14 Some Final Comments on the SIMD Move Instructions
11.7 The Shuffle and Unpack Instructions
11.7.1 The (v)pshufb Instructions
11.7.2 The (v)pshufd Instructions
11.7.3 The (v)pshuflw and (v)pshufhw Instructions
11.7.4 The shufps and shufpd Instructions
11.7.5 The vshufps and vshufpd Instructions
11.7.6 The (v)unpcklps, (v)unpckhps, (v)unpcklpd, and (v)unpckhpd Instructions
11.7.7 The Integer Unpack Instructions
11.7.8 The (v)pextrb, (v)pextrw, (v)pextrd, and (v)pextrq Instructions
11.7.9 The (v)pinsrb, (v)pinsrw, (v)pinsrd, and (v)pinsrq Instructions
11.7.10 The (v)extractps and (v)insertps Instructions
11.8 SIMD Arithmetic and Logical Operations
11.9 The SIMD Logical (Bitwise) Instructions
11.9.1 The (v)ptest Instructions
11.9.2 The Byte Shift Instructions
11.9.3 The Bit Shift Instructions
11.10 The SIMD Integer Arithmetic Instructions
11.10.1 SIMD Integer Addition
11.10.2 Horizontal Additions
11.10.3 Double-Word–Sized Horizontal Additions
11.10.4 SIMD Integer Subtraction
11.10.5 SIMD Integer Multiplication
11.10.6 SIMD Integer Averages
11.10.7 SIMD Integer Minimum and Maximum
11.10.8 SIMD Integer Absolute Value
11.10.9 SIMD Integer Sign Adjustment Instructions
11.10.10 SIMD Integer Comparison Instructions
11.10.11 Integer Conversions
11.11 SIMD Floating-Point Arithmetic Operations
11.12 SIMD Floating-Point Comparison Instructions
11.12.1 SSE and AVX Comparisons
11.12.2 Unordered vs. Ordered Comparisons
11.12.3 Signaling and Quiet Comparisons
11.12.4 Instruction Synonyms
11.12.5 AVX Extended Comparisons
11.12.6 Using SIMD Comparison Instructions
11.12.7 The (v)movmskps, (v)movmskpd Instructions
11.13 Floating-Point Conversion Instructions
11.14 Aligning SIMD Memory Accesses
11.15 Aligning Word, Dword, and Qword Object Addresses
11.16 Filling an XMM Register with Several Copies of the Same Value
11.17 Loading Some Common Constants Into XMM and YMM Registers
11.18 Setting, Clearing, Inverting, and Testing a Single Bit in an SSE Register
11.19 Processing Two Vectors by Using a Single Incremented Index
11.20 Aligning Two Addresses to a Boundary
11.21 Working with Blocks of Data Whose Length Is Not a Multiple of the SSE/AVX Register Size
11.22 Dynamically Testing for a CPU Feature
11.23 The MASM Include Directive
11.24 And a Whole Lot More
11.25 For More Information
11.26 Test Yourself
Chapter 12: Bit Manipulation
12.1 What Is Bit Data, Anyway?
12.2 Instructions That Manipulate Bits
12.2.1 The and Instruction
12.2.2 The or Instruction
12.2.3 The xor Instruction
12.2.4 Flag Modification by Logical Instructions
12.2.5 The Bit Test Instructions
12.2.6 Manipulating Bits with Shift and Rotate Instructions
12.3 The Carry Flag as a Bit Accumulator
12.4 Packing and Unpacking Bit Strings
12.5 BMI1 Instructions to Extract Bits and Create Bit Masks
12.6 Coalescing Bit Sets and Distributing Bit Strings
12.7 Coalescing and Distributing Bit Strings Using BMI2 Instructions
12.8 Packed Arrays of Bit Strings
12.9 Searching for a Bit
12.10 Counting Bits
12.11 Reversing a Bit String
12.12 Merging Bit Strings
12.13 Extracting Bit Strings
12.14 Searching for a Bit Pattern
12.15 For More Information
12.16 Test Yourself
Chapter 13: Macros and the MASM Compile-Time Language
13.1 Introduction to the Compile-Time Language
13.2 The echo and .err Directives
13.3 Compile-Time Constants and Variables
13.4 Compile-Time Expressions and Operators
13.4.1 The MASM Escape (!) Operator
13.4.2 The MASM Evaluation (%) Operator
13.4.3 The catstr Directive
13.4.4 The instr Directive
13.4.5 The sizestr Directive
13.4.6 The substr Directive
13.5 Conditional Assembly (Compile-Time Decisions)
13.6 Repetitive Assembly (Compile-Time Loops)
13.7 Macros (Compile-Time Procedures)
13.8 Standard Macros
13.9 Macro Parameters
13.9.1 Standard Macro Parameter Expansion
13.9.2 Optional and Required Macro Parameters
13.9.3 Default Macro Parameter Values
13.9.4 Macros with a Variable Number of Parameters
13.9.5 The Macro Expansion (&) Operator
13.10 Local Symbols in a Macro
13.11 The exitm Directive
13.12 MASM Macro Function Syntax
13.13 Macros as Compile-Time Procedures and Functions
13.14 Writing Compile-Time “Programs”
13.14.1 Constructing Data Tables at Compile Time
13.14.2 Unrolling Loops
13.15 Simulating HLL Procedure Calls
13.15.1 HLL-Like Calls with No Parameters
13.15.2 HLL-Like Calls with One Parameter
13.15.3 Using opattr to Determine Argument Types
13.15.4 HLL-Like Calls with a Fixed Number of Parameters
13.15.5 HLL-Like Calls with a Varying Parameter List
13.16 The invoke Macro
13.17 Advanced Macro Parameter Parsing
13.17.1 Checking for String Literal Constants
13.17.2 Checking for Real Constants
13.17.3 Checking for Registers
13.17.4 Compile-Time Arrays
13.18 Using Macros to Write Macros
13.19 Compile-Time Program Performance
13.20 For More Information
13.21 Test Yourself
Chapter 14: The String Instructions
14.1 The x86-64 String Instructions
14.1.1 The rep, repe, repz, and the repnz and repne Prefixes
14.1.2 The Direction Flag
14.1.3 The movs Instruction
14.1.4 The cmps Instruction
14.1.5 The scas Instruction
14.1.6 The stos Instruction
14.1.7 The lods Instruction
14.1.8 Building Complex String Functions from lods and stos
14.2 Performance of the x86-64 String Instructions
14.3 SIMD String Instructions
14.3.1 Packed Compare Operand Sizes
14.3.2 Type of Comparison
14.3.3 Result Polarity
14.3.4 Output Processing
14.3.5 Packed String Compare Lengths
14.3.6 Packed String Comparison Results
14.4 Alignment and Memory Management Unit Pages
14.5 For More Information
14.6 Test Yourself
Chapter 15: Managing Complex Projects
15.1 The include Directive
15.2 Ignoring Duplicate Include Operations
15.3 Assembly Units and External Directives
15.4 Header Files in MASM
15.5 The externdef Directive
15.6 Separate Compilation
15.7 An Introduction to Makefiles
15.7.1 Basic Makefile Syntax
15.7.2 Make Dependencies
15.7.3 Make Clean and Touch
15.8 The Microsoft Linker and Library Code
15.9 Object File and Library Impact on Program Size
15.10 For More Information
15.11 Test Yourself
Chapter 16: Stand-Alone Assembly Language Programs
16.1 Hello World, by Itself
16.2 Header Files and the Windows Interface
16.3 The Win32 API and the Windows ABI
16.4 Building a Stand-Alone Console Application
16.5 Building a Stand-Alone GUI Application
16.6 A Brief Look at the MessageBox Windows API Function
16.7 Windows File I/O
16.8 Windows Applications
16.9 For More Information
16.10 Test Yourself
Part III: Reference Material
Appendix A: ASCII Character Set
Appendix B: Glossary
Appendix C: Installing and Using Visual Studio
C.1 Installing Visual Studio Community
C.2 Creating a Command Line Prompt for MASM
C.3 Editing, Assembling, and Running a MASM Source File
Appendix D: The Windows Command Line Interpreter
D.1 Command Line Syntax
D.2 Directory Names and Drive Letters
D.3 Some Useful Built-in Commands
D.3.1 The cd and chdir Commands
D.3.2 The cls Command
D.3.3 The copy Command
D.3.4 The date Command
D.3.5 The del (erase) Command
D.3.6 The dir Command
D.3.7 The more Command
D.3.8 The move Command
D.3.9 The ren and rename Commands
D.3.10 The rd and rmdir Commands
D.3.11 The time Command
D.4 For More Information
Appendix E: Answers to Questions
E.1 Answers to Questions in Chapter 1
E.2 Answers to Questions in Chapter 2
E.3 Answers to Questions in Chapter 3
E.4 Answers to Questions in Chapter 4
E.5 Answers to Questions in Chapter 5
E.6 Answers to Questions in Chapter 6
E.7 Answers to Questions in Chapter 7
E.8 Answers to Questions in Chapter 8
E.9 Answers to Questions in Chapter 9
E.10 Answers to Questions in Chapter 10
E.11 Answers to Questions in Chapter 11
E.12 Answers to Questions in Chapter 12
E.13 Answers to Questions in Chapter 13
E.14 Answers to Questions in Chapter 14
E.15 Answers to Questions in Chapter 15
E.16 Answers to Questions in Chapter 16
Index