Despite using them every day, most software engineers know little about how programming languages are designed and implemented. For many, their only experience with that corner of computer science was a terrifying "compilers" class that they suffered through in undergrad and tried to blot from their memory as soon as they had scribbled their last NFA to DFA conversion on the final exam.
That fearsome reputation belies a field that is rich with useful techniques and not so difficult as some of its practitioners might have you believe. A better understanding of how programming languages are built will make you a stronger software engineer and teach you concepts and data structures you'll use the rest of your coding days. You might even have fun.
This book teaches you everything you need to know to implement a full-featured, efficient scripting language. You’ll learn both high-level concepts around parsing and semantics and gritty details like bytecode representation and garbage collection. Your brain will light up with new ideas, and your hands will get dirty and calloused.
Starting from main(), you will build a language that features rich syntax, dynamic typing, garbage collection, lexical scope, first-class functions, closures, classes, and inheritance. All packed into a few thousand lines of clean, fast code that you thoroughly understand because you wrote each one yourself.
Author(s): Robert Nystrom
Edition: 1
Publisher: Genever Benning
Year: 2021
Language: English
Pages: 584
Tags: Virtualization; Object-Oriented Programming; Computer Science; Compilers; Translators
COVER
CONTENTS
DEDICATION
ACKNOWLEDGEMENTS
I. WELCOME
01. Introduction
Why Learn This Stuff
Little Languages are everywhere
Languages are great exercise
One more reason
How the Book Is Organized
The code
Snippets
Asides
Challenges
Design notes
The First Interpreter
The Second Interpreter
CHALLENGES
DESIGN NOTE: WHATʼS IN A NAME?
02. A Map of the Territory
The Parts of a Language
Scanning
Parsing
Static analysis
Intermediate representations
Optimization
Code generation
Virtual machine
Runtime
Shortcuts and Alternate Routes
Single-pass compilers
Tree-walk interpreters
Tranpilers
Just-in-time compilation
Compilers and Interpreters
Our Journey
CHALLENGES
03. The Lox Language
Hello, Lox
A High-Level Language
Dynamic typing
Automatic memory management
Date Types
Expressions
Arithmetic
Comparison and equality
Logical operators
Precedence and grouping
Statements
Variables
Control Flow
Functions
Closures
Classes
Why might any language want to be object oriented?
Why is Lox object oriented?
Classes or prototypes
Classes in Lox
Instantiation and initialization
Inheritance
The Standard Library
CHALLENGES
DESIGN NOTE: EXPRESSIONS AND STATEMENTS
II. A TREE-WALK INTERPRETER
04. Scanning
The Interpreter Framework
Error handling
Lexemes and Tokens
Token type
Literal value
Location information
Regular Languages and Expressions
The Scanner Class
Recognizing Lexemes
Lexical errors
Operators
Longer Lexemes
String literals
Number literals
Reserved Word and Identifiers
CHALLENGES
DESIGN NOTE: IMPLICIT SEMICOLONS
05. Representing Code
Context-Free Grammars
Rules for grammars
Enhancing our notation
A Grammar for Lox expressions
Implementing Syntax Trees
Disoriented objects
Metaprogamming the trees
Working with Trees
The expression problem
The Visitor pattern
Visitors for expressions
A (Not Very) Pretty Printer
CHALLENGES
06. Parsing Expressions
Ambiguity and the Parsing Game
Recursive Descent Parsing
The parser class
Syntax Errors
Panic mode error recovery
Entering panic mode
Synchronizing a recursive descent parser
Wiring up the Parser
CHALLENGES
DESIGN NOTE: LOGIC VERSUS HISTORY
07. Evaluating Expressions
Representing Values
Evaluating Expressions
Evaluating literals
Evaluating parentheses
Evaluating unary expressions
Truthiness and falsiness
Evaluating binary operators
Runtime Errors
Detecting runtime errors
Hooking Up the Interpreter
Reporting runtime errors
Running the interpreter
CHALLENGES
DESIGN NOTE: STATIC AND DYNAMIC TYPING
08. Statements and State
Statements
Statement syntax trees
Parsing statements
Executing statements
Global Variables
Variable syntax
Parsing variables
Environments
Interpreting global variables
Assignment
Assignment syntax
Assignment semantics
Scope
Nesting and shadowing
Block syntax and semantics
CHALLENGES
DESIGN NOTE: IMPLICIT VARIABLE DECLARATI
09. Control Flow
Turing Machine Briefly
Conditional Execution
Logical Operators
While Loops
For Loops
Desugaring
CHALLENGES
DESIGN NOTE: SPOONFULS OF SYNTACTIC SUGA
10. Functions
Funtion Calls
Maximum argument counts
Interpreting function calls
Call type errors
Checking arity
Native Functions
Telling time
Function Declarations
Function Objects
Interpreting function declarations
Return Statements
Returning from calls
Local Functions and Closures
CHALLENGES
11. Resolving and Binding
Static Scope
Scopes and mutable environments
Persistent environments
Semantic Analysis
A variable resolution pass
A Resolver Class
Resolving blocks
Resolving variable declarations
Resolving variable expressions
Resolving assignment expressions
Resolving function declarations
Resolving the other syntax tree nodes
Interpreting Resolved Variables
Accessing a resolved variable
Assigning to a resolved variable
Running the resolver
Resolution Errors
Invalid return errors
CHALLENGES
12. Classes
OOP and Classes
Class Declarations
Creating Instances
Properties on Instances
Get expressions
Set expressions
Methods on Classes
This
Invalid uses of this
Constructors and Initializers
Invoking init() directly
Returning from init()
CHALLENGES
DESIGN NOTE: PROTOTYPES AND POWER
13. Inheritance
Superclasses and Subclasses
Inheriting Methods
Calling Superclass Methods
Syntax
Semantics
Invalid uses of super
Conclusion
CHALLENGES
III. A BYTECODE VIRTUAL MACHINE
15. Chunks of Bytecode
Bytecode?
Why not walk the AST?
Why not compile to native code?
What is bytecode?
Getting Started
Chunks of Instructions
A dynamic array of instructions
Disassembling Chunks
Constants
Representing values
Value arrays
Constant instructions
Line Information
Disassembling line information
CHALLENGES
DESIGN NOTE: TEST YOUR LANGUAGE
15. A Virtual Machine
An Instruction Execution Machine
Executing instructions
Execution tracing
A Value Stack Manipulator
The VM's Stack
Stack tracing
An Arithmetic Calculator
Binary operators
CHALLENGES
DESIGN NOTE: REGISTER-BASED BYTECODE
16. Scanning on Demand
Spinning up the Interpreter
Opening the compilation pipeline
The scanner scans
A Token at a Time
Scanning tokens
A Lexical Grammar for Lox
Whitespace
Comments
Literal tokens
Identifiers and keywords
Tries and state machines
CHALLENGES
17. Compiling Expressions
Single-Pass Compilation
Parsing Tokens
Handling syntax errors
Emitting Bytecode
Parsing Prefix Expressions
Parsers for tokens
Parentheses for grouping
Unary negation
Parsing Infix Expressions
A Pratt Parser
Parsing with precedence
Dumping Chunks
CHALLENGES
DESIGN NOTE: ITʼS JUST PARSING
18. Types of Values
Tagged Unions
Lox Values and C Values
Dynamically Typed Numbers
Unary negation and runtime errors
Binary arithmetic operators
Two New Types
Logical not and falsiness
Equality and comparison operators
CHALLENGES
19. Strings
Values and Objects
Struct Inheritance
Strings
Operations on Strings
Concatenation
Freeing Objects
CHALLENGES
DESIGN NOTE: STRING ENCODING
20. Hash Tables
An Array of Buckets
Load factor and wrapped keys
Collision Resolution
Seperate chaining
Open addressing
Hash Functions
Building a Hash Table
Hashing strings
Inserting entries
Allocating and resizing
Retrieving values
Deleting entries
Counting tombstones
String Interning
CHALLENGES
21. Global Variables
Statements
Print statements
Expression statements
Error synchronization
Variable Declarations
Reading Variables
Assignment
CHALLENGES
22. Local Variables
Representing Local Variables
Block Statements
Declaring Local Variables
Using Locals
Interpreting local variables
Another scope edge case
CHALLENGES
23. Jumping Back and Forth
If Statements
Else clauses
Logical Operators
Logical or operator
While Statements
For Statements
Initializer clause
Condition clause
Increment clause
CHALLENGES
DESIGN NOTE: CONSIDERING GOTO HARMFUL
24. Calls and Functions
Function Objects
Compiling to Function Objects
Creating function at compile time
Call Frames
Allocating local variables
Return addresses
The call stack
Function Declarations
A stack of compilers
Function parameters
Function Call
Binding arguments to parameters
Runtime error checking
Printing stack traces
Returning from functions
Return Statements
Native Functions
CHALLENGES
25. Closures
Closure Objects
Compiling to closure objects
Interpreting function declarations
Upvalues
Compiling upvalues
Flattening upvalues
Upvalue Objects
Upvalues in closures
Closed Upvalues
Values and variables
Closing upvalues
Tracking open upvalues
Closing upvalues at runtime
CHALLENGES
DESIGN NOTE: CLOSING OVER THE LOOP VARIA
26. Garbage Collection
Reachability
Mark-Sweep Garbage Collection
Collecting garbage
Debug logging
Marking the Roots
Less obvious roots
Tracing Object References
The tricolor abstraction
A worklist for gray objects
Processing gray objects
Sweeping Unused Objects
Weak references and the string pool
When to Collect
Latency and throughput
Self-adjusting heap
Garbage Collection Bugs
Adding to the constant table
Interning strings
Concatenating strings
CHALLENGES
DESIGN NOTE: GENERATIONAL COLLECTORS
27. Classes and Instances
Class Objects
Class Declarations
Instances of Classes
Get and Set Expressions
Interpreting getter and setter expressions
CHALLENGES
28. Methods and Initializers
Method Declarations
Representing methods
Compiling method declarations
Executing method declarations
Method References
Bound methods
Accessing methods
Calling methods
This
Misusing this
Instance Initializers
Involking initializers
Initializer return values
Incorrect returns in initializers
Optimized Invocations
Involking fields
CHALLENGES
DESIGN NOTE: NOVELTY BUDGET
29. Superclasses
Inheriting Methods
Executing inheritance
Invalid superclasses
Storing Superclasses
A superclass local variable
Super Calls
Executing super accesses
Faster super calls
A Complete Virtual Machine
CHALLENGES
30. Optimization
Measuring Performance
Benchmarks
Profiling
Faster Hash Table Probing
Slow key wrapping
NaN boxing
What is (and is not) a number?
Conditional support
Numbers
Nil, true and false
Objects
Value functions
Evaluating performance
Where to Next
CHALLENGES
BACKMATTER
A1. Appendix I
Syntax Grammar
Declarations
Statements
Expressions
Utility rules
Lexical Grammar
A2. Appendix II
Expressions
Assign expression
Binary expression
Call expression
Get expression
Grouping expression
Literal expression
Logical expression
Set expression
Super expression
This expression
Unary expression
Variable expression
Statements
Block statement
Class statement
Expression statement
Function statement
If statement
Print statement
Return statement
Variable statement
While statement