Learn how to build and use all parts of real-world compilers, including the frontend, optimization pipeline, and a new backend by leveraging the power of LLVM core libraries
Key Features
• Get to grips with effectively using LLVM libraries step-by-step
• Understand LLVM compiler high-level design and apply the same principles to your own compiler
• Use compiler-based tools to improve the quality of code in C++ projects
Book Description
LLVM was built to bridge the gap between compiler textbooks and actual compiler development. It provides a modular codebase and advanced tools which help developers to build compilers easily. This book provides a practical introduction to LLVM, gradually helping you navigate through complex scenarios with ease when it comes to building and working with compilers.
You'll start by configuring, building, and installing LLVM libraries, tools, and external projects. Next, the book will introduce you to LLVM design and how it works in practice during each LLVM compiler stage: frontend, optimizer, and backend. Using a subset of a real programming language as an example, you will then learn how to develop a frontend and generate LLVM IR, hand it over to the optimization pipeline, and generate machine code from it. Later chapters will show you how to extend LLVM with a new pass and how instruction selection in LLVM works. You'll also focus on Just-in-Time compilation issues and the current state of JIT-compilation support that LLVM provides, before finally going on to understand how to develop a new backend for LLVM.
By the end of this LLVM book, you will have gained real-world experience in working with the LLVM compiler development framework with the help of hands-on examples and source code snippets.
What you will learn
• Configure, compile, and install the LLVM framework
• Understand how the LLVM source is organized
• Discover what you need to do to use LLVM in your own projects
• Explore how a compiler is structured, and implement a tiny compiler
• Generate LLVM IR for common source language constructs
• Set up an optimization pipeline and tailor it for your own needs
• Extend LLVM with transformation passes and clang tooling
• Add new machine instructions and a complete backend
Who this book is for
This book is for compiler developers, enthusiasts, and engineers who are new to LLVM and are interested in learning about the LLVM framework. It is also useful for C++ software engineers looking to use compiler-based tools for code analysis and improvement, as well as casual users of LLVM libraries who want to gain more knowledge of LLVM essentials. Intermediate-level experience with C++ programming is mandatory to understand the concepts covered in this book more effectively.
Author(s): Kai Nacke
Edition: 1
Publisher: Packt Publishing
Year: 2021
Language: English
Commentary: Vector PDF
Pages: 392
City: Birmingham, UK
Tags: C++; Compilers; LLVM
Cover
Title Page
Copyright and Credits
Contributors
Table of Contents
Preface
Section 1 – The Basics of Compiler Construction with LLVM
Chapter 1: Installing LLVM
Getting the prerequisites ready
Ubuntu
Fedora and RedHat
FreeBSD
OS X
Windows
Configuring Git
Building with CMake
Cloning the repository
Creating a build directory
Generating the build system files
Customizing the build process
Variables defined by CMake
Variables defined by LLVM
Summary
Chapter 2: Touring the LLVM Source
Technical requirements
Contents of the LLVM mono repository
LLVM core libraries and additions
Compilers and tools
Runtime libraries
Layout of an LLVM project
Creating your own project using LLVM libraries
Creating the directory structure
Adding the CMake files
Adding the C++ source files
Compiling the tinylang application
Targeting a different CPU architecture
Summary
Chapter 3: Structure of a Compiler
Technical requirements
Building blocks of a compiler
An arithmetic expression language
Formalism for specifying the syntax of a programming language
How grammar helps the compiler writer
Lexical analysis
A handwritten lexer
Syntactical analysis
A handwritten parser
The abstract syntax tree
Semantic analysis
Generating code with the LLVM backend
Textual representation of the LLVM IR
Generating the IR from the AST
The missing pieces – the driver and the runtime library
Summary
Section 2 – From Source to Machine Code Generation
Chapter 4: Turning the Source File into an Abstract Syntax Tree
Technical requirements
Defining a real programming language
Creating the project layout
Managing source files and user messages
Structuring the lexer
Constructing a recursive descent parser
Generating a parser and lexer with bison and flex
Performing semantic analysis
Handling the scope of names
Using LLVM-style RTTI for the AST
Creating the semantic analyzer
Summary
Chapter 5: Basics of IR Code Generation
Technical requirements
Generating IR from the AST
Understanding the IR code
Knowing the load-and-store approach
Mapping the control flow to basic blocks
Using AST numbering to generate IR code in SSA form
Defining the data structure to hold values
Reading and writing values local to a basic block
Searching the predecessor blocks for a value
Optimizing the generated phi instructions
Sealing a block
Creating IR code for expressions
Emitting the IR code for a function
Controlling visibility with linkage and name mangling
Converting types from an AST description to LLVM types
Creating the LLVM IR function
Emitting the function body
Setting up the module and the driver
Wrapping everything in the code generator
Initializing the target machine class
Emitting assembler text and object code
Summary
Chapter 6: IR Generation for High-Level Language Constructs
Technical requirements
Working with arrays, structs, and pointers
Getting the application binary interface right
Creating IR code for classes and virtual functions
Implementing single inheritance
Extending single inheritance with interfaces
Adding support for multiple inheritance
Summary
Chapter 7: Advanced IR Generation
Technical requirements
Throwing and catching exceptions
Raising an exception
Catching an exception
Integrating the exception-handling code into the application
Generating metadata for type-based alias analysis
Understanding the need for additional metadata
Adding TBAA metadata to tinylang
Adding debug metadata
Understanding the general structure of debug metadata
Tracking variables and their values
Adding line numbers
Adding debug support to tinylang
Summary
Chapter 8: Optimizing IR
Technical requirements
Introducing the LLVM Pass manager
Implementing a Pass using the new Pass manager
Adding a Pass to the LLVM source tree
Adding a new Pass as a plugin
Adapting a Pass for use with the old Pass manager
Adding an optimization pipeline to your compiler
Creating an optimization pipeline with the new Pass manager
Extending the Pass pipeline
Summary
Section 3 – Taking LLVM to the Next Level
Chapter 9: Instruction Selection
Technical requirements
Understanding the LLVM target backend structure
Using MIR to test and debug the backend
How instruction selection works
Specifying the target description in the TableGen language
Instruction selection with the selection DAG
Fast instruction selection – FastISel
The new global instruction selection – GlobalISel
Supporting new machine instructions
Adding a new instruction to the assembler and code generation
Testing the new instruction
Summary
Chapter 10: JIT Compilation
Technical requirements
Getting an overview of LLVM's JIT implementation and use cases
Using JIT compilation for direct execution
Exploring the lli tool
Implementing our own JIT compiler with LLJIT
Building a JIT compiler class from scratch
Utilizing a JIT compiler for code evaluation
Identifying the language semantics
Summary
Chapter 11: Debugging Using LLVM Tools
Technical requirements
Instrumenting an application with sanitizers
Detecting memory access problems with the address sanitizer
Finding uninitialized memory access with the memory sanitizer
Pointing out data races with the thread sanitizer
Finding bugs with libFuzzer
Limitations and alternatives
Performance profiling with XRay
Checking the source with the Clang Static Analyzer
Adding a new checker to the Clang Static Analyzer
Creating your own Clang-based tool
Summary
Chapter 12: Create Your Own Backend
Technical requirements
Setting the stage for a new backend
Adding the new architecture to the Triple class
Extending the ELF file format definition in LLVM
Creating the target description
Implementing the top-level file of the target description
Adding the register definition
Defining the calling convention
Creating the scheduling model
Defining the instruction formats and the instruction information
Implementing the DAG instruction selection classes
Initializing the target machine
Adding the selection DAG implementation
Supporting target-specific operations
Configuring the target lowering
Generating assembler instructions
Emitting machine code
Adding support for disassembling
Piecing it all together
Summary
Other Books You May Enjoy
Index