An Introduction to Optimization with Applications in Machine Learning and Data Analytics

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

The primary goal of this text is a practical one. Equipping students with enough knowledge and creating an independent research platform, the author strives to prepare students for professional careers. Providing students with a marketable skill set requires topics from many areas of optimization. The initial goal of this text is to develop a marketable skill set for mathematics majors as well as for students of engineering, computer science, economics, statistics, and business. Optimization reaches into many different fields. This text provides a balance where one is needed. Mathematics optimization books are often too heavy on theory without enough applications; texts aimed at business students are often strong on applications, but weak on math. The book represents an attempt at overcoming this imbalance for all students taking such a course. The book contains many practical applications but also explains the mathematics behind the techniques, including stating definitions and proving theorems. Optimization techniques are at the heart of the first spam filters, are used in self-driving cars, play a great role in machine learning, and can be used in such places as determining a batting order in a Major League Baseball game. Additionally, optimization has seemingly limitless other applications in business and industry. In short, knowledge of this subject offers an individual both a very marketable skill set for a wealth of jobs as well as useful tools for research in many academic disciplines. Many of the problems rely on using a computer. Microsoft’s Excel is most often used, as this is common in business, but Python and other languages are considered. The consideration of other programming languages permits experienced mathematics and engineering students to use MATLAB® or Mathematica, and the computer science students to write their own programs in Java or Python.

Author(s): Jeffrey Paul Wheeler
Publisher: CRC Press
Year: 2023

Language: English
Pages: 475

Cover
Half Title
Series Page
Title Page
Copyright Page
Contents
Acknowledgments
List of Figures
List of Tables
List of Algorithms
List of Notation
I. Preliminary Matters
1. Preamble
1.1. Introduction
1.2. Software
1.3. About This Book
1.3.1. Presentation
1.3.2. Contents
1.4. One-Semester Course Material
1.5. Acknowledgments
2. The Language of Optimization
2.1. Basic Terms Defined
2.2. When a Max or Min Is Not in the Set
2.3. Solving an Optimization Problem
2.4. Algorithms and Heuristics
2.5. Runtime of an Algorithm or a Heuristic
2.6. For Further Study
2.7. Keywords
2.8. Exercises
3. Computational Complexity
3.1. Arithmetic Complexity
3.2. Asymptotic Notation
3.3. Intractability
3.4. Complexity Classes
3.4.1. Introduction
3.4.2. Time, Space, and Big O Notation
3.4.3. The Complexity Class P
3.4.4. The Complexity Class NP
3.4.5. Utility of Complexity Classes
3.5. For Further Study
3.6. Keywords
3.7. Exercises
4. Algebra Review
4.1. Systems of Linear Inequalities in Two Variables – Geometric Solutions
4.1.1. Examples
4.2. Solving Systems of Linear Equations Using Linear Algebra
4.2.1. Gauss-Jordan Elimination
4.2.2. Gaussian Elimination Compared with Gauss-Jordan Elimination
4.3. Linear Algebra Basics
4.3.1. Matrices and Their Multiplication
4.3.2. Identity Matrices, Inverses, and Determinants of Matrices
4.3.3. Solving Systems of Linear Equations via Cramer’s Rule
4.3.4. Vector and Matrix Norms
4.3.5. Vector Spaces
4.4. Matrix Properties Important to Optimization
4.4.1. Eigenvalues
4.4.2. Unimodular Matrices
4.5. Keywords
4.6. Exercises
5. Matrix Factorization
5.1. LU Factorization
5.2. Cholesky Decomposition
5.3. Orthogonality
5.4. Orthonormal Matrices
5.5. The Gram-Schmidt Process
5.6. QR Factorization
5.7. Keywords
5.8. For Further Study
5.9. Exercises
II. Linear Programming
6. Linear Programming
6.1. A Geometric Approach to Linear Programming in Two Dimensions
6.1.1. Example
6.1.2. Summary
6.1.3. Keywords
6.2. The Simplex Method: Max LP Problems with Constraints of the Form ≤
6.2.1. Introduction
6.2.2. Slack Variables
6.2.3. The Method
6.2.4. Summary
6.2.5. Keywords
6.3. The Dual: Minimization with Problem Constraints of the Form ≥
6.3.1. How It Works
6.3.2. Why It Works
6.3.3. Keywords
6.4. The Big M Method: Max/Min LP Problems with Varying Constraints
6.4.1. Maximization Problems with the Big M Method
6.4.2. Minimization Problems with the Big M Method
6.5. Degeneracy and Cycling in the Simplex Method
6.6. Exercises
7. Sensitivity Analysis
7.1. Motivation
7.2. An Excel Example
7.2.1. Solver’s Answer Report
7.2.2. Solver’s Sensitivity Report
7.3. Exercises
8. Integer Linear Programming
8.1. Introduction
8.2. Dakin’s Branch and Bound
8.3. Gomory Cut-Planes
8.4. For Further Study
8.5. Exercises
III. Nonlinear (Geometric) Programming
9. Calculus Review
9.1. Derivatives and Continuity
9.2. Taylor Series for Functions of a Single Variable
9.3. Newton’s Method
9.4. Vectors
9.5. Partial Derivatives
9.6. The Taylor Series of a Function of Two Variables
9.7. Exercises
10. A Calculus Approach to Nonlinear Programming
10.1. Using Derivatives to Find Extrema of Functions of a Single Variable
10.2. Calculus and Extrema of Multivariable Functions
10.3. Exercises
11. Constrained Nonlinear Programming: Lagrange Multipliers and the KKT Conditions
11.1. Lagrange Multipliers
11.2. The KKT Conditions
11.3. Exercises
12. Optimization Involving Quadratic Forms
12.1. Quadratic Forms
12.2. Definite and Semidefinite Matrices and Optimization
12.3. The Role of Eigenvalues in Optimization
12.4. Keywords
12.5. Exercises
13. Iterative Methods
13.1. Newton’s Method for Optimization
13.1.1. Single-Variable Newton’s Method for Optimization
13.1.2. Multivariable Newton’s Method for Optimization
13.2. Steepest Descent (or Gradient Descent)
13.2.1. Generalized Reduced Gradient
13.3. Additional Geometric Programming Techniques
13.4. Exercises
14. Derivative-Free Methods
14.1. The Arithmetic Mean-Geometric Mean Inequality (AGM)
14.2. Weight Finding Algorithm for the AGM
14.3. The AGM, Newton’s Method, and Reduced Gradient Compared
14.4. Exercises
15. Search Algorithms
15.1. Evolutionary Algorithms
15.2. Ant Foraging Optimization
IV. Convexity and the Fundamental Theorem of Linear Programming
16. Important Sets for Optimization
16.1. Special Linear Combinations
16.2. Special Sets
16.3. Special Properties of Sets
16.4. Special Objects
16.5. Exercises
17. The Fundamental Theorem of Linear Programming
17.1. The Finite Basis Theorem
17.2. The Fundamental Theorem of Linear Programming
17.3. For Further Study
17.4. Exercises
18. Convex Functions
18.1. Convex Functions of a Single Variable
18.2. Concave Functions
18.3. Graphs of Convex and Concave Functions
18.4. Multivariable Convex Functions
18.5. Mathematical Results from Convexity
18.6. Exercises
19. Convex Optimization
19.1. Convex Optimization and Applications
19.2. Duality
19.3. Subgradient Descent
19.4. Exercises
V. Combinatorial Optimization
20. An Introduction to Combinatorics
20.1. Introduction
20.2. The Basic Tools of Counting
20.2.1. When We Add, Subtract, Multiply, or Divide
20.2.2. Permutations and Combinations
20.3. The Binomial Theorem and Binomial Coefficients
20.3.1. Pascal’s Triangle
20.3.2. Binomial Coefficients
20.3.3. The Binomial Theorem
20.3.4. Another Counting Argument
20.3.5. The Multinomial Theorem
20.4. Counting When Objects Are Indistinguishable
20.4.1. Permutations with Indistinguishable Objects
20.4.2. Summary of Basic Counting Techniques
20.5. The Pigeonhole Principle
20.6. Exercises
21. An Introduction to Graph Theory
21.1. Basic Definitions
21.2. Special Graphs
21.2.1. Empty Graphs and the Trivial Graph
21.2.2. Walks, Trails, Paths, and Cycles
21.2.3. Trees
21.2.4. Complete and Bipartite Graphs
21.3. Vertex and Edge Cuts
21.3.1. Graph Connectivity
21.3.2. Notation for Removing a Vertex or Edge from a Graph
21.3.3. Cut Vertices and Vertex Cuts
21.3.4. Edge Cuts and Bridges
21.4. Some Useful and Interesting Results
21.5. Exercises
22. Network Flows
22.1. Basic Definitions
22.2. Maximum Flows and Cuts
22.3. The Dinitz-Edmonds-Karp-Ford-Fulkerson Algorithm
22.4. Max Flow as a Linear Programming Problem
22.5. Application to a Major League Baseball Pennant Race
22.6. Exercises
23. Minimum-Weight Spanning Trees and Shortest Paths
23.1. Weighted Graphs and Spanning Trees
23.2. Minimum-Weight Spanning Trees
23.2.1. Kruskal’s Algorithm
23.2.2. Prim’s Method
23.2.3. Kruskal’s and Prim’s Compared
23.3. Shortest Paths
23.3.1. Dijkstra’s Algorithm
23.3.2. A Linear Programming Approach to Shortest Paths
23.4. Exercises
24. Network Modeling and the Transshipment Problem
24.1. Introduction of the Problem
24.2. The Guarantee of Integer Solutions in Network Flow Problems
24.3. Exercises
25. The Traveling Salesperson Problem
25.1. History of the Traveling Salesperson Problem
25.2. The Problem
25.3. Heuristic Solutions
25.3.1. Nearest Neighbor
25.3.2. Insertion Algorithms
25.3.3. The Geometric Heuristic
25.4. For Further Study
25.5. Exercises
VI. Optimization for Data Analytics and Machine Learning
26. Probability
26.1. Introduction
26.2. Set Theory
26.2.1. The Vocabulary of Sets and Sample Spaces
26.2.2. The Algebra of Sets
26.3. Foundations of Probability
26.3.1. Borel Sets
26.3.2. The Axioms and Basic Properties of Probability
26.4. Conditional Probability
26.4.1. Naive Probability
26.4.2. Conditional Probability
26.4.3. Bayes’ Theorem
26.4.4. Independence
26.5. Random Variables and Distributions
26.5.1. Random Variables
26.5.2. Probability Mass and Probability Density Functions
26.5.3. Some Discrete Random Variable Probability Distributions
26.6. Exercises
27. Regression Analysis via Least Squares
27.1. Introduction
27.2. Formulation
27.3. Linear Least Squares
27.3.1. Pseudo-Inverse
27.3.2. Brief Discussion of Probabilistic Interpretation
27.4. Regularized Linear Least Squares
28. Forecasting
28.1. Smoothing
28.1.1. Exponential Smoothing
28.1.2. Trends
28.1.3. Seasonality
28.2. Stationary Data and Differencing
28.2.1. Autocorrelation
28.3. ARIMA Models
28.3.1. Autoregressive Models
28.3.2. Moving Average Models
28.3.3. ARIMA Model Structure
28.4. Partial
28.4.1. Goodness-of-Fit Metrics
28.5. Exercises
29. Introduction to Machine Learning
29.1. Introduction
29.2. Nearest Neighbors
29.3. Support Vector Machines
29.4. Neural Networks
29.4.1. Artificial Neural Networks
29.4.2. Exercises
A. Techniques of Proof
A.1. Introduction to Propositional Logic
A.2. Direct Proofs
A.3. Indirect Proofs
A.3.1. Proving the Contrapositive
A.3.2. Proof by Contradiction
A.4. The Principle of Mathematical Induction
A.5. Exercises
B. Useful Tools from Analysis and Topology
B.1. Definitions
B.2. Useful Theorems
Bibliography
Index