Introduction to Machine Learning with Applications in Information Security

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Introduction to Machine Learning with Applications in Information Security, Second Edition provides a classroom-tested introduction to a wide variety of machine learning and deep learning algorithms and techniques, reinforced via realistic applications. The book is accessible and doesn’t prove theorems, or dwell on mathematical theory. The goal is to present topics at an intuitive level, with just enough detail to clarify the underlying concepts. The book covers core classic machine learning topics in depth, including Hidden Markov Models (HMM), Support Vector Machines (SVM), and clustering. Additional machine learning topics include k-Nearest Neighbor (k-NN), boosting, Random Forests, and Linear Discriminant Analysis (LDA). The fundamental deep learning topics of backpropagation, Convolutional Neural Networks (CNN), Multilayer Perceptrons (MLP), and Recurrent Neural Networks (RNN) are covered in depth. A broad range of advanced deep learning architectures are also presented, including Long Short-Term Memory (LSTM), Generative Adversarial Networks (GAN), Extreme Learning Machines (ELM), Residual Networks (ResNet), Deep Belief Networks (DBN), Bidirectional Encoder Representations from Transformers (BERT), and Word2Vec. Finally, several cutting-edge deep learning topics are discussed, including dropout regularization, attention, explainability, and adversarial attacks. Most of the examples in the book are drawn from the field of information security, with many of the machine learning and deep learning applications focused on malware. The applications presented serve to demystify the topics by illustrating the use of various learning techniques in straightforward scenarios. Some of the exercises in this book require programming, and elementary computing concepts are assumed in a few of the application sections. However, anyone with a modest amount of computing experience should have no trouble with this aspect of the book.

Author(s): Mark Stamp
Series: Chapman & Hall/CRC Machine Learning & Pattern Recognition
Edition: 2
Publisher: CRC Press/Chapman & Hall
Year: 2022

Language: English
Pages: 497
City: Boca Raton
Tags: Information Security; Machine Learning; Deep Learning; Deep Belief Networks; Boltzmann Machines; Generative Adversarial Networks; Clustering; Principal Component Analysis; Support Vector Machines; Transfer Learning; Cryptography; Spam Detection; Malware Analysis; Long Short-Term Memory; Markov Models; Backpropagation

Cover
Half Title
Series Page
Title Page
Copyright Page
Dedication
Contents
Preface
About the Author
Acknowledgments
1. What is Machine Learning?
1.1. Introduction
1.2. About This Book
1.3. Necessary Background
1.4. A Note on Terminology
1.5. A Few Too Many Notes
I. Classic Machine Learning
2. A Revealing Introduction to Hidden Markov Models
2.1. Introduction and Background
2.2. Tree Rings and Temperature
2.3. Notation
2.4. The Three Problems
2.5. The Three Solutions
2.5.1. Scoring
2.5.2. Uncovering Hidden States
2.5.3. Training
2.6. Dynamic Programming
2.7. HMM Scaling
2.8. All Together Now
2.9. English Text Example
2.10. The Bottom Line
2.11. Problems
3. Principles of Principal Component Analysis
3.1. Introduction
3.2. Background
3.2.1. A Brief Review of Linear Algebra
3.2.2. Geometric View of Eigenvectors
3.2.3. Covariance Matrix
3.3. Principal Component Analysis
3.4. SVD Basics
3.5. All Together Now
3.5.1. Training Phase
3.5.2. Scoring Phase
3.6. A Numerical Example
3.7. The Bottom Line
3.8. Problems
4. A Reassuring Introduction to Support Vector Machines
4.1. Introduction
4.2. Constrained Optimization
4.2.1. Lagrange Multipliers
4.2.2. Lagrangian Duality
4.3. A Closer Look at SVM
4.3.1. Training and Scoring
4.3.2. Scoring Revisited
4.3.3. Support Vectors
4.3.4. Training and Scoring Re-revisited
4.3.5. The Kernel Trick
4.4. All Together Now
4.5. A Note on Quadratic Programming
4.6. The Bottom Line
4.7. Problems
5. A Comprehensible Collection of Clustering Concepts
5.1. Introduction
5.2. Overview and Background
5.3. ?-Means
5.4. Measuring Cluster Quality
5.4.1. Internal Validation
5.4.2. External Validation
5.4.3. Visualizing Clusters
5.5. EM Clustering
5.5.1. Maximum Likelihood Estimator
5.5.2. An Elementary EM Example
5.5.3. EM Algorithm
5.5.4. Gaussian Mixture Example
5.6. The Bottom Line
5.7. Problems
6. Many Mini Topics
6.1. Introduction
6.2. ?-Nearest Neighbors
6.3. Boost Your Knowledge of Boosting
6.3.1. Football Analogy
6.3.2. AdaBoost
6.3.3. Examples
6.4. Random Forest
6.5. Linear Discriminant Analysis
6.5.1. LDA Training
6.5.2. Numerical Example
6.6. The Bottom Line
6.7. Problems
II. Deep Learning
7. Deep Thoughts on Deep Learning
7.1. Introduction
7.2. A Brief History of Neural Networks
7.2.1. McCulloch-Pitts Neuron
7.2.2. Perceptron
7.2.3. Multilayer Perceptron
7.2.4. AI Winters and AI Summers
7.3. Why Deep Learning?
7.4. Decisions, Decisions
7.5. Basic Deep Learning Architectures
7.5.1. Feedforward Neural Networks
7.5.2. Convolutional Neural Networks
7.5.3. Recurrent Neural Networks
7.6. The Bottom Line
7.7. Problems
8. Onward to Backpropagation
8.1. Introduction
8.2. Automatic Differentiation
8.3. Backpropagation Example
8.3.1. Gradient Descent
8.3.2. MLP Example
8.4. Backpropagation Through Time
8.4.1. Vanishing and Exploding Gradients
8.4.2. Mitigating Gradient Issues
8.5. The Bottom Line
8.6. Problems
9. A Deeper Dive into Deep Learning
9.1. Introduction
9.2. Long Short-Term Memory
9.3. Gated Recurrent Unit
9.4. Generative Adversarial Networks
9.4.1. Generative and Discriminative Models
9.4.2. GAN Basics
9.4.3. GAN Training
9.5. Extreme Learning Machines
9.6. Residual Networks
9.7. Boltzmann Machines
9.7.1. Restricted Boltzmann Machine
9.7.2. Deep Belief Networks
9.7.3. Contrastive Divergence
9.8. Graph Neural Networks
9.9. Transfer Learning
9.10. The Bottom Line
9.11. Problems
10. Alphabet Soup of Deep Learning Topics
10.1. Introduction
10.2. Word Embedding Techniques
10.2.1. TF-IDF
10.2.2. HMM2Vec and PCA2Vec
10.2.3. Word2Vec
10.2.4. BERT
10.3. Multipart Methods
10.3.1. Ensembles
10.3.2. Combination Architectures
10.4. Overfitting
10.4.1. Regularization
10.4.2. Dropout
10.5. Attention
10.6. Explainability
10.7. Adversarial Attacks
10.8. The Bottom Line
10.9. Problems
III. Applications
11. HMMs for Classic Cryptanalysis
11.1. Introduction
11.2. Simple Substitutions
11.2.1. Jakobsen’s Algorithm
11.2.2. HMMs and Simple Substitutions
11.3. Homophonic Substitutions
11.4. Vigenere Cipher
11.4.1. Vigenere Cipher Example
11.4.2. Friedman Test
11.4.3. Experimental Results
11.5. Conclusion and Future Work
12. Image Spam Detection
12.1. Introduction
12.2. Eigenfaces
12.3. Eigenspam
12.3.1. PCA Experiments
12.3.2. Detection Results
12.4. SVM for Image Spam Detection
12.4.1. SVM Experiments
12.4.2. Improved Dataset
12.5. Conclusion and Future Work
13. Image-Based Malware Analysis
13.1. Introduction
13.2. Background
13.2.1. Transfer Learning Architectures
13.2.2. Dataset
13.3. Deep Learning Experiments and Results
13.3.1. MLP
13.3.2. CNN
13.3.3. RNN
13.3.4. Transfer Learning
13.3.5. Discussion
13.4. Conclusions and Future Work
14. Malware Evolution Detection
14.1. Introduction
14.2. Related Work
14.3. Design and Implementation
14.3.1. Dataset
14.3.2. Feature Extraction
14.3.3. Experimental Design
14.4. SVM Experimental Results
14.4.1. Juxtaposed Malware Families
14.4.2. Zbot Experiments
14.5. Additional Experiments
14.6. Conclusions and Future Work
IV. Extras
15. Experimental Design and Analysis
15.1. Introduction
15.2. Experimental Design
15.3. Accuracy
15.4. ROC Curves
15.5. Imbalance Problem
15.6. PR Curves
15.7. Accuracy, Loss, Overfitting, and Underfitting
15.8. The Bottom Line
15.9. Problems
16. Epilogue
16.1. Introduction
16.2. Summarizing Proust
16.3. The Goldilocks Principle
16.4. Machine Learning and Science Fiction
References
Index