A Guide to Applied Machine Learning for Biologists

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

This textbook is an introductory guide to applied machine learning, specifically for biology students. It familiarizes biology students with the basics of modern computer science and mathematics and emphasizes the real-world applications of these subjects. The chapters give an overview of computer systems and programming languages to establish a basic understanding of the important concepts in computer systems. Readers are introduced to machine learning and artificial intelligence in the field of bioinformatics, connecting these applications to systems biology, biological data analysis and predictions, and healthcare diagnosis and treatment. 

This book offers a necessary foundation for more advanced computer-based technologies used in biology, employing case studies, real-world issues, and various examples to guide the reader from the basic prerequisites to machine learning and its applications.

Author(s): Mohammad "Sufian" Badar
Publisher: Springer
Year: 2023

Language: English
Pages: 271
City: Cham

Foreword
Preface
Acknowledgments
Contents
Basics of Modern Computer Systems
1 Computer Generations
1.1 First Generation: Vacuum Tubes
1.2 Second Generation: Transistors
1.3 Third Generation: Integrated Circuits
1.4 Fourth Generation: Microprocessors
2 Attributes of a Modern-Day Computer
2.1 Speed and Accuracy
2.2 Diligence
2.3 Storage Capability and Versatility
3 Computer Hardware Basics
3.1 Central Processing Unit
3.2 Instruction Cycle
3.3 Functional Units
3.3.1 Input Unit
3.3.2 Memory Unit
3.4 Memory Hierarchy in Computer Systems (Fig. 1)
3.4.1 Arithmetic and Logic Unit
3.4.2 Output Unit
3.4.3 Control Unit
3.5 Hardware Connections
3.6 Computer Networking
3.6.1 Local Area Network (LAN)
3.6.2 Metropolitan Area Network (MAN)
3.6.3 Campus Area Network (CAN)
3.6.4 Wide Area Network (WAN)
3.6.5 Wireless Local Area Network (WLAN)
3.6.6 Intranet: A Secure Internet-Like Network for Organizations
3.6.7 Extranet: A Secure Means for Sharing Information with Partners
3.6.8 Networking Relationship
3.7 Topologies
4 Operating Systems
4.1 Services and Facilities
4.2 Modes of CPU Operation
4.2.1 User Mode
4.2.2 Kernel Mode
4.3 Structure of an OS
4.4 Input/Output Services
4.5 Process Control Management
4.6 Memory Management
4.7 Scheduling and Dispatch
4.8 Secondary Storage Management
4.9 Network and Communications Support Services
4.10 Security and Protection Services
5 Files and Directories
5.1 File Management
5.2 File Access Method
5.2.1 Sequential Access
5.2.2 Direct/Relative Access
5.3 Directories
5.4 Operations Performed on a Directory
5.5 File Sharing
5.6 File Protection
6 Programs and Shells
6.1 Programs
6.2 Shells
7 Programming Languages
7.1 Types
7.2 Compiler vs Interpreter
8 Troubleshooting Computer Problems
8.1 Resources
8.1.1 Google
8.1.2 Webopedia
References
The Python Programming Language
1 Short History
2 The Python Interpreter
3 Basic Syntax
3.1 Python Identifiers
3.2 Variables
3.3 Keywords
3.4 Indentation
3.5 Basic Operators
3.5.1 Arithmetic Operators
3.5.2 Assignment Operators
3.5.3 Comparison/Relational Operators
3.5.4 Logical Operators
3.5.5 Identity Operators
3.5.6 Membership Operators
3.5.7 Bitwise Operators
3.6 Comments
3.6.1 Single Line Comment
3.6.2 Multiline Comments
3.7 Data Types
3.7.1 None
3.7.2 Boolean
3.7.3 Integers
3.7.4 Float
3.7.5 String
3.7.6 List
3.7.7 Tuple
3.7.8 Dictionary
3.7.9 Sets
3.8 Type Casting
3.8.1 int()
3.8.2 float()
3.8.3 str()
3.9 Escape Characters
3.9.1 \n (Prints a newline)
3.9.2 \cr (Prints the backslash character itself)
3.9.3 \´´ (Prints quotes, so that the quotes does not signal the end of the string)
3.9.4 \t (Prints a tab space)
3.10 Condition Statements
3.10.1 If Statement
3.10.2 For Loop
3.10.3 While Loop
3.10.4 Break
3.10.5 Continue
3.11 Try, Except
3.12 Suites
3.13 Functions
3.14 Scope of Variables
4 Working with Popular Libraries (Modules)
4.1 Modules
4.2 Packages
4.3 NumPy
4.4 Pandas
4.5 Matplotlib
4.6 Scikit-Learn
4.7 TensorFlow
4.8 Keras
4.9 Pytorch
4.10 Pattern
5 Advanced Concepts
5.1 Decorators
5.2 Lambda
5.3 Iterators and Generators
5.4 Classes
6 Optimization
6.1 SciPy
6.2 Local Search with SciPy
6.3 Global Search with SciPy
References
Basic Mathematics
1 Overview
2 Linear Algebra Basics
2.1 Scalars
2.2 Tensors
2.3 Vectors
2.3.1 Geometric Vectors
2.3.2 Polynomials
2.4 Systems of Linear Equations
2.4.1 Algebraic Method
2.4.1.1 Substitution Method
2.4.1.2 Elimination Method
2.4.1.3 Graphical Method
2.5 Matrices
2.6 Determinants
2.7 Slope-Intercept Equation
2.8 Secant and Tangent Line
2.9 Nonlinear Function
3 Calculus Basics
3.1 Exponents
3.2 Logarithms
3.3 Functions
3.3.1 Interval Notation
3.3.2 Inverse Functions
3.3.3 Composition of Functions
3.4 Trigonometry
3.4.1 Trigonometric Formulae
3.4.2 Trigonometric Angles
3.5 Limits
3.5.1 Indeterminate Forms
3.5.2 Limits of Rational Functions as x a
3.5.3 Limits of Square Roots as x a
3.5.4 Limits of Rational Functions as x
3.6 Differential Calculus
3.6.1 Continuity
3.6.2 Derivatives
3.6.3 Notations
3.6.4 Differentiation Rules
3.7 Critical Points
3.8 Extreme Value Theorem
3.9 Partial Derivatives
3.10 Linearity of Differentiation
3.11 Integral Calculus
3.11.1 Integration
3.11.2 Indefinite Integral
3.11.3 Definite Integral
3.11.4 Area Estimation
3.12 Mean Value Theorem
4 Probability Basics
4.1 Sample Space
4.2 Event Space
4.3 Probability
4.4 Random Variables
4.5 Probability Distributions
4.6 Probability Mass Functions
4.7 Probability Density Functions
4.8 Marginal Probability
4.9 Conditional Probability
4.9.1 Chain/Product Rule
4.10 Expectation
4.11 Variance
4.12 Covariance
4.13 Correlation
4.14 Bayes´ Rule
5 General Applications of Linear Algebra, Calculus, and Probability
5.1 Linear Algebra
5.1.1 Computer Graphics
5.1.2 Network Flow
5.1.3 Recommender Systems
5.1.4 Natural Language Processing and Word Embedding
5.1.5 Error Correcting Codes
5.1.6 Facial Recognition
5.1.7 Signal Analysis
5.1.8 Quantum Computing
5.2 Calculus
5.2.1 Industrial Construction
5.2.2 Space Flight and Research
5.2.3 Bacterial Growth
5.2.4 Graphics
5.2.5 Chemistry
5.2.6 Shipbuilding
5.2.7 Price Elasticity
5.2.8 Astronomy
5.2.9 Cardiac Output: Blood Flow
5.2.10 Cancer: Monitor Tumor
5.2.11 Calculating Weather Patterns
5.3 Probability
5.3.1 Weather Planning
5.3.2 Insurance
5.3.3 Games and Game Theory
5.3.4 Elections
5.3.5 Winning Lottery
5.3.6 Rate of Accidents
5.3.7 Typing on Smart Devices
5.3.8 Traffic Signals
5.3.9 Sports Betting
5.3.10 Sales Forecasting
5.3.11 Natural Disasters
5.3.12 Investments
References
Introduction to the World of Bioinformatics
1 Laying Foundation
2 A Brief History of Bioinformatics
3 Goals of Bioinformatics
4 Genome, Genes, and Sequence
4.1 Gene
4.1.1 Gene Prediction
4.1.1.1 Similarity-Based Approaches
4.1.1.2 Ab Initio-Based Approaches
4.2 Genome
4.2.1 Functional Genomics
4.2.2 Structural Genomics
4.2.2.1 Genome Mapping
4.2.3 Comparative Genomics
4.3 Sequence
4.3.1 Sequence Homology
4.3.2 Sequence Alignment
4.3.3 Pairwise Sequence Alignment
4.3.4 Multiple Sequence Alignment
4.3.5 Algorithm for MSA
5 Protein and Structure
5.1 Linear Protein Structure
6 Secondary Structure
6.1 -Helix
6.2 -Sheets
6.3 Secondary Structure Prediction
7 Tertiary Structure
7.1 Prediction Methods for Tertiary Structure of Proteins
7.1.1 Comparative Modeling
7.1.2 Ab Initio
7.1.3 Fold Recognition
7.1.4 Machine Learning Approach
8 Databases
8.1 Primary Database
8.2 Secondary Database
8.3 Composite Database
9 Bioinformatics Tools
10 Conclusion
References
Introduction to Artificial Intelligence & ML
1 Machine Learning
1.1 A History of Machine Learning
1.1.1 Laying the Foundation
1.1.2 From Theory to Reality
1.2 Why Machine Learning
1.3 Machine Learning Approaches
1.3.1 Supervised Learning
1.3.2 Unsupervised Learning
1.3.3 Reinforcement Learning
1.4 Machine Learning Applications
1.4.1 Product Recommendation
1.4.2 Image Recognition
1.4.3 Sentiment Analysis
2 Artificial Intelligence
2.1 What Is Artificial Intelligence
2.2 Basic Principles
2.2.1 AI Ethics and Responsible AI
2.3 General Applications of Artificial Intelligence
2.3.1 Voice Assistants
2.3.2 Self-Driving Cars
2.3.3 Healthcare and Biology
3 Conclusion
References
Fundamentals of Machine Learning
1 Introduction
2 Supervised Learning
2.1 Linear Regression
2.2 Polynomial Regression
2.3 Non-linear Regression
2.4 Bayesian Regression
2.5 Logistic Regression
3 Unsupervised Learning
3.1 K-Means Clustering
4 Brief Introduction to Semi-Supervised and Reinforcement Learning
4.1 Semi-Supervised Learning Approach
4.2 Reinforcement Learning Approach
5 Deep Learning
5.1 Artificial Neural Networks
5.1.1 Artificial Neuron
5.1.2 Structure of a Neural Network
5.1.3 Forward Propagation
5.1.4 Backpropagation
5.1.5 Overview of Feedforward Neural Networks
5.2 Convolutional Neural Networks
5.2.1 Convolution Operation
5.2.2 Pooling Operation
5.2.3 Structure of the Network
5.2.4 Overview of Convolution Neural Networks
6 Training and Evaluating Models
6.1 Evaluation Metrics
6.2 Train Test Split
References
Applications in the Field of Bioinformatics
1 Introduction
2 Application of AI in Systems Biology
2.1 Why Is It Important?
2.2 Intelligent Vaccine Design
2.2.1 The Collaboration of AI and Systems Biology for Vaccine Design and Development
2.3 Approaches Used by AI to Design Intelligent Vaccine
2.3.1 Knowledge Discovery Approach
2.3.2 Epitope and Agent Prediction Approach
2.3.3 Agent-Based Model Approach
3 Application of AI in Biological Data Analysis and Predictions
3.1 Biomedical Imaging
3.1.1 ML for Feature Analysis
3.1.2 Deep Learning in Biomedical Imaging
3.2 Prediction of COVID-19 Pneumonia
3.2.1 Application in Early Detection
3.2.2 Treatment Monitoring
3.2.3 Application in Medical Image Analysis
3.2.4 Cases and Mortality Prediction
4 Application of AI in Healthcare Diagnosis and Treatment
4.1 Disease Diagnostics and Prediction
4.1.1 Futuristic Biosensors with AI for Cardiac Healthcare
4.1.2 Role of ML
4.1.3 Procedure to Determine Results from the CVD Databases for Disease Diagnosis
4.1.3.1 Data Pre-processing
4.1.3.2 Feature Extraction
4.1.3.3 Feature Selection
4.1.3.4 Learning Method
References
Future Prospects
1 Automotive Industry
1.1 Predictive Maintenance
1.2 Quality Control
1.3 Root Cause Analysis
1.4 Supply Chain Optimization
2 Aviation
2.1 Recommendation Engine
2.2 Chat Bots
2.3 Baggage Screening Passenger
2.4 AI Thermal Cameras/AI-Based Video Analytics
2.5 Autonomous Taxiing Takeoff and Landing
2.6 Automatic Dependent Surveillance Broadcast (ADS-B)
2.7 Revenue Management
3 Maritime Logistics
3.1 Imagery
3.2 Active Acoustics and Passive Acoustics
3.3 Other Data Types
3.4 Electronic Monitoring
4 Software Engineering
4.1 Bug and Error Identification
4.2 Strategic Decision-Making
4.3 Testing Tools
4.4 Rapid Prototype
4.5 Code Review
4.6 Smart and Intelligent Assistants
4.7 Accurate and Precise Estimates
5 Marketing and Retail
5.1 Recommendation Systems
5.2 Forecast Targeting
5.3 Churn Rate Forecasting
5.4 Choice Modeling
5.5 Product Matching
5.6 Predicting Customer Behavior
5.7 Retail Stocking and Inventory
5.7.1 Predicting Inventory
5.8 e-Commerce
5.8.1 Tagging and Copywriting Automation
5.8.1.1 Image Retouching
6 Manufacturing
6.1 Predictive Maintenance
6.2 Predictive Quality and Yield
6.3 Digital Twin
6.4 Generative Design and Smart Manufacturing
6.5 Energy Consumption Forecasting
6.6 Manufacturing Ergonomics
6.6.1 Operator Model
6.6.2 Operator and Workspace Interaction Model
6.6.3 System Design and Optimization Model
6.7 Fault Detection
7 Cybersecurity
7.1 Quantum Computing
7.2 Cloud Computing
7.3 Predictive Semantics
7.4 Behavioral Identity
7.5 Dynamic Networks
8 Healthcare
8.1 Clinical Decision Support Systems
8.2 Smart Recordkeeping
8.3 Medical Imaging
8.4 Personalized Medicine
8.5 Predictive Adjustments to Treatment
8.6 Elderly and Low-Mobility Group Care
8.7 Robotic Process Automation
8.8 Drug Discovery and Production
8.9 Clinical Research
8.10 Infectious Disease Outbreak Prediction
8.11 Administration
8.12 Prescription Error
9 Agriculture
9.1 Pre-harvesting
9.1.1 Soil
9.1.2 Seeds
9.1.3 Pesticide and Disease Detection
9.1.4 Surveillance
9.2 Crop Management
9.2.1 Yield Prediction
9.2.2 Crop Quality
9.2.3 Species Recognition
9.3 Harvesting
9.4 Post Harvesting
9.5 Livestock Management
9.6 Water Management
References
Case Study 1: Human Emotion Detection
1 Introduction
2 Google Colaboratory
2.1 Click on Runtime
2.2 Click on Change runtime type
2.3 Select GPU from the drop-down menu and click Save
3 Model Implementation
3.1 Deep Learning
3.2 Dataset
3.3 Convolutional Neural Network
3.4 Data Generators
3.5 Hyper-Parameter Description
3.6 Model Definition
3.7 Model Compilation and Training
3.8 Model Evaluation
3.9 OpenCV
4 Output and Conclusion
5 Credits
Case Study 2: Brain Tumor Classification
1 Introduction
2 Kaggle Kernel
3 Dataset
4 Model Implementation
4.1 Data Generators
4.2 ResNet Model
4.3 CNN Model
4.4 Train the Model
5 Output and Conclusion
6 Credits
Further Reading
Index