Machine Learning and Deep Learning in Computational Toxicology

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

This book is a collection of machine learning and deep learning algorithms, methods, architectures, and software tools that have been developed and widely applied in predictive toxicology. It compiles a set of recent applications using state-of-the-art machine learning and deep learning techniques in analysis of a variety of toxicological endpoint data. The contents illustrate those machine learning and deep learning algorithms, methods, and software tools and summarise the applications of machine learning and deep learning in predictive toxicology with informative text, figures, and tables that are contributed by the first tier of experts. One of the major features is the case studies of applications of machine learning and deep learning in toxicological research that serve as examples for readers to learn how to apply machine learning and deep learning techniques in predictive toxicology. This book is expected to provide a reference for practical applications of machine learning and deep learning in toxicological research. It is a useful guide for toxicologists, chemists, drug discovery and development researchers, regulatory scientists, government reviewers, and graduate students. The main benefit for the readers is understanding the widely used machine learning and deep learning techniques and gaining practical procedures for applying machine learning and deep learning in predictive toxicology. 

Author(s): Huixiao Hong
Series: Computational Methods in Engineering & the Sciences
Publisher: Springer
Year: 2023

Language: English
Pages: 653
City: Cham

Preface
Contents
Editor and Contributors
1 Machine Learning and Deep Learning Promote Computational Toxicology for Risk Assessment of Chemicals
1.1 Risk Assessment of Chemicals
1.2 Computational Toxicology
1.3 Machine Learning in Computational Toxicology
1.4 Deep Learning in Toxicology
1.5 Perspectives
References
Part I Machine Learning and Deep Learning Methods for Computational Toxicology
2 Assessment of the Xenobiotics Toxicity Taking into Account Their Metabolism
2.1 Introduction
2.2 Computational Methods of Studying Metabolism
2.2.1 Databases Containing Xenobiotic Metabolism Information
2.2.2 Descriptors/Notation Used for Metabolism Prediction
2.2.3 Prediction of Biotransformation Sites
2.2.4 Generation of the Structures of Probable Metabolites
2.2.5 Reactive Metabolite Formation Prediction
2.3 Integral Computational Assessment of Xenobiotic Toxicity
2.4 Future Directions in Xenobiotic Toxicity Assessment
References
3 Emerging Machine Learning Techniques in Predicting Adverse Drug Reactions
3.1 Introduction
3.2 Feature Generation for Machine Learning
3.2.1 Structure-Based Features
3.2.2 Interactions and Associations
3.2.3 Data Sources for Feature Generation
3.3 Conventional Methods for ADR Prediction
3.4 Emerging Methods for ADR Prediction
3.4.1 Molecule-Based Methods
3.4.2 Similarity-Based Methods
3.4.3 Network- and Graph-Based Methods
3.5 ADR Prediction Future Directions
References
4 Drug Effect Deep Learner Based on Graphical Convolutional Network
4.1 Introduction
4.2 Results
4.2.1 Gene Vector: Generation and Evaluation
4.2.2 Molecular Feature and Vector Generation
4.2.3 Cell Vector: Generation and Evaluation
4.2.4 Deep Drug Effect Predictor: Training and Validation
4.2.5 Application of DDEP to Predict the Effects of Anti-cancer Drugs Against Breast Adenocarcinoma
4.2.6 Insights into Drug Classification
4.3 Discussion
4.4 Methods
4.4.1 Capture Contextual Information of Genes from Their Interaction Networks
4.4.2 Generating Gene Vectors and Cell Vectors
4.4.3 GCN-Based Pre-models
4.4.4 Deep Drug Effect Predictor
References
5 AOP-Based Machine Learning for Toxicity Prediction
5.1 Introduction
5.2 Research Status and Existing Problems for ML
5.3 General Overview of AOP
5.3.1 The Generation of AOP
5.3.2 The Framework of AOP
5.3.3 Qualitative AOP and Quantitative AOP
5.4 Research Progress of Toxicity Prediction by AOP and ML
5.5 Perspectives and Future Prospects of AOP
References
6 Graph Kernel Learning for Predictive Toxicity Models
6.1 Introduction
6.2 A Brief Introduction of Graph Concepts
6.2.1 Graph Theory Definitions
6.2.2 Graph Kernels Fundamentals
6.3 Graph Kernel Learning for Molecular Representations
6.4 Applications of GKL Methods on Chemical Toxicity
6.4.1 Benchmark Data Sets and Methods About Chemical Toxicity
6.4.2 Applications of Graph Kernel-Based Methods
6.4.3 Applications of Graph Neural Networks
6.4.4 Applications of Learnable Graph Embeddings
6.4.5 Applications of Learnable Graph Kernels
6.5 Challenges and Perspectives of Graph Kernel Learning on Toxicity-Related Problems
6.6 Conclusion
References
7 Optimize and Strengthen Machine Learning Models Based on in Vitro Assays with Mechanistic Knowledge and Real-World Data
7.1 Introduction
7.2 Incorporating AOPs to Construct Parsimonious Machine Learning Models
7.2.1 AOPs and AOP Networks
7.2.2 Using AOPs to Facilitate Building Parsimonious Machine Learning Models
7.3 Utilize Spontaneous Reporting Databases to Corroborate Findings of Machine Learning Models
7.3.1 Statistical Methods for Safety Signal Mining Using Spontaneous Reporting Databases
7.3.2 Obtain Data from FAERS
7.3.3 Poisson Regression Model for Report Counts
7.3.4 Incorporating Host Factors in Testing
7.3.5 Utilize FAERS Data to Corroborate Models Based on in Vitro Assays
7.4 Conclusions
References
8 Multitask Learning for Quantitative Structure–Activity Relationships: A Tutorial
8.1 Introduction
8.2 QSAR and Multitask Learning
8.2.1 Definition of MTL Problem
8.2.2 Task Relatedness
8.2.3 Multitask Neural Networks
8.2.4 Performance Evaluation
8.3 Case Study: NURA Dataset
8.4 Hands-On Tutorial
8.4.1 Getting Started
8.5 Conclusions
References
Part II Tools and Approaches Facilitating Machine Learning and Deep Learning Methods in Computational Toxicology
9 Isalos Predictive Analytics Platform: Cheminformatics, Nanoinformatics, and Data Mining Applications
9.1 Introduction
9.2 Isalos Platform
9.3 Data Input
9.4 Data Transformation
9.4.1 Normalizers
9.4.2 Data Manipulation
9.4.3 Dataset Splitting
9.5 Analytics
9.5.1 Modelling Methodologies
9.5.2 Feature Selection
9.5.3 Existing Model Utilization
9.6 Statistics
9.6.1 Domain—APD
9.6.2 Model Metrics
9.7 Development of Predictive Models with Isalos
9.7.1 Ecotox Models
9.7.2 Molecular, Size, and Surface-Based Safe by Design (MS3bD, MSzeta) Model
9.7.3 Cell Viability Model
9.8 Conclusions
References
10 ED Profiler: Machine Learning Tool for Screening Potential Endocrine-Disrupting Chemicals
10.1 Introduction
10.2 Materials and Methods
10.2.1 Data Sets
10.2.2 Molecular Descriptor Calculation
10.2.3 (Q)SAR Modeling
10.2.4 Applicability Domain and Reliability Evaluation
10.2.5 Software Development
10.3 Development of Predictive Models
10.3.1 Proposed Predictive Model System
10.3.2 Development of SAR Models
10.3.3 Development of QSAR Models
10.4 Development of Software
10.4.1 Features and Overview of the Software
10.4.2 Examples
10.5 Conclusions
References
11 Quantitative Target-specific Toxicity Prediction Modeling (QTTPM): Coupling Machine Learning with Dynamic Protein–Ligand Interaction Descriptors (DyPLIDs) to Predict Androgen Receptor-mediated Toxicity
11.1 Introduction
11.2 Materials and Methods
11.2.1 Study Design
11.2.2 Dataset Curation, Preprocessing, and Chemical Preparation
11.2.3 Molecular Docking
11.2.4 Molecular Dynamics (MD) Simulations
11.2.5 Dynamic Protein–Ligand Interaction Descriptors (DyPLIDs) Calculation
11.2.6 Feature Selection: Down-Selection of Descriptors
11.2.7 Dataset Splitting
11.2.8 Machine Learning for Quantitative AR Activity Prediction Modeling
11.3 Results and Discussion
11.3.1 Conformational Ensemble of AR-Ligand Interactions
11.3.2 Comparison of 6 ns Versus 100 ns Simulations
11.3.3 Fingerprint Chemical Diversity
11.3.4 Predictive QSAR Model
11.3.5 Feature Importance
11.3.6 Model Limitation
11.4 Conclusion
References
12 Mold2 Descriptors Facilitate Development of Machine Learning and Deep Learning Models for Predicting Toxicity of Chemicals
12.1 Introduction
12.2 Mold2 Descriptors
12.2.1 The Descriptors
12.2.2 The Software
12.3 Information Content of Mold2 Descriptors
12.4 Applications in Machine Learning
12.4.1 Predicting Estrogenic Activity
12.4.2 Predicting Androgenic Activity
12.4.3 Predicting Kinase Inhibitors
12.5 Applications in Deep Learning
12.5.1 Predicting DILI
12.5.2 Predicting Drug-Likeness
12.6 Summary
References
13 Applicability Domain Characterization for Machine Learning QSAR Models
13.1 An Outline of Quantitative Structure–Activity Relationship (QSAR) Models
13.1.1 Core Elements of QSAR Models
13.1.2 Validity of QSAR Models
13.2 Concepts and Understandings of AD
13.2.1 Physicochemical, Structural, Mechanistic, and Metabolic Aspects
13.2.2 Interpolation, Distance/Similarity, and Boundary
13.2.3 AD Metrics Evaluating Prediction Performance on Individual Chemicals
13.3 AD Characterization Methods
13.3.1 Descriptor Domain
13.3.2 Structural (Similarity) Domain
13.3.3 Clustering-Based Methods
13.3.4 First-Class AD Metrics
13.3.5 Second-Class AD Metrics
13.3.6 Visualization of AD
13.4 Impacts on the QSAR Modeling Scenario from Machine Learning Algorithms
13.5 Toward Broader AD for Machine Learning QSAR Models
References
14 Controlling for Confounding in Complex Survey Machine Learning Models to Assess Drug Safety and Risk
14.1 Introduction and Background
14.2 The Propensity Score Method
14.3 Software and Assessment of Balance
14.4 A Drug Safety Model for Prescription NSAIDs
14.5 Propensity Score Weighting
14.5.1 Adjusting for Confounding with Propensity Score ATE Weighting
14.5.2 Adjusting for Confounding with Propensity Score ATT Weighting
14.6 Propensity Score Stratification
14.6.1 Adjusting for Confounding with Propensity Score Stratification
14.7 Conclusion
References
15 Multivariate Curve Resolution for Analysis of Heterogeneous System in Toxicogenomics
15.1 Introduction
15.2 Basic Conceptions and Application Scenarios
15.2.1 The Definition of MCR in Heterogeneous Systems
15.2.2 The Ultimate Goal of Applying MCR in TGx
15.3 Method Categories
15.3.1 Determining or Estimating k
15.3.2 Determining or Estimating E and W
15.3.3 Jointly Utilization and Alternate Estimation
15.4 Available Resources
15.4.1 Databases of TGx Data
15.4.2 Functions of MCR
15.4.3 Tools of Deconvolution by Using MCR
15.5 Perspective and Future Directions
References
Part III Machine Learning and Deep Learning for Chemical Toxicity Prediction
16 The Use of Machine Learning to Support Drug Safety Prediction
16.1 Introduction
16.2 Chemical-Based Safety Machine Learning
16.2.1 Overview
16.2.2 Databases
16.2.3 Machine Learning Algorithms
16.3 Case Study—Assessment of Pharmaceutical Impurities
16.4 Conclusions
References
17 Machine Learning-Based QSAR Models and Structural Alerts for Prediction of Mitochondrial Dysfunction
17.1 Introduction
17.2 Datasets and Methods
17.2.1 Data on Mitochondrial Dysfunction
17.2.2 Machine Learning Methods Used for Model Construction
17.2.3 Model Evaluation
17.2.4 Methods to Identify Structural Alerts
17.3 Mitochondrial Dysfunction QSAR Models and Structural Alerts
17.3.1 Mitochondrial Dysfunction QSAR Models
17.3.2 Structural Alerts for Mitochondrial Dysfunction
17.4 Conclusions and Future Directions
References
18 Machine Learning and Deep Learning Applications to Evaluate Mutagenicity
18.1 In Silico Methods to Predict Bacterial Mutagenicity
18.2 Data for Modeling Mutagenicity
18.3 Traditional Machine Learning for Mutagenicity Prediction
18.4 Deep Learning for Mutagenicity Prediction
18.5 Discussion and Perspective
References
19 Modeling Tox21 Data for Toxicity Prediction and Mechanism Deconvolution
19.1 Introduction
19.2 Tox21 10K Compound Library and Assay Data
19.2.1 Tox21 Compound Collection
19.2.2 Tox21 qHTS Process
19.3 Modeling Tox21 Data for Toxicity Prediction
19.3.1 Multiple Species In Vivo Toxicity
19.3.2 Human In Vivo Toxicity
19.3.3 In Vitro Toxicity
19.4 Toxicity Pathways and Mechanisms
19.5 Conclusions and Moving Forward
References
20 Identification of Structural Alerts by Machine Learning and Their Applications in Toxicology
20.1 Introduction
20.2 Approaches for Identification of Structural Alerts
20.2.1 Expert Systems
20.2.2 Computational Approaches
20.2.3 Comparison of Data-Driven Structural Alerts with Expert Systems
20.3 Application of Structural Alerts in Toxicology
20.3.1 Toxicity Prediction
20.3.2 Explanation of QSAR Models
20.3.3 Molecular Optimization
20.3.4 Exploring New Mechanisms
20.4 Perspectives and Outlook
References
21 Machine Learning in Prediction of Nanotoxicology
21.1 Introduction
21.2 Toxicity of Nanomaterials
21.2.1 Toxicity of Carbon Nanomaterials
21.2.2 Toxicity of Transition Metal Dichalcogenides
21.2.3 Toxicity of MOFs
21.3 Prediction of Nanotoxicity by Machine Learning
21.3.1 Prediction of Carbon Nanomaterials Toxicity by Machine Learning
21.3.2 Prediction of Nanometal Toxicity by Machine Learning
21.3.3 Prediction of Nanometal Oxide Toxicity by Machine Learning
21.3.4 Prediction of Other Nanomaterials Toxicity by Machine Learning
21.4 Future Directions of Machine Learning in Nanotoxicology Prediction
References
22 Machine Learning for Predicting Organ Toxicity
22.1 Introduction
22.2 Machine Learning Algorithms
22.2.1 Classification and Regression Tree
22.2.2 k-Nearest Neighbors (kNN)
22.2.3 Naïve Bayes (NB)
22.2.4 Random Forest
22.2.5 Support Vector Machine
22.3 Organ Toxicity Prediction
22.3.1 Liver Toxicity
22.3.2 Kidney Toxicity
22.3.3 Heart Toxicity
22.4 A Case Study for Organ Toxicity Prediction
22.4.1 Data Sources
22.4.2 Supervised Machine Learning
22.4.3 Results
22.5 Summary
References
Part IV The Progress of Machine Learning and Deep Learning in New Areas
23 Computational Modeling for the Prediction of Hepatotoxicity Caused by Drugs and Chemicals
23.1 Introduction
23.2 Machine Learning Methods for Predicting Hepatotoxicity
23.2.1 Toxicity Dataset for Machine Learning
23.2.2 Metrics for Evaluating Model Performance
23.2.3 Machine Learning Algorithms
23.3 A Case Study: Machine Learning Modeling for Hepatotoxicity Prediction
23.3.1 Data Sources
23.3.2 Modeling by Machine Learning Approaches
23.3.3 Results
23.4 Summary and Future Direction
References
24 Artificial Intelligence for Risk Assessment of Cancer Therapy-Related Cardiotoxicity and Precision Cardio-Oncology
24.1 Introduction
24.2 Methods and Materials
24.2.1 Data Resources
24.2.2 Molecular Feature and Vector Generation
24.2.3 Defining Biological Endpoints and Clinical Outcomes
24.2.4 AI/ML Algorithm and Model Selection
24.2.5 Evaluating Model Performance
24.3 Variable Network Construction
24.4 Case Studies
24.4.1 In Silico Pharmacoepidemiologic Evaluation of Drug-Induced Cardiovascular Complications Using Combined Classifiers
24.4.2 Machine Learning-Based Risk Assessment for Cancer Therapy-Related Cardiac Dysfunction in 4300 Longitudinal Oncology Patients
24.4.3 Cardiac Risk Stratification in Cancer Patients: A Longitudinal Patient-Patient Network Analysis
24.5 Future Directions and Conclusion
References
25 Deep Learning Model for Prediction of Compound Activities Over a Panel of Major Toxicity-Related Proteins
25.1 Introduction
25.2 Methods
25.2.1 Dataset
25.2.2 Chemical Diversity Analysis
25.2.3 Prediction Models
25.2.4 Evaluation Metrics
25.3 Results and Discussion
25.3.1 Data Collection and Analysis
25.3.2 Drug and Target Representations Selection
25.3.3 Model Performance
25.3.4 Comparison with Conventional per Protein Models
25.3.5 External Validation
25.4 Conclusions
References
26 Machine Learning for Analyzing Drug Safety in Electronic Health Records
26.1 Introduction
26.2 Drug Safety Problems to Solve with ML
26.2.1 Prescription Error
26.2.2 Medication Misuse
26.2.3 Drug-Drug Interactions
26.3 Recent Trends of NLP and ML Methods in Pharmacovigilance
26.3.1 The Existing of NLP Approaches
26.3.2 Machine Learning Methods
26.4 Discussions
References
27 Powering Toxicogenomic Studies by Applying Machine Learning to Genomic Sequencing and Variant Detection
27.1 Introduction
27.2 Machine Learning in Genomic Variant Detections
27.2.1 Machine Learning Algorithms in Germline Variant Detection
27.2.2 Challenges in Somatic Mutation Calling
27.2.3 Machine Learning to Improve Accuracy of Somatic Mutation Detection
27.3 Training Data for Machine Learning-Based Variant Callers
27.4 Conclusion
References
28 Machine Learning for Predicting Gas Adsorption Capacities of Metal Organic Framework
28.1 Introduction
28.2 Data Sources
28.3 Descriptors of MOFs
28.4 ML Algorithms
28.5 ML Models for Predicting Gas Adsorption of MOFs
28.5.1 ML Models for CH4 Adsorption
28.5.2 ML Models for H2 Adsorption
28.5.3 ML Models for CO2 Adsorption
28.5.4 ML Models for Xe/Kr Selective Adsorption
28.6 Conclusion Remarks and Future Perspective
References