QSPR/QSAR Analysis Using SMILES and Quasi-SMILES

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

This contributed volume overviews recently presented approaches for carrying out QSPR/QSAR analysis by using a simplifying molecular input-line entry system (SMILES) to represent the molecular structure. In contrast to traditional SMILES, quasi-SMILES is a sequence of special symbols-codes that reflect molecular features and codes of experimental conditions. SMILES and quasi-SMILES serve as a basis to develop QSPR/QSAR as well Nano-QSPR/QSAR via the Monte Carlo calculation that provides the so-called optimal descriptors for QSPR/QSAR models.  The book presents a reliable technology for developing Nano-QSPR/QSAR while it also includes the description of the algorithms of the Monte Carlo optimization. It discusses the theory and practice of the technique of variational authodecoders (VAEs) based on SMILES and analyses in detail the index of ideality of correlation (IIC) and the correlation intensity index (CII) which are new criteria for the predictive potential of the model. The mathematical apparatus used is simple so that students of relevant specializations can easily follow. This volume is a valuable contribution to the field and will be of great interest to developers of models of physicochemical properties and biological activity, chemical technologists, and toxicologists involved in the area of drug design.



Author(s): Alla P. Toropova, Andrey A. Toropov
Series: Challenges and Advances in Computational Chemistry and Physics, 33
Publisher: Springer
Year: 2023

Language: English
Pages: 469
City: Cham

Preface
Contents
Contributors
Abbreviations
Part I Theoretical Conceptions
1 Fundamentals of Mathematical Modeling of Chemicals Through QSPR/QSAR
1.1 Introduction
1.2 QSPR/QSAR: Tools and Tasks
1.3 Five OECD Principles
1.4 Praxis of the QSPR/QSAR Development
1.5 Molecular Descriptors are the Basis for the QSPR/QSAR
1.5.1 Principal Component Analysis
1.5.2 Multiple Linear Regressions
1.5.3 Partial Least Squares
1.5.4 K-Nearest Neighbor Classification
1.5.5 Artificial Neural Network
1.5.6 Support Vector Machine
1.5.7 Random Forest
1.5.8 Monte Carlo Method
1.5.9 Data Curation
1.6 Reproducibility
1.6.1 Applicability Domain
1.6.2 Model Validation
1.7 Recommendations for Building Robust QSPR/QSAR Models
1.8 Is It Possible to Obtain Correlations Suitable for QSPR/QSAR Using SMILES?
1.9 The Main Quality of a Descriptor Is to Indicate the Differences Between Molecules
1.10 Significant Notes
1.11 Conclusions
References
2 Molecular Descriptors in QSPR/QSAR Modeling
2.1 Introduction
2.1.1 History
2.1.2 QSPR/QSAR Modeling
2.1.3 Molecular Descriptors
2.2 Descriptors for Nano-QSPR/QSAR
2.3 SMILES and Quasi-SMILES Descriptors
2.3.1 Quasi-SMILES Examples in Peer-Reviewed Papers
2.4 Software for Generation of Molecular Descriptors
2.5 Conclusion and Future Direction
References
3 Application of SMILES to Cheminformatics and Generation of Optimum SMILES Descriptors Using CORAL Software
3.1 Introduction
3.1.1 The CORAL software description
3.1.2 An Example of Model Training and Validation (Graphically)
3.2 Conclusions 
References
Part II SMILES Based Descriptors
4 All SMILES Variational Autoencoder for Molecular Property Prediction and Optimization
4.1 Introduction
4.1.1 Summary of Novel Contributions
4.2 Efficient Molecular Encoding with Multiple SMILES Strings
4.3 Review of Recurrent Neural Networks
4.4 All SMILES VAE Architecture
4.4.1 Computational Complexity
4.4.2 Latent Space Optimization
4.5 Datasets
4.5.1 ZINC
4.5.2 Tox21
4.6 Results
4.6.1 Reconstruction Accuracy and Validity
4.6.2 Property Prediction
4.6.3 Molecular Optimization
4.6.4 Ablation of Model Components
4.7 SMILES Grammar Can Be Enforced with a Pushdown Automaton
4.7.1 Ringbond and Valence Shell Semantic Constraints
4.7.2 Redundancy in Graph-Based and SMILES Representations of Molecules
4.8 Conclusion
References
5 SMILES-Based Bioactivity Descriptors to Model the Anti-dengue Virus Activity: A Case Study
5.1 Introduction
5.2 Materials and Methods
5.2.1 Importance of Bioactivity Descriptors
5.2.2 Dataset Collection
5.2.3 Calculation of Molecular Descriptors
5.2.4 Development of Linear 2D-QSAR Models
5.2.5 Statistical Analysis of Models
5.2.6 Applicability Domain of the Models
5.2.7 Non-linear Model Development
5.3 Results and Discussion
5.4 Conclusions
References
Part III SMILES for QSPR/QSAR with Optimal Descriptors
6 QSPR Models for Prediction of Redox Potentials Using Optimal Descriptors
6.1 Introduction, Redox Potential, and Its Significance
6.2 Relationship Between Redox Potential and Structure
6.3 Optimal Descriptors in QSPR of Redox Potential
6.3.1 Basic Principles of Employing Optimal Descriptors in QSPR
6.3.2 Published Studies on SMILES-Based QSPR for Redox Potential
6.3.3 Case Study of Two Large Data Sets
6.4 Conclusions
References
7 Building Up QSPR for Polymers Endpoints by Using SMILES-Based Optimal Descriptors
7.1 Introduction
7.1.1 The General Scheme of QSPR/QSAR Analysis of Endpoints Related to Polymers
7.1.2 QSPR Analysis of Endpoints Related to Polymers with MLR
7.1.3 QSPR/QSAR Analysis of Endpoints Related to Polymers with PLS
7.1.4 QSPR Analysis of Endpoints Related to Polymers with ANN
7.1.5 QSPR Analysis of Endpoints Related to Polymers with SVM
7.2 Significant Notes
7.3 Building Up Models of Polymers Endpoints Using SMILES
7.3.1 SMILES
7.3.2 Optimal SMILES-Based Descriptors
7.3.3 The Monte Carlo Optimization Procedure
7.3.4 The Classic Scheme of Building Up the QSPR/QSAR Model Using the Optimal Descriptors
7.3.5 The Balance of Correlations for the QSPR/QSAR Model Using the Optimal Descriptors
7.3.6 Search and Use for Reliable Criteria of the Predictive Potential of QSPR/QSAR Models Based on the Optimal Descriptors
7.3.7 Hybrid Optimal Descriptors
7.3.8 Model Complication
7.4 Examples of Improving Models Built Up with Optimal Descriptors
7.4.1 Development of a New Conception to Building Up a Model
7.4.2 QSPR Models for the Glass Transition Temperature
7.4.3 QSPR Models for the Refractive Index
7.5 Comparison QSPR-Models
7.6 Possible Ways of Evolution of the QSPR for Polymers
7.7 Quasi-SMILES Can Be a Tool for the Discussion of Experimentalists and Model Developers
7.8 Conclusions
References
Part IV Quasi-SMILES for QSPR/QSAR
8 Quasi-SMILES-Based QSPR/QSAR Modeling
8.1 Introduction
8.2 Principals of QSPR/QSAR Models
8.3 Monte Carlo Technique for Nano-QSPR/QSAR
8.3.1 SMILES and Quasi-SMILES
8.3.2 The Main Step for QSPR/QSAR Modeling by SMILES or Quasi-SMILES
8.4 Examples of Quasi-SMILES-Based QSPR/QSAR Models
8.5 Conclusion and Future Direction
References
9 Quasi-SMILES-Based Mathematical Model for the Prediction of Percolation Threshold for Conductive Polymer Composites
9.1 Introduction
9.2 Theoretical Background of the Percolation Threshold
9.2.1 Effect of the Conductive Fillers
9.2.2 Effect of the Host Polymers
9.3 Methods for the Synthesis of Conductive Polymers
9.3.1 Chemical Method
9.3.2 Metathesis Method
9.3.3 Photochemical Method
9.3.4 Electro-Chemical Method
9.3.5 Plasma Polymerisation
9.3.6 Solid-State Method
9.3.7 Inclusion Method
9.4 Various Properties of Conducting Polymers
9.4.1 Magnetic Properties
9.4.2 Optical Properties
9.4.3 Electrical Properties
9.5 Applications of Conductive Polymers
9.5.1 Sensors
9.5.2 Solar Cells
9.5.3 Supercapacitors
9.5.4 Data Storage Transistors
9.5.5 Batteries
9.6 Mathematical Models for the Prediction of Percolation Threshold
9.6.1 Data and Building the Quasi-SMILES Codes
9.6.2 Optimal Descriptor
9.7 Results and Discussion
9.8 Conclusion
References
10 On the Possibility to Build up the QSAR Model of Different Kinds of Inhibitory Activity for a Large List of Human Intestinal Transporter Using Quasi-SMILES
10.1 Introduction
10.1.1 Literature Review on Various QSAR Models for Human Intestinal Transporter
10.1.2 An Overview of Computer Simulations Study of Human Intestinal Transporter
10.2 Materials and Methods
10.2.1 Experimental Data Curation
10.2.2 Development of the Models
10.3 Result and Discussion
10.4 Conclusion
References
11 Quasi-SMILES as a Tool for Peptide QSAR Modelling
11.1 Introduction
11.2 A Brief Overview of QSAR
11.3 Peptide QSAR Modelling
11.4 SMILES-Based Descriptors for QSAR Model Development
11.5 Quasi-SMILES
11.5.1 Development of QSAR Model by Quasi-SMILES
11.5.2 Optimal Descriptor Approach
11.6 Different Application of SMILES/Quasi-SMILES in Peptide QSPR/QSAR Modelling
11.6.1 Antimicrobial Peptides
11.6.2 Epitope Peptides with Class I Major Histocompatibility Complex (MHC)
11.7 Mathematical Approaches Used for Peptide QSAR Modelling
11.7.1 Multiple Linear Regressions (MLR)
11.7.2 Partial Least Square (PLS)
11.7.3 Principal Component Analysis (PCA)
11.7.4 Genetic Algorithm (GA)-Based Peptide QSAR
11.7.5 Particle Swarm Optimization Algorithm (PSO)
11.7.6 Artificial Neural Network (ANN)
11.7.7 Support Vector Machine (SVM)
11.7.8 Other Methods
11.8 Conclusions
References
Part V SMILES and Quasi-SMILES for QSPR/QSAR
12 SMILES and Quasi-SMILES Descriptors in QSAR/QSPR Modeling of Diverse Materials Properties in Safety and Environment Application
12.1 Introduction
12.1.1 QSAR/QSPR Methods
12.1.2 Brief Description of the QSAR/QSPR Methodology
12.2 SMILES and Quasi-SMILES Descriptors
12.3 Study of Several Important Properties/Activities in Safety and Environmental Applications
12.3.1 The Cytotoxicity of Metal Oxide Nanoparticles
12.3.2 Flammability Properties of Chemicals and Their Mixtures
12.3.3 Thermal Hazard Properties of Ionic Liquids and Their Mixtures
12.3.4 Toxicity of Ionic Liquids and Their Mixtures
12.4 Limitations and Outlook in Safety and Environmental Applications
12.4.1 Limitations
12.4.2 Outlook
References
13 SMILES and Quasi-SMILES in QSAR Modeling for Prediction of Physicochemical and Biochemical Properties
13.1 Introduction
13.2 Fundamentals of SMILES and Quasi-SMILES
13.3 Application of SMILES and Quasi-SMILES-Based QSAR Model
13.3.1 Nanoparticles Toxicity and Property Prediction
13.3.2 Toxicity Predictions and Risk Assessment of Organic Chemicals
13.3.3 Miscellaneous Physicochemical and Biochemical Property Predictions of Organic Chemicals
13.4 Conclusion
References
Part VI Possible Ways of Nano-QSPR/Nano-QSAR Evolution
14 The CORAL Software as a Tool to Develop Models for Nanomaterials’ Endpoints
14.1 Introduction
14.2 Theory and Practices of QSPR/QSAR
14.3 SMILES and Nanomaterials
14.4 Quasi-SMILES and Nanomaterials
14.5 Optimal SMILES-Based Descriptor
14.6 The Monte Carlo Optimization
14.7 Conclusions
References
15 Employing Quasi-SMILES Notation in Development of Nano-QSPR Models for Nanofluids
15.1 Introduction
15.1.1 Nanofluids
15.1.2 Theoretical Methods Applied for Study of Nanofluids’ Properties
15.1.3 The Importance of QSPR Study for Nanofluids
15.2 Methodology of CORAL-Based Models Generation
15.2.1 Collection of a Valid Data Set
15.2.2 Quasi-SMILES for Nanofluids
15.2.3 Optimal Descriptors, Predictability Criteria, and Optimization
15.3 Successful Nano-QSPR Studies on Nanofluids
15.4 Conclusion and Perspective Outlook
References
Part VII Possible Ways of QSPR/QSAR Evolution in the Future
16 On Complementary Approaches of Assessing the Predictive Potential of QSPR/QSAR Models
16.1 Introduction
16.2 Software for Building Up QSPR/QSAR Models
16.3 The Critical Analysis of Existing Approaches to Assessing the Predictive Potential
16.4 Convenience and Inconvenience of Correlation
16.5 Convenience and Inconvenience of Causation
16.6 Note on “Secrets of QSPR/QSAR”
16.7 Index Ideality of Correlation (IIC)
16.8 Correlation Intensity Index (CII)
16.9 Can IIC and CII Be Useful?
16.10 Is It Possible to Improve the Predictive Potential of Such Models Using IIC?
16.11 Is It Possible to Improve the Predictive Potential of Such Models Using CII?
16.12 Testing Assumptions About the Significance of IIC and CII
16.13 The Comparison of Criteria of the Predictive Potential of QSPR/QSAR
16.14 The System of Self-consistent Models
16.14.1 Examples of Successful Applications of Self-consistent Models
16.15 Conclusions
References
17 CORAL: Predictions of Quality of Rice Based on Retention Index Using a Combination of Correlation Intensity Index and Consensus Modelling
17.1 Introduction
17.2 Materials and Method
17.2.1 Data
17.2.2 Model
17.2.3 Optimal Descriptor
17.2.4 Monte Carlo Optimization
17.2.5 Applicability Domain
17.2.6 Validation
17.2.7 Consensus Modelling
17.3 Results and Discussion
17.3.1 QSRR Modelling and Validation
17.3.2 Mechanistic Interpretation
17.3.3 Consensus Modelling
17.4 Conclusions
References
Index