DATA MINING AND MACHINE LEARNING APPLICATIONSThe book elaborates in detail on the current needs of data mining and machine learning and promotes mutual understanding among research in different disciplines, thus facilitating research development and collaboration.
Data, the latest currency of today’s world, is the new gold. In this new form of gold, the most beautiful jewels are data analytics and machine learning. Data mining and machine learning are considered interdisciplinary fields. Data mining is a subset of data analytics and machine learning involves the use of algorithms that automatically improve through experience based on data.
Massive datasets can be classified and clustered to obtain accurate results. The most common technologies used include classification and clustering methods. Accuracy and error rates are calculated for regression and classification and clustering to find actual results through algorithms like support vector machines and neural networks with forward and backward propagation. Applications include fraud detection, image processing, medical diagnosis, weather prediction, e-commerce and so forth.
The book features:
- A review of the state-of-the-art in data mining and machine learning,
- A review and description of the learning methods in human-computer interaction,
- Implementation strategies and future research directions used to meet the design and application requirements of several modern and real-time applications for a long time,
- The scope and implementation of a majority of data mining and machine learning strategies.
- A discussion of real-time problems.
Audience
Industry and academic researchers, scientists, and engineers in information technology, data science and machine and deep learning, as well as artificial intelligence more broadly.
Author(s): Rohit Raja, Kapil Kumar Nagwanshi, Sandeep Kumar, K. Ramya Laxmi
Publisher: Wiley-Scrivener
Year: 2022
Language: English
Pages: 473
City: Beverly
Cover
Half-Title Page
Series Page
Title Page
Copyright Page
Contents
Preface
1 Introduction to Data Mining
1.1 Introduction
1.1.1 Data Mining
1.2 Knowledge Discovery in Database (KDD)
1.2.1 Importance of Data Mining
1.2.2 Applications of Data Mining
1.2.3 Databases
1.3 Issues in Data Mining
1.4 Data Mining Algorithms
1.5 Data Warehouse
1.6 Data Mining Techniques
1.7 Data Mining Tools
1.7.1 Python for Data Mining
1.7.2 KNIME
1.7.3 Rapid Miner
References
2 Classification and Mining Behavior of Data
2.1 Introduction
2.2 Main Characteristics of Mining Behavioral Data
2.2.1 Mining Dynamic/Streaming Data
2.2.2 Mining Graph & Network Data
2.2.3 Mining Heterogeneous/Multi-Source Information
2.2.3.1 Multi-Source and Multidimensional Information
2.2.3.2 Multi-Relational Data
2.2.3.3 Background and Connected Data
2.2.3.4 Complex Data, Sequences, and Events
2.2.3.5 Data Protection and Morals
2.2.4 Mining High Dimensional Data
2.2.5 Mining Imbalanced Data
2.2.5.1 The Class Imbalance Issue
2.2.6 Mining Multimedia Data
2.2.6.1 Common Applications Multimedia Data Mining
2.2.6.2 Multimedia Data Mining Utilizations
2.2.6.3 Multimedia Database Management
2.2.7 Mining Scientific Data
2.2.8 Mining Sequential Data
2.2.9 Mining Social Networks
2.2.9.1 Social-Media Data Mining Reasons
2.2.10 Mining Spatial and Temporal Data
2.2.10.1 Utilizations of Spatial and Temporal Data Mining
2.3 Research Method
2.4 Results
2.5 Discussion
2.6 Conclusion
References
3 A Comparative Overview of Hybrid Recommender Systems: Review, Challenges, and Prospects
3.1 Introduction
3.2 Related Work on Different Recommender System
3.2.1 Challenges in RS
3.2.2 Research Questions and Architecture of This Paper
3.2.3 Background
3.2.4 Analysis
3.2.5 Materials and Methods
3.2.6 Comparative Analysis With Traditional Recommender System
3.2.7 Practical Implications
3.2.8 Conclusion & Future Work
References
4 Stream Mining: Introduction, Tools & Techniques and Applications
4.1 Introduction
4.2 Data Reduction: Sampling and Sketching
4.2.1 Sampling
4.2.2 Sketching
4.3 Concept Drift
4.4 Stream Mining Operations
4.4.1 Clustering
4.4.2 Classification
4.4.3 Outlier Detection
4.4.4 Frequent Itemsets Mining
4.5 Tools & Techniques
4.5.1 Implementation in Java
4.5.2 Implementation in Python
4.5.3 Implementation in R
4.6 Applications
4.6.1 Stock Prediction in Share Market
4.6.2 Weather Forecasting System
4.6.3 Finding Trending News and Events
4.6.4 Analyzing User Behavior in Electronic Commerce Site (Click Stream)
4.6.5 Pollution Control Systems
4.7 Conclusion
References
5 Data Mining Tools and Techniques: Clustering Analysis
5.1 Introduction
5.2 Data Mining Task
5.2.1 Data Summarization
5.2.2 Data Clustering
5.2.3 Classification of Data
5.2.4 Data Regression
5.2.5 Data Association
5.3 Data Mining Algorithms and Methodologies
5.3.1 Data Classification Algorithm
5.3.2 Predication
5.3.3 Association Rule
5.3.4 Neural Network
5.3.4.1 Data Clustering Algorithm
5.3.5 In-Depth Study of Gathering Techniques
5.3.6 Data Partitioning Method
5.3.7 Hierarchical Method
5.3.8 Framework-Based Method
5.3.9 Model-Based Method
5.3.10 Thickness-Based Method
5.4 Clustering the Nearest Neighbor
5.4.1 Fuzzy Clustering
5.4.2 K-Algorithm Means
5.5 Data Mining Applications
5.6 Materials and Strategies for Document Clustering
5.6.1 Features Generation
5.7 Discussion and Results
5.7.1 Discussion
5.7.2 Conclusion
References
6 Data Mining Implementation Process
6.1 Introduction
6.2 Data Mining Historical Trends
6.3 Processes of Data Analysis
6.3.1 Data Attack
6.3.2 Data Mixing
6.3.3 Data Collection
6.3.4 Data Conversion
6.3.4.1 Data Mining
6.3.4.2 Design Evaluation
6.3.4.3 Data Illustration
6.3.4.4 Implementation of Data Mining in the Cross-Industry Standard Process
6.3.5 Business Understanding
6.3.6 Data Understanding
6.3.7 Data Preparation
6.3.8 Modeling
6.3.9 Evaluation
6.3.10 Deployment
6.3.11 Contemporary Developments
6.3.12 An Assortment of Data Mining
6.3.12.1 Using Computational & Connectivity Tools
6.3.12.2 Web Mining
6.3.12.3 Comparative Statement
6.3.13 Advantages of Data Mining
6.3.14 Drawbacks of Data Mining
6.3.15 Data Mining Applications
6.3.16 Methodology
6.3.17 Results
6.3.18 Conclusion and Future Scope
References
7 Predictive Analytics in IT Service Management (ITSM)
7.1 Introduction
7.2 Analytics: An Overview
7.2.1 Predictive Analytics
7.3 Significance of Predictive Analytics in ITSM
7.4 Ticket Analytics: A Case Study
7.4.1 Input Parameters
7.4.2 Predictive Modeling
7.4.3 Random Forest Model
7.4.4 Performance of the Predictive Model
7.5 Conclusion
References
8 Modified Cross-Sell Model for Telecom Service Providers Using Data Mining Techniques
8.1 Introduction
8.2 Literature Review
8.3 Methodology and Implementation
8.3.1 Selection of the Independent Variables
8.4 Data Partitioning
8.4.1 Interpreting the Results of Logistic Regression Model
8.5 Conclusions
References
9 Inductive Learning Including Decision Tree and Rule Induction Learning
9.1 Introduction
9.2 The Inductive Learning Algorithm (ILA)
9.3 Proposed Algorithms
9.4 Divide & Conquer Algorithm
9.4.1 Decision Tree
9.5 Decision Tree Algorithms
9.5.1 ID3 Algorithm
9.5.2 Separate and Conquer Algorithm
9.5.3 RULE EXTRACTOR-1
9.5.4 Inductive Learning Applications
9.5.4.1 Education
9.5.4.2 Making Credit Decisions
9.5.5 Multidimensional Databases and OLAP
9.5.6 Fuzzy Choice Trees
9.5.7 Fuzzy Choice Tree Development From a Multidimensional Database
9.5.8 Execution and Results
9.6 Conclusion and Future Work
References
10 Data Mining for Cyber-Physical Systems
10.1 Introduction
10.1.1 Models of Cyber-Physical System
10.1.2 Statistical Model-Based Methodologies
10.1.3 Spatial-and-Transient Closeness-Based Methodologies
10.2 Feature Recovering Methodologies
10.3 CPS vs. IT Systems
10.4 Collections, Sources, and Generations of Big Data for CPS
10.4.1 Establishing Conscious Computation and Information Systems
10.5 Spatial Prediction
10.5.1 Global Optimization
10.5.2 Big Data Analysis CPS
10.5.3 Analysis of Cloud Data
10.5.4 Analysis of Multi-Cloud Data
10.6 Clustering of Big Data
10.7 NoSQL
10.8 Cyber Security and Privacy Big Data
10.8.1 Protection of Big Computing and Storage
10.8.2 Big Data Analytics Protection
10.8.3 Big Data CPS Applications
10.9 Smart Grids
10.10 Military Applications
10.11 City Management
10.12 Clinical Applications
10.13 Calamity Events
10.14 Data Streams Clustering by Sensors
10.15 The Flocking Model
10.16 Calculation Depiction
10.17 Initialization
10.18 Representative Maintenance and Clustering
10.19 Results
10.20 Conclusion
References
11 Developing Decision Making and Risk Mitigation: Using CRISP-Data Mining
11.1 Introduction
11.2 Background
11.3 Methodology of CRISP-DM
11.4 Stage One—Determine Business Objectives
11.4.1 What Are the Ideal Yields of the Venture?
11.4.2 Evaluate the Current Circumstance
11.4.3 Realizes Data Mining Goals
11.5 Stage Two—Data Sympathetic
11.5.1 Portray Data
11.5.2 Investigate Facts
11.5.3 Confirm Data Quality
11.5.4 Data Excellence Description
11.6 Stage Three—Data Preparation
11.6.1 Select Your Data
11.6.2 The Data Is Processed
11.6.3 Data Needed to Build
11.6.4 Combine Information
11.7 Stage Four—Modeling
11.7.1 Select Displaying Strategy
11.7.2 Produce an Investigation Plan
11.7.3 Fabricate Ideal
11.7.4 Evaluation Model
11.8 Stage Five—Evaluation
11.8.1 Assess Your Outcomes
11.8.2 Survey Measure
11.8.3 Decide on the Subsequent Stages
11.9 Stage Six—Deployment
11.9.1 Plan Arrangement
11.9.2 Plan Observing and Support
11.9.3 Produce the Last Report
11.9.4 Audit Venture
11.10 Data on ERP Systems
11.11 Usage of CRISP-DM Methodology
11.12 Modeling
11.12.1 Association Rule Mining (ARM) or Association Analysis
11.12.2 Classification Algorithms
11.12.3 Regression Algorithms
11.12.4 Clustering Algorithms
11.13 Assessment
11.14 Distribution
11.15 Results and Discussion
11.16 Conclusion
References
12 Human–Machine Interaction and Visual Data Mining
12.1 Introduction
12.2 Related Researches
12.2.1 Data Mining
12.2.2 Data Visualization
12.2.3 Visual Learning
12.3 Visual Genes
12.4 Visual Hypotheses
12.5 Visual Strength and Conditioning
12.6 Visual Optimization
12.7 The Vis 09 Model
12.8 Graphic Monitoring and Contact With Human–Computer
12.9 Mining HCI Information Using Inductive Deduction Viewpoint
12.10 Visual Data Mining Methodology
12.11 Machine Learning Algorithms for Hand Gesture Recognition
12.12 Learning
12.13 Detection
12.14 Recognition
12.15 Proposed Methodology for Hand Gesture Recognition
12.16 Result
12.17 Conclusion
References
13 MSDTrA: A Boosting Based-Transfer Learning Approach for Class Imbalanced Skin Lesion Dataset for Melanoma Detection
13.1 Introduction
13.2 Literature Survey
13.3 Methods and Material
13.3.1 Proposed Methodology: Multi Source Dynamic TrAdaBoost Algorithm
13.4 Experimental Results
13.5 Libraries Used
13.6 Comparing Algorithms Based on Decision Boundaries
13.7 Evaluating Results
13.8 Conclusion
References
14 New Algorithms and Technologies for Data Mining
14.1 Introduction
14.2 Machine Learning Algorithms
14.3 Supervised Learning
14.4 Unsupervised Learning
14.5 Semi-Supervised Learning
14.6 Regression Algorithms
14.7 Case-Based Algorithms
14.8 Regularization Algorithms
14.9 Decision Tree Algorithms
14.10 Bayesian Algorithms
14.11 Clustering Algorithms
14.12 Association Rule Learning Algorithms
14.13 Artificial Neural Network Algorithms
14.14 Deep Learning Algorithms
14.15 Dimensionality Reduction Algorithms
14.16 Ensemble Algorithms
14.17 Other Machine Learning Algorithms
14.18 Data Mining Assignments
14.19 Data Mining Models
14.20 Non-Parametric & Parametric Models
14.21 Flexible vs. Restrictive Methods
14.22 Unsupervised vs. Supervised Learning
14.23 Data Mining Methods
14.24 Proposed Algorithm
14.24.1 Organization Formation Procedure
14.25 The Regret of Learning Phase
14.26 Conclusion
References
15 Classification of EEG Signals for Detection of Epileptic Seizure Using Restricted Boltzmann Machine Classifier
15.1 Introduction
15.2 Related Work
15.3 Material and Methods
15.3.1 Dataset Description
15.3.2 Proposed Methodology
15.3.3 Normalization
15.3.4 Preprocessing Using PCA
15.3.5 Restricted Boltzmann Machine (RBM)
15.3.6 Stochastic Binary Units (Bernoulli Variables)
15.3.7 Training
15.4 Experimental Framework
15.5 Experimental Results and Discussion
15.5.1 Performance Measurement Criteria
15.5.2 Experimental Results
15.6 Discussion
15.7 Conclusion
References
16 An Enhanced Security of Women and Children Using Machine Learning and Data Mining Techniques
16.1 Introduction
16.2 Related Work
16.2.1 WoSApp
16.2.2 Abhaya
16.2.3 Women Empowerment
16.2.4 Nirbhaya
16.2.5 Glympse
16.2.6 Fightback
16.2.7 Versatile-Based
16.2.8 RFID
16.2.9 Self-Preservation Framework for Women With Area Following and SMS Alarming Through GSM Network
16.2.10 Safe: A Women Security Framework
16.2.11 Intelligent Safety System For Women Security
16.2.12 A Mobile-Based Women Safety Application
16.2.13 Self-Salvation—The Women’s Security Module
16.3 Issue and Solution
16.3.1 Inspiration
16.3.2 Issue Statement and Choice of Solution
16.4 Selection of Data
16.5 Pre-Preparation Data
16.5.1 Simulation
16.5.2 Assessment
16.5.3 Forecast
16.6 Application Development
16.6.1 Methodology
16.6.2 AI Model
16.6.3 Innovations Used The Proposed Application Has Utilized After Technologies
16.7 Use Case For The Application
16.7.1 Application Icon
16.7.2 Enlistment Form
16.7.3 Login Form
16.7.4 Misconduct Place Detector
16.7.5 Help Button
16.8 Conclusion
References
17 Conclusion and Future Direction in Data Mining and Machine Learning
17.1 Introduction
17.2 Machine Learning
17.2.1 Neural Network
17.2.2 Deep Learning
17.2.3 Three Activities for Object Recognition
17.3 Conclusion
References
Index