Learning to Quantify

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

This open access book provides an introduction and an overview of learning to quantify (a.k.a. “quantification”), i.e. the task of training estimators of class proportions in unlabeled data by means of supervised learning. In data science, learning to quantify is a task of its own related to classification yet different from it, since estimating class proportions by simply classifying all data and counting the labels assigned by the classifier is known to often return inaccurate (“biased”) class proportion estimates.

The book introduces learning to quantify by looking at the supervised learning methods that can be used to perform it, at the evaluation measures and evaluation protocols that should be used for evaluating the quality of the returned predictions, at the numerous fields of human activity in which the use of quantification techniques may provide improved results with respect to the naive use of classification techniques, and at advanced topics in quantification research.

The book is suitable to researchers, data scientists, or PhD students, who want to come up to speed with the state of the art in learning to quantify, but also to researchers wishing to apply data science technologies to fields of human activity (e.g., the social sciences, political science, epidemiology, market research) which focus on aggregate (“macro”) data rather than on individual (“micro”) data.



Author(s): Andrea Esuli, Alessandro Fabris, Alejandro Moreo, Fabrizio Sebastiani
Series: The Information Retrieval Series, 47
Publisher: Springer
Year: 2023

Language: English
Pages: 144
City: Cham

Preface
Acknowledgments
Contents
Acronyms
1 The Case for Quantification
1.1 Class Distributions and Their Estimation
1.2 The Suboptimality of Classify and Count
1.3 Notational Conventions
1.4 Quantification Problems
1.5 Dataset Shift and Quantification
1.5.1 Types of Dataset Shift and Their Relation to Quantification
1.6 Quantification and Bias Mitigation
1.7 Structure of This Book
2 Applications of Quantification
2.1 Improving Classification Accuracy
2.1.1 Word Sense Disambiguation
2.2 Fairness
2.2.1 Improving Fairness
2.2.2 Measuring Fairness
2.3 Sentiment Analysis
2.4 Social and Political Sciences
2.5 Market Research
2.6 Epidemiology
2.7 Ecological Modelling
2.8 Resource Allocation
3 Evaluation of Quantification Algorithms
3.1 Measures for Evaluating SLQ, BQ, and MLQ
3.1.1 Properties of Evaluation Measures for SLQ, BQ,and MLQ
3.1.2 Bias
3.1.3 Absolute Error and its Variants
3.1.4 Relative Absolute Error and its Variants
3.1.5 Kullback-Leibler Divergence and its Variants
3.1.6 Which Measure is the Best for SLQ?
3.2 Measures for Evaluating OQ
3.2.1 Earth Mover's Distance
3.2.2 Root Normalised Order-Aware Divergence
3.3 Measures for Evaluating Regression Quantification
3.4 Experimental Protocols for Evaluating Quantification
3.4.1 Natural Prevalence Protocol (NPP)
3.4.2 Artificial Prevalence Protocol (APP)
3.4.3 A Variant of the APP Based on the Kraemer Algorithm
3.4.4 Should we Use the NPP or the APP?
3.5 Model Selection in Quantification
4 Methods for Learning to Quantify
4.1 Maximum Likelihood Prevalence Estimation
4.2 Aggregative Methods Based on General-Purpose Learners
4.2.1 Classify and Count
4.2.2 Probabilistic Classify and Count
4.2.3 Adjusted Classify and Count
4.2.4 Probabilistic Adjusted Classify and Count
4.2.5 X, MAX, and [email protected]
4.2.6 Median Sweep
4.2.7 The Ratio Estimator
4.2.8 Mixture Models
4.2.9 Expectation Maximisation for Quantification
4.2.10 Class Distribution Estimation
4.2.11 Ensemble Methods for Quantification
4.2.12 QuaNet
4.3 Aggregative Methods Based on Special-Purpose Learners
4.3.1 Methods Based on Explicit Loss Minimisation
4.3.2 Quantification Trees and Quantification Forests
4.4 Non-Aggregative Methods
4.4.1 The ReadMe Method
4.4.2 The iSA Method
4.4.3 The ReadMe2 Method
4.4.4 The HDx Method
4.4.5 The MMD-RKHS Method
4.4.6 The Uncertainty-Aware Generative Model
4.4.7 Deep Quantification Network
5 Advanced Topics
5.1 Ordinal Quantification
5.2 Regression Quantification
5.3 Cross-Lingual Quantification
5.4 Quantification for Networked Data
5.5 Cost Quantification
5.6 Quantification in Data Streams
5.7 One-Class Quantification
5.8 Confidence Intervals for Class Prevalence Estimates
6 The Quantification Landscape
6.1 Historical Development
6.1.1 The Trajectory of Quantification
6.1.2 Shared Tasks
6.2 Software
6.2.1 Publicly Available Implementations
6.2.2 QuaPy: A Comprehensive Framework for Quantification
6.3 How Do Different Quantification Methods Fare?
6.3.1 A Tour of Experimental Results
6.3.2 Visualisation Tools for the Analysis of Results
6.4 Related Tasks
6.4.1 Links to Existing Tasks
6.4.2 A Possible Variant of the Quantification Task
7 The Road Ahead
Bibliography
Index