Bayesian inference networks, a synthesis of statistics and expert systems, have advanced reasoning under uncertainty in medicine, business, and social sciences. This innovative volume is the first comprehensive treatment exploring how they can be applied to design and analyze innovative educational assessments.
Part I develops Bayes nets’ foundations in assessment, statistics, and graph theory, and works through the real-time updating algorithm. Part II addresses parametric forms for use with assessment, model-checking techniques, and estimation with the EM algorithm and Markov chain Monte Carlo (MCMC). A unique feature is the volume’s grounding in Evidence-Centered Design (ECD) framework for assessment design. This “design forward” approach enables designers to take full advantage of Bayes nets’ modularity and ability to model complex evidentiary relationships that arise from performance in interactive, technology-rich assessments such as simulations. Part III describes ECD, situates Bayes nets as an integral component of a principled design process, and illustrates the ideas with an in-depth look at the BioMass project: An interactive, standards-based, web-delivered demonstration assessment of science inquiry in genetics.
This book is both a resource for professionals interested in assessment and advanced students. Its clear exposition, worked-through numerical examples, and demonstrations from real and didactic applications provide invaluable illustrations of how to use Bayes nets in educational assessment. Exercises follow each chapter, and the online companion site provides a glossary, data sets and problem setups, and links to computational resources.
Author(s): Russell G. Almond, Robert J. Mislevy, Linda Steinberg, Duanli Yan, David Williamson
Series: Statistics for Social and Behavioral Sciences
Publisher: Springer
Year: 2015
Language: English
Pages: C, XXXIII, 666
Cover
Statistics for Social and Behavioral Sciences
Bayesian Networks in Educational Assessment
Copyright
Springer Science+Business Media New York 2015
ISBN 978-1-4939-2124-9
ISBN 978-1-4939-2125-6 (eBook)
DOI 10.1007/978-1-4939-2125-6
Library of Congress Control Number: 2014958291
Dedication
Acknowledgements
Using This Book
Notation
Random Variables
Sets
Probability Distributions and Related Functions
Transcendental Functions
Usual Use of Letters for Indices
Contents
List of Figures
List of Tables
Part I Building Blocks for Bayesian Networks
1 Introduction
1.1 An Example Bayes Network
1.2 Cognitively Diagnostic Assessment
1.3 Cognitive and Psychometric Science
1.4 Ten Reasons for Considering Bayesian Networks
1.5 What Is in This Book
2 An Introduction to Evidence-Centered Design
2.1 Overview
2.2 Assessment as Evidentiary Argument
2.3 The Process of Design
2.4 Basic ECD Structures
2.4.1 The Conceptual Assessment Framework
2.4.2 Four-Process Architecture for Assessment Delivery
2.4.3 Pretesting and Calibration
2.5 Conclusion
3 Bayesian Probability and Statistics: a Review
3.1 Probability: Objective and Subjective
3.1.1 Objective Notions of Probability
3.1.2 Subjective Notions of Probability
3.1.3 Subjective–Objective Probability
3.2 Conditional Probability
3.3 Independence and Conditional Independence
3.3.1 Conditional Independence
3.3.2 Common Variable Dependence
3.3.3 Competing Explanations
3.4 Random Variables
3.4.1 The Probability Mass and Density Functions
3.4.2 Expectation and Variance
3.5 Bayesian Inference
3.5.1 Re-expressing Bayes Theorem
3.5.2 Bayesian Paradigm
3.5.3 Conjugacy
3.5.4 Sources for Priors
3.5.5 Noninformative Priors
3.5.6 Evidence-Centered Design and the Bayesian Paradigm
4 Basic Graph Theory and Graphical Models
4.1 Basic Graph Theory
4.1.1 Simple Undirected Graphs
4.1.2 Directed Graphs
4.1.3 Paths and Cycles
4.2 Factorization of the Joint Distribution
4.2.1 Directed Graph Representation
4.2.2 Factorization Hypergraphs
4.2.3 Undirected Graphical Representation
4.3 Separation and Conditional Independence
4.3.1 Separation and D-Separation
4.3.2 Reading Dependence and Independence from Graphs
4.3.3 Gibbs–Markov Equivalence Theorem
4.4 Edge Directions and Causality
4.5 Other Representations
4.5.1 Influence Diagrams
4.5.2 Structural Equation Models
4.5.3 Other Graphical Models
5 Efficient Calculations
5.1 Belief Updating with Two Variables
5.2 More Efficient Procedures for Chains and Trees
5.2.1 Propagation in Chains
5.2.2 Propagation in Trees
5.2.3 Virtual Evidence
5.3 Belief Updating in Multiply Connected Graphs
5.3.1 Updating in the Presence of Loops
5.3.2 Constructing a Junction Tree
5.3.3 Propagating Evidence Through a Junction Tree
5.4 Application to Assessment
5.4.1 Proficiency and Evidence Model Bayes Net Fragments
5.4.2 Junction Trees for Fragments
5.4.3 Calculation with Fragments
5.5 The Structure of a Test
5.5.1 The Q-Matrix for Assessments Using Only Discrete Items
5.5.2 The Q-Matrix for a Test Using Multi-observable Tasks
5.6 Alternative Computing Algorithms
5.6.1 Variants of the Propagation Algorithm
5.6.2 Dealing with Unfavorable Topologies
6 Some Example Networks
6.1 A Discrete IRT Model
6.1.1 General Features of the IRT Bayes Net
6.1.2 Inferences in the IRT Bayes Net
6.2 The ``Context'' Effect
6.3 Compensatory, Conjunctive, and Disjunctive Models
6.4 A Binary-Skills Measurement Model
6.4.1 The Domain of Mixed Number Subtraction
6.4.2 A Bayes Net Model for Mixed-Number Subtraction
6.4.3 Inferences from the Mixed-Number Subtraction Bayes Net
6.5 Discussion
7 Explanation and Test Construction
7.1 Simple Explanation Techniques
7.1.1 Node Coloring
7.1.2 Most Likely Scenario
7.2 Weight of Evidence
7.2.1 Evidence Balance Sheet
7.2.2 Evidence Flow Through the Graph
7.3 Activity Selection
7.3.1 Value of Information
7.3.2 Expected Weight of Evidence
7.3.3 Mutual Information
7.4 Test Construction
7.4.1 Computer Adaptive Testing
7.4.2 Critiquing
7.4.3 Fixed-Form Tests
7.5 Reliability and Assessment Information
7.5.1 Accuracy Matrix
7.5.2 Consistency Matrix
7.5.3 Expected Value Matrix
7.5.4 Weight of Evidence as Information
Part II Learning and Revising Models from Data
8 Parameters for Bayesian Network Models
8.1 Parameterizing a Graphical Model
8.2 Hyper-Markov Laws
8.3 The Conditional Multinomial—Hyper-Dirichlet Family
8.3.1 Beta-Binomial Family
8.3.2 Dirichlet-Multinomial Family
8.3.3 The Hyper-Dirichlet Law
8.4 Noisy-OR and Noisy-AND Models
8.4.1 Separable Influence
8.5 DiBello's Effective Theta Distributions
8.5.1 Mapping Parent Skills to Space
8.5.2 Combining Input Skills
8.5.3 Samejima's Graded Response Model
8.5.4 Normal Link Function
8.6 Eliciting Parameters and Laws
8.6.1 Eliciting Conditional Multinomial and Noisy-AND
8.6.2 Priors for DiBello's Effective Theta Distributions
8.6.3 Linguistic Priors
9 Learning in Models with Fixed Structure
9.1 Data, Models, and Plate Notation
9.1.1 Plate Notation
9.1.2 A Bayesian Framework for a Generic Measurement Model
9.1.3 Extension to Covariates
9.2 Techniques for Learning with Fixed Structure
9.2.1 Bayesian Inference for the General Measurement Model
9.2.2 Complete Data Tables
9.3 Latent Variables as Missing Data
9.4 The EM Algorithm
9.5 Markov Chain Monte Carlo Estimation
9.5.1 Gibbs Sampling
9.5.2 Properties of MCMC Estimation
9.5.3 The Metropolis–Hastings Algorithm
9.6 MCMC Estimation in Bayes Nets in Assessment
9.6.1 Initial Calibration
9.6.2 Online Calibration
9.7 Caution: MCMC and EM are Dangerous!
10 Critiquing and Learning Model Structure
10.1 Fit Indices Based on Prediction Accuracy
10.2 Posterior Predictive Checks
10.3 Graphical Methods
10.4 Differential Task Functioning
10.5 Model Comparison
10.5.1 The DIC Criterion
10.5.2 Prediction Criteria
10.6 Model Selection
10.6.1 Simple Search Strategies
10.6.2 Stochastic Search
10.6.3 Multiple Models
10.6.4 Priors Over Models
10.7 Equivalent Models and Causality
10.7.1 Edge Orientation
10.7.2 Unobserved Variables
10.7.3 Why Unsupervised Learning cannot Prove Causality
10.8 The ``True'' Model
11 An Illustrative Example
11.1 Representing the Cognitive Model
11.1.1 Representing the Cognitive Model as a Bayesian Network
11.1.2 Representing the Cognitive Model as a Bayesian Network
11.1.3 Higher-Level Structure of the Proficiency Model; i.e., p(bold0mu mumu [|bold0mu mumu [) and p(bold0mu mumu [)
11.1.4 High Level Structure of the Evidence Models; i.e., p()
11.1.5 Putting the Pieces Together
11.2 Calibrating the Model with Field Data
11.2.1 MCMC Estimation
11.2.2 Scoring
11.2.3 Online Calibration
11.3 Model Checking
11.3.1 Observable Characteristic Plots
11.3.2 Posterior Predictive Checks
11.4 Closing Comments
Part III Evidence-Centered Assessment Design
12 The Conceptual Assessment Framework
12.1 Phases of the Design Process and Evidentiary Arguments
12.1.1 Domain Analysis and Domain Modeling
12.1.2 Arguments and Claims
12.2 The Student Proficiency Model
12.2.1 Proficiency Variables
12.2.2 Relationships Among Proficiency Variables
12.2.3 Reporting Rules
12.3 Task Models
12.4 Evidence Models
12.4.1 Rules of Evidence (for Evidence Identification)
12.4.2 Statistical Models of Evidence (for Evidence Accumulation)
12.5 The Assembly Model
12.6 The Presentation Model
12.7 The Delivery Model
12.8 Putting It All Together
13 The Evidence Accumulation Process
13.1 The Four-Process Architecture
13.1.1 A Simple Example of the Four-Process Framework
13.2 Producing an Assessment
13.2.1 Tasks and Task Model Variables
13.2.2 Evidence Rules
13.2.3 Evidence Models, Links, and Calibration
13.3 Scoring
13.3.1 Basic Scoring Protocols
13.3.2 Adaptive Testing
13.3.3 Technical Considerations
13.3.4 Score Reports
14 Biomass: An Assessment of Science Standards
14.1 Design Goals
14.2 Designing Biomass
14.2.1 Reconceiving Standards
14.2.2 Defining Claims
14.2.3 Defining Evidence
14.3 The Biomass Conceptual Assessment Framework
14.3.1 The Proficiency Model
14.3.2 The Assembly Model
14.3.3 Task Models
14.3.4 Evidence Models
14.4 The Assessment Delivery Processes
14.4.1 Biomass Architecture
14.4.2 The Presentation Process
14.4.3 Evidence Identification
14.4.4 Evidence Accumulation
14.4.5 Activity Selection
14.4.6 The Task/Evidence Composite Library
14.4.7 Controlling the Flow of Information Among the Processes
14.5 Conclusion
15 The Biomass Measurement Model
15.1 Specifying Prior Distributions
15.1.1 Specification of Proficiency Variable Priors
15.1.2 Specification of Evidence Model Priors
15.1.3 Summary Statistics
15.2 Pilot Testing
15.2.1 A Convenience Sample
15.2.2 Item and other Exploratory Analyses
15.3 Updating Based on Pilot Test Data
15.3.1 Posterior Distributions
15.3.2 Some Observations on Model Fit
15.3.3 A Quick Validity Check
15.4 Conclusion
16 The Future of Bayesian Networks in Educational Assessment
16.1 Applications of Bayesian Networks
16.2 Extensions to the Basic Bayesian Network Model
16.2.1 Object-Oriented Bayes Nets
16.2.2 Dynamic Bayesian Networks
16.2.3 Assessment-Design Support
16.3 Connections with Instruction
16.3.1 Ubiquitous Assessment
16.4 Evidence-Centered Assessment Design and Validity
16.5 What We Still Do Not Know
A Bayesian Network Resources
A.1 Software
A.1.1 Bayesian Network Manipulation
A.1.2 Manual Construction of Bayesian Networks
A.1.3 Markov Chain Monte Carlo
A.2 Sample Bayesian Networks
References
Author Index
Subject Index