Managing Machine Learning Projects: From design to deployment (Final Release)

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Guide machine learning projects from design to production with the techniques in this one-of-a-kind project management guide. No ML skills required In Managing Machine Learning Projects you’ll learn essential machine learning project management techniques, including Understanding an ML project’s requirements Setting up the infrastructure for the project and resourcing a team Working with clients and other stakeholders Dealing with data resources and bringing them into the project for use Handling the lifecycle of models in the project Managing the application of ML algorithms Evaluating the performance of algorithms and models Making decisions about which models to adopt for delivery Taking models through development and testing Integrating models with production systems to create effective applications Steps and behaviors for managing the ethical implications of ML technology Managing Machine Learning Projects is an end-to-end guide for delivering machine learning applications on time and under budget. It lays out tools, approaches, and processes designed to handle the unique challenges of machine learning project management. You’ll follow an in-depth case study through a series of sprints and see how to put each technique into practice. The book’s strong consideration to data privacy, and community impact ensure your projects are ethical, compliant with global legislation, and avoid being exposed to failure from bias and other issues. About the Technology Ferrying machine learning projects to production often feels like navigating uncharted waters. From accounting for large data resources to tracking and evaluating multiple models, machine learning technology has radically different requirements than traditional software. Never fear! This book lays out the unique practices you’ll need to ensure your projects succeed. About the Book Managing Machine Learning Projects is an amazing source of battle-tested techniques for effective delivery of real-life machine learning solutions. The book is laid out across a series of sprints that take you from a project proposal all the way to deployment into production. You’ll learn how to plan essential infrastructure, coordinate experimentation, protect sensitive data, and reliably measure model performance. Many ML projects fail to create real value—read this book to make sure your project is a success. What's Inside Set up infrastructure and resource a team Bring data resources into a project Accurately estimate time and effort Evaluate which models to adopt for delivery Integrate models into effective applications

Author(s): Simon Thompson
Publisher: Manning Publications Co.
Year: 2023

Language: English
Pages: 273

inside front cover
Delivering Machine Learning Projects
Copyright
contents
front matter
preface
acknowledgments
about this book
How this book is organized: A roadmap
LiveBook discussion forum
about the author
about the cover illustration
1 Introduction: Delivering machine learning projects is hard; let’s do it better
1.1 What is machine learning?
1.2 Why is ML important?
1.3 Other machine learning methodologies
1.4 Understanding this book
1.5 Case study: The Bike Shop
Summary
2 Pre-project: From opportunity to requirements
2.1 Pre-project backlog
2.2 Project management infrastructure
2.3 Project requirements
2.3.1 Funding model
2.3.2 Business requirements
2.4 Data
2.5 Security and privacy
2.6 Corporate responsibility, regulation, and ethical considerations
2.7 Development architecture and process
2.7.1 Development environment
2.7.2 Production architecture
Summary
3 Pre-project: From requirements to proposal
3.1 Build a project hypothesis
3.2 Create an estimate
3.2.1 Time and effort estimates
3.2.2 Team design for ML projects
3.2.3 Project risks
3.3 Pre-sales/pre-project administration
3.4 Pre-project/pre-sales checklist
3.5 The Bike Shop pre-sales
3.6 Pre-project postscript
Summary
4 Getting started
4.1 Sprint 0 backlog
4.2 Finalize team design and resourcing
4.3 A way of working
4.3.1 Process and structure
4.3.2 Heartbeat and communication plan
4.3.3 Tooling
4.3.4 Standards and practices
4.3.5 Documentation
4.4 Infrastructure plan
4.4.1 System access
4.4.2 Technical infrastructure evaluation
4.5 The data story
4.5.1 Data collection motivation
4.5.2 Data collection mechanism
4.5.3 Lineage
4.5.4 Events
4.6 Privacy, security, and an ethics plan
4.7 Project roadmap
4.8 Sprint 0 checklist
4.9 Bike Shop: project setup
Summary
5 Diving into the problem
5.1 Sprint 1 backlog
5.2 Understanding the data
5.2.1 The data survey
5.2.2 Surveying numerical data
5.2.3 Surveying categorical data
5.2.4 Surveying unstructured data
5.2.5 Reporting and using the survey
5.3 Business problem refinement, UX, and application design
5.4 Building data pipelines
5.4.1 Data fusion challenges
5.4.2 Pipeline jungles
5.4.3 Data testing
5.5 Model repository and model versioning
5.5.1 Features, foundational models, and training regimes
5.5.2 Overview of versioning
Summary
6 EDA, ethics, and baseline evaluations
6.1 Exploratory data analysis (EDA)
6.1.1 EDA objectives
6.1.2 Summarizing and describing data
6.1.3 Plots and visualizations
6.1.4 Unstructured data
6.2 Ethics checkpoint
6.3 Baseline models and performance
6.4 What if there are problems?
6.5 Pre-modeling checklist
6.6 The Bike Shop: Pre-modelling
6.6.1 After the survey
6.6.2 EDA implementation
Summary
7 Making useful models with ML
7.1 Sprint 2 backlog
7.2 Feature engineering and data augmentation
7.2.1 Data augmentation
7.3 Model design
7.3.1 Design forces
7.3.2 Overall design
7.3.3 Choosing component models
7.3.4 Inductive bias
7.3.5 Multiple disjoint models
7.3.6 Model composition
7.4 Making models with ML
7.4.1 Modeling process
7.4.2 Experiment tracking and model repositories
7.4.3 AutoML and model search
7.5 Stinky, dirty, no good, smelly models
Summary
8 Testing and selection
8.1 Why test and select?
8.2 Testing processes
8.2.1 Offline testing
8.2.2 Offline test environments
8.2.3 Online testing
8.2.4 Field trials
8.2.5 A/B testing
8.2.6 Multi-armed bandits (MABs)
8.2.7 Nonfunctional testing
8.3 Model selection
8.3.1 Quantitative selection
8.3.2 Choosing With Comparable Tests
8.3.3 Choosing with many tests
8.3.4 Qualitative selection measures
8.4 Post modelling checklist
8.5 The Bike Shop: sprint 2
Summary
9 Sprint 3: system building and production
9.1 Sprint 3 backlog
9.2 Types of ML implementations
9.2.1 Assistive systems: recommenders and dashboards
9.2.2 Delegative systems
9.2.3 Autonomous systems
9.3 Nonfunctional review
9.4 Implementing the production system
9.4.1 Production data infrastructure
9.4.2 The model server and the inference service
9.4.3 User interface design
9.5 Logging, monitoring, management, feedback, and documentation
9.5.1 Model governance
9.5.2 Documentation
9.6 Pre-release testing
9.7 Ethics review
9.8 Promotion to production
9.9 You aren’t done yet
9.10 The Bike Shop sprint 3
Summary
10 Post project (sprint Ω)
10.1 Sprint Ω backlog
10.2 Off your hands and into production?
10.2.1 Getting a grip
10.2.2 ML technical debt and model drift
10.2.3 Retraining
10.2.4 In an emergency
10.2.5 Problems in review
10.3 Team post-project review
10.4 Improving practice
10.5 New technology adoption
10.6 Case study
10.7 Goodbye and good luck
Summary
references
Chapter 1
Chapter 2
Chapter 3
Chapter 4
Chapter 5
Chapter 6
Chapter 7
Chapter 8
Chapter 9
Chapter 10
index