Begin your journey toward efficient data manipulation with this robust technical guide and enhance your aggregation skills while building efficient pipelines for a variety of tasks
Key Features
Build effective aggregation pipelines for increased productivity and performance
Solve common data manipulation and analysis problems with the help of practical examples
Learn essential strategies to aggregate time series data in financial datasets and IoT
Book Description
Officially endorsed by MongoDB, Inc., Practical MongoDB Aggregations helps you unlock the full potential of the MongoDB aggregation framework, including the latest features of MongoDB 7.0. This book provides practical, easy-to-digest principles and approaches for increasing your effectiveness in developing aggregation pipelines, supported by examples for building pipelines to solve complex data manipulation and analytical tasks.
This book is customized for developers, architects, data analysts, data engineers, and data scientists with some familiarity with the aggregation framework. It begins by explaining the framework's architecture and then shows you how to build pipelines optimized for productivity and scale.
Given the critical role arrays play in MongoDB's document model, the book delves into best practices for optimally manipulating arrays. The latter part of the book equips you with examples to solve common data processing challenges so you can apply the lessons you've learned to practical situations. By the end of this MongoDB book, you’ll have learned how to utilize the MongoDB aggregation framework to streamline your data analysis and manipulation processes effectively.
What you will learn
Develop dynamic aggregation pipelines tailored to changing business requirements
Master essential techniques to optimize aggregation pipelines for rapid data processing
Achieve optimal efficiency for applying aggregations to vast datasets with effective sharding strategies
Eliminate the performance penalties of processing data externally by filtering, grouping, and calculating aggregated values directly within the database
Use pipelines to help you secure your data access and distribution
Who this book is for
This book is for intermediate-level developers, architects, analysts, engineers, and data scientists who are interested in learning about aggregation capabilities in MongoDB. Working knowledge of MongoDB is needed to get the most out of this book.
Author(s): Paul Done
Publisher: Packt Publishing Pvt. Ltd.
Year: 2023
Language: English
Pages: 542
First edition
Acknowledgements
Foreword
Preface
How will this book help you?
Who this book is for
What this book covers
To get the most out of this book
Download the example code files
Conventions used
Get in touch
Download a free PDF copy of this book
1
MongoDB Aggregations Explained
What is the MongoDB aggregation framework?
What is the MongoDB aggregation language?
What do developers use the aggregation framework for?
A short history of MongoDB aggregations
Aggregation capabilities in MongoDB server releases
Getting going
Setting up your environment
Database
Client tool
Getting further help
Summary
Part 1: Guiding Tips and Principles
2
Optimizing Pipelines for Productivity
Embrace composability for increased productivity
Guiding principles to promote composability
Using macro functions
So, what's the best way of factoring out code?
Better alternatives for a projection stage
When to use $set and $unset
When to use $project
The hidden danger of $project
Key projection takeaways
Summary
3
Optimizing Pipelines for Performance
Using explain plans to identify performance bottlenecks
Viewing an explain plan
Understanding the explain plan
Guidance for optimizing pipeline performance
Be cognizant of streaming vs blocking stages ordering
Avoid unwinding and regrouping documents just to process each array's elements
Encourage match filters to appear early in the pipeline
Summary
4
Harnessing the Power of Expressions
Aggregation expressions explained
What do expressions produce?
Chaining operator expressions together
Can all stages use expressions?
What is using $expr inside $match all about?
Restrictions when using expressions within $match
Advanced use of expressions for array processing
if-else conditional comparison
The power array operators
for-each looping to transform an array
for-each looping to compute a summary value from an array
for-each looping to locate an array element
Reproducing $map behavior using $reduce
Adding new fields to existing objects in an array
Rudimentary schema reflection using arrays
Summary
5
Optimizing Pipelines for Sharded Clusters
A brief summary of MongoDB sharded clusters
Sharding implications for pipelines
Sharded aggregation constraints
Where does a sharded aggregation run?
Pipeline splitting at runtime
Execution of the split pipeline shards
Execution of the merger part of the split pipeline
Difference in merging behavior for grouping versus sorting
Performance tips for sharded aggregations
Summary
Part 2: Aggregations by Example
6
Foundational Examples: Filtering, Grouping, and Unwinding
Filtered top subset
Scenario
Populating the sample data
Defining the aggregation pipeline
Executing the aggregation pipeline
Expected pipeline results
Pipeline observations
Group and total
Scenario
Populating the sample data
Defining the aggregation pipeline
Executing the aggregation pipeline
Expected pipeline result
Pipeline observations
Unpack arrays and group differently
Scenario
Populating the sample data
Defining the aggregation pipeline
Executing the aggregation pipeline
Expected pipeline results
Pipeline observations
Distinct list of values
Scenario
Populating the sample data
Defining the aggregation pipeline
Executing the aggregation pipeline
Expected pipeline result
Pipeline observations
Summary
7
Joining Data Examples
One-to-one join
Scenario
Populating the sample data
Defining the aggregation pipeline
Executing the aggregation pipeline
Expected pipeline result
Pipeline observations
Multi-field join and one-to-many
Scenario
Populating the sample data
Defining the aggregation pipeline
Executing the aggregation pipeline
Expected pipeline result
Pipeline observations
Summary
8
Fixing and Generating Data Examples
Strongly typed conversion
Scenario
Populating the sample data
Defining the aggregation pipeline
Executing the aggregation pipeline
Expected pipeline result
Pipeline observations
Converting incomplete date strings
Scenario
Populating the sample data
Defining the aggregation pipeline
Executing the aggregation pipeline
Expected pipeline result
Pipeline observations
Generating mock test data
Scenario
Populating the sample data
Defining the aggregation pipeline
Executing the aggregation pipeline
Expected pipeline result
Pipeline observations
Summary
9
Trend Analysis Examples
Faceted classification
Scenario
Populating the sample data
Defining the aggregation pipeline
Executing the aggregation pipeline
Expected pipeline result
Pipeline observations
Largest graph network
Scenario
Populating the sample data
Defining the aggregation pipeline
Executing the aggregation pipeline
Expected pipeline result
Pipeline observations
Incremental analytics
Scenario
Populating the sample data
Defining the aggregation pipeline
Executing the aggregation pipeline
Expected pipeline result
Pipeline observations
Summary
10
Securing Data Examples
Redacted view
Scenario
Populating the sample data
Defining the aggregation pipeline
Executing the aggregation pipeline
Expected pipeline result
Pipeline observations
Mask sensitive fields
Scenario
Populating the sample data
Defining the aggregation pipeline
Executing the aggregation pipeline
Expected pipeline result
Pipeline observations
Role programmatic restricted view
Scenario
Populating the sample data
Defining the aggregation pipeline
Executing the aggregation pipeline
Expected pipeline result
Pipeline observations
Summary
11
Time-Series Examples
IoT power consumption
Scenario
Populating the sample data
Defining the aggregation pipeline
Executing the aggregation pipeline
Expected pipeline result
Pipeline observations
State change boundaries
Scenario
Populating the sample data
Defining the aggregation pipeline
Executing the aggregation pipeline
Expected pipeline result
Pipeline observations
Summary
12
Array Manipulation Examples
Summarizing arrays for first, last, minimum, maximum, and average values
Scenario
Populating the sample data
Defining the aggregation pipeline
Executing the aggregation pipeline
Expected pipeline result
Pipeline observations
Pivoting array items by a key
Scenario
Populating the sample data
Defining the aggregation pipeline
Executing the aggregation pipeline
Expected pipeline result
Pipeline observations
Array sorting and percentiles
Scenario
Populating the sample data
Defining the aggregation pipeline
Executing the aggregation pipeline
Expected pipeline result
Pipeline observations
Array element grouping
Scenario
Populating the sample data
Defining the aggregation pipeline
Executing the aggregation pipeline
Expected pipeline result
Pipeline observations
Array fields joining
Scenario
Populating the sample data
Defining the aggregation pipeline
Executing the aggregation pipeline
Expected pipeline result
Pipeline observations
Comparison of two arrays
Scenario
Populating the sample data
Defining the aggregation pipeline
Executing the aggregation pipeline
Expected pipeline result
Pipeline observations
Jagged array condensing
Scenario
Populating the sample data
Defining the aggregation pipeline
Executing the aggregation pipeline
Expected pipeline result
Pipeline observations
Summary
13
Full-Text Search Examples
What is Atlas Search?
Compound text search criteria
Scenario
Populating the sample data
Defining the aggregation pipeline
Executing the aggregation pipeline
Expected pipeline result
Pipeline observations
Facets and counts text search
Scenario
Populating the sample data
Defining the aggregation pipeline
Executing the aggregation pipeline
Expected pipeline result
Pipeline observations
Summary
Appendix
Create an Atlas Search index
Afterword
Index
Why subscribe?
Other books you may enjoy
Packt is searching for authors like you
Download a free PDF copy of this book