Database Performance at Scale: A Practical Guide

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Discover critical considerations and best practices for improving database performance based on what has worked, and failed, across thousands of teams and use cases in the field. This book provides practical guidance for understanding the database-related opportunities, trade-offs, and traps you might encounter while trying to optimize data-intensive applications for high throughput and low latency. Whether you are building a new system from the ground up or trying to optimize an existing use case for increased demand, this book covers the essentials. The book begins with a look at the many factors impacting database performance at the extreme scale that today’s game changing applications face—or at least hope to achieve. You’ll gain insight into the performance impact of both technical and business requirements, and how those should influence your decisions around database infrastructure and topology. The authors share an inside perspective on often-overlooked engineering details that could be constraining—or helping—your team’s database performance. The book also covers benchmarking and monitoring practices by which to measure and validate the outcomes from the decisions that you make. The ultimate goal of the book is to help you discover new ways to optimize database performance for your team’s specific use cases, requirements, and expectations. What You Will Learn Understand often overlooked factors that impact database performance at scale Recognize data-related performance and scalability challenges associated with your project Select a database architecture that’s suited to your workloads, use cases, and requirements Avoid common mistakes that could impede your long-term agility and growth Jumpstart teamwide adoption of best practices for optimizing database performance at scale Who This Book Is For Individuals and teams looking to optimize distributed database performance for an existing project or to begin a new performance-sensitive project with a solid and scalable foundation. This will likely include software architects, database architects, and senior software engineers who are either experiencing or anticipating pain related to database latency and/or throughput. You are most likely: • Experiencing or anticipating some pain related to database latency and/or throughput • Working primarily on a use case with terabytes to petabytes of raw (unreplicated) data, over 10K operations per second, and with P99 latencies measured in milliseconds • At least somewhat familiar with scalable distributed databases such as Apache Cassandra, ScyllaDB, Amazon DynamoDB, Google Cloud Bigtable, CockroachDB, and so on • A software architect, database architect, software engineer, VP of engineering, or technical CTO/founder working with a data-intensive application What This Book Is NOT: A few things that this book is not attempting to be: • A reference for infrastructure engineers building databases. We focus on people working with a database. • A “definitive guide” to distributed databases, NoSQL, or data-intensive applications. We focus on the top database considerations most critical to performance. • A guide on how to configure, work with, optimize, or tune any specific database. We focus on broader strategies you can “port” across databases.

Author(s): Felipe Cardeneti Mendes, Piotr Sarna, Pavel Emelyanov
Publisher: Apress
Year: 2023

Language: English
Pages: 270

Table of Contents
About the Authors
About the Technical Reviewers
Acknowledgments
Introduction
Chapter 1: A Taste of What You’re Up Against: Two Tales
Joan Dives Into Drivers and Debugging
Joan’s Diary of Lessons Learned, Part I
The Tuning
Joan’s Diary of Lessons Learned, Part II
Patrick’s Unlucky Green Fedoras
Patrick’s Diary of Lessons Learned, Part I
The First Spike
Patrick’s Diary of Lessons Learned, Part II
The First Loss
Patrick’s Diary of Lessons Learned, Part III
The Spike Strikes Again
Patrick’s Diary of Lessons Learned, Part IV
Backup Strikes Back
Patrick’s Diary of Lessons Learned, Part V
Summary
Chapter 2: Your Project, Through the Lens of Database Performance
Workload Mix (Read/Write Ratio)
Write-Heavy Workloads
Read-Heavy Workloads
Mixed Workloads
Delete-Heavy Workloads
Competing Workloads (Real-Time vs Batch)
Item Size
Item Type
Dataset Size
Throughput Expectations
Latency Expectations
Concurrency
Connected Technologies
Demand Fluctuations
ACID Transactions
Consistency Expectations
Geographic Distribution
High-Availability Expectations
Summary
Chapter 3: Database Internals: Hardware and Operating System Interactions
CPU
Share Nothing Across Cores
Futures-Promises
Execution Stages
Frontend
Branch Speculation
Backend
Retiring
Implications for Databases
Memory
Allocation
Cache Control
I/O
Traditional Read/Write
mmap
Direct I/O (DIO)
Asynchronous I/O (AIO/DIO)
Understanding the Tradeoffs
Copying and MMU Activity
I/O Scheduling
Thread Scheduling
I/O Alignment
Application Complexity
Choosing the Filesystem and/or Disk
Filesystems vs Raw Disks
Appending Writes
How Modern SSDs Work
Networking
DPDK
IRQ Binding
Summary
Chapter 4: Database Internals: Algorithmic Optimizations
Optimizing Collections
To B- or Not to B-Tree
Linear Search on Steroids
Scanning the Tree
When the Tree Size Matters
The Secret Life of Separation Keys
Summary
Chapter 5: Database Drivers
Relationship Between Clients and Servers
Workload Types
Interactive Workloads
Batch (Analytical) Workloads
Mixed Workloads
Throughput vs Goodput
Timeouts
Client-Side Timeouts
Server-Side Timeouts
A Cautionary Tale
Contextual Awareness
Topology and Metadata
Current Load
Request Caching
Query Locality
Retries
Error Categories
Idempotence
Retry Policies
Paging
Concurrency
Modern Hardware
Modern Software
What to Look for When Selecting a Driver
Summary
Chapter 6: Getting Data Closer
Databases as Compute Engines
User-Defined Functions and Procedures
Determinism
Latency
Just-in-Time Compilation (JIT)
Examples
Best Practices
User-Defined Aggregates
Built-In Aggregates
Components
Initial Value
State Transition Function
Final Function
Reduce Function
Examples
State Transition Function
Final Function
Aggregate Definition
Distributed User-Defined Aggregate
Best Practices
WebAssembly for User-Defined Functions
Runtime
Back to Latency
Edge Computing
Performance
Conflict-Free Replicated Data Types
G-Counter
PN-Counter
G-Set
LWW-Set
Summary
Chapter 7: Infrastructure and Deployment Models
Core Hardware Considerations for Speed at Scale
Identifying the Source of Your Performance Bottlenecks
Achieving Balance
Setting Realistic Expectations
Recommendations for Specific Hardware Components
Storage
Disk Types
Disk Setup
Disk Size
Raw Devices and Custom Drivers
Maintaining Disk Performance Over Time
Tiered Storage
CPUs (Cores)
Memory (RAM)
Network
Considerations in the Cloud
Fully Managed Database-as-a-Service
Serverless Deployment Models
Containerization and Kubernetes
Summary
Chapter 8: Topology Considerations
Replication Strategy
Rack Configuration
Multi-Region or Global Replication
Multi-Availability Zones vs. Multi-Region
Scaling Up vs Scaling Out
Workload Isolation
More on Workload Prioritization for Logical Isolation
Abstraction Layers
Load Balancing
External Caches
An External Cache Adds Latency
An External Cache Is an Additional Cost
External Caching Decreases Availability
Application Complexity: Your Application Needs to Handle More Cases
External Caching Ruins the Database Caching
External Caching Might Increase Security Risks
External Caching Ignores the Database Knowledge and Database Resources
Summary
Chapter 9: Benchmarking
Latency or Throughput: Choose Your Focus
Less Is More (at First): Taking a Phased Approach
Benchmarking Do’s and Don’ts
Know What’s Under the Hood of Your Database (Or Find Someone Who Knows)
Choose an Environment That Takes Advantage of the Database’s Potential
Use an Environment That Represents Production
Don’t Overlook Observability
Use Standardized Benchmarking Tools Whenever Feasible
Use Representative Data Models, Datasets, and Workloads
Data Models
Dataset Size
Workloads
Exercise Your Cache Realistically
Look at Steady State
Watch Out for Client-Side Bottlenecks
Also Watch Out for Networking Issues
Document Meticulously to Ensure Repeatability
Reporting Do’s and Don’ts
Be Careful with Aggregations
Don’t Assume People Will Believe You
Take Coordinated Omission Into Account
Special Considerations for Various Benchmarking Goals
Preparing for Growth
Comparing Different Databases
Comparing the Same Database on Different Infrastructure
Assessing the Impact of a Data Modeling or Database Configuration Change
Beyond the Usual Benchmark
Benchmarking Admin Operations
Testing Disaster Recovery
Benchmarking at Extreme Scale
Summary
Chapter 10: Monitoring
Taking a Proactive Approach
Tracking Core Database KPIs
Database Cluster KPIs
What to Look for at Different Levels (Datacenter, Node, CPU/Shard)
Three Industry-Specific Examples
Application KPIs
Infrastructure/Hardware KPIs
Creating Effective Custom Alerts
Walking Through Sample Scenarios
One Replica Is Lagging in Acknowledging Requests
Disappointing P99 Read Latencies
Monitoring Options
The Database Vendor’s Monitoring Stack
Build Your Own Dashboards and Alerting (Grafana, Grafana Loki)
Third-Party Database Monitoring Tools
Full Stack Application Performance Monitoring (APM) Tool
Summary
Chapter 11: Administration
Admin Operations and Performance
Looking at Admin Operations Through the Lens of Performance
Backups
Impacts
Optimization
Compaction
Impacts
Optimization
Summary
Appendix A: A Brief Look at Fundamental Database Design Decisions
Sharding and Replication
Sharding
Replication
Learning More
Consensus Algorithms
Raft
Paxos
Comparing Leaderless and “Leader-Based” Classes
Learning More
B-Tree vs LSM Tree
Learning More
Record Storage Approach
Row-Oriented Databases
Column-Oriented Databases
Learning More
Index