Optimize the decisions that define your code by exploring the common mistakes and intentional tradeoffs made by expert developers.
In Software Mistakes and Tradeoffs you will learn how to:
Reason about your systems to make intuitive and better design decisions
Understand consequences and how to balance tradeoffs
Pick the right library for your problem
Thoroughly analyze all of your service’s dependencies
Understand delivery semantics and how they influence distributed architecture
Design and execute performance tests to detect code hot paths and validate a system’s SLA
Detect and optimize hot paths in your code to focus optimization efforts on root causes
Decide on a suitable data model for date/time handling to avoid common (but subtle) mistakes
Reason about compatibility and versioning to prevent unexpected problems for API clients
Understand tight/loose coupling and how it influences coordination of work between teams
Clarify requirements until they are precise, easily implemented, and easily tested
Optimize your APIs for friendly user experience
Code performance versus simplicity. Delivery speed versus duplication. Flexibility versus maintainability—every decision you make in software engineering involves balancing tradeoffs. In Software Mistakes and Tradeoffs you’ll learn from costly mistakes that Tomasz Lelek and Jon Skeet have encountered over their impressive careers. You’ll explore real-world scenarios where poor understanding of tradeoffs lead to major problems down the road, so you can pre-empt your own mistakes with a more thoughtful approach to decision making.
Learn how code duplication impacts the coupling and evolution speed of your systems, and how simple-sounding requirements can have hidden nuances with respect to date and time information. Discover how to efficiently narrow your optimization scope according to 80/20 Pareto principles, and ensure consistency in your distributed systems. You’ll soon have built up the kind of knowledge base that only comes from years of experience.
Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications.
About the technology
Every step in a software project involves making tradeoffs. When you’re balancing speed, security, cost, delivery time, features, and more, reasonable design choices may prove problematic in production. The expert insights and relatable war stories in this book will help you make good choices as you design and build applications.
About the book
Software Mistakes and Tradeoffs explores real-world scenarios where the wrong tradeoff decisions were made and illuminates what could have been done differently. In it, authors Tomasz Lelek and Jon Skeet share wisdom based on decades of software engineering experience, including some delightfully instructive mistakes. You’ll appreciate the specific tips and practical techniques that accompany each example, along with evergreen patterns that will change the way you approach your next projects.
What's inside
How to reason about your software systematically
How to pick tools, libraries, and frameworks
How tight and loose coupling affect team coordination
Requirements that are precise, easy to implement, and easy to test
About the reader
For mid- and senior-level developers and architects who make decisions about software design and implementation.
About the author
Tomasz Lelek works daily with a wide range of production services, architectures, and JVM languages. A Google engineer and author of C# in Depth, Jon Skeet is famous for his many practical contributions to Stack Overflow.
Table of Contents
1 Introduction
2 Code duplication is not always bad: Code duplication vs. flexibility
3 Exceptions vs. other patterns of handling errors in your code
4 Balancing flexibility and complexity
5 Premature optimization vs. optimizing the hot path: Decisions that impact code performance
6 Simplicity vs. cost of maintenance for your API
7 Working effectively with date and time data
8 Leveraging data locality and memory of your machines
9 Third-party libraries: Libraries you use become your code
10 Consistency and atomicity in distributed systems
11 Delivery semantics in distributed systems
12 Managing versioning and compatibility
13 Keeping up to date with trends vs. cost of maintenance of your code
Author(s): Tomasz Lelek, Jon Skeet
Edition: 1
Publisher: Manning
Year: 2022
Language: English
Pages: 416
Tags: Software Development; Software Development Best Practices; Software Development Process; Reasoning about Software Development
contents
front matter
preface
acknowledgments
about this book
about the authors
about the cover illustration
1 Introduction
1.1 Consequences of every decision and pattern
Unit testing decisions
Proportions of unit and integration tests
1.2 Code design patterns and why they do not always work
Measuring our code
1.3 Architecture design patterns and why they do not always work
Scalability and elasticity
Development speed
Complexity of microservices
2 Code duplication is not always bad: Code duplication vs. flexibility
2.1 Common code between codebases and duplication
Adding a new business requirement that requires code duplication
Implementing the new business requirement
Evaluating the result
2.2 Libraries and sharing code between codebases
Evaluating the tradeoffs and disadvantages of shared libraries
Creating a shared library
2.3 Code extraction to a separate microservice
Looking at the tradeoffs and disadvantages of a separate service
Conclusions about separate service
2.4 Improving loose coupling by code duplication
2.5 An API design with inheritance to reduce duplication
Extracting a base request handler
Looking at inheritance and tight coupling
Looking at the tradeoffs between inheritance and composition
Looking at inherent and incidental duplication
3 Exceptions vs. other patterns of handling errors in your code
3.1 Hierarchy of exceptions
Catch-all vs. a more granular approach to handling errors
3.2 Best patterns to handle exceptions in the code that you own
Handling checked exceptions in a public API
Handling unchecked exceptions in a public API
3.3 Anti-patterns in exception handling
Closing resources in case of an error
Anti-pattern of using exceptions to control application flow
3.4 Exceptions from third-party libraries
3.5 Exceptions in multithread environments
Exceptions in an async workflow with a promise API
3.6 Functional approach to handling errors with Try
Using Try in production code
Mixing Try with code that throws an exception
3.7 Performance comparison of exception-handling code
4 Balancing flexibility and complexity
4.1 A robust but not extensible API
Designing a new component
Starting with the most straightforward code
4.2 Allowing clients to provide their own metrics framework
4.3 Providing extensibility of your APIs via hooks
Guarding against unpredictable usage of the hooks API
Performance impact of the hook API
4.4 Providing extensibility of your APIs via listeners
Using listeners vs. hooks
Immutability of our design
4.5 Flexibility analysis of an API vs. the cost of maintenance
5 Premature optimization vs. optimizing the hot path: Decisions that impact code performance
5.1 When premature optimization is evil
Creating accounts processing pipeline
Optimizing processing based on false assumptions
Benchmarking performance optimization
5.2 Hot paths in your code
Understanding the Pareto principle in the context of software systems
Configuring the number of concurrent users (threads) for a given SLA
5.3 A word service with a potential hot path
Getting the word of the day
Validating if the word exists
Exposing the WordsService using HTTP service
5.4 Hot path detection in your code
Creating API performance tests using Gatling
Measuring code paths using MetricRegistry
5.5 Improvements for hot path performance
Creating JMH microbenchmark for the existing solution
Optimizing word exists using a cache
Modifying performance tests to have more input words
6 Simplicity vs. cost of maintenance for your API
6.1 A base library used by other tools
Creating a cloud service client
Exploring authentication strategies
Understanding the configuration mechanism
6.2 Directly exposing settings of a dependent library
Configuring the batch tool
6.3 A tool that is abstracting settings of a dependent library
Configuring the streaming tool
6.4 Adding new setting for the cloud client library
Adding a new setting to the batch tool
Adding a new setting to the streaming tool
Comparing both solutions for UX friendliness and maintainability
6.5 Deprecating/removing a setting in the cloud client library
Removing a setting from the batch tool
Removing a setting from the streaming tool
Comparing both solutions for UX friendliness and maintainability
7 Working effectively with date and time data
7.1 Concepts in date and time information
Machine time: Instants, epochs, and durations
Civil time: Calendar systems, dates, times, and periods
Time zones, UTC, and offsets from UTC
Date and time concepts that hurt my head
7.2 Preparing to work with date and time information
Limiting your scope
Clarifying date and time requirements
Using the right libraries or packages
7.3 Implementing date and time code
Applying concepts consistently
Improving testability by avoiding defaults
Representing date and time values in text
Explaining code with comments
7.4 Corner cases to specify and test
Calendar arithmetic
Time zone transitions at midnight
Handling ambiguous or skipped times
Working with evolving time zone data
8 Leveraging data locality and memory of your machines
8.1 What is data locality?
Moving computations to data
Scaling processing using data locality
8.2 Data partitioning and splitting data
Offline big data partitioning
Partitioning vs. sharding
Partitioning algorithms
8.3 Join big data sets from multiple partitions
Joining data within the same physical machine
Joining that requires data movement
Optimizing join leveraging broadcasting
8.4 Data processing: Memory vs. disk
Using disk-based processing
Why do we need MapReduce?
Calculating access times
RAM-based processing
8.5 Implement joins using Apache Spark
Implementing a join without broadcast
Implementing a join with broadcast
9 Third-party libraries: Libraries you use become your code
9.1 Importing a library and taking full responsibility for its settings: Beware of the defaults
9.2 Concurrency models and scalability
Using async and sync APIs
Distributed scalability
9.3 Testability
Testing library
Testing with fakes (test double) and mocks
Integration testing toolkit
9.4 Dependencies of third-party libraries
Avoiding version conflicts
Too many dependencies
9.5 Choosing and maintaining third-party dependencies
First impressions
Different approaches to reusing code
Vendor lock-in
Licensing
Libraries vs. frameworks
Security and updates
Decision checklist
10 Consistency and atomicity in distributed systems
10.1 At-least-once delivery of data sources
Traffic between one-node services
Retrying an application’s call
Producing data and idempotency
Understanding Command Query Responsibility Segregation (CQRS)
10.2 A naive implementation of a deduplication library
10.3 Common mistakes when implementing deduplication in distributed systems
One node context
Multiple nodes context
10.4 Making your logic atomic to prevent race conditions
11 Delivery semantics in distributed systems
11.1 Architecture of event-driven applications
11.2 Producer and consumer applications based on Apache Kafka
Looking at the Kafka consumer side
Understanding the Kafka brokers setup
11.3 The producer logic
Choosing consistency vs. availability for the producer
11.4 Consumer code and different delivery semantics
Committing a consumer manually
Restarting from the earliest or latest offsets
(Effectively) exactly-once semantic
11.5 Leveraging delivery guarantees to provide fault tolerance
12 Managing versioning and compatibility
12.1 Versioning in the abstract
Properties of versions
Backward and forward compatibility
Semantic versioning
Marketing versions
12.2 Versioning for libraries
Source, binary, and semantic compatibility
Dependency graphs and diamond dependencies
Techniques for handling breaking changes
Managing internal-only libraries
12.3 Versioning for network APIs
The context of network API calls
Customer-friendly clarity
Common versioning strategies
Further versioning considerations
12.4 Versioning for data storage
A brief introduction to Protocol Buffers
What is a breaking change?
Migrating data within a storage system
Expecting the unexpected
Separating API and storage representations
Evaluating storage formats
13 Keeping up to date with trends vs. cost of maintenance of your code
13.1 When to use dependency injection frameworks
Do-it-yourself (DIY) dependency injection
Using a dependency injection framework
13.2 When to use reactive programming
Creating single-threaded, blocking processing
Using CompletableFuture
Implementing a reactive solution
13.3 When to use functional programming
Creating functional code in a nonfunctional language
Tail recursion optimization
Leveraging immutability
13.4 Using lazy vs. eager evaluation
index