Software Mistakes and Tradeoffs: How to make good programming decisions

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Optimize the decisions that define your code by exploring the common mistakes and intentional tradeoffs made by expert developers. In Software Mistakes and Tradeoffs you will learn how to: • Reason about your systems to make intuitive and better design decisions • Understand consequences and how to balance tradeoffs • Pick the right library for your problem • Thoroughly analyze all of your service’s dependencies • Understand delivery semantics and how they influence distributed architecture • Design and execute performance tests to detect code hot paths and validate a system’s SLA • Detect and optimize hot paths in your code to focus optimization efforts on root causes • Decide on a suitable data model for date/time handling to avoid common (but subtle) mistakes • Reason about compatibility and versioning to prevent unexpected problems for API clients • Understand tight/loose coupling and how it influences coordination of work between teams • Clarify requirements until they are precise, easily implemented, and easily tested • Optimize your APIs for friendly user experience Code performance versus simplicity. Delivery speed versus duplication. Flexibility versus maintainability—every decision you make in software engineering involves balancing tradeoffs. In Software Mistakes and Tradeoffs you’ll learn from costly mistakes that Tomasz Lelek and Jon Skeet have encountered over their impressive careers. You’ll explore real-world scenarios where poor understanding of tradeoffs lead to major problems down the road, so you can pre-empt your own mistakes with a more thoughtful approach to decision making. Learn how code duplication impacts the coupling and evolution speed of your systems, and how simple-sounding requirements can have hidden nuances with respect to date and time information. Discover how to efficiently narrow your optimization scope according to 80/20 Pareto principles, and ensure consistency in your distributed systems. You’ll soon have built up the kind of knowledge base that only comes from years of experience. About the technology Every step in a software project involves making tradeoffs. When you’re balancing speed, security, cost, delivery time, features, and more, reasonable design choices may prove problematic in production. The expert insights and relatable war stories in this book will help you make good choices as you design and build applications. About the book Software Mistakes and Tradeoffs explores real-world scenarios where the wrong tradeoff decisions were made and illuminates what could have been done differently. In it, authors Tomasz Lelek and Jon Skeet share wisdom based on decades of software engineering experience, including some delightfully instructive mistakes. You’ll appreciate the specific tips and practical techniques that accompany each example, along with evergreen patterns that will change the way you approach your next projects. What's inside • How to reason about your software systematically • How to pick tools, libraries, and frameworks • How tight and loose coupling affect team coordination • Requirements that are precise, easy to implement, and easy to test About the reader For mid- and senior-level developers and architects who make decisions about software design and implementation. About the author Tomasz Lelek works daily with a wide range of production services, architectures, and JVM languages. A Google engineer and author of C# in Depth, Jon Skeet is famous for his many practical contributions to Stack Overflow.

Author(s): Tomasz Lelek, Jon Skeet
Publisher: Manning Publications
Year: 2021

Language: English
Commentary: Vector PDF
City: Shelter Island, NY
Tags: Programming; Distributed Systems; Maintanability; Consistency; Design Patterns; Best Practices; API Design; Error Handling; Complexity; Version Control Systems

Software Mistakes and Tradeoffs
brief contents
contents
preface
acknowledgments
about this book
Who should read this book
How this book is organized
About the code
liveBook discussion forum
about the authors
Tomasz Lelek
Jon Skeet
about the cover illustration
1 Introduction
1.1 Consequences of every decision and pattern
1.1.1 Unit testing decisions
1.1.2 Proportions of unit and integration tests
1.2 Code design patterns and why they do not always work
1.2.1 Measuring our code
1.3 Architecture design patterns and why they do not always work
1.3.1 Scalability and elasticity
1.3.2 Development speed
1.3.3 Complexity of microservices
Summary
2 Code duplication is not always bad: Code duplication vs. flexibility
2.1 Common code between codebases and duplication
2.1.1 Adding a new business requirement that requires code duplication
2.1.2 Implementing the new business requirement
2.1.3 Evaluating the result
2.2 Libraries and sharing code between codebases
2.2.1 Evaluating the tradeoffs and disadvantages of shared libraries
2.2.2 Creating a shared library
2.3 Code extraction to a separate microservice
2.3.1 Looking at the tradeoffs and disadvantages of a separate service
2.3.2 Conclusions about separate service
2.4 Improving loose coupling by code duplication
2.5 An API design with inheritance to reduce duplication
2.5.1 Extracting a base request handler
2.5.2 Looking at inheritance and tight coupling
2.5.3 Looking at the tradeoffs between inheritance and composition
2.5.4 Looking at inherent and incidental duplication
Summary
3 Exceptions vs. other patterns of handling errors in your code
3.1 Hierarchy of exceptions
3.1.1 Catch-all vs. a more granular approach to handling errors
3.2 Best patterns to handle exceptions in the code that you own
3.2.1 Handling checked exceptions in a public API
3.2.2 Handling unchecked exceptions in a public API
3.3 Anti-patterns in exception handling
3.3.1 Closing resources in case of an error
3.3.2 Anti-pattern of using exceptions to control application flow
3.4 Exceptions from third-party libraries
3.5 Exceptions in multithread environments
3.5.1 Exceptions in an async workflow with a promise API
3.6 Functional approach to handling errors with Try
3.6.1 Using Try in production code
3.6.2 Mixing Try with code that throws an exception
3.7 Performance comparison of exception-handling code
Summary
4 Balancing flexibility and complexity
4.1 A robust but not extensible API
4.1.1 Designing a new component
4.1.2 Starting with the most straightforward code
4.2 Allowing clients to provide their own metrics framework
4.3 Providing extensibility of your APIs via hooks
4.3.1 Guarding against unpredictable usage of the hooks API
4.3.2 Performance impact of the hook API
4.4 Providing extensibility of your APIs via listeners
4.4.1 Using listeners vs. hooks
4.4.2 Immutability of our design
4.5 Flexibility analysis of an API vs. the cost of maintenance
Summary
5 Premature optimization vs. optimizing the hot path: Decisions that impact code performance
5.1 When premature optimization is evil
5.1.1 Creating accounts processing pipeline
5.1.2 Optimizing processing based on false assumptions
5.1.3 Benchmarking performance optimization
5.2 Hot paths in your code
5.2.1 Understanding the Pareto principle in the context of software systems
5.2.2 Configuring the number of concurrent users (threads) for a given SLA
5.3 A word service with a potential hot path
5.3.1 Getting the word of the day
5.3.2 Validating if the word exists
5.3.3 Exposing the WordsService using HTTP service
5.4 Hot path detection in your code
5.4.1 Creating API performance tests using Gatling
5.4.2 Measuring code paths using MetricRegistry
5.5 Improvements for hot path performance
5.5.1 Creating JMH microbenchmark for the existing solution
5.5.2 Optimizing word exists using a cache
5.5.3 Modifying performance tests to have more input words
Summary
6 Simplicity vs. cost of maintenance for your API
6.1 A base library used by other tools
6.1.1 Creating a cloud service client
6.1.2 Exploring authentication strategies
6.1.3 Understanding the configuration mechanism
6.2 Directly exposing settings of a dependent library
6.2.1 Configuring the batch tool
6.3 A tool that is abstracting settings of a dependent library
6.3.1 Configuring the streaming tool
6.4 Adding new setting for the cloud client library
6.4.1 Adding a new setting to the batch tool
6.4.2 Adding a new setting to the streaming tool
6.4.3 Comparing both solutions for UX friendliness and maintainability
6.5 Deprecating/removing a setting in the cloud client library
6.5.1 Removing a setting from the batch tool
6.5.2 Removing a setting from the streaming tool
6.5.3 Comparing both solutions for UX friendliness and maintainability
Summary
7 Working effectively with date and time data
7.1 Concepts in date and time information
7.1.1 Machine time: Instants, epochs, and durations
7.1.2 Civil time: Calendar systems, dates, times, and periods
7.1.3 Time zones, UTC, and offsets from UTC
7.1.4 Date and time concepts that hurt my head
7.2 Preparing to work with date and time information
7.2.1 Limiting your scope
7.2.2 Clarifying date and time requirements
7.2.3 Using the right libraries or packages
7.3 Implementing date and time code
7.3.1 Applying concepts consistently
7.3.2 Improving testability by avoiding defaults
7.3.3 Representing date and time values in text
7.3.4 Explaining code with comments
7.4 Corner cases to specify and test
7.4.1 Calendar arithmetic
7.4.2 Time zone transitions at midnight
7.4.3 Handling ambiguous or skipped times
7.4.4 Working with evolving time zone data
Summary
8 Leveraging data locality and memory of your machines
8.1 What is data locality?
8.1.1 Moving computations to data
8.1.2 Scaling processing using data locality
8.2 Data partitioning and splitting data
8.2.1 Offline big data partitioning
8.2.2 Partitioning vs. sharding
8.2.3 Partitioning algorithms
8.3 Join big data sets from multiple partitions
8.3.1 Joining data within the same physical machine
8.3.2 Joining that requires data movement
8.3.3 Optimizing join leveraging broadcasting
8.4 Data processing: Memory vs. disk
8.4.1 Using disk-based processing
8.4.2 Why do we need MapReduce?
8.4.3 Calculating access times
8.4.4 RAM-based processing
8.5 Implement joins using Apache Spark
8.5.1 Implementing a join without broadcast
8.5.2 Implementing a join with broadcast
Summary
9 Third-party libraries: Libraries you use become your code
9.1 Importing a library and taking full responsibility for its settings: Beware of the defaults
9.2 Concurrency models and scalability
9.2.1 Using async and sync APIs
9.2.2 Distributed scalability
9.3 Testability
9.3.1 Testing library
9.3.2 Testing with fakes (test double) and mocks
9.3.3 Integration testing toolkit
9.4 Dependencies of third-party libraries
9.4.1 Avoiding version conflicts
9.4.2 Too many dependencies
9.5 Choosing and maintaining third-party dependencies
9.5.1 First impressions
9.5.2 Different approaches to reusing code
9.5.3 Vendor lock-in
9.5.4 Licensing
9.5.5 Libraries vs. frameworks
9.5.6 Security and updates
9.5.7 Decision checklist
Summary
10 Consistency and atomicity in distributed systems
10.1 At-least-once delivery of data sources
10.1.1 Traffic between one-node services
10.1.2 Retrying an application’s call
10.1.3 Producing data and idempotency
10.1.4 Understanding Command Query Responsibility Segregation (CQRS)
10.2 A naive implementation of a deduplication library
10.3 Common mistakes when implementing deduplication in distributed systems
10.3.1 One node context
10.3.2 Multiple nodes context
10.4 Making your logic atomic to prevent race conditions
Summary
11 Delivery semantics in distributed systems
11.1 Architecture of event-driven applications
11.2 Producer and consumer applications based on Apache Kafka
11.2.1 Looking at the Kafka consumer side
11.2.2 Understanding the Kafka brokers setup
11.3 The producer logic
11.3.1 Choosing consistency vs. availability for the producer
11.4 Consumer code and different delivery semantics
11.4.1 Committing a consumer manually
11.4.2 Restarting from the earliest or latest offsets
11.4.3 (Effectively) exactly-once semantic
11.5 Leveraging delivery guarantees to provide fault tolerance
Summary
12 Managing versioning and compatibility
12.1 Versioning in the abstract
12.1.1 Properties of versions
12.1.2 Backward and forward compatibility
12.1.3 Semantic versioning
12.1.4 Marketing versions
12.2 Versioning for libraries
12.2.1 Source, binary, and semantic compatibility
12.2.2 Dependency graphs and diamond dependencies
12.2.3 Techniques for handling breaking changes
12.2.4 Managing internal-only libraries
12.3 Versioning for network APIs
12.3.1 The context of network API calls
12.3.2 Customer-friendly clarity
12.3.3 Common versioning strategies
12.3.4 Further versioning considerations
12.4 Versioning for data storage
12.4.1 A brief introduction to Protocol Buffers
12.4.2 What is a breaking change?
12.4.3 Migrating data within a storage system
12.4.4 Expecting the unexpected
12.4.5 Separating API and storage representations
12.4.6 Evaluating storage formats
Summary
13 Keeping up to date with trends vs. cost of maintenance of your code
13.1 When to use dependency injection frameworks
13.1.1 Do-it-yourself (DIY) dependency injection
13.1.2 Using a dependency injection framework
13.2 When to use reactive programming
13.2.1 Creating single-threaded, blocking processing
13.2.2 Using CompletableFuture
13.2.3 Implementing a reactive solution
13.3 When to use functional programming
13.3.1 Creating functional code in a nonfunctional language
13.3.2 Tail recursion optimization
13.3.3 Leveraging immutability
13.4 Using lazy vs. eager evaluation
Summary
index
A
B
C
D
E
F
G
H
I
J
L
M
N
O
P
R
S
T
U
V
W
Y
Z