Software Mistakes and Tradeoffs: How to make good programming decisions

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Optimize the decisions that define your code by exploring the common mistakes and intentional tradeoffs made by expert developers.

In
Software Mistakes and Tradeoffs you will learn how to:

Reason about your systems to make intuitive and better design decisions
Understand consequences and how to balance tradeoffs
Pick the right library for your problem
Thoroughly analyze all of your service’s dependencies
Understand delivery semantics and how they influence distributed architecture
Design and execute performance tests to detect code hot paths and validate a system’s SLA
Detect and optimize hot paths in your code to focus optimization efforts on root causes
Decide on a suitable data model for date/time handling to avoid common (but subtle) mistakes
Reason about compatibility and versioning to prevent unexpected problems for API clients
Understand tight/loose coupling and how it influences coordination of work between teams
Clarify requirements until they are precise, easily implemented, and easily tested
Optimize your APIs for friendly user experience

Code performance versus simplicity. Delivery speed versus duplication. Flexibility versus maintainability—every decision you make in software engineering involves balancing tradeoffs. In
Software Mistakes and Tradeoffs you’ll learn from costly mistakes that Tomasz Lelek and Jon Skeet have encountered over their impressive careers. You’ll explore real-world scenarios where poor understanding of tradeoffs lead to major problems down the road, so you can pre-empt your own mistakes with a more thoughtful approach to decision making.

Learn how code duplication impacts the coupling and evolution speed of your systems, and how simple-sounding requirements can have hidden nuances with respect to date and time information. Discover how to efficiently narrow your optimization scope according to 80/20 Pareto principles, and ensure consistency in your distributed systems. You’ll soon have built up the kind of knowledge base that only comes from years of experience.

Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications.

About the technology
Every step in a software project involves making tradeoffs. When you’re balancing speed, security, cost, delivery time, features, and more, reasonable design choices may prove problematic in production. The expert insights and relatable war stories in this book will help you make good choices as you design and build applications.

About the book
Software Mistakes and Tradeoffs explores real-world scenarios where the wrong tradeoff decisions were made and illuminates what could have been done differently. In it, authors Tomasz Lelek and Jon Skeet share wisdom based on decades of software engineering experience, including some delightfully instructive mistakes. You’ll appreciate the specific tips and practical techniques that accompany each example, along with evergreen patterns that will change the way you approach your next projects.

What's inside

How to reason about your software systematically
How to pick tools, libraries, and frameworks
How tight and loose coupling affect team coordination
Requirements that are precise, easy to implement, and easy to test

About the reader
For mid- and senior-level developers and architects who make decisions about software design and implementation.

About the author
Tomasz Lelek works daily with a wide range of production services, architectures, and JVM languages. A Google engineer and author of C# in Depth, Jon Skeet is famous for his many practical contributions to Stack Overflow.

Table of Contents
1 Introduction
2 Code duplication is not always bad: Code duplication vs. flexibility
3 Exceptions vs. other patterns of handling errors in your code
4 Balancing flexibility and complexity
5 Premature optimization vs. optimizing the hot path: Decisions that impact code performance
6 Simplicity vs. cost of maintenance for your API
7 Working effectively with date and time data
8 Leveraging data locality and memory of your machines
9 Third-party libraries: Libraries you use become your code
10 Consistency and atomicity in distributed systems
11 Delivery semantics in distributed systems
12 Managing versioning and compatibility
13 Keeping up to date with trends vs. cost of maintenance of your code

Author(s): Tomasz Lelek, Jon Skeet
Edition: 1
Publisher: Manning
Year: 2022

Language: English
Pages: 416
Tags: Software Development; Software Development Best Practices; Software Development Process; Reasoning about Software Development

contents

front matter

preface

acknowledgments

about this book

about the authors

about the cover illustration

1 Introduction

1.1 Consequences of every decision and pattern

Unit testing decisions

Proportions of unit and integration tests

1.2 Code design patterns and why they do not always work

Measuring our code

1.3 Architecture design patterns and why they do not always work

Scalability and elasticity

Development speed

Complexity of microservices

2 Code duplication is not always bad: Code duplication vs. flexibility

2.1 Common code between codebases and duplication

Adding a new business requirement that requires code duplication

Implementing the new business requirement

Evaluating the result

2.2 Libraries and sharing code between codebases

Evaluating the tradeoffs and disadvantages of shared libraries

Creating a shared library

2.3 Code extraction to a separate microservice

Looking at the tradeoffs and disadvantages of a separate service

Conclusions about separate service

2.4 Improving loose coupling by code duplication

2.5 An API design with inheritance to reduce duplication

Extracting a base request handler

Looking at inheritance and tight coupling

Looking at the tradeoffs between inheritance and composition

Looking at inherent and incidental duplication

3 Exceptions vs. other patterns of handling errors in your code

3.1 Hierarchy of exceptions

Catch-all vs. a more granular approach to handling errors

3.2 Best patterns to handle exceptions in the code that you own

Handling checked exceptions in a public API

Handling unchecked exceptions in a public API

3.3 Anti-patterns in exception handling

Closing resources in case of an error

Anti-pattern of using exceptions to control application flow

3.4 Exceptions from third-party libraries

3.5 Exceptions in multithread environments

Exceptions in an async workflow with a promise API

3.6 Functional approach to handling errors with Try

Using Try in production code

Mixing Try with code that throws an exception

3.7 Performance comparison of exception-handling code

4 Balancing flexibility and complexity

4.1 A robust but not extensible API

Designing a new component

Starting with the most straightforward code

4.2 Allowing clients to provide their own metrics framework

4.3 Providing extensibility of your APIs via hooks

Guarding against unpredictable usage of the hooks API

Performance impact of the hook API

4.4 Providing extensibility of your APIs via listeners

Using listeners vs. hooks

Immutability of our design

4.5 Flexibility analysis of an API vs. the cost of maintenance

5 Premature optimization vs. optimizing the hot path: Decisions that impact code performance

5.1 When premature optimization is evil

Creating accounts processing pipeline

Optimizing processing based on false assumptions

Benchmarking performance optimization

5.2 Hot paths in your code

Understanding the Pareto principle in the context of software systems

Configuring the number of concurrent users (threads) for a given SLA

5.3 A word service with a potential hot path

Getting the word of the day

Validating if the word exists

Exposing the WordsService using HTTP service

5.4 Hot path detection in your code

Creating API performance tests using Gatling

Measuring code paths using MetricRegistry

5.5 Improvements for hot path performance

Creating JMH microbenchmark for the existing solution

Optimizing word exists using a cache

Modifying performance tests to have more input words

6 Simplicity vs. cost of maintenance for your API

6.1 A base library used by other tools

Creating a cloud service client

Exploring authentication strategies

Understanding the configuration mechanism

6.2 Directly exposing settings of a dependent library

Configuring the batch tool

6.3 A tool that is abstracting settings of a dependent library

Configuring the streaming tool

6.4 Adding new setting for the cloud client library

Adding a new setting to the batch tool

Adding a new setting to the streaming tool

Comparing both solutions for UX friendliness and maintainability

6.5 Deprecating/removing a setting in the cloud client library

Removing a setting from the batch tool

Removing a setting from the streaming tool

Comparing both solutions for UX friendliness and maintainability

7 Working effectively with date and time data

7.1 Concepts in date and time information

Machine time: Instants, epochs, and durations

Civil time: Calendar systems, dates, times, and periods

Time zones, UTC, and offsets from UTC

Date and time concepts that hurt my head

7.2 Preparing to work with date and time information

Limiting your scope

Clarifying date and time requirements

Using the right libraries or packages

7.3 Implementing date and time code

Applying concepts consistently

Improving testability by avoiding defaults

Representing date and time values in text

Explaining code with comments

7.4 Corner cases to specify and test

Calendar arithmetic

Time zone transitions at midnight

Handling ambiguous or skipped times

Working with evolving time zone data

8 Leveraging data locality and memory of your machines

8.1 What is data locality?

Moving computations to data

Scaling processing using data locality

8.2 Data partitioning and splitting data

Offline big data partitioning

Partitioning vs. sharding

Partitioning algorithms

8.3 Join big data sets from multiple partitions

Joining data within the same physical machine

Joining that requires data movement

Optimizing join leveraging broadcasting

8.4 Data processing: Memory vs. disk

Using disk-based processing

Why do we need MapReduce?

Calculating access times

RAM-based processing

8.5 Implement joins using Apache Spark

Implementing a join without broadcast

Implementing a join with broadcast

9 Third-party libraries: Libraries you use become your code

9.1 Importing a library and taking full responsibility for its settings: Beware of the defaults

9.2 Concurrency models and scalability

Using async and sync APIs

Distributed scalability

9.3 Testability

Testing library

Testing with fakes (test double) and mocks

Integration testing toolkit

9.4 Dependencies of third-party libraries

Avoiding version conflicts

Too many dependencies

9.5 Choosing and maintaining third-party dependencies

First impressions

Different approaches to reusing code

Vendor lock-in

Licensing

Libraries vs. frameworks

Security and updates

Decision checklist

10 Consistency and atomicity in distributed systems

10.1 At-least-once delivery of data sources

Traffic between one-node services

Retrying an application’s call

Producing data and idempotency

Understanding Command Query Responsibility Segregation (CQRS)

10.2 A naive implementation of a deduplication library

10.3 Common mistakes when implementing deduplication in distributed systems

One node context

Multiple nodes context

10.4 Making your logic atomic to prevent race conditions

11 Delivery semantics in distributed systems

11.1 Architecture of event-driven applications

11.2 Producer and consumer applications based on Apache Kafka

Looking at the Kafka consumer side

Understanding the Kafka brokers setup

11.3 The producer logic

Choosing consistency vs. availability for the producer

11.4 Consumer code and different delivery semantics

Committing a consumer manually

Restarting from the earliest or latest offsets

(Effectively) exactly-once semantic

11.5 Leveraging delivery guarantees to provide fault tolerance

12 Managing versioning and compatibility

12.1 Versioning in the abstract

Properties of versions

Backward and forward compatibility

Semantic versioning

Marketing versions

12.2 Versioning for libraries

Source, binary, and semantic compatibility

Dependency graphs and diamond dependencies

Techniques for handling breaking changes

Managing internal-only libraries

12.3 Versioning for network APIs

The context of network API calls

Customer-friendly clarity

Common versioning strategies

Further versioning considerations

12.4 Versioning for data storage

A brief introduction to Protocol Buffers

What is a breaking change?

Migrating data within a storage system

Expecting the unexpected

Separating API and storage representations

Evaluating storage formats

13 Keeping up to date with trends vs. cost of maintenance of your code

13.1 When to use dependency injection frameworks

Do-it-yourself (DIY) dependency injection

Using a dependency injection framework

13.2 When to use reactive programming

Creating single-threaded, blocking processing

Using CompletableFuture

Implementing a reactive solution

13.3 When to use functional programming

Creating functional code in a nonfunctional language

Tail recursion optimization

Leveraging immutability

13.4 Using lazy vs. eager evaluation

index