Cloud Observability in Action (MEAP v10)

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Generate actionable insights about your cloud native systems. This book teaches you how to set up an observability system that learns from a cloud application’s signals, logging, and monitoring using free and open source tools. In Cloud Observability in Action you will learn how to: Apply observability in cloud native systems Understand observability signals, including their costs and benefits Apply good practices around instrumentation and signal collection Deliver dashboarding, alerting, and SLOs/SLIs at scale Choose the correct signal types for given roles or tasks Pick the right observability tool for any given function Communicate the benefits of observability to management Cloud native, serverless, and containerized applications are made up of hundreds of moving parts. When something goes wrong, it’s not enough to just know there is a problem—you need to know where it is, what it is, and even how to fix it. Cloud Observability in Action shows you how to go beyond the traditional monitoring and build observability systems that turn application telemetry into actionable insight. about the technology A well-designed observability system provides insight into bugs and performance issues in cloud native applications. Often, observability is the difference between an error message and an explanation! You know exactly which service is affected, who’s responsible for its repair, and even how it can be optimized in the future. Best of all, observability allows you to easily automate your error handling with machine users applying fixes without any human help. about the book Cloud Observability in Action teaches you to apply observability practices to cloud-based serverless and Kubernetes environments. In this one-of-a-kind guide, author Michael Hausenblas shares insights from his extensive experience building, monitoring, and improving cloud native systems. You’ll use open source tools like Prometheus and Grafana to build your own observability system without having to rely on proprietary software. Learn how to use telemetry and destinations to continuously generate and discover insights from different signals, including logs, metrics, traces, and profiles. Throughout, use cases and rigorous cost-benefit analysis make sure you’re getting a real return on your investment in observability. about the reader For developers and SREs who have worked with cloud native applications. This book can be used with any public cloud.

Author(s): Michael Hausenblas
Publisher: Manning Publications
Year: 2023

Language: English
Pages: 234

Cloud Observability in Action MEAP V10
Copyright
Welcome
Brief contents
Chapter 1: End-to-end Observability Example
1.1 What is Observability?
1.2 Roles and Goals
1.3 Example Microservices App
1.4 Challenges and How Observability Helps
1.5 Summary
Chapter 2: Signal Types
2.1 Reference Example
2.2 Assessing Instrumentation Costs
2.3 Logs
2.3.1 Instrumentation
2.3.2 Telemetry
2.3.3 Costs and Benefits
2.3.4 Observability with Logs
2.4 Metrics
2.4.1 Instrumentation
2.4.2 Telemetry
2.4.3 Costs and Benefits
2.4.4 Observability with Metrics
2.5 Traces
2.5.1 Distributed Traces
2.5.2 Instrumentation
2.5.3 Telemetry
2.5.4 Costs and Benefits
2.5.5 Observability with Traces
2.6 Selecting Signals
2.7 Summary
Chapter 3: Sources
3.1 Selecting Sources
3.2 Compute-related Sources
3.2.1 Basics
3.2.2 Containers
3.2.3 Kubernetes
3.2.4 Serverless Compute
3.3 Storage-related Sources
3.3.1 Relational Databases and NoSQL Datastores
3.3.2 Filesystems and Object Stores
3.4 Network-related Sources
3.4.1 Network Interfaces
3.4.2 Higher-level Network Sources
3.5 Your Code
3.5.1 Instrumentation
3.5.2 Proxy Sources
3.6 Summary
Chapter 4: Agents & Instrumentation
4.1 Log Routers
4.1.1 Fluentd & Fluent Bit
4.1.2 Other Log Routers
4.2 Metrics Collection
4.2.1 Prometheus
4.2.2 Other Metrics Agents
4.3 OpenTelemetry
4.3.1 Instrumentation
4.3.2 Collector
4.4 Other Agents
4.5 Selecting An Agent
4.5.1 Security For and Of the Agent
4.5.2 Agent Performance and Resource Usage
4.5.3 Agent Non-Functional Requirements (NFRs)
4.6 Summary
Chapter 5: Back-end Destinations
5.1 Back-end Destinations Terminology
5.2 Back-end Destinations for Logs
5.2.1 Cloud Providers
5.2.2 Open Source Log Back-ends
5.2.3 Commercial Offerings for Log Back-ends
5.3 Back-end Destinations for Metrics
5.3.1 Cloud Providers
5.3.2 Open Source Metrics Back-ends
5.3.3 Commercial Offerings for Metrics Back-ends
5.4 Back-end Destinations for Traces
5.4.1 Cloud Providers
5.4.2 Open Source Traces Back-ends
5.4.3 Commercial Offerings for Traces Back-ends
5.5 Columnar Datastores
5.6 Selecting Back-End Destinations
5.6.1 Costs
5.6.2 Open Standards
5.6.3 Cardinality and Queries
5.7 Summary
Chapter 6: Front-end Destinations
6.1 Front-ends
6.1.1 Grafana
6.1.2 Kibana and OpenSearch Dashboards
6.1.3 Other Open Source Front-ends
6.1.4 Cloud Providers and Commercial Front-ends
6.2 All-In-Ones
6.2.1 CNCF Jaeger
6.2.2 CNCF Pixie
6.2.3 Zipkin
6.2.4 Apache Skywalking
6.2.5 SigNoz
6.2.6 Uptrace
6.2.7 Commercial Offerings
6.3 Selecting Front-ends and All-in-ones
6.4 Summary
Chapter 7: Cloud Operations
7.1 Incident Management
7.1.1 Health and Performance Monitoring
7.1.2 Handling the Incident
7.1.3 Learning from the Incident After The Fact
7.2 Alerting
7.2.1 Prometheus Alerting
7.2.2 Using Grafana for Alerting
7.2.3 Cloud Providers
7.3 Usage Tracking
7.3.1 Users
7.3.2 Costs
7.4 Summary
Chapter 8: Distributed Tracing
8.1 Intro and Terminology
8.1.1 Motivational Example
8.1.2 Terminology
8.1.3 Use Cases
8.2 Using Distributed Tracing to Troubleshoot a Microservices App
8.2.1 Example App Overview
8.2.2 Implementing the Example App
8.2.3 Happy Path in the Example App
8.2.4 Exploring a Failure in the Example App
8.3 Considerations
8.3.1 Sampling
8.3.2 Observability Tax
8.3.3 Traces vs. Metrics vs. Logs
8.4 Summary
Chapter 9: Developer Observability
9.1 Continuous Profiling
9.1.1 The Humble Beginnings
9.1.2 Common Technologies
9.1.3 Open Source CP Tooling
9.1.4 Commercial CP Offerings
9.1.5 Using CP to Assess Resource Usage
9.2 Developer Productivity
9.2.1 Challenges
9.2.2 Tooling
9.3 Considerations
9.3.1 Challenges
9.4 Summary
Chapter 10: Service Level Objectives
10.1 The Fundamentals of SLOs
10.1.1 Types of Services
10.1.2 Service Level Indicator
10.1.3 Service Level Objective
10.1.4 Service Level Agreement
10.2 Implementing SLOs
10.2.1 Highl-level Example
10.2.2 Using Prometheus to Implement SLOs
10.2.3 Commercial SLO Offerings
10.3 Considerations
10.4 Summary
Chapter 11: Signal Correlation
11.1 Correlation Fundamentals
11.1.1 Correlation With OpenTelemetry
11.1.2 Correlating Traces
11.1.3 Correlating Metrics
11.1.4 Correlating Logs
11.1.5 Correlating Profiles
11.2 Using Prometheus, Jaeger, and Grafana to Implement Signal Correlation
11.2.1 Metrics-Traces Correlation Example Setup
11.2.2 Using Metrics-Traces Correlation
11.3 Signal Correlation Support in Commercial Offerings
11.4 Considerations
11.4.1 Early Days
11.4.2 Signals
11.4.3 User Experience (UX)
11.5 Summary
Notes