Stream Processing: Hands\hyp{}on with Apache Flink

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Get onboard this journey into the land of streams. This is a complete hands-on book about Apache Flink, that follows real-life use cases and will help you learn how to create scalable end-to-end stream processing pipelines. This is a complete hands-on book about Apache Flink. The book follows real-life use cases and you will learn how to create end-to-end stream processing pipelines. We will be using Redpanda and Apache Kafka - along with other technologies - so an understanding of Apache Kafka and Redpanda concepts like topics/partitions and producers/consumers is nice to have. The book is designed to teach you the theory and the practicals as fast as possible. The reader should be able to get from zero to production-ready applications fast with enough practice on the concepts introduced in the book, along with having enough knowledge to debug and troubleshoot when things go wrong. Hope you will enjoy it and use it as a guide in your journey in the land of streams.

Author(s): Giannis Polyzos
Publisher: Independently published
Year: 2023

Language: English
Pages: 234

Stream Processing: Hands\hyp{}on with Apache Flink
Stream Processing: Hands\hyp{}on with Apache Flink
Introduction
In the land of streams
The Streaming Layer: Redpanda
Flink’s Runtime
Summary
Streams and Tables
Streaming SQL Semantics
Flink SQL Logical Components
Running SQL Queries
Operators
The TableEnvironment
Summary
Watermarks & Windows
The Notion of Time
Time Windows
What is a Watermark?
How do watermarks work?
Watermark Generation
Watermark Propagation
Idle Sources
Summary
Streaming Joins
Introduction
Regular Joins
Interval Joins
Temporal Joins
Lookup Joins
Summary
User Defined Functions
Scalar Functions
Table Functions
Aggregate & Table Aggregate Functions
External Service Lookup UDF
Summary
The Datastream API
Sources
Datastream Operators
Merging Multiple Streams
Event Buffering & Enrichment
Handling Late Arriving Data
Summary
Fault Tolerance
Why the need for checkpoints?
Failure in Practise
Flink’s Checkpointing Algorithm
Aligned and Unaligned Checkpoints
Checkpoints vs. Savepoints
Summary
State Backends
State Backends
Using RocksDB
Inspecting RocksDB
Tuning and Troubleshooting
Summary
Monitoring and Troubleshooting
Metrics System
Prometheus and Grafana Setup
Setting up Flink Dashboards
Troubleshooting tips
Summary