Gain deep insight into real-time analytics, including the features of these systems and the problems they solve. With this practical book, data engineers at organizations that use event-processing systems such as Kafka, Google Pub/Sub, and AWS Kinesis will learn how to analyze data streams in real time. The faster you derive insights, the quicker you can spot changes in your business and act accordingly.
In the first part of this book, authors Mark Needham and Dunith Dhanushka from StarTree provide an overview of the real-time analytics space and an understanding of what goes into building real-time applications. The second part offers a series of hands-on tutorials that show you how to combine multiple software products to build real-time analytics applications for an imaginary pizza delivery service.
The term “streaming” describes a continuous, never-ending flow of data with no beginning or end. The data is made available incrementally over time, which means that you can act upon it without needing to download everything in one go. A data stream consists of a series of data points ordered in time. Each data point represents an “event” or a change in state that has occurred in the business. For example, a customer purchasing a product becomes an event, which captures facts about the person, product, price, and transaction time.
What is Real-Time Analytics?
Raw events are useless unless we find a way to derive insights from them. We can derive these insights using a “data analytics system” that captures, stores, processes, and analyses raw events to generate actionable insights from them. Real-time analytics systems capture, analyze and act upon events as soon as they become available. They are the unbounded, incrementally processed counterpart to the batch processing systems that have dominated the data analytics space for many years.
With this book, you will:
Learn common architectures for real-time analytics
Discover how event processing differs from real-time analytics
Ingest event data from Apache Kafka into Apache Pinot
Combine event streams with static data using Kafka Streams
Write real-time queries against event data stored in Apache Pinot
Build a real-time dashboard, fraud detection pipeline, order tracking app, and anomaly detection system
Learn how organizations like Uber, Stripe, and Just Eat use real-time analytics
Author(s): Mark Needham and Dunith Dhanushka
Publisher: O'Reilly Media, Inc.
Year: 2023
Language: English
Commentary: early release, raw and unedited
Pages: 130