Practical Apache Spark: Using the Scala API

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Work with Apache Spark using Scala to deploy and set up single-node, multi-node, and high-availability clusters. This book discusses various components of Spark such as Spark Core, DataFrames, Datasets and SQL, Spark Streaming, Spark MLib, and R on Spark with the help of practical code snippets for each topic. Practical Apache Spark also covers the integration of Apache Spark with Kafka with examples. You’ll follow a learn-to-do-by-yourself approach to learning – learn the concepts, practice the code snippets in Scala, and complete the assignments given to get an overall exposure. On completion, you’ll have knowledge of the functional programming aspects of Scala, and hands-on expertise in various Spark components. You’ll also become familiar with machine learning algorithms with real-time usage. What You Will Learn • Discover the functional programming features of Scala • Understand the complete architecture of Spark and its components • Integrate Apache Spark with Hive and Kafka • Use Spark SQL, DataFrames, and Datasets to process data using traditional SQL queries • Work with different machine learning concepts and libraries using Spark's MLlib packages Who This Book Is For Developers and professionals who deal with batch and stream data processing.

Author(s): Subhashini Chellappan, Dharanitharan Ganesan
Edition: 1
Publisher: Apress
Year: 2019

Language: English
Commentary: True PDF
Pages: 280
Tags: Databases; Functional Programming; Apache Spark; Apache Kafka; Clusters; Scala; Spark; DataFrames; Spark; GraphX; Spark SQL; Spark MLlib; Resilient Distributed Datasets; Spark Streaming

Front Matter ....Pages i-xvi
Scala: Functional Programming Aspects (Subhashini Chellappan, Dharanitharan Ganesan)....Pages 1-37
Single and Multinode Cluster Setup (Subhashini Chellappan, Dharanitharan Ganesan)....Pages 39-77
Introduction to Apache Spark and Spark Core (Subhashini Chellappan, Dharanitharan Ganesan)....Pages 79-113
Spark SQL, DataFrames, and Datasets (Subhashini Chellappan, Dharanitharan Ganesan)....Pages 115-139
Introduction to Spark Streaming (Subhashini Chellappan, Dharanitharan Ganesan)....Pages 141-156
Spark Structured Streaming (Subhashini Chellappan, Dharanitharan Ganesan)....Pages 157-174
Spark Streaming with Kafka (Subhashini Chellappan, Dharanitharan Ganesan)....Pages 175-187
Spark Machine Learning Library (Subhashini Chellappan, Dharanitharan Ganesan)....Pages 189-236
Working with SparkR (Subhashini Chellappan, Dharanitharan Ganesan)....Pages 237-260
Spark Real-Time Use Case (Subhashini Chellappan, Dharanitharan Ganesan)....Pages 261-273
Back Matter ....Pages 275-280