Advanced Platform Development with Kubernetes: Enabling Data Management, the Internet of Things, Blockchain, and Machine Learning

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Leverage Kubernetes for the rapid adoption of emerging technologies. Kubernetes is the future of enterprise platform development and has become the most popular, and often considered the most robust, container orchestration system available today. This book focuses on platforming technologies that power the Internet of Things, Blockchain, Machine Learning, and the many layers of data and application management supporting them. Advanced Platform Development with Kubernetes takes you through the process of building platforms with these in-demand capabilities. You'll progress through the development of Serverless, CICD integration, data processing pipelines, event queues, distributed query engines, modern data warehouses, data lakes, distributed object storage, indexing and analytics, data routing and transformation, query engines, and data science/machine learning environments. You’ll also see how to implement and tie together numerous essential and trending technologies including: Kafka, NiFi, Hive, Keycloak, Cassandra, MySQL, Zookeeper, Mosquitto, Elasticsearch, Logstash, Kibana, Presto, Mino, OpenFaaS, Ethereum, WireGuard, MLflow and Seldon Core. The book uses Golang and Python to demonstrate the development integration of custom container and Serverless functions, including interaction with the Kubernetes API. The exercises throughout teach Kubernetes through the lens of platform development, expressing the power and flexibility of Kubernetes with clear and pragmatic examples. Discover why Kubernetes is an excellent choice for any individual or organization looking to embark on developing a successful data and application platform.

Author(s): Craig Johnston
Publisher: Apress
Year: 2020

Language: English
Pages: 519

Table of Contents
About the Author
About the Technical Reviewer
Acknowledgments
Chapter 1: Software Platform and the API
Software Applications vs. Software Platforms
Dependency Management and Encapsulation
Network of Applications
Application Platform
Platform Requirements
Platform Architecture
Platform Capabilities
IoT
Ingestion of Data
Edge Gateway
IoT OS
Blockchain
Private Managed Blockchains
Use Cases
Machine Learning
Automation and Management
Core Components
Configuration
Application Parameters
Ingress
Data Management
Metrics
APIs and Protocols
Summary
Chapter 2: DevOps Infrastructure
Cloud Computing
Cloud Native and Vendor Neutral
Redundancy
Portable Platforms
Getting Started Vendor Neutral
DevOps Toolchain
Repositories
Registries
CI/CD
GitLab for DevOps
k3s + GitLab
Server Setup
Configure DNS
Install k3s
Remote Access
Install Cert Manager/Let’s Encrypt
Install GitLab
Namespace
TLS Certificate
Services
ConfigMap
Deployment
Ingress
Disable Sign-up
Summary
Next Steps
Chapter 3: Development Environment
Custom Development Kubernetes Cluster
Nodes
Server Setup
Prepare Nodes
Install Dependencies
Install WireGuard VPN
Install Docker
Install Kubernetes Utilities
Install Master Node
Join Worker Nodes
DNS
Remote Access
Configuration
Repository
Ingress
TLS/HTTPS with Cert Manager
Persistent Volumes with Rook Ceph
Block Storage
Shared Filesystem
Monitoring
Summary
Chapter 4: In-Platform CI/CD
Development and Operations
Platform Integration
Yet Another Development Cluster
RBAC
GitLab Group Kubernetes Access
Configure Kubernetes Cluster Integration
Enable Dependencies
Custom JupyterLab Image
Repository and Container Source
Local Testing
Port-Forwarding
Test Notebook
Additional Learning
Automation
GitLab CI
.gitlab-ci.yml
Kaniko
Integrated Environment Variables
Running a Pipeline
Manual Testing in Kubernetes
Prepare Namespace
Run Notebook
Repository Access
GitOps
Summary
Chapter 5: Pipeline
Statefulness and Kubernetes
Real-Time Data Architecture
Message and Event Queues
Distributed Streaming Platform
MQTT and IoT
Development Environment
Cluster-Wide Configuration
Data Namespace
TLS Certificates
Basic Auth
Apache Zookeeper
Apache Kafka
Kafka Client Utility Pod
Mosquitto (MQTT)
Summary
Chapter 6: Indexing and Analytics
Search and Analytics
Data Science Environment
Development Environment
TLS Certificates
Basic Auth
ELK
Elasticsearch
Logstash
Kibana
Data Lab
Keycloak
Realm, Client, and User
Namespace
JupyterHub
JupyterLab
Kubernetes API
Kafka
Elasticsearch
Mosquitto (MQTT)
Summary
Chapter 7: Data Lakes
Data Processing Pipeline
Development Environment
Data Lake as Object Storage
MinIO Operator
MinIO Cluster
MinIO Client
MinIO Events
Process Objects
Configure Notifications
Event Notebook
Test Data
Containerized Application
Programmatic Deployments
Serverless Object Processing
Summary
Chapter 8: Data Warehouses
Data and Data Science
Data Platform
Development Environment
Data and Metadata Sources
MySQL
MySQL Operator
MySQL Cluster
Apache Cassandra
Cassandra Operator
Cassandra Cluster
Apache Hive
Containerization
Local Hive Testing
Modern Data Warehouse
Hive
Kubernetes Configuration
Test Data
Create Schema
Presto
Kubernetes Configuration
Query
Summary
Chapter 9: Routing and Transformation
ETL and Data Processing
Development Environment
Serverless
OpenFaaS
Install OpenFaaS
Install Sentiment Analysis
ETL
Apache NiFi
Install Apache NiFi
Example ETL Data Pipeline
NiFi Template
Prepare Elasticsearch
Dataflow
Analysis and Programmatic Control
Analysis and Visualization
Programming NiFi
Summary
Chapter 10: Platforming Blockchain
Private Blockchain Platform
Development Environment
Private Ethereum Network
Bootnodes
Bootnode Registrar
Ethstats
Geth Miners
Geth Transaction Nodes
Private Networks
Blockchain Interaction
Geth Attach
Jupyter Environment
Serverless/OpenFaaS
Summary
Chapter 11: Platforming AIML
Data
Hybrid Infrastructure
Development Environment
DNS
k3s Hybrid Cloud
Kilo VPN
Master Node
Worker Nodes
On-premises
GPU
GPU/CUDA
Install Ubuntu
NVIDIA GPU Support
k3s with NVIDIA Runtime
IoT / Raspberry Pi
Raspberry Pi OS
WireGuard
k3s on Raspberry Pi
Node Roles
Install Kilo
Platform Applications
Data Collection
MQTT IoT Client
ETL
Apache NiFi
Python CronJob
Machine Learning Automation
Jupyter Notebook GPU Support
CUDA Data Science Container
JupyterHub Spawner Options
Model Development
MLflow
Installation
Tracking Models
Deploy Artificial Intelligence
Seldon Core
Summary
Index