Practical Data Privacy: Solving Privacy and Security Problems in Your Data Science Workflow (Fifth Early Release)

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Between major privacy regulations like the GDPR and CCPA and expensive and notorious data breaches, there has never been so much pressure for data scientists to ensure data privacy. Unfortunately, integrating privacy into your data science workflow is still complicated. This essential guide will give you solid advice and best practices on breakthrough privacy-enhancing technologies such as encrypted learning and differential privacy--as well as a look at emerging technologies and techniques in the field. Federated Learning (FL) and distributed Data Science provide new ways to think about how you do data analysis by keeping data at the edge: on phones, laptops, edge services — or even on-premise architecture or separate cloud architecture when working with partners. The data is not collected or copied to your own cloud or storage before you do analysis or Machine Learning. In this chapter, you’ll learn how this works in practice and determine when this approach is appropriate for a given use case. You’ll also evaluate how to offer privacy via other tools, along with what types of data or engineering problems federated approaches can solve and which are a poor fit. In Data Science, you are almost always using distributed data. Every time you start up a Kubernetes or Hadoop cluster or use a multi-cloud setup for data analysis, your data is de facto distributed.

Author(s): Katharine Jarmul
Edition: 5
Publisher: O'Reilly Media, Inc.
Year: 2022

Language: English
Pages: 306