In the world of data science there are myriad tools available to analyze data. This book describes some of the popular software application tools along with the processes for downloading and using them in the most optimum fashion. The content includes data analysis using Microsoft Excel, KNIME, R, and OpenOffice (Spreadsheet). Each of these tools will be used to apply statistical concepts including confidence intervals, normal distribution, T-Tests, linear regression, histograms, and geographic analysis using real data from Federal Government sources.
Features
Analyzes data using popular applications such as Excel, R, KNIME, and OpenOffice
Covers statistical concepts including confidence intervals, normal distribution, T-Tests, linear regression, histograms, and geographic analysis
Capstone exercises analyze data using the different software packages
Author(s): Christopher Greco
Publisher: Mercury Learning and Information
Year: 2020
Language: English
Commentary: Data Science Tools, R, Excel, KNIME, & OpenOffice
Pages: 206
Tags: Data Science Tools, R, Excel, KNIME, & OpenOffice
CONTENTS
Preface
Acknowledgments
Notes on Permissions
Chapter 1: First Steps
1.1 Introduction to Data Tools
1.1.1 The Software Is Easy to Use
1.1.2 The Software Is Available from Anywhere
1.1.3 The Software Is Updated Regularly
1.1.4 Summary
1.2 Why Data Analysis (Data Science) at All?
1.3 Where to Get Data
Chapter 2: Importing Data
2.1 Excel
2.1.1 Excel Analysis ToolPak
2.2 OpenOffice
2.3 Import into R and Rattle
2.4 Import into RStudio
2.5 Rattle Import
2.6 Import into KNIME
2.6.1 Stoplight Approach
Chapter 3: Statistical Tests
3.1 Descriptive Statistics
3.1.1 Excel
3.1.2 OpenOffice
3.1.3 RStudio/Rattle
3.1.4 KNIME
3.2 Cumulative Probability Charts
3.2.1 Excel
3.2.2 OpenOffice
3.2.3 R/RStudio/Rattle
3.2.4 KNIME
3.3 T-Test (Parametric)
3.3.1 Excel
3.3.2 OpenOffice
3.3.3 R/RStudio/Rattle
3.3.4 KNIME
Chapter 4: More Statistical Tests
4.1 Correlation
4.1.1 Excel
4.1.2 OpenOffice
4.1.3 R/RStudio/Rattle
4.1.4 KNIME
4.2 Regression
4.2.1 Excel
4.2.2 OpenOffice
4.2.3 R/RStudio/Rattle
4.2.4 KNIME
4.3 Confidence Interval
4.3.1 Excel
4.3.2 OpenOffice
4.3.3 R/RStudio/Rattle
4.3.4 KNIME
4.4 Random Sampling
4.4.1 Excel
4.4.2 OpenOffice
4.4.3 R/RStudio/Rattle
4.4.4 KNIME
Chapter 5: Statistical Methods for Specific Tools
5.1 Power
5.1.1 R/RStudio/Rattle
5.2 F-Test
5.2.1 Excel
5.2.2 R/RStudio/Rattle
5.2.3 KNIME
5.3 Multiple Regression/Correlation
5.3.1 Excel
5.3.2 OpenOffice
5.3.3 R/RStudio/Rattle
5.3.4 KNIME
5.4 Benford’s Law
5.4.1 Rattle
5.5 Lift
5.5.1 KNIME
5.6 Wordcloud
5.6.1 R/RStudio
5.6.2 KNIME
5.7 Filtering
5.7.1 Excel
5.7.2 OpenOffice
5.7.3 R/RStudio/Rattle
5.7.4 KNIME
Chapter 6: Summary
6.1 Packages
6.2 Analysis ToolPak
Chapter 7: Supplemental Information
7.1 Exercise One – Tornado and the States
7.1.1 Answer to Exercise 7.1
7.1.2 Pairing Exercise
References
Index