Statistics and Data Visualisation with Python

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Author(s): Jesús Rogel-Salazar
Series: The Python Series
Edition: Chapman & Hall/CRC
Publisher: CRC Press

Language: english

Cover
Half Title
Series Page
Title Page
Copyright Page
Dedication
Contents
1. Data, Stats and Stories – An Introduction
1.1. From Small to Big Data
1.2. Numbers, Facts and Stats
1.3. A Sampled History of Statistics
1.4. Statistics Today
1.5. Asking Questions and Getting Answers
1.6. Presenting Answers Visually
2. Python Programming Primer
2.1. Talking to Python
2.1.1. Scripting and Interacting
2.1.2. Jupyter Notebook
2.2. Starting Up with Python
2.2.1. Types in Python
2.2.2. Numbers: Integers and Floats
2.2.3. Strings
2.2.4. Complex Numbers
2.3. Collections in Python
2.3.1. Lists
2.3.2. List Comprehension
2.3.3. Tuples
2.3.4. Dictionaries
2.3.5. Sets
2.4. The Beginning of Wisdom: Logic & Control Flow
2.4.1. Booleans and Logical Operators
2.4.2. Conditional Statements
2.4.3. While Loop
2.4.4. For Loop
2.5. Functions
2.6. Scripts and Modules
3. Snakes, Bears & Other Numerical Beasts: NumPy, SciPy & pandas
3.1. Numerical Python – NumPy
3.1.1. Matrices and Vectors
3.1.2. N-Dimensional Arrays
3.1.3. N-Dimensional Matrices
3.1.4. Indexing and Slicing
3.1.5. Descriptive Statistics
3.2. Scientific Python – SciPy
3.2.1. Matrix Algebra
3.2.2. Numerical Integration
3.2.3. Numerical Optimisation
3.2.4. Statistics
3.3. Panel Data = pandas
3.3.1. Series and Dataframes
3.3.2. Data Exploration with pandas
3.3.3. Pandas Data Types
3.3.4. Data Manipulation with pandas
3.3.5. Loading Data to pandas
3.3.6. Data Grouping
4. The Measure of All Things – Statistics
4.1. Descriptive Statistics
4.2. Measures of Central Tendency and Dispersion
4.3. Central Tendency
4.3.1. Mode
4.3.2. Median
4.3.3. Arithmetic Mean
4.3.4. Geometric Mean
4.3.5. Harmonic Mean
4.4. Dispersion
4.4.1. Setting the Boundaries: Range
4.4.2. Splitting One’s Sides: Quantiles, Quartiles, Percentiles and More
4.4.3. Mean Deviation
4.4.4. Variance and Standard Deviation
4.5. Data Description – Descriptive Statistics Revisited
5. Definitely Maybe: Probability and Distributions
5.1. Probability
5.2. Random Variables and Probability Distributions
5.2.1. Random Variables
5.2.2. Discrete and Continuous Distributions
5.2.3. Expected Value and Variance
5.3. Discrete Probability Distributions
5.3.1. Uniform Distribution
5.3.2. Bernoulli Distribution
5.3.3. Binomial Distribution
5.3.4. Hypergeometric Distribution
5.3.5. Poisson Distribution
5.4. Continuous Probability Distributions
5.4.1. Normal or Gaussian Distribution
5.4.2. Standard Normal Distribution Z
5.4.3. Shape and Moments of a Distribution
5.4.4. The Central Limit Theorem
5.5. Hypothesis and Confidence Intervals
5.5.1. Student’s t Distribution
5.5.2. Chi-squared Distribution
6. Alluring Arguments and Ugly Facts – Statistical Modelling and Hypothesis Testing
6.1. Hypothesis Testing
6.1.1. Tales and Tails: One- and Two-Tailed Tests
6.2. Normality Testing
6.2.1. Q-Q Plot
6.2.2. Shapiro-Wilk Test
6.2.3. D’Agostino K-squared Test
6.2.4. Kolmogorov-Smirnov Test
6.3. Chi-square Test
6.3.1. Goodness of Fit
6.3.2. Independence
6.4. Linear Correlation and Regression
6.4.1. Pearson Correlation
6.4.2. Linear Regression
6.4.3. Spearman Correlation
6.5. Hypothesis Testing with One Sample
6.5.1. One-Sample t-test for the Population Mean
6.5.2. One-Sample z-test for Proportions
6.5.3. Wilcoxon Signed Rank with One-Sample
6.6. Hypothesis Testing with Two Samples
6.6.1. Two-Sample t-test – Comparing Means, Same Variances
6.6.2. Levene’s Test – Testing Homoscedasticity
6.6.3. Welch’s t-test – Comparing Means, Different Variances
6.6.4. Mann-Whitney Test – Testing Non-normal Samples
6.6.5. Paired Sample t-test
6.6.6. Wilcoxon Matched Pairs
6.7. Analysis of Variance
6.7.1. One-factor or One-way ANOVA
6.7.2. Tukey’s Range Test
6.7.3. Repeated Measures ANOVA
6.7.4. Kruskal-Wallis – Non-parametric One-way ANOVA
6.7.5. Two-factor or Two-way ANOVA
6.8. Tests as Linear Models
6.8.1. Pearson and Spearman Correlations
6.8.2. One-sample t- and Wilcoxon Signed Rank Tests
6.8.3. Two-Sample t- and Mann-Whitney Tests
6.8.4. Paired Sample t- and Wilcoxon Matched Pairs Tests
6.8.5. One-way ANOVA and Kruskal-Wallis Test
7. Delightful Details – Data Visualisation
7.1. Presenting Statistical Quantities
7.1.1. Textual Presentation
7.1.2. Tabular Presentation
7.1.3. Graphical Presentation
7.2. Can You Draw Me a Picture? – Data Visualisation
7.3. Design and Visual Representation
7.4. Plotting and Visualising: Matplotlib
7.4.1. Keep It Simple: Plotting Functions
7.4.2. Line Styles and Colours
7.4.3. Titles and Labels
7.4.4. Grids
7.5. Multiple Plots
7.6. Subplots
7.7. Plotting Surfaces
7.8. Data Visualisation – Best Practices
8. Dazzling Data Designs – Creating Charts
8.1. What Is the Right Visualisaton for Me?
8.2. Data Visualisation and Python
8.2.1. Data Visualisation with Pandas
8.2.2. Seaborn
8.2.3. Bokeh
8.2.4. Plotly
8.3. Scatter Plot
8.4. Line Chart
8.5. Bar Chart
8.6. Pie Chart
8.7. Histogram
8.8. Box Plot
8.9. Area Chart
8.10. Heatmap
A. Variance: Population v Sample
B. Sum of First n Integers
C. Sum of Squares of the First n Integers
D. The Binomial Coefficient
D.1. Some Useful Properties of the Binomial Coefficient
E. The Hypergeometric Distribution
E.1. The Hypergeometric vs Binomial Distribution
F. The Poisson Distribution
F.1. Derivation of the Poisson Distribution
F.2. The Poisson Distribution as a Limit of the Binomial Distribution
G. The Normal Distribution
G.1. Integrating the PDF of the Normal Distribution
G.2. Maximum and Inflection Points of the Normal Distribution
H. Skewness and Kurtosis
I. Kruskal-Wallis Test – No Ties
Bibliography
Index