Path: blob/master/notebook-for-learning/Chapter-6-Descriptive-Statistics.ipynb
388 views
Chapter 6 - Descriptive Statistics
Experimentation
Data and Statistical Inference
Data: mixture of nature and noise
Our goal: we want to represent data with a probability distribution
Statistical inference: the science of deducing properties of an underlying probability distribution from data
Samples
Population: set of all the possible observations from a particular probability distribution
Sample: a subset of a population
Random sample: sample where the elements are chosen at random from the population
A sample is desired to be representative of the population
Types of observations:
Numerical (just numbers)
Nominal (classes, e.g. male/female, shirts/socks/pants...)
Data Presentation
Histograms
Leaf Plots
Outliers
Outlier: observation which is not from the distribution from which the main body of the sample is collected
Outliers should be removed for analysis
Box Plots
Sample Statistics
Sample:
Sample mean
Sample median
The -th smallest sample when is odd
The average of the -th and the -th smallest sample when is even
Sample trimmed mean
The average of the subset of the sample obtained by removing the top and the bottom from the sample
Sample mode
The value at which the sample frequency is the largest
Sample variance
is called the sample standard deviation
Sample quantile
-th sample quantile is the value satisfying: and
The 25-th and 75-th sample quantiles are called 1-st () and 3-rd () respectively
Interquantile range (IQR):
Coefficient of variation
where is the sample mean