Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
Probability-Statistics-Jupy…
GitHub Repository: Probability-Statistics-Jupyter-Notebook/probability-statistics-notebook
Path: blob/master/notebook-for-reviewing/chapter_6_descriptive_statistics.ipynb
388 views
Kernel: Python 3
import numpy as np import pandas as pd import matplotlib.pyplot as plt from scipy import stats from scipy.ndimage import mean, median, variance from stemgraphic import stem_graphic

Chapter 6 Descriptive Statistics

Data Presentation

Show the data

# Load data data = np.array([45,62,52,72,91,88,64,65,69,59,70,63,80,70,59,87,59,69,68,69,56,59,74,60,79,56,177,61,60,78,66,61,47,63,63,57,77,67,55,55,56,39,65,60,80,41,72,77,54,81,63,70,73,76,61,75,62,59,64,61,70,65,83,61,56,64,72,90,86,63,63,63,65,80,69,62,75,59,81,79,94,63,64,55,61,66,65,72,61,76,48,92,135,67,73,66,143,82,71,51,70,71,45,64,89,66,66,65,60,64,59,93,84,47,48,65,74,57,62,79,62,68,73,54,55,78,69,69,61,186,55,68,76,70,69,61,55,61,82,83,66,59,69,61,93,76,81,65,67,51,69,77,78,63,77,61,61,66,87,53,67,78,68,80,89,77,63,67,95,54,64,63,28,73,75,65,67,62,65,88,78,75,71,72,60,53,67,81,85,71,49,70,49,58,63,105,62,72,66,79]) data = data.reshape(len(data), 1) data_frame = pd.DataFrame(data, columns=['Service Times']) print(data_frame)
Service Times 0 45 1 62 2 52 3 72 4 91 .. ... 195 105 196 62 197 72 198 66 199 79 [200 rows x 1 columns]

Data graphs

Include:

  • Histograms

  • Leaf Plots

  • Box Plots

# Generate the Histogram Graph data_frame.hist(['Service Times'], grid=False) plt.show()
Image in a Jupyter notebook
# Generate the Leaf Plots Graph stem_graphic(data_frame['Service Times']) plt.show()
Image in a Jupyter notebook
# Generate the Box Plots plt.boxplot(data_frame, vert=False) plt.show()
Image in a Jupyter notebook

Sample Statistics

  • Mean

  • Variance

  • Median

  • Trimmed Mean

  • Mode

  • Quantile

  • Coefficient

# Cal the mean mu = mean(data_frame) # Cal the variance var = stats.tstd(data_frame) ** 2 # Cal the median medi = median(data_frame) # Cal the trimmed mean r = 0.05 trim_mean = stats.trim_mean(data_frame, r) # Cal the mode mode = stats.mode(data_frame) # Cal the quantile upper_quantile = stats.mstats.mquantiles(data_frame, prob=[0.75]) lower_quantile = stats.mstats.mquantiles(data_frame, prob=[0.25]) # Cal the interquantile range inter_quantile = upper_quantile - lower_quantile # Cal the coefficient coeff = stats.variation(data_frame) # Output print('----- Mean -----\n{}'.format(mu)) print('----- Varn -----\n{}'.format(var)) print('----- Medi -----\n{}'.format(medi)) print('----- Trim -----\n{}'.format(trim_mean)) print('----- Mode -----\n{}'.format(mode)) print('----- Up-Q -----\n{}'.format(upper_quantile)) print('----- Lo-Q -----\n{}'.format(lower_quantile)) print('----- Inte -----\n{}'.format(inter_quantile)) print('----- Coef -----\n{}'.format(coeff))
----- Mean ----- Service Times 69.345 dtype: float64 ----- Varn ----- [309.31253769] ----- Medi ----- 66.0 ----- Trim ----- [67.88333333] ----- Mode ----- ModeResult(mode=array([[61]]), count=array([[13]])) ----- Up-Q ----- [76.] ----- Lo-Q ----- [61.] ----- Inte ----- [15.] ----- Coef ----- [0.25298522]