Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
Probability-Statistics-Jupy…
GitHub Repository: Probability-Statistics-Jupyter-Notebook/probability-statistics-notebook
Path: blob/master/notebook-for-reviewing/chapter_8_inferences_on_a_population_mean.ipynb
388 views
Kernel: Python 3
import math import numpy as np import pandas as pd import matplotlib.pyplot as plt import statsmodels.stats.weightstats as sms from scipy import stats

Chapter 8 Inferences on a Population Mean

Confidence Intervals

Confidence Interval - for unknown parameter, the interval contains a set of possible values.

Confidence Level 1α1-\alpha - probability for the parameter within the interval.

Two-sided t-interval

Requirements:

  • Continuous data set, length = nn

  • Sample mean xˉ\bar{x}

  • Sample standard ss

  • N.B. Real standard deviation is UNKNOWN.

Real mean μ\mu 1α1-\alpha confidence interval would be

(xˉtα/2,n1Sn,xˉ+tα/2,n1Sn)\left( \bar{x} - \frac{t_{\alpha/2, n-1} S}{\sqrt{n}}, \bar{x} + \frac{t_{\alpha/2, n-1} S}{\sqrt{n}} \right)

T distribution:

T=XˉμS/ntn1T = \frac{\bar{X} - \mu}{S/\sqrt{n}} \sim t_{n-1}
# Two-sided t-interval # Input n = 60 s = 0.134 x_bar = 49.9999 alpha = 0.1 # Calculate t = stats.t.ppf(1 - alpha / 2, n - 1) wing = t * s / math.sqrt(n) # Output print('T Statistics Value\t{:.4f}'.format(t)) print('Interval Length\t\t{:.4f}'.format(2 * wing)) print('Confidece Interval\t({:.4f}, {:.4f})'.format(x_bar - wing, x_bar + wing))
T Statistics Value 1.6711 Interval Length 0.0578 Confidece Interval (49.9710, 50.0288)

One-sided t-interval

Real mean μ\mu with 1α1-\alpha confidence interval would be

(,xˉ+tα,n1Sn)\left( -\infty, \bar{x} + \frac{t_{\alpha, n-1} S}{\sqrt{n}} \right)(xˉtα,n1Sn,+)\left( \bar{x} - \frac{t_{\alpha, n-1} S}{\sqrt{n}}, +\infty \right)
# One-sided t-interval # Input n = 60 s = 0.134 x_bar = 49.9999 alpha = 0.1 # Calculate t = stats.t.ppf(1 - alpha, n - 1) wing = t * s / math.sqrt(n) # Output print('T Statistics Value\t{:.4f}'.format(t)) print('----- Upper Bound -----') print('Confidece Interval\t(-inf, {:.4f})'.format(x_bar + wing)) print('----- Lower Bound -----') print('Confidece Interval\t({:.4f}, +inf)'.format(x_bar - wing))
T Statistics Value 1.2961 ----- Upper Bound ----- Confidece Interval (-inf, 50.0223) ----- Lower Bound ----- Confidece Interval (49.9775, +inf)

Two-sided z-interval

Real mean μ\mu with 1α1-\alpha confidence interval would be

(xˉzα/2,σn,xˉ+zα/2σn)\left( \bar{x} - \frac{z_{\alpha/2,} \sigma}{\sqrt{n}}, \bar{x} + \frac{z_{\alpha/2} \sigma}{\sqrt{n}} \right)
# Two-sided z-interval # Input n = 60 sigma = 0.134 x_bar = 49.9999 alpha = 0.1 # Calculate z = stats.norm.ppf(1 - alpha / 2) wing = z * sigma / math.sqrt(n) # Output print('T Statistics Value\t{:.4f}'.format(z)) print('Interval Length\t\t{:.4f}'.format(2 * wing)) print('Confidece Interval\t({:.4f}, {:.4f})'.format(x_bar - wing, x_bar + wing))
T Statistics Value 1.6449 Interval Length 0.0569 Confidece Interval (49.9714, 50.0284)
# One-sided t-interval # Input n = 60 sigma = 0.134 x_bar = 49.9999 alpha = 0.1 # Calculate z = stats.norm.ppf(1 - alpha, n - 1) wing = t * sigma / math.sqrt(n) # Output print('T Statistics Value\t{:.4f}'.format(t)) print('----- Upper Bound -----') print('Confidece Interval\t(-inf, {:.4f})'.format(x_bar + wing)) print('----- Lower Bound -----') print('Confidece Interval\t({:.4f}, +inf)'.format(x_bar - wing))
T Statistics Value 1.2961 ----- Upper Bound ----- Confidece Interval (-inf, 50.0223) ----- Lower Bound ----- Confidece Interval (49.9775, +inf)

Hypothesis

Null Hypothesis H0H_0 - designate possible value.

Alternative Hypothesis HAH_A - opposite of null hypothesis.

Two-sided hypothesis:

H0:μ=μ0.versus.HA:μμ0H_0:\mu = \mu_0.versus.H_A:\mu\ne\mu_0

One-sided hypothesis:

H0:μμ0.versus.HA:μ>μ0H_0:\mu \leq \mu_0.versus.H_A:\mu > \mu_0H0:μμ0.versus.HA:μ<μ0H_0:\mu \geq \mu_0.versus.H_A:\mu < \mu_0

pvaluep-value - probability of making null hypothesis true.

  • pvalue<significancelevelp-value < significance level - reject null hypothesis

  • pvaluesignificancelevelp-value \geq significance level - accept null hypothesis

  • N.B. null hypothesis may not be true

# Two-sided t-test # Input n = 60 s = 0.1334 x_bar = 49.99856 mu_0 = 50 # Calculate t = (x_bar - mu_0) / (s / math.sqrt(n)) p_value = 2 * stats.t.sf(abs(t), n - 1) # Outpu print('P-Value\t{:.4f}'.format(p_value))
P-Value 0.9336
# One-sided t-test # Input n = 60 s = 0.1334 x_bar = 49.99856 mu_0 = 50 # Calculate t = (x_bar - mu_0) / (s / math.sqrt(n)) p_value = stats.t.cdf(abs(t), n - 1) # Outpu print('P-Value\t{:.4f}'.format(p_value))
P-Value 0.5332
# Two-sided acceptance region # Input n = 60 s = 0.1334 x_bar = 49.99856 mu_0 = 50 alpha = 0.1 # Calculate t = stats.t.ppf(1 - alpha / 2, n - 1) wing = t * (s / math.sqrt(n)) # Output print('Acceptance Region\t({:.4f}, {:.4f})'.format(x_bar - wing, x_bar + wing))
Acceptance Region (49.9698, 50.0273)
# One-sided acceptance region # Input n = 60 s = 0.1334 x_bar = 49.99856 mu_0 = 50 alpha = 0.1 # Calculate t = stats.t.ppf(1 - alpha, n - 1) wing = t * (s / math.sqrt(n)) # Output print('Acceptance Region\t(-inf, {:.4f})'.format(x_bar + wing))
Acceptance Region (-inf, 50.0209)

Z-test hypothesis

# Two-sided z-test # Input n = 60 sigma = 0.1334 x_bar = 49.99856 mu_0 = 50 # Calculate t = (x_bar - mu_0) / (sigma / math.sqrt(n)) p_value = 2 * stats.norm.sf(abs(t), n - 1) # Outpu print('P-Value\t{:.4f}'.format(p_value))
P-Value 2.0000
# One-sided z-test # Input n = 60 sigma = 0.1334 x_bar = 49.99856 mu_0 = 50 # Calculate t = (x_bar - mu_0) / (sigma / math.sqrt(n)) p_value = stats.norm.cdf(abs(t), n - 1) # Outpu print('P-Value\t{:.4f}'.format(p_value))
P-Value 0.0000