Probability-Statistics-Jupy…

GitHub Repository: Probability-Statistics-Jupyter-Notebook/probability-statistics-notebook
Path: blob/master/notebook-for-reviewing/chapter_5_normal_distribution.ipynb
³⁸⁸ views

Kernel: Python 3

In [1]:

from scipy.stats import norm
from scipy.stats import chi2
from scipy.stats import t
import math

In [4]:

def print_status(exp, var, pmf, cdf):
    print('Exp\t {:.6f}\nVar\t {:.6f}\nPMF\t {:.6f}\nCDF\t {:.6f}'.format(exp, var, pmf, cdf))

Chapter 5 Normal Distribution

$X\sim N(\mu, \sigma^2):f(x;\mu, \sigma) = \frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{(x-\mu)^2}{2\sigma^2}}$

The standard normal distribution

$X\sim N(0, 1): f(x) = \frac{1}{\sqrt{2\pi}}e^{-\frac{x^2}{2}}$

In [13]:

# Normal Distribution
# Input
mu = 2
sigma = 4 # N.B. NOT sigma^2
pdf_variable = [-4, 7.2] # [star point, end point]
cdf_variable = 2

# Calculate
exp, var = norm.stats(loc=mu, scale=sigma, moments='mv')
pdf = norm.cdf(pdf_variable[1], loc=mu, scale=sigma) - \
norm.cdf(pdf_variable[0], loc=mu, scale=sigma)
cdf = norm.cdf(cdf_variable, loc=mu, scale=sigma)

# Output
print_status(exp, var, pdf, cdf)

Out[13]:

Exp	 2.000000
Var	 16.000000
PMF	 0.836392
CDF	 0.500000

In [6]:

# Normal Distribution CDF-1
# Input
mu = 0
sigma = 1
ppf_x = 0.16 # Probability

# Calculate
x = norm.ppf(ppf_x, loc=mu, scale=sigma)

# Output
print(x)

Out[6]:

-0.994457883209753

In [16]:

# Standard Normal Distribution
# Input
mu = 0
sigma = 1
pdf_variable = [-0.19, 0.29] # [star point, end point]
cdf_variable = 0.2

# Calculate
exp, var = norm.stats(loc=mu, scale=sigma, moments='mv')
pdf = norm.cdf(pdf_variable[1], loc=mu, scale=sigma) - \
norm.cdf(pdf_variable[0], loc=mu, scale=sigma)
cdf = norm.cdf(cdf_variable, loc=mu, scale=sigma)

# Output
print_status(exp, var, pdf, cdf)

Out[16]:

Exp	 0.000000
Var	 1.000000
PMF	 0.189437
CDF	 0.579260

In [15]:

# Standard Normal Distribution CDF-1
# Input
mu = 0
sigma = 1
ppf_x = 0.2 # Probability

# Calculate
x = norm.ppf(ppf_x, loc=mu, scale=sigma)

# Output
print(x)

Out[15]:

-0.8416212335729142

5.2 Linear combinations of normal distribution

$X\sim N(\mu, \sigma^2)\Longrightarrow aX+b\sim N(a\mu+b, a^2\sigma^2)$

$X_1\sim N(\mu_1, \sigma_1^2), X_2\sim N(\mu_2, \sigma_2^2)\Longrightarrow X_1+X_2\sim N(\mu_1+\mu_2, \sigma_1^2+\sigma_2^2)$

$X\sim N(\mu, \sigma^2)\Longrightarrow \bar{X}\sim N(\mu, \frac{\sigma^2}{n})$

5.3 Approximating distribution with normal distribution

$X\sim B(n, p), Z\sim N(0, 1)$ we have:

$P(X\leq x)\approx P(Z\leq\frac{x+0.5-np}{\sqrt{np(1-p)}})$
$P(X\geq x)\approx P(Z\geq\frac{x-0.5-np}{\sqrt{np(1-p)}})$

In [6]:

# Input
n = 100
p = 0.2
pdf_variable_x = [0, 50] # [star point, end point]
cdf_variable_x = 100

# Calculate the value
mu = n * p
sigma = n * p * (1 - p)
def approxmate(x, mu, sigma):
    return (x + 0.5 - mu) / (sigma ** (1/2))
pdf_variable = [approxmate(pdf_variable_x[0], mu, sigma), \
                approxmate(pdf_variable_x[1], mu, sigma)] # [star point, end point]
cdf_variable = approxmate(cdf_variable_x, mu, sigma)

# Calculate
exp, var = norm.stats(loc=mu, scale=sigma, moments='mv')
pdf = norm.cdf(pdf_variable[1], loc=mu, scale=sigma) - norm.cdf(pdf_variable[0], loc=mu, scale=sigma)
cdf = norm.cdf(cdf_variable, loc=mu, scale=sigma)

# Output
print_status(exp, var, pdf, cdf)

Out[6]:

Exp	 20.000000
Var	 256.000000
PMF	 0.159621
CDF	 0.503117

5.3.2 Central limit theorem

$X_1, X_2, \dots, X_n$ are independent and have the same mean $\mu$ and the same variance $\sigma^2$ , then

\bar{X} = \frac{\sum_i^nX_i}{n}\approx N(\mu, \frac{\sigma^2}{n})

for $n \to \infty$

5.4.1 The Lognormal Distribution

$ln(X)\sim N(\mu, \sigma^2): f(x; \mu, \sigma) = \frac{1}{\sqrt{2\pi}\sigma x}exp(-\frac{(ln(x)-\mu)^2}{2\sigma^2})$

CDF: $F(x;\mu, \sigma) = \Phi(\frac{ln(x)-\mu}{\sigma})$

$E(X) = exp(\mu + \frac{\sigma^2}{2})$

$Var(X) = e^{2\mu+\sigma^2}(e^{\sigma^2}-1)$

5.4.2 Chi-Square Distribution

$X_i\sim N(0, 1), X = \sum_{i=1}^{v}X_i^2 \sim \chi_v^2$

$f(x;v) = \frac{\frac{1}{2}e^{-x/2}(\frac{x}{2})^{v/2-1}}{\Gamma(\frac{v}{2})}$

$\chi_v^2 = Gam(\frac{v}{2}, \frac{1}{2})$
$v$ : degree of freedom

$E(X) = v$

$Var(X) = 2v$

In [9]:

# Input
df = 12
pdf_variable = [0, 13.3] # [star point, end point]
cdf_variable = 13.3

# Calculate
exp, var = chi2.stats(df, moments='mv')
pdf = chi2.cdf(pdf_variable[1], df) - \
chi2.cdf(pdf_variable[0], df)
cdf = chi2.cdf(cdf_variable, df)

# Output
print_status(exp, var, pdf, cdf)

Out[9]:

Exp	 12.000000
Var	 24.000000
PMF	 0.652382
CDF	 0.652382

In [8]:

# Input
df = 8
alpha = 0.12

# Calculate
x = chi2.ppf(1-alpha, df)

# Output
print('Critical points: {:.6f}'.format(x))

Out[8]:

Critical points: 12.770329

5.4.3 T Distribution

$Z\sim N(0, 1), W\sim \chi_v^2$

$T_v = \frac{Z}{\sqrt{W/v}}\sim t_v$

$P(X\geq \chi_{\alpha, v}^2)=\alpha$

N.B. T Distribution is symmetric with $x=0$

In [15]:

# Input
df = 16
pdf_variable = [-2.131, 2.131] # [star point, end point]
cdf_variable = 2.131

# Calculate
exp, var = t.stats(df, moments='mv')
pdf = t.cdf(pdf_variable[1], df) - \
t.cdf(pdf_variable[0], df)
cdf = t.cdf(cdf_variable, df)

# Output
print_status(exp, var, pdf, cdf)

Out[15]:

Exp	 0.000000
Var	 1.142857
PMF	 0.951053
CDF	 0.975526

In [17]:

# Input
df = 15
alpha = 0.025

# Calculate
x = t.ppf(alpha, df)

# Output
print('Critical points: {:.6f}'.format(x))

Out[17]:

Critical points: -2.131450

5.4.4 F Distribution

$W_i\sim \chi_{v_i}^2, i = 1, 2$

$F_{v_1, v_2} \sim \frac{W_1}{v_1}/\frac{W_2}{v_2}$

Chapter 5 Normal Distribution

5.2 Linear combinations of normal distribution

5.3 Approximating distribution with normal distribution

5.3.2 Central limit theorem

5.4.1 The Lognormal Distribution

5.4.2 Chi-Square Distribution

5.4.3 T Distribution

5.4.4 F Distribution

Product

Resources

Company

Chapter 5 Normal Distribution

5.2 Linear combinations of normal distribution

5.3 Approximating distribution with normal distribution

5.3.2 Central limit theorem

5.4 Distributions related to the Normal distribution

5.4.1 The Lognormal Distribution

5.4.2 Chi-Square Distribution

5.4.3 T Distribution

5.4.4 F Distribution