Probability-Statistics-Jupy…

GitHub Repository: Probability-Statistics-Jupyter-Notebook/probability-statistics-notebook
Path: blob/master/notebook-for-reviewing/chapter_3_discrete_probability_distributions.ipynb
³⁸⁸ views

Kernel: Python 3

In [1]:

from scipy.stats import bernoulli
from scipy.stats import binom
from scipy.stats import geom
from scipy.stats import nbinom
from scipy.stats import hypergeom
from scipy.stats import poisson
from scipy.stats import multinomial

In [2]:

def print_status(exp, var, pmf, cdf):
    print('Exp\t {:.6f}\nVar\t {:.6f}\nPMF\t {:.6f}\nCDF\t {:.6f}'\
          .format(exp, var, pmf, cdf))

Chapter 3 Discrete Probability Distributions

3.1 The Binomial and Bernoulli Distribution

3.1.1 Bernoulli Distribution

Do the experiment only once, the experiment only has two outcomes: success and failure.

$x$ : 1 means success, 0 means failure
$p$ : probability of success

$X\sim Ber(p): f(x;p)=p^x(1-p)^{1-x}, x=0, 1$

$E(X) = p$

$Var(X) = p(1-p)$

In [3]:

# Input
p = 0.3
pmf_variable = 0
cdf_variable = 0

# Calculate
exp, var = bernoulli.stats(p, moments='mv')
pmf = bernoulli.pmf(pmf_variable, p)
cdf = bernoulli.cdf(cdf_variable, p)

# Output
print_status(exp, var, pmf, cdf)

Out[3]:

Exp	 0.300000
Var	 0.210000
PMF	 0.700000
CDF	 0.700000

3.1.2 Binomial Distribution

Do an expriment $n$ times independently, the experiment only have two outcomes: success and failure.

$x$ : success times
$n$ : total experiment times
$p$ : probability of success in each time experiment

$X\sim B(n, p): f(x;n, p) = C_n^xp^x(1-p)^{n-x}, x=1, 2, \dots, n$

$E(X) = np$

$Var(X) = np(1-p)$

In [16]:

# Input
n = 12
p = 0.7878
pmf_variable = 7
cdf_variable = 9

# Calculate
exp, var = binom.stats(n, p, moments='mv')
pmf = binom.pmf(pmf_variable, n, p)
cdf = binom.cdf(cdf_variable, n, p)

# Output
print_status(exp, var, pmf, cdf)

Out[16]:

Exp	 9.453600
Var	 2.006054
PMF	 0.064175
CDF	 0.484489

3.2 Geometric and Negative Distribution

3.2.1 Geometric Distribution

Do $x$ times experiments and only the last experiment it successes.

$x$ : total experiment times
$p$ : probability of success in each experiment

$X\sim G(p):f(x;p) = (1-p)^{x-1}p, x=1, 2, \dots, \infty$

$E(X) = \frac{1}{p}$

$Var(X) = \frac{1-p}{p^2}$

In [6]:

# Input
p = 0.23
pmf_variable = 5
cdf_variable = 4

# Calculate
exp, var = geom.stats(p, moments='mv')
pmf = geom.pmf(pmf_variable, p)
cdf = geom.cdf(cdf_variable, p)

# Output
print_status(exp, var, pmf, cdf)

Out[6]:

Exp	 4.347826
Var	 14.555766
PMF	 0.080852
CDF	 0.648470

3.2.2 Negative Binomial Distribution

Do $x+n$ times experiments, we success $n$ times and failed $x$ times, and the last experiment is success.

$x$ : total failed times;
$n$ : total success times;

$X\sim NB(n, p):f(x;n, p) = C_{x+n-1}^{n-1}p^{n}(1-p)^{x}, x=1, 2, \dots, \infty$

$E(X) = \frac{x + n}{p}$

$Var(X) = \frac{(x+n)(1-p)}{p^2}$

In [9]:

# Input
n = 3 # success times
p = 0.77
pmf_variable = 3 # failed times
cdf_variable = 5

# Calculate
exp, var = nbinom.stats(n, p, moments='mv') # N.B. the result of exp is the failed times
pmf = nbinom.pmf(pmf_variable, n, p) 
cdf = nbinom.cdf(cdf_variable, n, p)

# Output
print_status(exp, var, pmf, cdf)

Out[9]:

Exp	 0.896104
Var	 1.163771
PMF	 0.055546
CDF	 0.997325

3.3 Hptergeometric Distribution

Choose $n$ items from $N$ items without replacement, within the $N$ items, there are $r$ items in one type and the rest are in another type. We finally get $x$ itmes in one type.

$N$ : total number of items;
$r$ : number of a type item in all items;
$n$ : number of chosen items;
$x$ : number of a type item in chosen items;

$X\sim Hypergeometric(N, n, r): f(x;N, n, r) = \frac{C_r^xC_{N-r}^{n-x}}{C_N^n}$

$E(X) = n\frac{r}{N}$

$Var(X) = \frac{N-n}{N-1}n\frac{r}{N}(1-\frac{r}{N})$

In [12]:

# Input
N = 15
r = 9
n = 5
pmf_variable = 2 # number of items in r type
cdf_variable = 1

# Calculate
exp, var = hypergeom.stats(N, r, n, moments='mv')
pmf = hypergeom.pmf(pmf_variable, N, r, n) 
cdf = hypergeom.cdf(cdf_variable, N, r, n)

# Output
print_status(exp, var, pmf, cdf)

Out[12]:

Exp	 3.000000
Var	 0.857143
PMF	 0.239760
CDF	 0.046953

3.4 Poisson Distribution

$X\sim Pois(\lambda): f(x;\lambda) = \frac{e^{-\lambda}\lambda^x}{x!}, x=0, 1, 2, \dots, \infty$

$E(X) = \lambda$

$Var(X) = \lambda$

In [15]:

# Input
lamb = 6
pmf_variable = 1
cdf_variable = 4 

# Calculate
exp, var = poisson.stats(lamb, moments='mv')
pmf = poisson.pmf(pmf_variable, lamb) 
cdf = poisson.cdf(cdf_variable, lamb)

# Output
print_status(exp, var, pmf, cdf)

Out[15]:

Exp	 6.000000
Var	 6.000000
PMF	 0.014873
CDF	 0.285057

In [ ]:

# Input
lamb = 
ppf_x = 0.01 # Probability

# Calculate
x = norm.ppf(ppf_x, loc=mu, scale=sigma)

# Output
print(x)

3.5 Multinomial Distribution

Nominal distribution with $k$ outputs, each output have the probability $p_i, i=1, 2, \dots, k$ , the variables are the number of times of output $x_i, i=1, 2, \dots, k$ .

$f(x_1, x_2, \dots, x_k;p_1, p_2, \dots, p_k, n) = C_{n}^{x_1, x_2, \dots, x_k}p_1^{x_1}p_2^{x_2}\dots p_k^{x_k}$

We required $\sum_{i=1}^{k}x_i=n$ and $\sum_{i=1}^{k}p_i=1$

$E(X_i) = np_i$

$Var(X_i) = np_i(1-p_i)$

In [59]:

# Input
n = 8
p = [0.3, 0.2, 0.5]
pmf_variable = [2, 2, 4]

# Calculate
pmf = multinomial.pmf(pmf_variable, n, p) 

# Output
print(pmf)

Out[59]:

0.09450000000000003