Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
Probability-Statistics-Jupy…
GitHub Repository: Probability-Statistics-Jupyter-Notebook/probability-statistics-notebook
Path: blob/master/notebook-for-reviewing/chapter_3_discrete_probability_distributions.ipynb
388 views
Kernel: Python 3
from scipy.stats import bernoulli from scipy.stats import binom from scipy.stats import geom from scipy.stats import nbinom from scipy.stats import hypergeom from scipy.stats import poisson from scipy.stats import multinomial
def print_status(exp, var, pmf, cdf): print('Exp\t {:.6f}\nVar\t {:.6f}\nPMF\t {:.6f}\nCDF\t {:.6f}'\ .format(exp, var, pmf, cdf))

Chapter 3 Discrete Probability Distributions

3.1 The Binomial and Bernoulli Distribution

3.1.1 Bernoulli Distribution

Do the experiment only once, the experiment only has two outcomes: success and failure.

  • xx: 1 means success, 0 means failure

  • pp: probability of success

XBer(p):f(x;p)=px(1p)1x,x=0,1X\sim Ber(p): f(x;p)=p^x(1-p)^{1-x}, x=0, 1

E(X)=pE(X) = p

Var(X)=p(1p)Var(X) = p(1-p)

# Input p = 0.3 pmf_variable = 0 cdf_variable = 0 # Calculate exp, var = bernoulli.stats(p, moments='mv') pmf = bernoulli.pmf(pmf_variable, p) cdf = bernoulli.cdf(cdf_variable, p) # Output print_status(exp, var, pmf, cdf)
Exp 0.300000 Var 0.210000 PMF 0.700000 CDF 0.700000

3.1.2 Binomial Distribution

Do an expriment nn times independently, the experiment only have two outcomes: success and failure.

  • xx: success times

  • nn: total experiment times

  • pp: probability of success in each time experiment

XB(n,p):f(x;n,p)=Cnxpx(1p)nx,x=1,2,,nX\sim B(n, p): f(x;n, p) = C_n^xp^x(1-p)^{n-x}, x=1, 2, \dots, n

E(X)=npE(X) = np

Var(X)=np(1p)Var(X) = np(1-p)

# Input n = 12 p = 0.7878 pmf_variable = 7 cdf_variable = 9 # Calculate exp, var = binom.stats(n, p, moments='mv') pmf = binom.pmf(pmf_variable, n, p) cdf = binom.cdf(cdf_variable, n, p) # Output print_status(exp, var, pmf, cdf)
Exp 9.453600 Var 2.006054 PMF 0.064175 CDF 0.484489

3.2 Geometric and Negative Distribution

3.2.1 Geometric Distribution

Do xx times experiments and only the last experiment it successes.

  • xx: total experiment times

  • pp: probability of success in each experiment

XG(p):f(x;p)=(1p)x1p,x=1,2,,X\sim G(p):f(x;p) = (1-p)^{x-1}p, x=1, 2, \dots, \infty

E(X)=1pE(X) = \frac{1}{p}

Var(X)=1pp2Var(X) = \frac{1-p}{p^2}

# Input p = 0.23 pmf_variable = 5 cdf_variable = 4 # Calculate exp, var = geom.stats(p, moments='mv') pmf = geom.pmf(pmf_variable, p) cdf = geom.cdf(cdf_variable, p) # Output print_status(exp, var, pmf, cdf)
Exp 4.347826 Var 14.555766 PMF 0.080852 CDF 0.648470

3.2.2 Negative Binomial Distribution

Do x+nx+n times experiments, we success nn times and failed xx times, and the last experiment is success.

  • xx: total failed times;

  • nn: total success times;

XNB(n,p):f(x;n,p)=Cx+n1n1pn(1p)x,x=1,2,,X\sim NB(n, p):f(x;n, p) = C_{x+n-1}^{n-1}p^{n}(1-p)^{x}, x=1, 2, \dots, \infty

E(X)=x+npE(X) = \frac{x + n}{p}

Var(X)=(x+n)(1p)p2Var(X) = \frac{(x+n)(1-p)}{p^2}

# Input n = 3 # success times p = 0.77 pmf_variable = 3 # failed times cdf_variable = 5 # Calculate exp, var = nbinom.stats(n, p, moments='mv') # N.B. the result of exp is the failed times pmf = nbinom.pmf(pmf_variable, n, p) cdf = nbinom.cdf(cdf_variable, n, p) # Output print_status(exp, var, pmf, cdf)
Exp 0.896104 Var 1.163771 PMF 0.055546 CDF 0.997325

3.3 Hptergeometric Distribution

Choose nn items from NN items without replacement, within the NN items, there are rr items in one type and the rest are in another type. We finally get xx itmes in one type.

  • NN: total number of items;

  • rr: number of a type item in all items;

  • nn: number of chosen items;

  • xx: number of a type item in chosen items;

XHypergeometric(N,n,r):f(x;N,n,r)=CrxCNrnxCNnX\sim Hypergeometric(N, n, r): f(x;N, n, r) = \frac{C_r^xC_{N-r}^{n-x}}{C_N^n}

E(X)=nrNE(X) = n\frac{r}{N}

Var(X)=NnN1nrN(1rN)Var(X) = \frac{N-n}{N-1}n\frac{r}{N}(1-\frac{r}{N})

# Input N = 15 r = 9 n = 5 pmf_variable = 2 # number of items in r type cdf_variable = 1 # Calculate exp, var = hypergeom.stats(N, r, n, moments='mv') pmf = hypergeom.pmf(pmf_variable, N, r, n) cdf = hypergeom.cdf(cdf_variable, N, r, n) # Output print_status(exp, var, pmf, cdf)
Exp 3.000000 Var 0.857143 PMF 0.239760 CDF 0.046953

3.4 Poisson Distribution

XPois(λ):f(x;λ)=eλλxx!,x=0,1,2,,X\sim Pois(\lambda): f(x;\lambda) = \frac{e^{-\lambda}\lambda^x}{x!}, x=0, 1, 2, \dots, \infty

E(X)=λE(X) = \lambda

Var(X)=λVar(X) = \lambda

# Input lamb = 6 pmf_variable = 1 cdf_variable = 4 # Calculate exp, var = poisson.stats(lamb, moments='mv') pmf = poisson.pmf(pmf_variable, lamb) cdf = poisson.cdf(cdf_variable, lamb) # Output print_status(exp, var, pmf, cdf)
Exp 6.000000 Var 6.000000 PMF 0.014873 CDF 0.285057
# Input lamb = ppf_x = 0.01 # Probability # Calculate x = norm.ppf(ppf_x, loc=mu, scale=sigma) # Output print(x)

3.5 Multinomial Distribution

Nominal distribution with kk outputs, each output have the probability pi,i=1,2,,kp_i, i=1, 2, \dots, k, the variables are the number of times of output xi,i=1,2,,kx_i, i=1, 2, \dots, k.

f(x1,x2,,xk;p1,p2,,pk,n)=Cnx1,x2,,xkp1x1p2x2pkxkf(x_1, x_2, \dots, x_k;p_1, p_2, \dots, p_k, n) = C_{n}^{x_1, x_2, \dots, x_k}p_1^{x_1}p_2^{x_2}\dots p_k^{x_k}

We required i=1kxi=n\sum_{i=1}^{k}x_i=n and i=1kpi=1\sum_{i=1}^{k}p_i=1

E(Xi)=npiE(X_i) = np_i

Var(Xi)=npi(1pi)Var(X_i) = np_i(1-p_i)

# Input n = 8 p = [0.3, 0.2, 0.5] pmf_variable = [2, 2, 4] # Calculate pmf = multinomial.pmf(pmf_variable, n, p) # Output print(pmf)
0.09450000000000003