Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
Probability-Statistics-Jupy…
GitHub Repository: Probability-Statistics-Jupyter-Notebook/probability-statistics-notebook
Path: blob/master/notebook-for-reviewing/chapter_2_random_variables.ipynb
388 views
Kernel: Python 3
from scipy.stats import expon from scipy.integrate import quad from scipy.integrate import dblquad import numpy as np

Chapter 2 Random Variables

2.0 Random Variables

Random variable X:SRX: S\to R, from sample spcae to real line (as the domian of probability function).

Probability functions P(x):R[0,1]P(x): R\to [0, 1], from real line to the value of probabiliy.

2.1 Discrete Variables

Discrete random variables

  • Probability Mass Function (p.m.f.)

  • Cumulative Distribution Function (c.d.f.)

  • Expectation E(X)=ipixiE(X) = \sum_i p_ix_i

# Calculate the expectation of Discrete variables # Input x = [1, 2, 3, 4, 5] p = [0.2, 0.2, 0.2, 0.2, 0.2] # Cal exp = 0 for i, j in zip(x, p): exp += i * j # Output print('Expectation: {}'.format(exp))
Expectation: 3.0

2.2 Continuous Variables

Continous random variables

  • Probability Density Function (p.d.f.)

  • Cumulative Distribution Function (c.d.f.)

  • Expectation E(X)=xf(x)dxE(X) = \int_{-\infty}^{\infty}xf(x)dx

# Calculate the c.d.f # Input func = lambda x: 1.5 - 6 * (x - 50) ** 2 domain = (49.5, 50.5) # Cal cumulation = quad(func, domain[0], domain[1]) # Output print('Cumulation from {} to {}: {:.6f}'.format(domain[0], domain[1], cumulation[0]))
Cumulation from 49.5 to 50.5: 1.000000
# Calculate the expectation # Input func = lambda x: x * (2/11) domain = (5, 6) # Cal def exp_func(x): return x * func(x) exp = quad(exp_func, domain[0], domain[1]) # Output print('Expectation: {:.6f}'.format(exp[0]))
Expectation: 5.515152

2.3 Expectation and Variance

Variance of random variables

Var(X)=E(XE(X))2=E(X2)E(X)2Var(X) = E(X-E(X))^2 = E(X^2) - E(X)^2
# Calculate the variance for discrete variables # Input x = [1, 2, 3, 4, 5] p = [0.2, 0.2, 0.2, 0.2, 0.2] # Cal exp = 0 exp2 = 0 for i, j in zip(x, p): exp += i * j exp2 += i * i * j var = exp2 - exp ** 2 # Output print('Variance: {:.4f}'.format(var)) print('Standard Variance: {:.4f}'.format(var ** (1/2)))
Variance: 2.0000 Standard Variance: 1.4142
# Calculate the variance for continuous variance # Input func = lambda x: (2/11) * x domain = (5, 6) # Cal def exp_func(x): return x * func(x) def exp_func2(x): return x * x * func(x) exp = quad(exp_func, domain[0], domain[1]) exp2 = quad(exp_func2, domain[0], domain[1]) var = exp2[0] - exp[0] ** 2 # Output print('Variance: {:.4f}'.format(var)) print('Standard Variance: {:.4f}'.format(var ** (1/2)))
Variance: 0.0831 Standard Variance: 0.2883

Quantiles of Random Variables

  • Upper quartile Q3Q_3: c.d.f. = 0.75

  • Lower quartile Q1Q_1: c.d.f. = 0.25

  • Interquartile range IQRIQR: Q3Q1Q_3 - Q_1

Chebyshev's Inequality

P(μcσXμ+cσ)11c2,c1P(\mu -c \sigma \leq X \leq \mu + c \sigma) \geq 1-\frac{1}{c^2}, c\geq 1

2.4 Jointly Distributed Random Variables

Discrete jointly distribution random variables P(X=xi,Y=yj)=pij0 satisfying ijpij=1\begin{equation} P(X = x_i, Y = y_j) = p_{ij} \geq 0 \text{ satisfying } \sum_i \sum_j p_{ij} = 1 \end{equation}

Continuous jointly distribution random variables f(x,y)0 satisfying f(x,y)dxdy=1\begin{equation} f(x,y) \geq 0 \text{ satisfying } \int \int f(x,y) dxdy= 1 \end{equation}

Cumulative Distribution Function: F(x,y)=P(Xxj,Yyj)\begin{equation} F(x,y) = P(X \leq x_j, Y \leq y_j) \end{equation} Discrete CDF F(x,y)=i:xixj:yjypij\begin{equation} F(x,y) = \sum_{i:x_i \leq x} \sum_{j:y_j \leq y} p_{ij} \end{equation} Continuous CDF F(x,y)=yxf(w,z)dwdz\begin{equation} F(x,y) = \int_{- \infty}^{ y} \int_{- \infty}^{ x} f(w, z) dwdz \end{equation}

2.5 Covarance and Covv

Cov(X,Y)=E(XY)E(X)E(Y)Cov(X, Y) = E(XY) - E(X)E(Y)

Corr(X,Y)=Cov(X,Y)Var(X)Var(Y)Corr(X, Y) = \frac{Cov(X, Y)}{\sqrt{Var(X)Var(Y)}}

# Calculate the Covarance for discrete # Input [X * Y] value_x = np.array([0, 1, 2, 3]) value_y = np.array([0, 1, 2, 3]) prob_matrix = np.array([[1/16, 1/16, 0, 0], [1/16, 3/16, 2/16, 0], [0, 2/16, 3/16, 1/16], [0, 0, 1/16, 1/16]]) # Cal expectation of x exp_x = 0 for i in range(len(value_x)): exp_x += value_x[i] * np.sum(prob_matrix, axis=1)[i] # Cal expectation of y exp_y = 0 for i in range(len(value_y)): exp_y += value_y[i] * np.sum(prob_matrix, axis=0)[i] # Cal variance of x exp_x2 = 0 for i in range(len(value_x)): exp_x2 += (value_x[i] ** 2) * np.sum(prob_matrix, axis=1)[i] var_x = exp_x2 - (exp_x ** 2) # Cal variance of y exp_y2 = 0 for i in range(len(value_y)): exp_y2 += (value_y[i] ** 2) * np.sum(prob_matrix, axis=0)[i] var_y = exp_y2 - (exp_y ** 2) # Cal covarance exp_xy = 0 for i in range(len(value_x)): for j in range(len(value_y)): exp_xy += value_x[i] * value_y[j] * prob_matrix[i, j] cov = exp_xy - exp_x * exp_y ## Cal correlation corr = cov / ((var_x * var_y) ** (1/2)) # Output print('EXP X\t {:.4f}\nEXP Y\t {:.4f}\nVAR X\t {:.4f}\nVAR Y\t \ {:.4f}\nCOV\t {:.4f}\nCORR\t {:.4f}'.format(exp_x, exp_y, var_x, var_y, cov, corr))
EXP X 1.5000 EXP Y 1.5000 VAR X 0.7500 VAR Y 0.7500 COV 0.5000 CORR 0.6667
# Calculate the Covarance for continous # Input func_x = lambda x: 2 * x func_y = lambda y: 4 - 2 * y func_xy = lambda x, y: 4 * x * (2 - y) domain_x = (0, 1) domain_y = (1, 2) # Cal expectation of x def exp_func_x(x): return x * func_x(x) exp_x = quad(exp_func_x, domain_x[0], domain_x[1])[0] # Cal expectation of y def exp_func_y(y): return y * func_y(y) exp_y = quad(exp_func_y, domain_y[0], domain_y[1])[0] # Cal variance of x def exp_func_x2(x): return x * x * func_x(x) exp_x2 = quad(exp_func_x2, domain_x[0], domain_x[1])[0] var_x = exp_x2 - exp_x ** 2 # Cal variance of y def exp_func_y2(y): return y * y * func_y(y) exp_y2 = quad(exp_func_y2, domain_y[0], domain_y[1])[0] var_y = exp_y2 - exp_y ** 2 # Cal covarance def exp_func_xy(x, y): return x * y * func_xy(x, y) exp_xy = dblquad(exp_func_xy, domain_x[0], domain_x[1], lambda y: domain_y[0], lambda y: domain_y[1])[0] cov = exp_xy - exp_x * exp_y # Cal correlation corr = cov/((var_x * var_y) ** (1/2)) # Output print('EXP X\t {:.4f}\nEXP Y\t {:.4f}\nVAR X\t {:.4f}\nVAR Y\t \ {:.4f}\nCOV\t {:.4f}\nCORR\t {:.4f}'.format(exp_x, exp_y, var_x, var_y, cov, corr))
EXP X 0.6667 EXP Y 1.3333 VAR X 0.0556 VAR Y 0.0556 COV 5.3333 CORR 96.0000

2.6 Combinations and Functions of Random Variables

E(aX+b)=aE(X)+bE(aX+b) = aE(X) + b

E(X1+X2)=E(X1)+E(X2)E(X_1+X_2) = E(X_1) + E(X_2)

E(Xˉ)=E(X)E(\bar{X}) = E(X)

Var(aX+b)=a2Var(X)Var(aX+b) = a^2Var(X)

Var(X1+X2)=Var(X1)+Var(X2)+2Cov(X1,X2)Var(X_1+X_2) = Var(X_1) + Var(X_2) + 2Cov(X_1, X_2)

Var(Xˉ)=σ2nVar(\bar{X}) = \frac{\sigma^2}{n}