Path: blob/master/notebook-for-learning/Chapter-9-Comparing-Two-Population-Means.ipynb
388 views
Chapter 9 - Comparing Two Population Means
Introduction
Two Sample Problems
Suppose we have:
Set of data observations from a population A with cumulative dist.
Set of data observations from a population B with cumulative dist.
goal: compare the means and
Common practices
For avoiding biases, we can use:
Randomization
Placebo, blind and double blind experiments
Testing
Consider testing versus
the -value can be found in the same way as for one-sample problems
Paired Samples Versus Independent Samples
Data from paired samples are of the form from each experimental subjects (i.e. test heart rate reduction in each patient)
Blocking: keep out all unwanted influences
Comparison is done via the pairwise differences
Analysis of Paired Samples
Methodology
Data:
where (or ) are the effects of the treatments A or B, effects by subject I, are measurement errors for subject I under treatment i.e. A (similarly for B)
are observations from a distribution with mean
Analysis of Independent Samples
Population | Samples | Size | Mean | Standard deviation |
---|---|---|---|---|
Population A | n | |||
Population B | m |
Point estimate Standard error
Assume and are unknown. Then we have:
General procedure:
Pooled variance procedure: where
When the variances are known, we use a two-sample z-test.
General Procedure (Smith-Satterthwaite test)
We use the statistics
This statistic follows approximately t-distribution with the d.f. as the largest integer not larger than
Two sided level of confidence iunterval for is given by
For testing vs the t-statistic is
Pooled Variance Procedure
Assume
Unbiased estimate of of is given by:
t-statistics becomes
A two-sided level confidence interval for is given by the end-points
For testing vs the t-statistic is
z-Procedure
When the population variances are known for the two samples, we can use a z-statistic instead of a t-statistic.
A two-sided 1 − α level confidence interval for is given by the end-points
For testing versus we use under
Interval length
The interval length (suppose we are using the general procedure) is:
Then the minimum number of samples will be (supposing n = m)