Egwald Statistics — Probability and Stochastic Processes: Hypotheses Testing

Egwald Web Services
Domain Names
Web Site Design

Egwald Statistics: Probability and Stochastic Processes

by

Elmer G. Wiens

Egwald's popular web pages are provided without cost to users.
Follow Elmer Wiens on Twitter:

Hypotheses Testing

Testing the mean | Testing the variance

A. Testing the mean

1. Small sample from a normal population with known variance ø^²

Suppose you know that a random variable X has a normal distribution, but that you only know its variance ø^². The sample mean of n observations, {X₁, ...X_n} is

Y = (X₁ + ... + X_n)/n

To test that the r.v. X has mean µ, use the fact that

Z = (Y - µ)* sqrt(n) / ø

has the standard normal distribution. If the absolute value of z is greater than 1.96, we can reject the hypotheses, at a 95% level of confidence, that the mean of X, E(X), is equal to µ.

2. Large sample, n >= 30, with unknown mean and (finite) variance

The Central Limit Theorem permits us to approximate ø^² with S^² and assume that the r.v.

Z = (Y - µ) * sqrt(n) / S

has a standard normal distribution.

3. Small sample from a normal population whose mean and (finite) variance are unknown.

If the sample mean = Y, and the sample variance = S^² then the r.v.

Z = (Y - µ) * sqrt(n) / S

has a t-distribution with n-1 degrees of freedom.

For example, suppose a sample of 10 observations yields a sample mean of y = 22 and a sample variance of s^² = 7. We want to test the hypotheses that the population mean µ = 20.

z = (22 - 20)* sqrt(10) / sqrt(7) = 2.39

Since 2.39 > 2.262 we can reject the hypotheses at the 95% confidence level.

B. Testing the difference between means

1. Two independent random samples from two normal populations with known variances

Let {X₁, ... , X_n} be a sample of size n from the r.v. X with variance ø^². Let {W₁, ... , W_m} be a sample of size m from the r.v. W with variance ö^². Compute their means as:

Y = [X₁+ ... + X_n] / n

Z = [W₁+ ... + W_m] / m

Suppose we want to test the hypotheses:

H₀: y - z = þ

against

H₁: y - z != þ

The r.v. V defined by:

V = [Y - Z - þ] / sqrt[(ø^²/n) + (ö^²/m)]

has a standard normal distribution. We can reject the hypotheses if, for two samples, the value of V is larger than 1.96, or less than -1.96.

2. Two large, independent random samples (n,m >= 30) from two populations (not necessarily normal) with unknown variances

The Central Limit Theorem permits us to approximate ø^² with the sample variance S^², and ö^² with the sample variance R^². Then

V = [Y - Z - þ] / sqrt[(S^²/n) + (R^²/m)]

has a standard normal distribution.

For example: if for sample 1, y = 195, n = 35, s^² = 45, and for sample 2, z = 200, m = 40, r^² = 55. Suppose we want to test that the two populations have the same mean. Then

v = [195 - 200 - 0] / sqrt[(45/35) + (55/40)] = -5 / 1.63 = -3.06

Using the standard normal distribution, we can reject the hypotheses, H₀: y - z = 0, at the 95% confidence level.

3. Two small, independent random samples from two normal populations with the same variance ø^²

Suppose for sample 1, the sample mean = Y, the sample variance = S^², and the sample size = n; for sample 2, the sample mean = Z, the sample variance = R^², and the sample size = m. Then, the distribution of the r.v.

T = [Y - Z - þ] / {sqrt[(n - 1)*S^² + (m - 1)*R^²] * sqrt[1 / (n + m - 2)] * sqrt[(1/n) + 1/m)]}

is the Student-t distribution with (n + m - 2) degrees of freedom.