Exercise 3 · Binomial, normal and confidence interval — Web loading time
Binomial distribution, normal approximation and a confidence interval for the mean.
Maximum score · 2.5 pointsA telecommunications company considers a web page efficient if its loading time is under 3 seconds. The company claims that at least 50 % of the pages it manages are efficient. To check this claim, a random sample of $n = 2500$ web pages managed by the company is selected.
Formulas provided: $Z \sim N(0,1) \Rightarrow P(-1.96 \le Z \le 1.96) = 0.95$ and $P(-2.58 \le Z \le 2.58) = 0.99$; confidence intervals $\left[\hat{p} - z_{\gamma}\sqrt{\tfrac{\hat{p}(1-\hat{p})}{n}},\; \hat{p} + z_{\gamma}\sqrt{\tfrac{\hat{p}(1-\hat{p})}{n}}\right]$ for the proportion and $\left[\bar{x} - z_{\gamma}\tfrac{s}{\sqrt{n}},\; \bar{x} + z_{\gamma}\tfrac{s}{\sqrt{n}}\right]$ for the mean.
- Assume that, each time a web page is selected for the sample, it can be efficient (or not) with probability $p = 0.5$, independently of all other pages. Consider the random variable $X$ counting how many of the 2500 sampled pages are efficient. What distribution does $X$ follow? Compute the probability that at most 1299 pages are efficient, using the normal approximation without continuity correction. 1.25 p
- In the sample of $n = 2500$ web pages, the sample mean loading time was $\bar{x} = 2.95$ seconds with a sample standard deviation of $s = 0.38$ seconds. Build a 95 % confidence interval for the mean loading time of the company's pages. Based on the interval, what can be said about the company's claim that at least 50 % of its pages are efficient? 1.25 p
Step-by-step solution
Key idea: $X$ counts successes in $n$ independent trials with constant success probability: a binomial. Since $n$ is very large, it is approximated by a normal with mean $np$ and standard deviation $\sqrt{np(1-p)}$.
a) Distribution of $X$ and the probability
Each page is efficient with probability $p = 0.5$, independently, and we count successes among $n = 2500$:
Parameters of the normal approximation:
Hence $X \approx N(1250,\, 25)$. Standardise (no continuity correction, as requested):
Since $P(-1.96 \le Z \le 1.96) = 0.95$, each tail is $0.025$, so $P(Z \le 1.96) = 0.95 + 0.025 = 0.975$.
b) Confidence interval and interpretation
Large sample with unknown variance: use the interval for the mean with $z_{\gamma} = 1.96$ (95 % confidence):
The whole interval lies below 3 seconds: with 95 % confidence, the mean loading time of the company's pages is under 3 seconds.
This is consistent with — and supportive of — the company's claim: if the mean time is clearly below 3 s, it is plausible that at least half the pages load in under 3 s. (Strictly speaking, the interval concerns the mean while the claim concerns the proportion of efficient pages; the interval does not prove the claim, but it does not contradict it.)