Normal Approximation and Sample Size Calculation
$$
The Sample \[ \begin{array}{r|rrrr|r} i & 1 & 2 & \dots & 625 & \bar{Y}_{625} \\ Y_i & 1 & 1 & \dots & 1 & 0.68 \\ \end{array} \]
The Bootstrap Sample
\[ \begin{array}{r|rrrr|r} i & 1 & 2 & \dots & 625 & \bar{Y}_{625}^* \\ Y_i^* & 1 & 0 & \dots & 1 & 0.68 \\ \end{array} \]
The Population
\[ \begin{array}{r|rrrr|r} j & 1 & 2 & \dots & 7.23M & \bar{y}_{7.23M} \\ y_{j} & 1 & 1 & \dots & 1 & 0.70 \\ \end{array} \]
The ‘Bootstrap Population’ — The Sample \[ \begin{array}{r|rrrr|r} j & 1 & 2 & \dots & 625 & \bar{y}^*_{625} \\ y_j^* & 1 & 1 & \dots & 1 & 0.68 \\ \end{array} \]
\[ \small{ \begin{array}{r|rr|rr|r|rr|rrrr} \text{call} & 1 & & 2 & & \dots & 625 & & & & & & \\ \text{pollster} & X_1 & Y_1 & X_2 & Y_2 & \dots & X_{625} & Y_{625} & \overline{X} & \overline{Y} &\frac{\sum_{i:X_i=1} Y_i}{\sum_{i:X_i=1} 1} & \frac{\sum_{i:X_i=0} Y_i}{\sum_{i:X_i=0} 1} & \text{difference} \\ \hline \color[RGB]{7,59,76}1 & \color[RGB]{7,59,76}0 & \color[RGB]{7,59,76}1 & \color[RGB]{7,59,76}1 & \color[RGB]{7,59,76}1 & \color[RGB]{7,59,76}\dots & \color[RGB]{7,59,76}0 & \color[RGB]{7,59,76}1 & \color[RGB]{7,59,76}0.28 & \color[RGB]{7,59,76}0.68 & \color[RGB]{7,59,76}0.69 & 0.68 & \color[RGB]{7,59,76}0.01 \\ \end{array} } \]
\[ \small{ \begin{array}{r|rr|rr|r|rr|rrrr} \text{`call'} & 1 & & 2 & & \dots & 625 & & & & & & \\ \text{`pollster'} & X_1^* & Y_1^* & X_2^* & Y_2^* & \dots & X^*_{625} & Y^*_{625} & \overline{X}^* & \overline{Y}^* &\frac{\sum_{i:X_i^*=1} Y_i}{\sum_{i:X_i^*=1} 1} & \frac{\sum_{i:X_i^*=0} Y_i}{\sum_{i:X_i^*=0} 1} & \text{difference} \\ \hline \color[RGB]{239,71,111}2 & \color[RGB]{239,71,111}X_{398} & \color[RGB]{239,71,111}Y_{398} & \color[RGB]{239,71,111}X_{129} & \color[RGB]{239,71,111}Y_{129} & \color[RGB]{239,71,111}\dots & \color[RGB]{239,71,111}X_{232} & \color[RGB]{239,71,111}Y_{232} & & & & & & \\ & \color[RGB]{239,71,111}1 & \color[RGB]{239,71,111}0 & \color[RGB]{239,71,111}1 & \color[RGB]{239,71,111}1 & \color[RGB]{239,71,111}\dots & \color[RGB]{239,71,111}0 & \color[RGB]{239,71,111}1 & \color[RGB]{239,71,111}0.29 & \color[RGB]{239,71,111}0.68 & \color[RGB]{239,71,111}0.68 & \color[RGB]{239,71,111}0.69 & \color[RGB]{239,71,111}-0.01 \\ \color[RGB]{17,138,178}3 & \color[RGB]{17,138,178}X_{293} & \color[RGB]{17,138,178}Y_{293} & \color[RGB]{17,138,178}X_{526} & \color[RGB]{17,138,178}Y_{526} & \color[RGB]{17,138,178}\dots & \color[RGB]{17,138,178}X_{578} & \color[RGB]{17,138,178}Y_{578} & & & & & & \\ & \color[RGB]{17,138,178}0 & \color[RGB]{17,138,178}1 & \color[RGB]{17,138,178}0 & \color[RGB]{17,138,178}1 & \color[RGB]{17,138,178}\dots & \color[RGB]{17,138,178}0 & \color[RGB]{17,138,178}1 & \color[RGB]{17,138,178}0.28 & \color[RGB]{17,138,178}0.65 & \color[RGB]{17,138,178}0.67 & \color[RGB]{17,138,178}0.64 & \color[RGB]{17,138,178}0.03 \\ \color[RGB]{6,214,160}1 & \color[RGB]{6,214,160}X_{281} & \color[RGB]{6,214,160}Y_{281} & \color[RGB]{6,214,160}X_{520} & \color[RGB]{6,214,160}Y_{520} & \color[RGB]{6,214,160}\dots & \color[RGB]{6,214,160}X_{363} & \color[RGB]{6,214,160}Y_{363} & & & & & & \\ & \color[RGB]{6,214,160}0 & \color[RGB]{6,214,160}0 & \color[RGB]{6,214,160}0 & \color[RGB]{6,214,160}1 & \color[RGB]{6,214,160}\dots & \color[RGB]{6,214,160}0 & \color[RGB]{6,214,160}1 & \color[RGB]{6,214,160}0.28 & \color[RGB]{6,214,160}0.68 & \color[RGB]{6,214,160}0.71 & \color[RGB]{6,214,160}0.66 & \color[RGB]{6,214,160}0.05 \\ \end{array} } \]
The normal distribution is a parametric family of distributions involving two parameters.
We say a random variable \(X\) is normally distributed with mean \(\theta\) and standard deviation \(\sigma\) …
… if the probability that it’s in an interval \([a,b]\) is given by this integral of its probability density. \[ P_{\theta,\sigma}(X \in [a,b]) = \int_a^b f_{\theta,\sigma}(x)dx \qfor f_{\theta, \sigma}(x) = \frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{(x-\theta)^2}{2\sigma^2}} \]
We have to talk about the probability that it’s ‘in an interval’ rather than that it ‘takes a value’ because …
… the probability it actually takes on any particular value is \(x\) zero: it’s an integral from \(x\) to \(x\).
This seems like an annoyance, but given what the normal distribution is actually used for, it’s a blessing.1
The Central Limit Theorem
The sampling distribution of a mean of independent random variables, \(\color[RGB]{64,64,64}\frac{1}{n}\sum\limits_{i=1}^n X_i\), gets arbitrarily close to its normal approximation
as the number of random variables \(\color[RGB]{64,64,64}n\) increases to infinity.
Caveat. This is only true if each random variable contributes infinitesimally to the mean, i.e. \(\color[RGB]{64,64,64}\max_{i\le n} X_i/n \to 0\) as \(\color[RGB]{64,64,64}n \to \infty\).
\[ \text{interval estimate} = \hat\theta \pm 1.96 \hat\sigma \]
\[ \begin{aligned} \text{normal approximation} & \ f_{\theta,\sigma}(x) \qfor \sigma^2 = \frac{\theta(1-\theta)}{n} \\ \text{corresponding estimate} & \ f_{\hat\theta, \hat \sigma}(x) \qfor \hat\sigma^2 = \frac{\hat \theta(1-\hat \theta)}{n}. \end{aligned} \]
\[ \begin{aligned} \text{normal approximation} & \ f_{\theta,\sigma}(x) \qfor \sigma^2 = \frac{\theta(1-\theta)}{n} \times \frac{m-n}{m-1} \\ \text{corresponding estimate} & \ f_{\hat\theta, \hat \sigma}(x) \qfor \hat\sigma^2 = \frac{\hat \theta(1-\hat \theta)}{n} \times \frac{m-n}{m-1}. \end{aligned} \]
\[ \begin{aligned} \text{estimator} & = \frac{\sum_{i:X_i=1} Y_i}{\sum_{i:X_i=1} 1} - \frac{\sum_{i:X_i=0} Y_i}{\sum_{i:X_i=0} 1} \\ \text{bootstrap estimate} & = \frac{\sum_{i:X_i^*=1} Y_i^*}{\sum_{i:X_i^*=1} 1} - \frac{\sum_{i:X_i^*=0} Y_i^*}{\sum_{i:X_i^*=0} 1} \end{aligned} \]
\[ \begin{aligned} \text{normal approximation:} & \ f_{\theta,\sigma}(x) \qfor \theta=\mu(1)-\mu(0) \qand \sigma^2 = \mathop{\mathrm{E}}\qty[\frac{\mu(1)\{1-\mu(1)\}}{N_1} + \frac{\mu(0)\{1-\mu(0)\}}{N_0}] \\ \text{corresponding estimate:} & \ f_{\hat\theta, \hat \sigma}(x) \qfor \hat\theta=\hat\mu(1)-\hat\mu(0) \qand \hat\sigma^2 = \frac{\hat \mu(1)\{1-\hat \mu(1)\}}{N_1} + \frac{\hat \mu(0)\{1-\hat \mu(0)\}}{N_0} \end{aligned} \]
Above, \(N_0\) and \(N_1\) are the sizes of the subsamples with \(X_i=0\) and \(X_i=1\) and \(\mu(0)\) and \(\mu(1)\) are the proportion of ones in the corresponding subpopulations. More on this Friday and next week.
\[ \begin{aligned} E ( a Y + b Z ) &= E (aY) + E (bZ) \\ &= aE(Y) + bE(Z) \\ & \text{ for random variables $Y, Z$ and numbers $a,b$ } \end{aligned} \]
\[ \begin{aligned} \mathop{\mathrm{E}}\qty[\frac1n \sum_{i=1}^n Y_i] &= \frac1n \sum_{i=1}^n \mathop{\mathrm{E}}[Y_i] && \text{linearity of expectation} \\ &= \frac1n \sum_{i=1}^n \qty{\sum_{j=1}^m y_j \times P(J_i=j) } && \text{def. expectation} \\ &= \frac1n \sum_{i=1}^n \qty{\sum_{j=1}^m y_j \times \frac{1}{m} } && \text{sampling uniformly at random} \\ &= \frac1n \sum_{i=1}^n \mu && \text{ for } \ \mu = \frac1m \sum_{j=1}^m y_j \\ &= \frac{1}{n} \times n \times \mu = \mu \end{aligned} \]
\[ \small{ \begin{aligned} \mathop{\mathrm{E}}\qty( a Y + b Z ) &= \sum_{y}\sum_z (a y + b z) \ P(Y=y, Z=z) && \text{ by definition of expectation} \\ &= \sum_{y}\sum_z a y \ P(Y=y, Z=z) + \sum_{z}\sum_y b z \ P(Y=y, Z=z) && \text{changing the order in which we sum} \\ &= \sum_{y} a y \ \sum_z P(Y=y,Z=z) + \sum_{z} b z \ \sum_y P(Y=y,Z=z) && \text{pulling constants out of the inner sums} \\ &= \sum_{y} a y \ P(Y=y) + \sum_{z} b z \ P(Z=z) && \text{summing to get marginal probabilities from our joint } \\ &= a\sum_{y} y \ P(Y=y) + b\sum_{z} z \ P(Z=z) && \text{ pulling constants out of the remaining sum } \\ &= a\mathop{\mathrm{E}}Y + b \mathop{\mathrm{E}}Z && \text{ by definition} \end{aligned} } \]
\[ \mathop{\mathrm{E}}[YZ] = \mathop{\mathrm{E}}[Y]\mathop{\mathrm{E}}[Z] \qqtext{when $Y$ and $Z$ are independent} \]
\[ \begin{aligned} \mathop{\mathrm{E}}[YZ] &= \sum_{yz} yz \ P(Y=y, Z=z) && \text{by definition of expectation} \\ &= \sum_y \sum_z yz \ P(Y=y) P(Z=z) && \text{factoring and ordering sums } \\ &= \textcolor[RGB]{17,138,178}{\sum_y y \ P(Y=y)} \textcolor[RGB]{239,71,111}{\sum_z z \ P(Z=z)} && \text{pulling factors that don't depend on $z$ out of the inner sum} \\ & \textcolor[RGB]{17,138,178}{\mathop{\mathrm{E}}[Y]} \textcolor[RGB]{239,71,111}{\mathop{\mathrm{E}}[Z]} && \text{by definition of expectation} \end{aligned} \]
We’ve worked out the variance (i.e. the squared standard deviation) of a binary random random variable. \[ \mathop{\mathrm{\mathop{\mathrm{V}}}}[Y] = \theta(1-\theta) \qqtext{ when } Y = \begin{cases} 1 & \qqtext{ with probability } \theta \\ 0 & \qqtext{ with probability } 1-\theta \end{cases} \]
When we sample uniformly at random, our sample proportion is the mean of \(n\) independent variables like this.
\[ \mathop{\mathrm{\mathop{\mathrm{V}}}}[\hat\theta] = \mathop{\mathrm{\mathop{\mathrm{V}}}}\qty[\frac1n \sum_{i=1}^n Y_i] = \mathop{\mathrm{E}}\qty[ \qty{ \frac1n \sum_{i=1}^n Y_i - \mathop{\mathrm{E}}\qty[\frac1n \sum_{i=1}^n Y_i] }^2 ] = \frac{\theta(1-\theta)}{n} \]
\[ \begin{aligned} \mathop{\mathrm{\mathop{\mathrm{V}}}}[\hat\theta] &\overset{\texttip{\text{ \ ❓ \ }}{Using the definitions of $\mathop{\mathrm{\mathop{\mathrm{V}}}}$ and $\hat\theta$}}{=} \mathop{\mathrm{E}}\qty[ \qty{ \frac1n \sum_{i=1}^n Y_i - \mathop{\mathrm{E}}\qty[\frac1n \sum_{i=1}^n Y_i] }^2 ] \\ &\overset{\texttip{\text{ \ ❓ \ }}{Step 1. Centering each term. We can do this because Expectation is linear.}}{=} \mathop{\mathrm{E}}\qty[ \qty{ \frac1n \sum_{i=1}^n \qty(Y_i - \mathop{\mathrm{E}}[Y_i]) }^2 ] \\ &\overset{\texttip{\text{ \ ❓ \ }}{Step 2. Squaring out the Sum. This is just arithmetic: a version of $(a+b)^2=a^2+ab+ba+b^2$ for bigger sums.}}{=} \mathop{\mathrm{E}}\qty[ \frac{1}{n^2} \sum_{i=1}^n \sum_{j=1}^n \qty(Y_i - \mathop{\mathrm{E}}[Y_i])(Y_j - \mathop{\mathrm{E}}[Y_j]) ] \\ &\overset{\texttip{\text{ \ ❓ \ }}{Step 3. Distributing the Expectation. Linearity of Expectation again.}}{=} \frac{1}{n^2} \sum_{i=1}^n \sum_{j=1}^n \mathop{\mathrm{E}}\qty[ \qty(Y_i - \mathop{\mathrm{E}}[Y_i])(Y_j - \mathop{\mathrm{E}}[Y_j]) ] \\ &\overset{\texttip{\text{ \ ❓ \ }}{Step 4. Taking Expectations term-by-term.}}{=} \frac{1}{n^2} \sum_{i=1}^n \sum_{j=1}^n \begin{cases} \mathop{\mathrm{E}}\qty[ \qty(Y_i - \mathop{\mathrm{E}}[Y_i])^2 ] \overset{\texttip{\text{ \ ❓ \ }}{By definition}}{=} \mathop{\mathrm{\mathop{\mathrm{V}}}}[Y_i] = \theta (1-\theta) & \text{ when } j=i \\ \mathop{\mathrm{E}}\qty[ \qty(Y_i - \mathop{\mathrm{E}}[Y_i]) \mathop{\mathrm{E}}\qty(Y_j - \mathop{\mathrm{E}}[Y_j]) ] \overset{\texttip{\text{ \ ❓ \ }}{Because $Y_i$ and $Y_j$ are independent.}}{=} \mathop{\mathrm{E}}\qty[\qty(Y_i - \mathop{\mathrm{E}}[Y_i])] \mathop{\mathrm{E}}\qty[\qty(Y_j - \mathop{\mathrm{E}}[Y_j])] \overset{\texttip{\text{ \ ❓ \ }}{Because each factor has mean zero.}}{=} 0 & \text{ when } j \neq i \end{cases} \\ &\overset{\texttip{\text{ \ ❓ \ }}{Because each sum over $j$ has one nonzero term---the one where $j=i$---and it's always $\theta(1-\theta)$}}{=} \frac{1}{n^2} \sum_{i=1}^n \theta(1-\theta) = \frac{1}{n^2} \times n \times \theta(1-\theta) = \frac{\theta(1-\theta)}{n} \end{aligned} \]
The variance of our mean is \(\color[RGB]{64,64,64}{1/n \times}\) the variance of one observation. \[ \mathop{\mathrm{\mathop{\mathrm{V}}}}\qty[\frac1n\sum_{i=1}^n Y_i] = \frac{\mathop{\mathrm{\mathop{\mathrm{V}}}}[Y_1]}{n} \]
So the standard deviation of our mean is \(\color[RGB]{64,64,64}{1/\sqrt{n} \times}\) the standard deviation of one observation.
\[ \mathop{\mathrm{sd}}\qty[\frac1n\sum_{i=1}^n Y_i] = \frac{\mathop{\mathrm{sd}}[Y_1]}{\sqrt{n}} = \sqrt{\frac{\theta(1-\theta)}{n}} \]
\[ \mathop{\mathrm{\mathop{\mathrm{V}}}}[\hat\theta] = \mathop{\mathrm{\mathop{\mathrm{V}}}}\qty[\frac1n \sum_{i=1}^n Y_i] = \mathop{\mathrm{E}}\qty[ \qty{ \frac1n \sum_{i=1}^n Y_i - \frac1n \mathop{\mathrm{E}}[\sum_{i=1}^n Y_i] }^2 ] = \frac{\theta(1-\theta)}{n} \times \frac{m-n}{m-1} \]
\[ \begin{aligned} \mathop{\mathrm{\mathop{\mathrm{V}}}}[\hat\theta] &\overset{\texttip{\text{ \ ❓ \ }}{Using the definitions of $\mathop{\mathrm{\mathop{\mathrm{V}}}}$ and $\hat\theta$}}{=} \mathop{\mathrm{E}}\qty[ \qty{ \frac1n \sum_{i=1}^n Y_i - \mathop{\mathrm{E}}\qty[\frac1n \sum_{i=1}^n Y_i] }^2 ] \\ &\overset{\texttip{\text{ \ ❓ \ }}{Step 1. Centering each term. We can do this because Expectation is linear.}}{=} \mathop{\mathrm{E}}\qty[ \qty{ \frac1n \sum_{i=1}^n \qty(Y_i - \mathop{\mathrm{E}}[Y_i]) }^2 ] \\ &\overset{\texttip{\text{ \ ❓ \ }}{Step 2. Squaring out the Sum. This is just arithmetic: a version of $(a+b)^2=a^2+ab+ba+b^2$ for bigger sums.}}{=} \mathop{\mathrm{E}}\qty[ \frac{1}{n^2} \sum_{i=1}^n \sum_{j=1}^n \qty(Y_i - \mathop{\mathrm{E}}[Y_i])(Y_j - \mathop{\mathrm{E}}[Y_j]) ] \\ &\overset{\texttip{\text{ \ ❓ \ }}{Step 3. Distributing the Expectation. Linearity of Expectation again.}}{=} \frac{1}{n^2} \sum_{i=1}^n \sum_{j=1}^n \mathop{\mathrm{E}}\qty[ \qty(Y_i - \mathop{\mathrm{E}}[Y_i])(Y_j - \mathop{\mathrm{E}}[Y_j]) ] \\ &\overset{\texttip{\text{ \ ❓ \ }}{Step 4. Taking Expectations term-by-term.}}{=} \frac{1}{n^2} \sum_{i=1}^n \sum_{j=1}^n \begin{cases} \mathop{\mathrm{E}}\qty[ \qty(Y_i - \mathop{\mathrm{E}}[Y_i])^2 ] \overset{\texttip{\text{ \ ❓ \ }}{By definition}}{=} \mathop{\mathrm{\mathop{\mathrm{V}}}}[Y_i] = \theta (1-\theta) & \text{ when } j=i \\ \mathop{\mathrm{E}}\qty[ \qty(Y_i - \mathop{\mathrm{E}}[Y_i]) \mathop{\mathrm{E}}\qty(Y_j - \mathop{\mathrm{E}}[Y_j]) ] \overset{\texttip{\text{ \ ❓ \ }}{Because $Y_i$ and $Y_j$ are independent.}}{=} \mathop{\mathrm{E}}\qty[\qty(Y_i - \mathop{\mathrm{E}}[Y_i])] \mathop{\mathrm{E}}\qty[\qty(Y_j - \mathop{\mathrm{E}}[Y_j])] \overset{\texttip{\text{ \ ❓ \ }}{Because each factor has mean zero.}}{=} 0 & \text{ when } j \neq i \end{cases} \\ &\overset{\texttip{\text{ \ ❓ \ }}{Because each sum over $j$ has one nonzero term---the one where $j=i$---and it's always $\theta(1-\theta)$}}{=} \frac{1}{n^2} \sum_{i=1}^n \theta(1-\theta) = \frac{1}{n^2} \times n \times \theta(1-\theta) = \frac{\theta(1-\theta)}{n} \end{aligned} \]
\[ \begin{aligned} \mathop{\mathrm{\mathop{\mathrm{V}}}}[\hat\theta] &= \mathop{\mathrm{E}}\qty[ \qty{ \frac1n \sum_{i=1}^n Y_i - \frac1n \sum_{i=1}^n \mathop{\mathrm{E}}[Y_i] }^2 ] \\ &= \mathop{\mathrm{E}}\qty[ \qty{ \frac1n \sum_{i=1}^n \qty(Y_i - \mathop{\mathrm{E}}[Y_i]) }^2 ] \\ &= \mathop{\mathrm{E}}\qty[ \frac{1}{n^2} \sum_{i=1}^n \sum_{j=1}^n \qty(Y_i - \mathop{\mathrm{E}}[Y_i])(Y_j - \mathop{\mathrm{E}}[Y_j]) ] \\ &= \frac{1}{n^2} \sum_{i=1}^n \sum_{j=1}^n \mathop{\mathrm{E}}\qty[ \qty(Y_i - \mathop{\mathrm{E}}[Y_i])(Y_j - \mathop{\mathrm{E}}[Y_j]) ] \\ &= \frac{1}{n^2} \sum_{i=1}^n \sum_{j=1}^n \begin{cases} \mathop{\mathrm{E}}\qty(Y_i - \mathop{\mathrm{E}}[Y_i])^2 = \mathop{\mathrm{\mathop{\mathrm{V}}}}[Y_i] = \theta (1-\theta) & \text{ when } j=i \\ \textcolor{blue}{\mathop{\mathrm{E}}\qty[ \qty(Y_i - \mathop{\mathrm{E}}[Y_i]) \qty(Y_j - \mathop{\mathrm{E}}[Y_j]) ] \overset{\texttip{\text{ \ ❓ \ }}{Why $-\frac{\theta(1-\theta)}{m-1}$? This'll be a homework problem.}}{=} -\frac{\theta (1-\theta)}{m-1} } & \text{ when } j \neq i \end{cases} \\ &\overset{\texttip{\text{ \ ❓ \ }}{Because each sum over $j$ now includes one copy of $\theta(1-\theta)$ and $n-1$ copies of $-\frac{\theta(1-\theta)}{m-1}$}}{=} \frac{1}{n^2} \sum_{i=1}^n \theta(1-\theta)\qty{1 - (n-1) \times \frac{1}{m-1}} \\ &\overset{\texttip{\text{ \ ❓ \ }}{Pulling out common factors and simplifying}}{=} \frac{\theta(1-\theta)}{n} \times \qty{1 - \frac{n-1}{m-1}} = \frac{\theta(1-\theta)}{n} \times \frac{m-n}{m-1}. \end{aligned} \]
\[ \begin{aligned} \textcolor{red}{\text{binomial interval}} &= 0.6800 \pm 0.0368 \\ \textcolor{blue}{\text{normal interval}} &= 0.6800 \pm 0.0366 \end{aligned} \]
\[ \hat \theta \pm 1.96\sigma \qfor \sigma = \sqrt{\frac{\theta (1-\theta)}{n}} \qqtext{ is } \hat\theta \pm .01 \qqtext{if} 1.96\sqrt{\frac{\theta (1-\theta)}{n}} = .01 \]
\[ n = \frac{1.96^2 \ \hat\theta (1-\hat\theta)}{.01^2} \approx 8000 \]
\[ \begin{aligned} \pm 0.037 &= \pm 1.96\sqrt{\frac{\hat\theta (1-\hat\theta)}{625}} \\ \pm 0.01 &= \pm 1.96\sqrt{\frac{\hat\theta (1-\hat\theta)}{n}} \end{aligned} \]
\[ \frac{0.037}{.01} = \frac{1.96\sqrt{\frac{\hat\theta (1-\hat\theta)}{625}}}{1.96\sqrt{\frac{\hat\theta (1-\hat\theta)}{n}}} = \sqrt{\frac{n}{625}} \qqtext{ so } n = 625\left(\frac{0.037}{.01}\right)^2 \approx 8000 \]
What do we do if we don’t have any data yet?
But we can still use the formula we worked out earlier to get somewhere. \[ n = \frac{1.96^2 \ \theta (1-\theta)}{.01^2} \]
We don’t have an estimate of \(\theta\), but we do know it’s between 0 and 1. And, consequently, so is \(\theta(1-\theta)\).
So we know that if we just substitute \(1\) into our formula, we’ll get a number that’s bigger than we need.
\[ n < n' = \frac{1.96^2 \cdot 1}{.01^2} \approx 38400 \]
That’s a bit excessive. In fact, we can substitute in \(1/4\) instead of \(1\). \[ n < n' = \frac{1.96^2 \cdot 1/4}{.01^2} \approx 9600 \]
Much better. That’s pretty close to the number we got with preliminary data.
Here’s the claim. Why is it true? \[ n := \frac{1.96^2 \ \theta (1-\theta)}{.01^2} < n' := \frac{1.96^2 \ \times 1/4}{.01^2} \]
Hint. Look at the graph of \(f(x) = x(1-x)\) above.
Nothing here will show up on an exam.
\[ \qty{\sum_{i=1}^n Z_i^2} = \sum_{i=1}^n \sum_{j=1}^n Z_i Z_j \]
\[ (a+b)^2 = \textcolor[RGB]{239,71,111}{(a+b)}\textcolor[RGB]{17,138,178}{(a+b)} = \textcolor[RGB]{239,71,111}{a}\textcolor[RGB]{17,138,178}{a} + \textcolor[RGB]{239,71,111}{a}\textcolor[RGB]{17,138,178}{b} + \textcolor[RGB]{239,71,111}{b}\textcolor[RGB]{17,138,178}{a} + \textcolor[RGB]{239,71,111}{b}\textcolor[RGB]{17,138,178}{b} = a^2 + 2ab + b^2 \]
\[ \qty{\sum_{i=1}^n Z_i}^2 = \textcolor[RGB]{239,71,111}{\sum_{i=1}^n Z_i} \textcolor[RGB]{17,138,178}{\sum_{j=1}^n Z_j} = \textcolor[RGB]{239,71,111}{\sum_{i=1}^n} \textcolor[RGB]{17,138,178}{\sum_{j=1}^n} \textcolor[RGB]{239,71,111}{Z_i} \textcolor[RGB]{17,138,178}{Z_j} \]