Normal Approximation and Sample Size Calculation
$$
The Sample \[ \begin{array}{r|rrrr|r} i & 1 & 2 & \dots & 625 & \bar{Y}_{625} \\ Y_i & 1 & 1 & \dots & 1 & 0.68 \\ \end{array} \]
The Bootstrap Sample
\[ \begin{array}{r|rrrr|r} i & 1 & 2 & \dots & 625 & \bar{Y}_{625}^* \\ Y_i^* & 1 & 0 & \dots & 1 & 0.68 \\ \end{array} \]
The Population
\[ \begin{array}{r|rrrr|r} j & 1 & 2 & \dots & 7.23M & \bar{y}_{7.23M} \\ y_{j} & 1 & 1 & \dots & 1 & 0.70 \\ \end{array} \]
The ‘Bootstrap Population’ — The Sample \[ \begin{array}{r|rrrr|r} j & 1 & 2 & \dots & 625 & \bar{y}^*_{625} \\ y_j^* & 1 & 1 & \dots & 1 & 0.68 \\ \end{array} \]
\[ \small{ \begin{array}{r|rr|rr|r|rr|rrrr} \text{call} & 1 & & 2 & & \dots & 625 & & & & & & \\ \text{pollster} & X_1 & Y_1 & X_2 & Y_2 & \dots & X_{625} & Y_{625} & \overline{X} & \overline{Y} &\frac{\sum_{i:X_i=1} Y_i}{\sum_{i:X_i=1} 1} & \frac{\sum_{i:X_i=0} Y_i}{\sum_{i:X_i=0} 1} & \text{difference} \\ \hline \color[RGB]{7,59,76}1 & \color[RGB]{7,59,76}0 & \color[RGB]{7,59,76}1 & \color[RGB]{7,59,76}1 & \color[RGB]{7,59,76}1 & \color[RGB]{7,59,76}\dots & \color[RGB]{7,59,76}0 & \color[RGB]{7,59,76}1 & \color[RGB]{7,59,76}0.28 & \color[RGB]{7,59,76}0.68 & \color[RGB]{7,59,76}0.69 & 0.68 & \color[RGB]{7,59,76}0.01 \\ \end{array} } \]
\[ \small{ \begin{array}{r|rr|rr|r|rr|rrrr} \text{`call'} & 1 & & 2 & & \dots & 625 & & & & & & \\ \text{`pollster'} & X_1^* & Y_1^* & X_2^* & Y_2^* & \dots & X^*_{625} & Y^*_{625} & \overline{X}^* & \overline{Y}^* &\frac{\sum_{i:X_i^*=1} Y_i}{\sum_{i:X_i^*=1} 1} & \frac{\sum_{i:X_i^*=0} Y_i}{\sum_{i:X_i^*=0} 1} & \text{difference} \\ \hline \color[RGB]{239,71,111}2 & \color[RGB]{239,71,111}X_{398} & \color[RGB]{239,71,111}Y_{398} & \color[RGB]{239,71,111}X_{129} & \color[RGB]{239,71,111}Y_{129} & \color[RGB]{239,71,111}\dots & \color[RGB]{239,71,111}X_{232} & \color[RGB]{239,71,111}Y_{232} & & & & & & \\ & \color[RGB]{239,71,111}1 & \color[RGB]{239,71,111}0 & \color[RGB]{239,71,111}1 & \color[RGB]{239,71,111}1 & \color[RGB]{239,71,111}\dots & \color[RGB]{239,71,111}0 & \color[RGB]{239,71,111}1 & \color[RGB]{239,71,111}0.29 & \color[RGB]{239,71,111}0.68 & \color[RGB]{239,71,111}0.68 & \color[RGB]{239,71,111}0.69 & \color[RGB]{239,71,111}-0.01 \\ \color[RGB]{17,138,178}3 & \color[RGB]{17,138,178}X_{293} & \color[RGB]{17,138,178}Y_{293} & \color[RGB]{17,138,178}X_{526} & \color[RGB]{17,138,178}Y_{526} & \color[RGB]{17,138,178}\dots & \color[RGB]{17,138,178}X_{578} & \color[RGB]{17,138,178}Y_{578} & & & & & & \\ & \color[RGB]{17,138,178}0 & \color[RGB]{17,138,178}1 & \color[RGB]{17,138,178}0 & \color[RGB]{17,138,178}1 & \color[RGB]{17,138,178}\dots & \color[RGB]{17,138,178}0 & \color[RGB]{17,138,178}1 & \color[RGB]{17,138,178}0.28 & \color[RGB]{17,138,178}0.65 & \color[RGB]{17,138,178}0.67 & \color[RGB]{17,138,178}0.64 & \color[RGB]{17,138,178}0.03 \\ \color[RGB]{6,214,160}1 & \color[RGB]{6,214,160}X_{281} & \color[RGB]{6,214,160}Y_{281} & \color[RGB]{6,214,160}X_{520} & \color[RGB]{6,214,160}Y_{520} & \color[RGB]{6,214,160}\dots & \color[RGB]{6,214,160}X_{363} & \color[RGB]{6,214,160}Y_{363} & & & & & & \\ & \color[RGB]{6,214,160}0 & \color[RGB]{6,214,160}0 & \color[RGB]{6,214,160}0 & \color[RGB]{6,214,160}1 & \color[RGB]{6,214,160}\dots & \color[RGB]{6,214,160}0 & \color[RGB]{6,214,160}1 & \color[RGB]{6,214,160}0.28 & \color[RGB]{6,214,160}0.68 & \color[RGB]{6,214,160}0.71 & \color[RGB]{6,214,160}0.66 & \color[RGB]{6,214,160}0.05 \\ \end{array} } \]
The normal distribution is a parametric family of distributions involving two parameters.
We say a random variable \(X\) is normally distributed with mean \(\theta\) and standard deviation \(\sigma\) …
… if the probability that it’s in an interval \([a,b]\) is given by this integral of its probability density. \[ P_{\theta,\sigma}(X \in [a,b]) = \int_a^b f_{\theta,\sigma}(x)dx \qfor f_{\theta, \sigma}(x) = \frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{(x-\theta)^2}{2\sigma^2}} \]
We have to talk about the probability that it’s ‘in an interval’ rather than that it ‘takes a value’ because …
… the probability it actually takes on any particular value is \(x\) zero: it’s an integral from \(x\) to \(x\).
This seems like an annoyance, but given what the normal distribution is actually used for, it’s a blessing.1
The Central Limit Theorem
The sampling distribution of a mean of independent random variables, \(\color[RGB]{64,64,64}\frac{1}{n}\sum\limits_{i=1}^n X_i\), gets arbitrarily close to its normal approximation
as the number of random variables \(\color[RGB]{64,64,64}n\) increases to infinity.
Caveat. This is only true if each random variable contributes infinitesimally to the mean, i.e. \(\color[RGB]{64,64,64}\max_{i\le n} X_i/n \to 0\) as \(\color[RGB]{64,64,64}n \to \infty\).
\[ \text{interval estimate} = \hat\theta \pm 1.96 \hat\sigma \]
\[ \begin{aligned} \text{normal approximation} & \ f_{\theta,\sigma}(x) \qfor \sigma^2 = \frac{\theta(1-\theta)}{n} \\ \text{corresponding estimate} & \ f_{\hat\theta, \hat \sigma}(x) \qfor \hat\sigma^2 = \frac{\hat \theta(1-\hat \theta)}{n}. \end{aligned} \]
\[ \begin{aligned} \text{normal approximation} & \ f_{\theta,\sigma}(x) \qfor \sigma^2 = \frac{\theta(1-\theta)}{n} \times \frac{m-n}{m-1} \\ \text{corresponding estimate} & \ f_{\hat\theta, \hat \sigma}(x) \qfor \hat\sigma^2 = \frac{\hat \theta(1-\hat \theta)}{n} \times \frac{m-n}{m-1}. \end{aligned} \]
\[ \begin{aligned} \text{estimator} & = \frac{\sum_{i:X_i=1} Y_i}{\sum_{i:X_i=1} 1} - \frac{\sum_{i:X_i=0} Y_i}{\sum_{i:X_i=0} 1} \\ \text{bootstrap estimate} & = \frac{\sum_{i:X_i^*=1} Y_i^*}{\sum_{i:X_i^*=1} 1} - \frac{\sum_{i:X_i^*=0} Y_i^*}{\sum_{i:X_i^*=0} 1} \end{aligned} \]
\[ \begin{aligned} \text{normal approximation:} & \ f_{\theta,\sigma}(x) \qfor \theta=\mu(1)-\mu(0) \qand \sigma^2 = \mathop{\mathrm{E}}\qty[\frac{\mu(1)\{1-\mu(1)\}}{N_1} + \frac{\mu(0)\{1-\mu(0)\}}{N_0}] \\ \text{corresponding estimate:} & \ f_{\hat\theta, \hat \sigma}(x) \qfor \hat\theta=\hat\mu(1)-\hat\mu(0) \qand \hat\sigma^2 = \frac{\hat \mu(1)\{1-\hat \mu(1)\}}{N_1} + \frac{\hat \mu(0)\{1-\hat \mu(0)\}}{N_0} \end{aligned} \]
Above, \(N_0\) and \(N_1\) are the sizes of the subsamples with \(X_i=0\) and \(X_i=1\) and \(\mu(0)\) and \(\mu(1)\) are the proportion of ones in the corresponding subpopulations. More on this Friday and next week.
\[ \begin{aligned} E ( a Y + b Z ) &= E (aY) + E (bZ) \\ &= aE(Y) + bE(Z) \\ & \text{ for random variables $Y, Z$ and numbers $a,b$ } \end{aligned} \]
\[ \begin{aligned} \mathop{\mathrm{E}}\qty[\frac1n \sum_{i=1}^n Y_i] &= \frac1n \sum_{i=1}^n \mathop{\mathrm{E}}[Y_i] && \text{linearity of expectation} \\ &= \frac1n \sum_{i=1}^n \qty{\sum_{j=1}^m y_j \times P(J_i=j) } && \text{def. expectation} \\ &= \frac1n \sum_{i=1}^n \qty{\sum_{j=1}^m y_j \times \frac{1}{m} } && \text{sampling uniformly at random} \\ &= \frac1n \sum_{i=1}^n \mu && \text{ for } \ \mu = \frac1m \sum_{j=1}^m y_j \\ &= \frac{1}{n} \times n \times \mu = \mu \end{aligned} \]
\[ \small{ \begin{aligned} \mathop{\mathrm{E}}\qty( a Y + b Z ) &= \sum_{y}\sum_z (a y + b z) \ P(Y=y, Z=z) && \text{ by definition of expectation} \\ &= \sum_{y}\sum_z a y \ P(Y=y, Z=z) + \sum_{z}\sum_y b z \ P(Y=y, Z=z) && \text{changing the order in which we sum} \\ &= \sum_{y} a y \ \sum_z P(Y=y,Z=z) + \sum_{z} b z \ \sum_y P(Y=y,Z=z) && \text{pulling constants out of the inner sums} \\ &= \sum_{y} a y \ P(Y=y) + \sum_{z} b z \ P(Z=z) && \text{summing to get marginal probabilities from our joint } \\ &= a\sum_{y} y \ P(Y=y) + b\sum_{z} z \ P(Z=z) && \text{ pulling constants out of the remaining sum } \\ &= a\mathop{\mathrm{E}}Y + b \mathop{\mathrm{E}}Z && \text{ by definition} \end{aligned} } \]
\[ \mathop{\mathrm{E}}[YZ] = \mathop{\mathrm{E}}[Y]\mathop{\mathrm{E}}[Z] \qqtext{when $Y$ and $Z$ are independent} \]
\[ \begin{aligned} \mathop{\mathrm{E}}[YZ] &= \sum_{yz} yz \ P(Y=y, Z=z) && \text{by definition of expectation} \\ &= \sum_y \sum_z yz \ P(Y=y) P(Z=z) && \text{factoring and ordering sums } \\ &= \textcolor[RGB]{17,138,178}{\sum_y y \ P(Y=y)} \textcolor[RGB]{239,71,111}{\sum_z z \ P(Z=z)} && \text{pulling factors that don't depend on $z$ out of the inner sum} \\ & \textcolor[RGB]{17,138,178}{\mathop{\mathrm{E}}[Y]} \textcolor[RGB]{239,71,111}{\mathop{\mathrm{E}}[Z]} && \text{by definition of expectation} \end{aligned} \]
We’ve worked out the variance (i.e. the squared standard deviation) of a binary random random variable. \[ \mathop{\mathrm{\mathop{\mathrm{V}}}}[Y] = \theta(1-\theta) \qqtext{ when } Y = \begin{cases} 1 & \qqtext{ with probability } \theta \\ 0 & \qqtext{ with probability } 1-\theta \end{cases} \]
When we sample uniformly at random, our sample proportion is the mean of \(n\) independent variables like this.
\[ \mathop{\mathrm{\mathop{\mathrm{V}}}}[\hat\theta] = \mathop{\mathrm{\mathop{\mathrm{V}}}}\qty[\frac1n \sum_{i=1}^n Y_i] = \mathop{\mathrm{E}}\qty[ \qty{ \frac1n \sum_{i=1}^n Y_i - \mathop{\mathrm{E}}\qty[\frac1n \sum_{i=1}^n Y_i] }^2 ] = \frac{\theta(1-\theta)}{n} \]
\[ \begin{aligned} \mathop{\mathrm{\mathop{\mathrm{V}}}}[\hat\theta] &\overset{\texttip{\small{\unicode{x2753}}}{Using the definitions of $\mathop{\mathrm{\mathop{\mathrm{V}}}}$ and $\hat\theta$}}{=} \mathop{\mathrm{E}}\qty[ \qty{ \frac1n \sum_{i=1}^n Y_i - \mathop{\mathrm{E}}\qty[\frac1n \sum_{i=1}^n Y_i] }^2 ] \\ &\overset{\texttip{\small{\unicode{x2753}}}{Step 1. Centering each term. We can do this because Expectation is linear.}}{=} \mathop{\mathrm{E}}\qty[ \qty{ \frac1n \sum_{i=1}^n \qty(Y_i - \mathop{\mathrm{E}}[Y_i]) }^2 ] \\ &\overset{\texttip{\small{\unicode{x2753}}}{Step 2. Squaring out the Sum. This is just arithmetic: a version of $(a+b)^2=a^2+ab+ba+b^2$ for bigger sums.}}{=} \mathop{\mathrm{E}}\qty[ \frac{1}{n^2} \sum_{i=1}^n \sum_{j=1}^n \qty(Y_i - \mathop{\mathrm{E}}[Y_i])(Y_j - \mathop{\mathrm{E}}[Y_j]) ] \\ &\overset{\texttip{\small{\unicode{x2753}}}{Step 3. Distributing the Expectation. Linearity of Expectation again.}}{=} \frac{1}{n^2} \sum_{i=1}^n \sum_{j=1}^n \mathop{\mathrm{E}}\qty[ \qty(Y_i - \mathop{\mathrm{E}}[Y_i])(Y_j - \mathop{\mathrm{E}}[Y_j]) ] \\ &\overset{\texttip{\small{\unicode{x2753}}}{Step 4. Taking Expectations term-by-term.}}{=} \frac{1}{n^2} \sum_{i=1}^n \sum_{j=1}^n \begin{cases} \mathop{\mathrm{E}}\qty[ \qty(Y_i - \mathop{\mathrm{E}}[Y_i])^2 ] \overset{\texttip{\small{\unicode{x2753}}}{By definition}}{=} \mathop{\mathrm{\mathop{\mathrm{V}}}}[Y_i] = \theta (1-\theta) & \text{ when } j=i \\ \mathop{\mathrm{E}}\qty[ \qty(Y_i - \mathop{\mathrm{E}}[Y_i]) \mathop{\mathrm{E}}\qty(Y_j - \mathop{\mathrm{E}}[Y_j]) ] \overset{\texttip{\small{\unicode{x2753}}}{Because $Y_i$ and $Y_j$ are independent.}}{=} \mathop{\mathrm{E}}\qty[\qty(Y_i - \mathop{\mathrm{E}}[Y_i])] \mathop{\mathrm{E}}\qty[\qty(Y_j - \mathop{\mathrm{E}}[Y_j])] \overset{\texttip{\small{\unicode{x2753}}}{Because each factor has mean zero.}}{=} 0 & \text{ when } j \neq i \end{cases} \\ &\overset{\texttip{\small{\unicode{x2753}}}{Because each sum over $j$ has one nonzero term---the one where $j=i$---and it's always $\theta(1-\theta)$}}{=} \frac{1}{n^2} \sum_{i=1}^n \theta(1-\theta) = \frac{1}{n^2} \times n \times \theta(1-\theta) = \frac{\theta(1-\theta)}{n} \end{aligned} \]
The variance of our mean is \(\color[RGB]{64,64,64}{1/n \times}\) the variance of one observation. \[ \mathop{\mathrm{\mathop{\mathrm{V}}}}\qty[\frac1n\sum_{i=1}^n Y_i] = \frac{\mathop{\mathrm{\mathop{\mathrm{V}}}}[Y_1]}{n} \]
So the standard deviation of our mean is \(\color[RGB]{64,64,64}{1/\sqrt{n} \times}\) the standard deviation of one observation.
\[ \mathop{\mathrm{sd}}\qty[\frac1n\sum_{i=1}^n Y_i] = \frac{\mathop{\mathrm{sd}}[Y_1]}{\sqrt{n}} = \sqrt{\frac{\theta(1-\theta)}{n}} \]
\[ \mathop{\mathrm{\mathop{\mathrm{V}}}}[\hat\theta] = \mathop{\mathrm{\mathop{\mathrm{V}}}}\qty[\frac1n \sum_{i=1}^n Y_i] = \mathop{\mathrm{E}}\qty[ \qty{ \frac1n \sum_{i=1}^n Y_i - \frac1n \mathop{\mathrm{E}}[\sum_{i=1}^n Y_i] }^2 ] = \frac{\theta(1-\theta)}{n} \times \frac{m-n}{m-1} \]
\[ \begin{aligned} \mathop{\mathrm{\mathop{\mathrm{V}}}}[\hat\theta] &\overset{\texttip{\small{\unicode{x2753}}}{Using the definitions of $\mathop{\mathrm{\mathop{\mathrm{V}}}}$ and $\hat\theta$}}{=} \mathop{\mathrm{E}}\qty[ \qty{ \frac1n \sum_{i=1}^n Y_i - \mathop{\mathrm{E}}\qty[\frac1n \sum_{i=1}^n Y_i] }^2 ] \\ &\overset{\texttip{\small{\unicode{x2753}}}{Step 1. Centering each term. We can do this because Expectation is linear.}}{=} \mathop{\mathrm{E}}\qty[ \qty{ \frac1n \sum_{i=1}^n \qty(Y_i - \mathop{\mathrm{E}}[Y_i]) }^2 ] \\ &\overset{\texttip{\small{\unicode{x2753}}}{Step 2. Squaring out the Sum. This is just arithmetic: a version of $(a+b)^2=a^2+ab+ba+b^2$ for bigger sums.}}{=} \mathop{\mathrm{E}}\qty[ \frac{1}{n^2} \sum_{i=1}^n \sum_{j=1}^n \qty(Y_i - \mathop{\mathrm{E}}[Y_i])(Y_j - \mathop{\mathrm{E}}[Y_j]) ] \\ &\overset{\texttip{\small{\unicode{x2753}}}{Step 3. Distributing the Expectation. Linearity of Expectation again.}}{=} \frac{1}{n^2} \sum_{i=1}^n \sum_{j=1}^n \mathop{\mathrm{E}}\qty[ \qty(Y_i - \mathop{\mathrm{E}}[Y_i])(Y_j - \mathop{\mathrm{E}}[Y_j]) ] \\ &\overset{\texttip{\small{\unicode{x2753}}}{Step 4. Taking Expectations term-by-term.}}{=} \frac{1}{n^2} \sum_{i=1}^n \sum_{j=1}^n \begin{cases} \mathop{\mathrm{E}}\qty[ \qty(Y_i - \mathop{\mathrm{E}}[Y_i])^2 ] \overset{\texttip{\small{\unicode{x2753}}}{By definition}}{=} \mathop{\mathrm{\mathop{\mathrm{V}}}}[Y_i] = \theta (1-\theta) & \text{ when } j=i \\ \mathop{\mathrm{E}}\qty[ \qty(Y_i - \mathop{\mathrm{E}}[Y_i]) \mathop{\mathrm{E}}\qty(Y_j - \mathop{\mathrm{E}}[Y_j]) ] \overset{\texttip{\small{\unicode{x2753}}}{Because $Y_i$ and $Y_j$ are independent.}}{=} \mathop{\mathrm{E}}\qty[\qty(Y_i - \mathop{\mathrm{E}}[Y_i])] \mathop{\mathrm{E}}\qty[\qty(Y_j - \mathop{\mathrm{E}}[Y_j])] \overset{\texttip{\small{\unicode{x2753}}}{Because each factor has mean zero.}}{=} 0 & \text{ when } j \neq i \end{cases} \\ &\overset{\texttip{\small{\unicode{x2753}}}{Because each sum over $j$ has one nonzero term---the one where $j=i$---and it's always $\theta(1-\theta)$}}{=} \frac{1}{n^2} \sum_{i=1}^n \theta(1-\theta) = \frac{1}{n^2} \times n \times \theta(1-\theta) = \frac{\theta(1-\theta)}{n} \end{aligned} \]
\[ \begin{aligned} \mathop{\mathrm{\mathop{\mathrm{V}}}}[\hat\theta] &= \mathop{\mathrm{E}}\qty[ \qty{ \frac1n \sum_{i=1}^n Y_i - \frac1n \sum_{i=1}^n \mathop{\mathrm{E}}[Y_i] }^2 ] \\ &= \mathop{\mathrm{E}}\qty[ \qty{ \frac1n \sum_{i=1}^n \qty(Y_i - \mathop{\mathrm{E}}[Y_i]) }^2 ] \\ &= \mathop{\mathrm{E}}\qty[ \frac{1}{n^2} \sum_{i=1}^n \sum_{j=1}^n \qty(Y_i - \mathop{\mathrm{E}}[Y_i])(Y_j - \mathop{\mathrm{E}}[Y_j]) ] \\ &= \frac{1}{n^2} \sum_{i=1}^n \sum_{j=1}^n \mathop{\mathrm{E}}\qty[ \qty(Y_i - \mathop{\mathrm{E}}[Y_i])(Y_j - \mathop{\mathrm{E}}[Y_j]) ] \\ &= \frac{1}{n^2} \sum_{i=1}^n \sum_{j=1}^n \begin{cases} \mathop{\mathrm{E}}\qty(Y_i - \mathop{\mathrm{E}}[Y_i])^2 = \mathop{\mathrm{\mathop{\mathrm{V}}}}[Y_i] = \theta (1-\theta) & \text{ when } j=i \\ \textcolor{blue}{\mathop{\mathrm{E}}\qty[ \qty(Y_i - \mathop{\mathrm{E}}[Y_i]) \qty(Y_j - \mathop{\mathrm{E}}[Y_j]) ] \overset{\texttip{\small{\unicode{x2753}}}{Why $-\frac{\theta(1-\theta)}{m-1}$? This'll be a homework problem.}}{=} -\frac{\theta (1-\theta)}{m-1} } & \text{ when } j \neq i \end{cases} \\ &\overset{\texttip{\small{\unicode{x2753}}}{Because each sum over $j$ now includes one copy of $\theta(1-\theta)$ and $n-1$ copies of $-\frac{\theta(1-\theta)}{m-1}$}}{=} \frac{1}{n^2} \sum_{i=1}^n \theta(1-\theta)\qty{1 - (n-1) \times \frac{1}{m-1}} \\ &\overset{\texttip{\small{\unicode{x2753}}}{Pulling out common factors and simplifying}}{=} \frac{\theta(1-\theta)}{n} \times \qty{1 - \frac{n-1}{m-1}} = \frac{\theta(1-\theta)}{n} \times \frac{m-n}{m-1}. \end{aligned} \]
\[ \begin{aligned} \textcolor{red}{\text{binomial interval}} &= 0.6800 \pm 0.0368 \\ \textcolor{blue}{\text{normal interval}} &= 0.6800 \pm 0.0366 \end{aligned} \]
\[ \hat \theta \pm 1.96\sigma \qfor \sigma = \sqrt{\frac{\theta (1-\theta)}{n}} \qqtext{ is } \hat\theta \pm .01 \qqtext{if} 1.96\sqrt{\frac{\theta (1-\theta)}{n}} = .01 \]
\[ n = \frac{1.96^2 \ \hat\theta (1-\hat\theta)}{.01^2} \approx 8000 \]
\[ \begin{aligned} \pm 0.037 &= \pm 1.96\sqrt{\frac{\hat\theta (1-\hat\theta)}{625}} \\ \pm 0.01 &= \pm 1.96\sqrt{\frac{\hat\theta (1-\hat\theta)}{n}} \end{aligned} \]
\[ \frac{0.037}{.01} = \frac{1.96\sqrt{\frac{\hat\theta (1-\hat\theta)}{625}}}{1.96\sqrt{\frac{\hat\theta (1-\hat\theta)}{n}}} = \sqrt{\frac{n}{625}} \qqtext{ so } n = 625\left(\frac{0.037}{.01}\right)^2 \approx 8000 \]
What do we do if we don’t have any data yet?
But we can still use the formula we worked out earlier to get somewhere. \[ n = \frac{1.96^2 \ \theta (1-\theta)}{.01^2} \]
We don’t have an estimate of \(\theta\), but we do know it’s between 0 and 1. And, consequently, so is \(\theta(1-\theta)\).
So we know that if we just substitute \(1\) into our formula, we’ll get a number that’s bigger than we need.
\[ n < n' = \frac{1.96^2 \cdot 1}{.01^2} \approx 38400 \]
That’s a bit excessive. In fact, we can substitute in \(1/4\) instead of \(1\). \[ n < n' = \frac{1.96^2 \cdot 1/4}{.01^2} \approx 9600 \]
Much better. That’s pretty close to the number we got with preliminary data.
Here’s the claim. Why is it true? \[ n := \frac{1.96^2 \ \theta (1-\theta)}{.01^2} < n' := \frac{1.96^2 \ \times 1/4}{.01^2} \]
Hint. Look at the graph of \(f(x) = x(1-x)\) above.
Nothing here will show up on an exam.
\[ \qty{\sum_{i=1}^n Z_i^2} = \sum_{i=1}^n \sum_{j=1}^n Z_i Z_j \]
\[ (a+b)^2 = \textcolor[RGB]{239,71,111}{(a+b)}\textcolor[RGB]{17,138,178}{(a+b)} = \textcolor[RGB]{239,71,111}{a}\textcolor[RGB]{17,138,178}{a} + \textcolor[RGB]{239,71,111}{a}\textcolor[RGB]{17,138,178}{b} + \textcolor[RGB]{239,71,111}{b}\textcolor[RGB]{17,138,178}{a} + \textcolor[RGB]{239,71,111}{b}\textcolor[RGB]{17,138,178}{b} = a^2 + 2ab + b^2 \]
\[ \qty{\sum_{i=1}^n Z_i}^2 = \textcolor[RGB]{239,71,111}{\sum_{i=1}^n Z_i} \textcolor[RGB]{17,138,178}{\sum_{j=1}^n Z_j} = \textcolor[RGB]{239,71,111}{\sum_{i=1}^n} \textcolor[RGB]{17,138,178}{\sum_{j=1}^n} \textcolor[RGB]{239,71,111}{Z_i} \textcolor[RGB]{17,138,178}{Z_j} \]
See Slide 5.2 if you’re curious about what I mean by that.
See Slide 5.1 if this is unfamiliar.
If we used \(i\) to count terms in both sums, we’d wind up with two different things we call \(i\): one pink and one teal. We could get by saying ‘pink \(i\)’ and ‘teal \(i\)’ in class, but people you meet later on would probably get confused if you talk this way. Using \(i\) and \(j\) is conventional.